NVIDIA H100 PCIe, SXM5 & NVL GPU Comparison: What Is the Difference?
- 5 hours ago
- 10 min read
The NVIDIA H100 GPU comes in different form factors — mainly PCIe, SXM5, and the special H100 NVL version — and choosing the wrong one is an expensive mistake.
NVIDIA H100 GPUs
In Stock: PCIe, SXM5 & NVL GPUs
This guide compares H100 PCIe, H100 SXM5, and H100 NVL GPUs technically, explains the bandwidth architecture, covers real server options, and helps you decide which configuration fits your infrastructure and workload.

Comparison: NVIDIA H100 PCIe, SXM5 & NVL GPUs
Quick comparison
Variant | Best for | VRAM | Power |
H100 PCIe 80GB | Flexible AI servers Inference, smaller fine-tuning | 80GB | 350–400W |
H100 SXM5 80GB | HGX H100, large training, HPC | 80GB | Up to 700W |
H100 NVL 94GB | LLM inference, memory-heavy inference | 94GB per GPU | 350–400W |
Technical differences
Variant | Memory type | Memory bandwidth | GPU-to-GPU bandwidth |
H100 PCIe 80GB | HBM2e | ~2.0 TB/s | PCIe Gen5 / optional NVLink bridge |
H100 SXM5 80GB | HBM3 | ~3.35 TB/s | 900 GB/s via NVLink / NVSwitch |
H100 NVL 94GB | HBM3 | ~3.9 TB/s | 600 GB/s between paired GPUs |
Decision Shortcut: NVIDIA H100 PCIe, SXM5 & NVL GPUs
If you only need one thing from this article:
1 GPU → H100 PCIe
2 GPUs, LLM inference or memory-heavy inference → H100 NVL
2 GPUs, general inference or smaller fine-tuning → H100 PCIe
4–8 GPUs, large model training → H100 SXM5 / HGX H100
Large AI clusters, foundation models, HPC → H100 SXM5 with InfiniBand networking
Everything else in this article is the technical reasoning behind that table.
What Is the Difference Between PCIe, SXM5 & NVL NVIDIA H100 GPUs?
NVIDIA H100 PCIe uses a standard PCIe Gen5 x16 interface, so it works in many enterprise GPU servers and is a flexible choice for inference, smaller training jobs, and standard server deployments. Its main limitation is GPU-to-GPU communication: PCIe Gen5 is fast, but it cannot match a full HGX system with NVLink and NVSwitch.
NVIDIA H100 NVL is a special PCIe-based H100 version built mainly for LLM inference. It offers 94GB memory per GPU and up to 600 GB/s NVLink bandwidth between two GPUs, making it a strong choice for dual-GPU inference without a full HGX platform.
NVIDIA H100 SXM5 does not use a standard PCIe slot. It runs on an HGX H100 platform with NVLink and NVSwitch, giving 4-GPU and 8-GPU servers much stronger GPU-to-GPU communication.
The simple difference:
Form factor | Best for | GPU-to-GPU bandwidth |
H100 PCIe | Flexibility, inference, smaller workloads | PCIe Gen5 fabric, optional NVLink bridge for 2 GPUs |
H100 NVL | LLM inference, memory-heavy inference, dual-GPU setups | Up to 600 GB/s between paired GPUs |
H100 SXM5 | Multi-GPU training, HPC, large models | 900 GB/s via NVLink / NVSwitch |
PCIe, NVL, and SXM5 are different H100 configurations. SXM5 needs a complete HGX H100 system, while NVL is a PCIe-based dual-GPU inference solution, not a replacement for HGX H100.
What Is the Difference Between H100 PCIe, H100 SXM5, and H100 NVL GPUs?
Variant | Main purpose | Memory |
H100 PCIe 80GB | Standard enterprise GPU servers | 80GB |
H100 SXM5 80GB | HGX training systems | 80GB |
H100 NVL 94GB | LLM inference, usually dual-GPU NVLink pair | 94GB per GPU |
The H100 PCIe is usually the flexible option.
The H100 SXM5 is usually the maximum-performance training option.
The H100 NVL is a special version mainly designed for large language model inference, where more memory per GPU and a strong two-GPU NVLink connection can help.
The H100 SXM5 also has much higher memory bandwidth than H100 PCIe: around 3.35 TB/s vs around 2.0 TB/s. The H100 NVL goes even higher at around 3.9 TB/s.
NVIDIA H100 PCIe, SXM5 & NVL GPUs: Bandwidth Architecture in Detail
Connection | Bandwidth | Topology |
H100 SXM5 NVLink 4 | 900 GB/s per GPU | HGX H100 with NVSwitch |
H100 PCIe Gen5 x16 | 128 GB/s | PCIe fabric |
H100 PCIe with NVLink bridge | Up to 600 GB/s between 2 GPUs | Two-GPU bridge only |
H100 SXM5 memory bandwidth | ~3.35 TB/s | HBM3 |
H100 PCIe memory bandwidth | ~2.0 TB/s | HBM2e |
H100 NVL memory bandwidth | ~3.9 TB/s | HBM3 |
The SXM5 gives you much stronger GPU-to-GPU communication.
That matters when several GPUs must work together as one system - for example:
large language model training
tensor parallelism
model parallel workloads
large HPC simulations
multi-GPU scientific computing
workloads where GPUs exchange data constantly
For inference, smaller fine-tuning jobs, computer vision, recommendation systems, and single-GPU workloads, H100 PCIe can still be very strong.
NVIDIA H100 PCIe, SXM5 & NVL GPUs: Full Technical Specifications
Architecture
All variants: NVIDIA Hopper architecture
GPU: GH100
Manufacturing process: TSMC 4N
Tensor Cores: 4th generation
Transformer Engine: Yes
FP8 support: Yes
Confidential Computing: Yes
CUDA cores
Variant | CUDA cores |
H100 PCIe | 14,592 |
H100 SXM5 | 16,896 |
H100 NVL | 16,896 |
H100 NVL is based on the higher-performance H100 configuration and is mainly designed for LLM inference in dual-GPU systems.
Tensor Core performance
Precision | H100 PCIe | H100 SXM5 | H100 NVL |
FP64 | 26 TFLOPS | 34 TFLOPS | 34 TFLOPS |
FP64 Tensor Core | 51 TFLOPS | 67 TFLOPS | 67 TFLOPS |
FP32 | 51 TFLOPS | 67 TFLOPS | 67 TFLOPS |
TF32 Tensor Core | 756 TFLOPS | 989 TFLOPS | 989 TFLOPS |
FP16 / BF16 Tensor Core | 1,513 TFLOPS | 1,979 TFLOPS | 1,979 TFLOPS |
FP8 Tensor Core | 3,026 TFLOPS | 3,958 TFLOPS | 3,958 TFLOPS |
VRAM and memory bandwidth
Variant | VRAM | Memory type | Memory bandwidth |
H100 PCIe | 80GB | HBM2e | ~2.0 TB/s |
H100 SXM5 | 80GB | HBM3 | ~3.35 TB/s |
H100 NVL | 94GB per GPU | HBM3 | ~3.9 TB/s |
GPU-to-GPU interconnect
Variant | Interconnect |
H100 PCIe | PCIe Gen5 x16, 128 GB/s |
H100 PCIe with NVLink bridge | Up to 600 GB/s between two GPUs |
H100 NVL | Up to 600 GB/s between paired GPUs |
H100 SXM5 | Fourth-generation NVLink, 900 GB/s per GPU |
HGX H100 8-GPU systems | NVLink + NVSwitch topology |
Power draw
Variant | Power draw |
H100 PCIe | 350W default / up to 400W configurable |
H100 SXM5 | Up to 700W configurable |
H100 NVL | 350–400W configurable |
Server compatibility
Variant | Server compatibility |
H100 PCIe | Enterprise GPU servers with PCIe Gen5 support |
H100 SXM5 | HGX H100 baseboard only |
H100 NVL | Special PCIe-based dual-GPU configuration with NVLink |
MIG support
H100 PCIe supports Multi-Instance GPU
H100 SXM5 supports Multi-Instance GPU
H100 NVL supports Multi-Instance GPU
Up to 7 isolated GPU instances per GPU
H100 PCIe is the flexible enterprise server option. H100 SXM5 is the HGX version for large training and HPC. H100 NVL is the special LLM inference version with 94GB memory per GPU and strong dual-GPU NVLink bandwidth.
NVIDIA H100 PCIe, SXM5 & NVL GPUs: Scale-Up vs Scale-Out
SXM5 is better for scale-up.
This means fewer servers, but each server has tightly connected GPUs through NVLink and NVSwitch. This is important when one large model must be split across several GPUs.
PCIe is better for scale-out.
This means more standard servers, usually connected with InfiniBand or Ethernet. It is often more flexible and easier to deploy in existing data centers.
H100 NVL sits between these two options.
It is PCIe-based, but built for dual-GPU LLM inference, with more memory than standard H100 PCIe and fast NVLink between two GPUs. It is not the same as a 4-GPU or 8-GPU HGX H100 SXM5 system.
Strategy | Best option | Best for |
Scale-up | H100 SXM5 | Large models, tensor parallelism, heavy GPU-to-GPU communication |
Scale-out | H100 PCIe | Inference, smaller training jobs, flexible server deployments |
Dual-GPU LLM inference | H100 NVL | Large inference workloads, high memory per GPU |
Flexible enterprise AI | H100 PCIe | Standard data center servers, easier deployment |
Maximum multi-GPU performance | H100 SXM5 / HGX H100 | 4-GPU and 8-GPU training systems |
Choose H100 SXM5 when GPUs inside one server must work very closely together.
Choose H100 PCIe when you want flexible standard servers.
Choose H100 NVL when you need a strong two-GPU setup for LLM inference, especially when GPU memory is important.
Multi-Instance GPU: NVIDIA H100 PCIe, SXM5 & NVL GPUs
MIG is available on NVIDIA H100 GPUs. It allows one physical GPU to be split into smaller isolated GPU instances.
This is useful when several users, teams, or workloads need access to GPU resources without interfering with each other.
For example, one H100 can be used for:
MIG setup | Use case |
7 small GPU instances | Many small inference workloads |
3 medium GPU instances | Medium inference workloads |
1 full GPU | One large training or inference workload |
This makes H100 very useful for:
AI cloud providers
Kubernetes GPU clusters
internal enterprise AI platforms
research teams
multi-tenant inference
GPU-as-a-Service environments
MIG is especially important when you do not want every user to reserve a full H100 GPU.
NVIDIA H100 PCIe, SXM5 & NVL GPU Workloads
NVIDIA H100 PCIe 80GB
H100 PCIe is a strong choice for inference, fine-tuning, data analytics, and smaller training jobs where standard server architecture is enough.
Good fit for:
single-GPU inference
2–4 GPU inference servers
smaller LLM fine-tuning
enterprise AI pilots
multi-tenant GPU platforms
Kubernetes-based GPU clusters
companies that want H100 performance without HGX complexity
NVIDIA H100 SXM5 80GB
H100 SXM5 is the stronger option for heavy multi-GPU work. It is built for HGX H100 systems with 4 or 8 GPUs, NVLink, and NVSwitch.
Good fit for:
large language model training
large-scale fine-tuning
HPC workloads
simulation
scientific computing
AI research clusters
foundation model development
workloads where GPU-to-GPU bandwidth matters
This is the version customers usually need when the GPUs must act like one tightly connected system.
NVIDIA H100 NVL 94GB
H100 NVL is a special H100 version mainly designed for LLM inference. It uses two H100 PCIe-style GPUs connected with NVLink and gives 94GB memory per GPU.
Good fit for:
LLM inference
large model serving
high batch-size inference
dual-GPU deployments
customers who need more memory than standard H100 PCIe
It is not the same as an 8-GPU HGX H100 system. It is more of a focused two-GPU inference solution.
NVIDIA H100 Servers: PCIe, SXM5 & NVL GPU Models
These are the server models most commonly used with NVIDIA H100 GPUs, both from new inventory and the secondary market.
SXM5 / HGX H100 servers — for multi-GPU training and HPC:
NVIDIA DGX H100 — NVIDIA’s own 8-GPU H100 system with HGX H100, NVLink, NVSwitch, and high-speed networking.
Dell PowerEdge XE9680 — 6U enterprise AI server, commonly used with 8× H100 SXM5 GPUs.
Dell PowerEdge XE8640 — 4U platform for 4× H100 SXM GPUs, useful when 8 GPUs are not required.
Supermicro HGX H100 systems — common in GPU cloud, AI infrastructure, and research environments, usually in 4-GPU or 8-GPU HGX configurations.
Lenovo ThinkSystem SR675 V3 / SR680a V3 — Lenovo GPU platforms for AI and HPC workloads, depending on the exact configuration.
HPE Cray XD / Apollo-style GPU systems — high-density GPU platforms for enterprise AI, HPC, and research clusters.
PCIe H100 servers — for inference and flexible deployments:
Dell PowerEdge R760xa — 2U enterprise GPU server, often used with up to 4× double-wide PCIe GPUs.
Dell PowerEdge R760 / R7625 GPU configurations — standard rack servers for smaller PCIe GPU setups.
Supermicro PCIe GPU servers — common for GPU cloud, AI labs, and secondary market H100 PCIe systems.
Lenovo ThinkSystem SR675 V3 — flexible PCIe GPU platform for AI, HPC, and visualization workloads.
HPE ProLiant DL380 / DL385 GPU configurations — useful when the customer already uses HPE infrastructure.
H100 NVL servers — for LLM inference:
H100 NVL systems — PCIe-based dual-GPU configurations with NVLink between two GPUs. Best for LLM inference where high GPU memory and strong two-GPU bandwidth matter.
Always check the exact H100 configuration, including GPU form factor, PCIe/SXM5/NVL compatibility, power and cooling, risers, GPU enablement kit, firmware, network cards, rack power capacity, warranty, and testing.
Common Mistakes When Buying NVIDIA H100 PCIe, SXM5 & NVL GPUs
Buying SXM5 without HGX infrastructure
H100 SXM5 modules need a compatible HGX H100 platform. They cannot be installed in a normal PCIe server.
Assuming PCIe and SXM5 perform the same
For single-GPU workloads, the difference may be smaller. For heavy multi-GPU training, SXM5 is much stronger because of NVLink and NVSwitch.
Ignoring power and cooling
H100 SXM5 systems need serious power and cooling planning, especially 4-GPU and 8-GPU HGX servers. Always check rack power, PDU capacity, airflow, cooling, power cables, and redundant PSUs.
Buying H100 PCIe when you really need HGX
H100 PCIe is excellent for many workloads, but if your model needs strong GPU-to-GPU communication across many GPUs, HGX H100 with SXM5 is usually the better architecture.
Buying HGX H100 when PCIe would be enough
Some workloads do not need SXM5. For inference, smaller fine-tuning, or flexible enterprise AI deployments, H100 PCIe can be easier, cheaper, and more practical.
Confusing H100 NVL with H100 SXM5
H100 NVL is mainly a dual-GPU LLM inference solution with 94GB memory per GPU. H100 SXM5 is for 4-GPU and 8-GPU HGX systems with stronger scale-up architecture.
Forgetting about networking
For serious AI clusters, the GPU is only one part of the system. You also need the right NICs, InfiniBand or Ethernet speed, switches, cables, topology, firmware, and tested ports. A weak network design can limit the value of expensive H100 servers.
Summary: NVIDIA H100 PCIe, SXM5 & NVL GPUs
H100 PCIe 80GB | H100 SXM5 80GB | H100 NVL 94GB | |
Best for | Inference, fine-tuning, flexible servers | Training, HPC, large models | LLM inference |
VRAM | 80GB | 80GB | 94GB per GPU |
Memory type | HBM2e | HBM3 | HBM3 |
Memory BW | ~2.0 TB/s | ~3.35 TB/s | ~3.9 TB/s |
GPU-to-GPU BW | PCIe Gen5 / optional 2-GPU NVLink | 900 GB/s NVLink / NVSwitch | 600 GB/s between paired GPUs |
Topology | PCIe fabric | HGX H100 | Dual-GPU NVLink pair |
Infrastructure | Standard GPU server | HGX H100 server | Special NVL server config |
TDP | 350–400W | Up to 700W | 350–400W |
MIG | Yes | Yes | Yes |
Scale strategy | Scale-out | Scale-up | Dual-GPU inference |
We have NVIDIA H100 GPUs and complete H100 servers available across PCIe, SXM5, and NVL configurations. Tell us what you are building and we will send you a recommended configuration with pricing → Get your configuration.
FAQ: NVIDIA H100 PCIe, SXM5 & NVL GPUs
Can I install an H100 SXM5 in a standard server?
No. H100 SXM5 modules require a compatible HGX H100 system. They cannot be installed in a regular PCIe slot.
Is H100 PCIe much slower than H100 SXM5?
For single-GPU inference, the difference may be smaller. For multi-GPU training, SXM5 can be much faster because of NVLink and NVSwitch bandwidth.
Does H100 PCIe support NVLink?
Yes, H100 PCIe can support NVLink bridge between two GPUs in compatible systems. This is not the same as HGX H100 SXM5 with NVSwitch.
What is the main difference between H100 PCIe and H100 SXM5?
H100 PCIe is for flexible standard servers. H100 SXM5 is for HGX systems where multiple GPUs need very fast communication.
What is H100 NVL?
H100 NVL is a special H100 version mainly designed for large language model inference. It offers 94GB memory per GPU and uses NVLink between two GPUs.
Which H100 is best for LLM training?
For serious LLM training, H100 SXM5 in an HGX H100 server is usually the best option because of NVLink, NVSwitch, and high memory bandwidth.
Which H100 is best for LLM inference?
For many inference workloads, H100 PCIe is enough. For larger LLM inference workloads, H100 NVL can be a very strong option because it has more memory per GPU.
How many H100 GPUs do I need?
It depends on model size, precision, batch size, and whether you are training or only serving inference. For small inference, one H100 may be enough. For large training, you may need 4, 8, or many more GPUs across a cluster.
Should I buy H100 PCIe, NVL or SXM5?
Choose PCIe if you want flexibility, standard servers, and good inference performance. Choose SXM5 if you need maximum multi-GPU performance for large training or HPC workloads.
NVIDIA H100 GPUs
In Stock: PCIe, SXM5 & NVL GPUs
Sources: NVIDIA H100 PCIe, SXM5 & NVL GPUs
NVIDIA H100 Tensor Core GPU official product page:
NVIDIA H100 PCIe GPU product brief:
NVIDIA HGX H100 / HGX H200 official datasheet:
NVIDIA DGX H100 / H200 system documentation:
NVIDIA H100 NVL GPU product brief:

