NVIDIA H100 PCIe, SXM5 & NVL GPU Comparison: What Is the Difference?

May 18
10 min read

Updated: Jun 6

The NVIDIA H100 GPU comes in different form factors — mainly PCIe, SXM5, and the special H100 NVL version — and choosing the wrong one is an expensive mistake.

NVIDIA H100 GPUs

In Stock: SXM5, PCIe & NVL GPUs

Request a Quote

This guide compares H100 PCIe, H100 SXM5, and H100 NVL GPUs technically, explains the bandwidth architecture, covers real server options, and helps you decide which configuration fits your infrastructure and workload.

NVIDIA H100 PCIe vs SXM5 GPU comparison for AI servers, HGX H100 systems, LLM training, inference workloads, HPC clusters, NVLink, NVSwitch, HBM3 memory and enterprise GPU infrastructure by server-parts.eu

Comparison: NVIDIA H100 PCIe, SXM5 & NVL GPUs

Quick comparison

Variant	Best for	VRAM	Power
H100 PCIe 80GB	Flexible AI servers Inference, smaller fine-tuning	80GB	350–400W
H100 SXM5 80GB	HGX H100, large training, HPC	80GB	Up to 700W
H100 NVL 94GB	LLM inference, memory-heavy inference	94GB per GPU	350–400W

Technical differences

Variant	Memory type	Memory bandwidth	GPU-to-GPU bandwidth
H100 PCIe 80GB	HBM2e	~2.0 TB/s	PCIe Gen5 / optional NVLink bridge
H100 SXM5 80GB	HBM3	~3.35 TB/s	900 GB/s via NVLink / NVSwitch
H100 NVL 94GB	HBM3	~3.9 TB/s	600 GB/s between paired GPUs

Decision Shortcut: NVIDIA H100 PCIe, SXM5 & NVL GPUs

If you only need one thing from this article:

1 GPU → H100 PCIe
2 GPUs, LLM inference or memory-heavy inference → H100 NVL
2 GPUs, general inference or smaller fine-tuning → H100 PCIe
4–8 GPUs, large model training → H100 SXM5 / HGX H100
Large AI clusters, foundation models, HPC → H100 SXM5 with InfiniBand networking

Everything else in this article is the technical reasoning behind that table.

What Is the Difference Between PCIe, SXM5 & NVL NVIDIA H100 GPUs?

NVIDIA H100 PCIe uses a standard PCIe Gen5 x16 interface, so it works in many enterprise GPU servers and is a flexible choice for inference, smaller training jobs, and standard server deployments. Its main limitation is GPU-to-GPU communication: PCIe Gen5 is fast, but it cannot match a full HGX system with NVLink and NVSwitch.

NVIDIA H100 NVL is a special PCIe-based H100 version built mainly for LLM inference. It offers 94GB memory per GPU and up to 600 GB/s NVLink bandwidth between two GPUs, making it a strong choice for dual-GPU inference without a full HGX platform.

NVIDIA H100 SXM5 does not use a standard PCIe slot. It runs on an HGX H100 platform with NVLink and NVSwitch, giving 4-GPU and 8-GPU servers much stronger GPU-to-GPU communication.

The simple difference:

Form factor	Best for	GPU-to-GPU bandwidth
H100 PCIe	Flexibility, inference, smaller workloads	PCIe Gen5 fabric, optional NVLink bridge for 2 GPUs
H100 NVL	LLM inference, memory-heavy inference, dual-GPU setups	Up to 600 GB/s between paired GPUs
H100 SXM5	Multi-GPU training, HPC, large models	900 GB/s via NVLink / NVSwitch

PCIe, NVL, and SXM5 are different H100 configurations. SXM5 needs a complete HGX H100 system, while NVL is a PCIe-based dual-GPU inference solution, not a replacement for HGX H100.

What Is the Difference Between H100 PCIe, H100 SXM5, and H100 NVL GPUs?

Variant	Main purpose	Memory
H100 PCIe 80GB	Standard enterprise GPU servers	80GB
H100 SXM5 80GB	HGX training systems	80GB
H100 NVL 94GB	LLM inference, usually dual-GPU NVLink pair	94GB per GPU

The H100 PCIe is usually the flexible option.
The H100 SXM5 is usually the maximum-performance training option.
The H100 NVL is a special version mainly designed for large language model inference, where more memory per GPU and a strong two-GPU NVLink connection can help.

The H100 SXM5 also has much higher memory bandwidth than H100 PCIe: around 3.35 TB/s vs around 2.0 TB/s. The H100 NVL goes even higher at around 3.9 TB/s.

NVIDIA H100 PCIe, SXM5 & NVL GPUs: Bandwidth Architecture in Detail

Connection	Bandwidth	Topology
H100 SXM5 NVLink 4	900 GB/s per GPU	HGX H100 with NVSwitch
H100 PCIe Gen5 x16	128 GB/s	PCIe fabric
H100 PCIe with NVLink bridge	Up to 600 GB/s between 2 GPUs	Two-GPU bridge only
H100 SXM5 memory bandwidth	~3.35 TB/s	HBM3
H100 PCIe memory bandwidth	~2.0 TB/s	HBM2e
H100 NVL memory bandwidth	~3.9 TB/s	HBM3

The SXM5 gives you much stronger GPU-to-GPU communication.

That matters when several GPUs must work together as one system - for example:

large language model training
tensor parallelism
model parallel workloads
large HPC simulations
multi-GPU scientific computing
workloads where GPUs exchange data constantly

For inference, smaller fine-tuning jobs, computer vision, recommendation systems, and single-GPU workloads, H100 PCIe can still be very strong.

NVIDIA H100 PCIe, SXM5 & NVL GPUs: Full Technical Specifications

Architecture

All variants: NVIDIA Hopper architecture
GPU: GH100
Manufacturing process: TSMC 4N
Tensor Cores: 4th generation
Transformer Engine: Yes
FP8 support: Yes
Confidential Computing: Yes

CUDA cores

Variant	CUDA cores
H100 PCIe	14,592
H100 SXM5	16,896
H100 NVL	16,896

H100 NVL is based on the higher-performance H100 configuration and is mainly designed for LLM inference in dual-GPU systems.

Tensor Core performance

Precision	H100 PCIe	H100 SXM5	H100 NVL
FP64	26 TFLOPS	34 TFLOPS	34 TFLOPS
FP64 Tensor Core	51 TFLOPS	67 TFLOPS	67 TFLOPS
FP32	51 TFLOPS	67 TFLOPS	67 TFLOPS
TF32 Tensor Core	756 TFLOPS	989 TFLOPS	989 TFLOPS
FP16 / BF16 Tensor Core	1,513 TFLOPS	1,979 TFLOPS	1,979 TFLOPS
FP8 Tensor Core	3,026 TFLOPS	3,958 TFLOPS	3,958 TFLOPS

VRAM and memory bandwidth

Variant	VRAM	Memory type	Memory bandwidth
H100 PCIe	80GB	HBM2e	~2.0 TB/s
H100 SXM5	80GB	HBM3	~3.35 TB/s
H100 NVL	94GB per GPU	HBM3	~3.9 TB/s

GPU-to-GPU interconnect

Variant	Interconnect
H100 PCIe	PCIe Gen5 x16, 128 GB/s
H100 PCIe with NVLink bridge	Up to 600 GB/s between two GPUs
H100 NVL	Up to 600 GB/s between paired GPUs
H100 SXM5	Fourth-generation NVLink, 900 GB/s per GPU
HGX H100 8-GPU systems	NVLink + NVSwitch topology

Power draw

Variant	Power draw
H100 PCIe	350W default / up to 400W configurable
H100 SXM5	Up to 700W configurable
H100 NVL	350–400W configurable

Server compatibility

Variant	Server compatibility
H100 PCIe	Enterprise GPU servers with PCIe Gen5 support
H100 SXM5	HGX H100 baseboard only
H100 NVL	Special PCIe-based dual-GPU configuration with NVLink

MIG support

H100 PCIe supports Multi-Instance GPU
H100 SXM5 supports Multi-Instance GPU
H100 NVL supports Multi-Instance GPU
Up to 7 isolated GPU instances per GPU

H100 PCIe is the flexible enterprise server option. H100 SXM5 is the HGX version for large training and HPC. H100 NVL is the special LLM inference version with 94GB memory per GPU and strong dual-GPU NVLink bandwidth.

NVIDIA H100 PCIe, SXM5 & NVL GPUs: Scale-Up vs Scale-Out

SXM5 is better for scale-up.

This means fewer servers, but each server has tightly connected GPUs through NVLink and NVSwitch. This is important when one large model must be split across several GPUs.

PCIe is better for scale-out.

This means more standard servers, usually connected with InfiniBand or Ethernet. It is often more flexible and easier to deploy in existing data centers.

H100 NVL sits between these two options.

It is PCIe-based, but built for dual-GPU LLM inference, with more memory than standard H100 PCIe and fast NVLink between two GPUs. It is not the same as a 4-GPU or 8-GPU HGX H100 SXM5 system.

Strategy	Best option	Best for
Scale-up	H100 SXM5	Large models, tensor parallelism, heavy GPU-to-GPU communication
Scale-out	H100 PCIe	Inference, smaller training jobs, flexible server deployments
Dual-GPU LLM inference	H100 NVL	Large inference workloads, high memory per GPU
Flexible enterprise AI	H100 PCIe	Standard data center servers, easier deployment
Maximum multi-GPU performance	H100 SXM5 / HGX H100	4-GPU and 8-GPU training systems

Choose H100 SXM5 when GPUs inside one server must work very closely together.

Choose H100 PCIe when you want flexible standard servers.

Choose H100 NVL when you need a strong two-GPU setup for LLM inference, especially when GPU memory is important.

Multi-Instance GPU: NVIDIA H100 PCIe, SXM5 & NVL GPUs

MIG is available on NVIDIA H100 GPUs. It allows one physical GPU to be split into smaller isolated GPU instances.

This is useful when several users, teams, or workloads need access to GPU resources without interfering with each other.

For example, one H100 can be used for:

MIG setup	Use case
7 small GPU instances	Many small inference workloads
3 medium GPU instances	Medium inference workloads
1 full GPU	One large training or inference workload

This makes H100 very useful for:

AI cloud providers
Kubernetes GPU clusters
internal enterprise AI platforms
research teams
multi-tenant inference
GPU-as-a-Service environments

MIG is especially important when you do not want every user to reserve a full H100 GPU.

NVIDIA H100 PCIe, SXM5 & NVL GPU Workloads

NVIDIA H100 PCIe 80GB

H100 PCIe is a strong choice for inference, fine-tuning, data analytics, and smaller training jobs where standard server architecture is enough.

Good fit for:

single-GPU inference
2–4 GPU inference servers
smaller LLM fine-tuning
enterprise AI pilots
multi-tenant GPU platforms
Kubernetes-based GPU clusters
companies that want H100 performance without HGX complexity

NVIDIA H100 SXM5 80GB

H100 SXM5 is the stronger option for heavy multi-GPU work. It is built for HGX H100 systems with 4 or 8 GPUs, NVLink, and NVSwitch.

Good fit for:

large language model training
large-scale fine-tuning
HPC workloads
simulation
scientific computing
AI research clusters
foundation model development
workloads where GPU-to-GPU bandwidth matters

This is the version customers usually need when the GPUs must act like one tightly connected system.

NVIDIA H100 NVL 94GB

H100 NVL is a special H100 version mainly designed for LLM inference. It uses two H100 PCIe-style GPUs connected with NVLink and gives 94GB memory per GPU.

Good fit for:

LLM inference
large model serving
high batch-size inference
dual-GPU deployments
customers who need more memory than standard H100 PCIe

It is not the same as an 8-GPU HGX H100 system. It is more of a focused two-GPU inference solution.

NVIDIA H100 Servers: PCIe, SXM5 & NVL GPU Models

These are the server models most commonly used with NVIDIA H100 GPUs, both from new inventory and the secondary market.

SXM5 / HGX H100 servers — for multi-GPU training and HPC:

NVIDIA DGX H100 — NVIDIA’s own 8-GPU H100 system with HGX H100, NVLink, NVSwitch, and high-speed networking.
Dell PowerEdge XE9680 — 6U enterprise AI server, commonly used with 8× H100 SXM5 GPUs.
Dell PowerEdge XE8640 — 4U platform for 4× H100 SXM GPUs, useful when 8 GPUs are not required.
Supermicro HGX H100 systems — common in GPU cloud, AI infrastructure, and research environments, usually in 4-GPU or 8-GPU HGX configurations.
Lenovo ThinkSystem SR675 V3 / SR680a V3 — Lenovo GPU platforms for AI and HPC workloads, depending on the exact configuration.
HPE Cray XD / Apollo-style GPU systems — high-density GPU platforms for enterprise AI, HPC, and research clusters.

PCIe H100 servers — for inference and flexible deployments:

Dell PowerEdge R760xa — 2U enterprise GPU server, often used with up to 4× double-wide PCIe GPUs.
Dell PowerEdge R760 / R7625 GPU configurations — standard rack servers for smaller PCIe GPU setups.
Supermicro PCIe GPU servers — common for GPU cloud, AI labs, and secondary market H100 PCIe systems.
Lenovo ThinkSystem SR675 V3 — flexible PCIe GPU platform for AI, HPC, and visualization workloads.
HPE ProLiant DL380 / DL385 GPU configurations — useful when the customer already uses HPE infrastructure.

H100 NVL servers — for LLM inference:

H100 NVL systems — PCIe-based dual-GPU configurations with NVLink between two GPUs. Best for LLM inference where high GPU memory and strong two-GPU bandwidth matter.

Always check the exact H100 configuration, including GPU form factor, PCIe/SXM5/NVL compatibility, power and cooling, risers, GPU enablement kit, firmware, network cards, rack power capacity, warranty, and testing.

Common Mistakes When Buying NVIDIA H100 PCIe, SXM5 & NVL GPUs

Buying SXM5 without HGX infrastructure

H100 SXM5 modules need a compatible HGX H100 platform. They cannot be installed in a normal PCIe server.

Assuming PCIe and SXM5 perform the same

For single-GPU workloads, the difference may be smaller. For heavy multi-GPU training, SXM5 is much stronger because of NVLink and NVSwitch.

Ignoring power and cooling

H100 SXM5 systems need serious power and cooling planning, especially 4-GPU and 8-GPU HGX servers. Always check rack power, PDU capacity, airflow, cooling, power cables, and redundant PSUs.

Buying H100 PCIe when you really need HGX

H100 PCIe is excellent for many workloads, but if your model needs strong GPU-to-GPU communication across many GPUs, HGX H100 with SXM5 is usually the better architecture.

Buying HGX H100 when PCIe would be enough

Some workloads do not need SXM5. For inference, smaller fine-tuning, or flexible enterprise AI deployments, H100 PCIe can be easier, cheaper, and more practical.

Confusing H100 NVL with H100 SXM5

H100 NVL is mainly a dual-GPU LLM inference solution with 94GB memory per GPU. H100 SXM5 is for 4-GPU and 8-GPU HGX systems with stronger scale-up architecture.

Forgetting about networking

For serious AI clusters, the GPU is only one part of the system. You also need the right NICs, InfiniBand or Ethernet speed, switches, cables, topology, firmware, and tested ports. A weak network design can limit the value of expensive H100 servers.

Summary: NVIDIA H100 PCIe, SXM5 & NVL GPUs

	H100 PCIe 80GB	H100 SXM5 80GB	H100 NVL 94GB
Best for	Inference, fine-tuning, flexible servers	Training, HPC, large models	LLM inference
VRAM	80GB	80GB	94GB per GPU
Memory type	HBM2e	HBM3	HBM3
Memory BW	~2.0 TB/s	~3.35 TB/s	~3.9 TB/s
GPU-to-GPU BW	PCIe Gen5 / optional 2-GPU NVLink	900 GB/s NVLink / NVSwitch	600 GB/s between paired GPUs
Topology	PCIe fabric	HGX H100	Dual-GPU NVLink pair
Infrastructure	Standard GPU server	HGX H100 server	Special NVL server config
TDP	350–400W	Up to 700W	350–400W
MIG	Yes	Yes	Yes
Scale strategy	Scale-out	Scale-up	Dual-GPU inference

We have NVIDIA H100 GPUs and complete H100 servers available across PCIe, SXM5, and NVL configurations. Tell us what you are building and we will send you a recommended configuration with pricing → Get your configuration.

FAQ: NVIDIA H100 PCIe, SXM5 & NVL GPUs

Can I install an H100 SXM5 in a standard server?

No. H100 SXM5 modules require a compatible HGX H100 system. They cannot be installed in a regular PCIe slot.

Is H100 PCIe much slower than H100 SXM5?

For single-GPU inference, the difference may be smaller. For multi-GPU training, SXM5 can be much faster because of NVLink and NVSwitch bandwidth.

Does H100 PCIe support NVLink?

Yes, H100 PCIe can support NVLink bridge between two GPUs in compatible systems. This is not the same as HGX H100 SXM5 with NVSwitch.

What is the main difference between H100 PCIe and H100 SXM5?

H100 PCIe is for flexible standard servers. H100 SXM5 is for HGX systems where multiple GPUs need very fast communication.

What is H100 NVL?

H100 NVL is a special H100 version mainly designed for large language model inference. It offers 94GB memory per GPU and uses NVLink between two GPUs.

Which H100 is best for LLM training?

For serious LLM training, H100 SXM5 in an HGX H100 server is usually the best option because of NVLink, NVSwitch, and high memory bandwidth.

Which H100 is best for LLM inference?

For many inference workloads, H100 PCIe is enough. For larger LLM inference workloads, H100 NVL can be a very strong option because it has more memory per GPU.

How many H100 GPUs do I need?

It depends on model size, precision, batch size, and whether you are training or only serving inference. For small inference, one H100 may be enough. For large training, you may need 4, 8, or many more GPUs across a cluster.

Should I buy H100 PCIe, NVL or SXM5?

Choose PCIe if you want flexibility, standard servers, and good inference performance. Choose SXM5 if you need maximum multi-GPU performance for large training or HPC workloads.

NVIDIA H100 GPUs

In Stock: PCIe, SXM5 & NVL GPUs

Request a Quote

Sources: NVIDIA H100 PCIe, SXM5 & NVL GPUs

NVIDIA H100 Tensor Core GPU official product page:

https://www.nvidia.com/en-us/data-center/h100/

NVIDIA H100 PCIe GPU product brief:

https://www.nvidia.com/content/dam/en-zz/Solutions/gtcs22/data-center/h100/PB-11133-001_v01.pdf

NVIDIA HGX H100 / HGX H200 official datasheet:

https://resources.nvidia.com/en-us-hopper-architecture/hpc-hgx-h100-hgx-h20

NVIDIA DGX H100 / H200 system documentation:

https://docs.nvidia.com/dgx/dgxh100-user-guide/

NVIDIA H100 NVL GPU product brief:

https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/h100/PB-11773-001_v01.pdf

server-parts.eu Blog

Comparison: NVIDIA H100 PCIe, SXM5 & NVL GPUs

Quick comparison

Technical differences

Decision Shortcut: NVIDIA H100 PCIe, SXM5 & NVL GPUs

If you only need one thing from this article:

What Is the Difference Between PCIe, SXM5 & NVL NVIDIA H100 GPUs?

What Is the Difference Between H100 PCIe, H100 SXM5, and H100 NVL GPUs?

NVIDIA H100 PCIe, SXM5 & NVL GPUs: Bandwidth Architecture in Detail

NVIDIA H100 PCIe, SXM5 & NVL GPUs: Full Technical Specifications

Architecture

CUDA cores

Tensor Core performance

VRAM and memory bandwidth

GPU-to-GPU interconnect

Power draw

Server compatibility

MIG support

NVIDIA H100 PCIe, SXM5 & NVL GPUs: Scale-Up vs Scale-Out

Multi-Instance GPU: NVIDIA H100 PCIe, SXM5 & NVL GPUs

NVIDIA H100 PCIe, SXM5 & NVL GPU Workloads

NVIDIA H100 PCIe 80GB

NVIDIA H100 SXM5 80GB

NVIDIA H100 NVL 94GB

NVIDIA H100 Servers: PCIe, SXM5 & NVL GPU Models

SXM5 / HGX H100 servers — for multi-GPU training and HPC:

PCIe H100 servers — for inference and flexible deployments:

H100 NVL servers — for LLM inference:

Common Mistakes When Buying NVIDIA H100 PCIe, SXM5 & NVL GPUs

Buying SXM5 without HGX infrastructure

Assuming PCIe and SXM5 perform the same

Ignoring power and cooling

Buying H100 PCIe when you really need HGX

Buying HGX H100 when PCIe would be enough

Confusing H100 NVL with H100 SXM5

Forgetting about networking

Summary: NVIDIA H100 PCIe, SXM5 & NVL GPUs

FAQ: NVIDIA H100 PCIe, SXM5 & NVL GPUs

Can I install an H100 SXM5 in a standard server?

Is H100 PCIe much slower than H100 SXM5?

Does H100 PCIe support NVLink?

What is the main difference between H100 PCIe and H100 SXM5?

What is H100 NVL?

Which H100 is best for LLM training?

Which H100 is best for LLM inference?

How many H100 GPUs do I need?

Should I buy H100 PCIe, NVL or SXM5?

Sources: NVIDIA H100 PCIe, SXM5 & NVL GPUs