GPU Benchmarks That Actually Help You Pick a Graphics Card

The first thing you need to know about GPU names is that they were designed by marketing departments, not engineers. A GTX 1660 Ti sounds like it should be comfortably behind an RTX 2060 since the 20-series came after the 16-series. In PassMark's benchmark, the RTX 2060 scores about 14,100 and the GTX 1660 Ti scores about 12,660. That's an 11% gap. Not the generation leap the naming implies.

Now imagine picking a laptop. You find one with an RTX 5090. You think: flagship GPU. Top tier. The RTX 5090 desktop hits around 575 watts and benchmarks as the fastest consumer card available. The RTX 5090 laptop version runs at a maximum of 150 watts and delivers roughly half the performance of its desktop counterpart. Same name. Half the performance. This is not a quirk. It is policy.

The only way to cut through this is to compare actual benchmark numbers, not names.

GPU 排行榜

比较和排名 GPU 性能

Why the naming schemes are designed to confuse

NVIDIA's line has three active tiers running simultaneously: the old GTX cards (no ray tracing), the RTX 30-series (Ampere), and the RTX 40-series (Ada Lovelace). Within each generation, the number after the series indicates tier: 60 is mid-range, 70 is upper-mid, 80 is high-end, 90 is flagship. Simple enough. But then NVIDIA adds Ti and Super suffixes to the same tier, creating RTX 4060, RTX 4060 Ti, RTX 4070, RTX 4070 Super, RTX 4070 Ti, and RTX 4070 Ti Super. Six cards in roughly the same tier bracket.

AMD's RX series has its own logic: the first digit is the generation (7 = RDNA 3), the second is the tier (6 = mid-range, 7 = upper-mid, 9 = high-end), and XT means a higher-clocked version. Except AMD also released the RX 6000 series alongside the RX 7000 series, and a high-end RX 6800 XT often trades blows with or beats a mid-range RX 7700 XT depending on the workload. The numbers 6800 and 7700 imply the 7700 is faster. That is not always true.

Intel Arc adds a third system entirely. The A-series used three digits where the first indicates tier and the latter two indicate rank within that tier. Then Intel released the B-series with different naming logic. The Arc A770 with 16GB of VRAM is a legitimately good card for the money, but "Arc" as a brand communicates nothing about performance tier to someone who hasn't already looked it up.

Cross-vendor comparisons are even messier. The RX 7600 (AMD) and RTX 4060 (NVIDIA) compete at the same price point. The AMD card uses higher numbers but that doesn't mean it's faster. In many gaming benchmarks, they're within a few percent of each other. The number tells you nothing without knowing the vendor's tier system.

The mobile GPU problem

This is where the confusion becomes expensive. When a laptop lists an RTX 4060 or RTX 4070, those names describe mobile versions of those chips. NVIDIA dropped the "Max-Q" and "Max-P" suffixes it briefly used to differentiate power variants, so a laptop RTX 4060 could mean anything from 35 watts to 115 watts depending on the manufacturer's choice. Same model name. Performance that can vary by 40% or more between two laptops.

The RTX 5090 example is the extreme case, but the pattern holds all the way down the stack. A laptop RTX 4080 often performs similarly to a desktop RTX 4070. A laptop RTX 4060 can perform like a desktop RTX 3060 or worse when power limits are set aggressively by a thin-and-light chassis.

If you're comparing a desktop card to a laptop card, or comparing two laptops with the same GPU name, you need the benchmark scores. The name is not sufficient.

VRAM: the spec everyone misreads

More VRAM is better. Except when it isn't, and except when it's necessary in ways a benchmark score doesn't capture.

Here is the practical breakdown by workload:

Gaming. At 1080p, 8GB is still a functional minimum, though some titles in 2025-2026 are pushing against that limit at high settings. At 1440p, 12GB is the realistic floor for demanding titles without hitting texture streaming issues. At 4K with high-resolution texture packs, 16GB is where you want to be. The benchmark score matters more than VRAM for pure frame rate, but low VRAM creates stuttering and texture pop-in that no benchmark measures well.

Video editing. A 4K timeline with complex effects and color grading in DaVinci Resolve or Premiere Pro works well at 12GB. For RED raw footage or uncompressed 4K at 10-bit, 16GB is more comfortable. VRAM here is a buffer for the GPU to hold frames it needs to process without reading from slower system RAM. Running out causes project stalls, not just frame drops.

Local AI and machine learning. VRAM requirements scale with model size. Running a 7 billion parameter model for local inference needs roughly 14-15GB of VRAM in 16-bit mode. A 13-14B model needs around 28GB in the same mode. Running smaller quantized models (4-bit) halves the requirement roughly, so a 7B model at 4-bit fits in 6-8GB. The point is that a card with a high PassMark score but 8GB of VRAM can't run models that a slower card with 24GB can run. For ML work specifically, VRAM is often the binding constraint, not compute performance.

This is why a high-score card on a benchmark chart isn't automatically the right choice. A card that scores 15% higher but has 8GB instead of 16GB is the wrong pick for a machine learning workflow or a 4K editing setup.

What PassMark scores actually measure

PassMark's G3D Mark runs a mix of rendering tests covering geometric complexity, texture operations, lighting, shading, and compute operations. It produces a composite score that reflects overall GPU performance reasonably well for general-purpose workloads.

What it does not measure specifically: ray tracing performance, DLSS or FSR upscaling quality, video encode/decode throughput, AI tensor performance, or game-specific optimizations. A card with weak ray tracing hardware can score well in PassMark because ray tracing is a small part of the composite. A card with excellent CUDA performance for ML may score similarly to one that's better optimized for rasterization games.

Use the score as a reliable filter, not as the final answer. If you're picking between two cards for gaming and one scores 18,000 while the other scores 14,000, the first card is faster in most gaming scenarios. If you're picking for stable diffusion or video encoding and the specs show different VRAM or encode hardware, the score tells you less.

A practical decision process

Rather than a tier list, here's how I'd actually approach a GPU purchase:

Start with your primary use case and note the hard requirements first. If you're running local language models at 13B parameters, 16GB of VRAM is not optional, and that immediately rules out a large portion of the market. If you're editing 4K video professionally, 12GB is a realistic minimum. If you're gaming at 1080p, 8GB works today. These constraints come before looking at benchmark scores.

Then use benchmark scores to rank the cards that clear those hard requirements. At that point, the PassMark number is a useful proxy for "which of these is faster for compute." A 20% higher score generally means a 20% faster card in practice, though the relationship loosens for very workload-specific tasks.

Finally, check whether you're comparing a desktop card to a desktop card. If a laptop is involved, find benchmark data for that specific machine configuration, not just the GPU model name.

GPU 排行榜

比较和排名 GPU 性能

The single most useful thing about a benchmark table is that it makes performance gaps visible in a way that product names never will. An RTX 4060 Ti outperforms an RTX 3080 in PassMark. A high-end card from two generations ago sometimes beats a current-gen mid-range card. The name tells you neither of those things.

My opinion, stated plainly: NVIDIA's decision to drop the Max-Q/Max-P laptop suffix was a mistake that saves the company some marketing awkwardness at the cost of consumers consistently overpaying for laptop GPUs that perform significantly below expectations. Until naming standardizes across desktop and mobile, benchmark scores are the only honest comparison.

z.tools

GPU names lie. Here's how to stop falling for it

GPU 排行榜

Why the naming schemes are designed to confuse

The mobile GPU problem

VRAM: the spec everyone misreads

What PassMark scores actually measure

A practical decision process

GPU 排行榜

https://z.tools/t/gpu-ranking

MiniMax HD vs Turbo vs Eleven Flash for finished work

Mandarin text-to-speech in 2026: dialect routing across MiniMax 2.8 and Qwen3-TTS

Voice cloning from a few seconds of audio: where it works, where it stops, and consent

GPU 排行榜

Why the naming schemes are designed to confuse

The mobile GPU problem

VRAM: the spec everyone misreads

What PassMark scores actually measure

A practical decision process

GPU 排行榜

https://z.tools/t/gpu-ranking

继续阅读

MiniMax HD vs Turbo vs Eleven Flash for finished work

Mandarin text-to-speech in 2026: dialect routing across MiniMax 2.8 and Qwen3-TTS

Voice cloning from a few seconds of audio: where it works, where it stops, and consent