Is the RTX 50 Super refresh cancelled?

Not officially. As of May 2026, board-partner reporting from TweakTown, VideoCardz, and guru3d indicates NVIDIA has told its AIB partners the GeForce RTX 50 SUPER refresh is 'delayed indefinitely' — a backchannel notice, not a public announcement. Some roadmap leaks suggest outright cancellation, but NVIDIA has officially confirmed neither a delay nor a cancellation. The practical difference is small for buyers: a card with no announced date and no confirmed specs cannot be planned around.

Should I wait for the RTX 5070 Ti Super 24GB?

No. The rumored 24GB RTX 5070 Ti SUPER and RTX 5080 SUPER have no announced release date, and the GDDR7 memory shortage that caused the delay also means any eventual launch would likely arrive well above MSRP. The decision rule is simple: if the card you are waiting for has no date and prices are rising, waiting is a losing trade. A used RTX 4090 (24GB) or RTX 3090 (24GB) bought today gives you the same VRAM tier now, with resale value that caps your downside if the Super ever ships.

What is the best 24GB GPU for local AI now that the Super is delayed?

The best 24GB GPU for local AI in 2026 is a used RTX 3090 ($699 – $999) for value, or an RTX 4090 ($1,599 – $1,999) for speed and CUDA support. Both deliver the 24GB VRAM tier that the rumored RTX 5070 Ti SUPER / 5080 SUPER would have brought to the sub-$1,000 segment. 24GB is the threshold that unlocks 32B-class models such as Gemma 3 27B and CodeLlama 34B at Q4 quantization — a tier no 16GB card can run.

Why was the RTX 50 Super delayed?

The RTX 50 SUPER cards depended on dense 3GB GDDR7 memory modules to deliver their rumored ~50% VRAM increase. That exact memory is being absorbed by the AI datacenter boom, which has created a GDDR7 shortage. TechPowerUp reported the shortage could stall the SUPER rollout entirely. With no competitive pressure from AMD at the high end, NVIDIA has little incentive to fight for scarce memory to ship a modest consumer refresh.

Is a used RTX 3090 still worth buying in 2026?

Yes. The used RTX 3090 ($699 – $999) is the strongest VRAM-per-dollar pick for local AI in 2026. It delivers 24GB of GDDR6X — the same capacity as an RTX 4090 — from a memory supply chain barely touched by the GDDR7 shortage. It runs 70B-class models at roughly 9 tokens per second at Q4 and handles 32B-class models comfortably. It is the exact card the delayed Super refresh was meant to replace, and it is available today.

Economics14 min read

The RTX 50 Super Refresh Is Delayed Indefinitely — What Local AI Builders Should Buy Instead (2026)

NVIDIA has quietly told board partners the RTX 50 SUPER refresh — the 18GB/24GB cards local AI builders were waiting for — is delayed indefinitely over the GDDR7 shortage. Here's why waiting is now a losing trade, and the exact GPU to buy instead at every budget.

Compute Market Team

Published May 22, 2026

Our Top Pick

NVIDIA GeForce RTX 3090

$699 – $999

24GB GDDR6X10,496936 GB/s

Check Price on Amazon Full review →

The verdict, up front: NVIDIA's GeForce RTX 50 SUPER refresh — the cards that would have brought 24GB of VRAM to the RTX 5070 Ti and 5080 — has been delayed indefinitely, and for local AI builders that changes the math entirely. Do not keep waiting. The right move now is to buy the correct card for your VRAM tier today: a used RTX 3090 ($699 – $999) or an RTX 4090 ($1,599 – $1,999) if you need 24GB, an RTX 5090 ($1,999 – $2,199) if you need 32GB, and a new 16GB card only if you were never going to touch 32B-class models anyway.

If you saw the CES-era leaks, decided to "wait for the 24GB Super," and have now been waiting five-plus months with nothing to show for it — this post is the verdict you came for. Here is exactly what happened, why it matters far more to AI builders than to gamers, and the specific SKU to buy at every budget.

Quick Answer: Should You Wait for the RTX 50 Super?

No. NVIDIA's RTX 50 SUPER refresh — which would have brought 24GB cards to the RTX 5070 Ti and 5080 — has been delayed indefinitely due to the GDDR7 memory shortage, so the best 24GB GPU for local AI in 2026 is a used RTX 3090 ($699 – $999) or an RTX 4090 ($1,599 – $1,999) bought today, not a Super card waited for.

Three-line decision summary, keyed to budget:

Under $1,000 and you want 24GB: buy a used RTX 3090 now. The Super was never going to be cheaper.
$1,000–$2,000 and you want 24GB done right: buy an RTX 4090. It is the real winner of this delay.
You need 32GB+ for 70B-class models: buy an RTX 5090, or go Apple Silicon for raw model fit.

What the RTX 50 SUPER Refresh Was Supposed to Be

When the RTX 50 SUPER refresh leaked around CES 2026, it was the most-anticipated hardware on the local-AI calendar. The rumored lineup — and every spec below is leaked/rumored, never officially announced by NVIDIA — looked like this:

Rumored card	Rumored VRAM	Replaces	Memory tech
RTX 5070 SUPER	18GB (vs 12GB)	RTX 5070	Dense 3GB GDDR7 modules
RTX 5070 Ti SUPER	24GB (vs 16GB)	RTX 5070 Ti	Dense 3GB GDDR7 modules
RTX 5080 SUPER	24GB (vs 16GB)	RTX 5080	Dense 3GB GDDR7 modules

Source: leaked specifications aggregated by VideoCardz and wccftech, CES-2026 era. NVIDIA never officially confirmed these cards. Treat all figures as rumored.

The headline was a roughly 50% VRAM bump on the same Blackwell silicon, achieved by swapping standard 2GB GDDR7 modules for denser 3GB ones, paired with a modest ~5–10% raw performance gain from slightly higher core counts and clocks. For gamers, that is a minor mid-cycle refresh. For local AI, the 24GB tier on a sub-$1,100 card would have been the single most important value release of the year — which is exactly why its disappearance hurts.

What Actually Happened: "Delayed Indefinitely"

In mid-May 2026, multiple board-partner outlets reported the same thing within days of each other: NVIDIA had quietly informed its AIB (add-in-board) partners that the RTX 50 SUPER refresh is delayed indefinitely. There was no press release, no roadmap update, no public statement — just a backchannel notice to the companies that build the cards.

TweakTown reported the SUPER series is "delayed indefinitely" as NVIDIA informed its partners.
VideoCardz corroborated with its own board-partner sourcing, framing the release as "put on hold."
guru3d independently confirmed the AIB-channel notice that the refresh had slipped.

One important distinction for anyone parsing the headlines: "delayed indefinitely" is not the same as "cancelled." Some roadmap leaks point toward outright cancellation, but NVIDIA has officially confirmed neither. For a buyer, though, the practical effect is identical. As we put it in our companion analysis on whether to wait for the RTX 5090 Ti / Titan: a card with no announced date, no confirmed price, and no confirmed specs cannot be planned around. It is a phantom. You cannot run inference on a rumor.

The purchasing logic is blunt: an indefinite delay with no replacement date is, for buying purposes, a cancellation. The only people who should still be "waiting" are those who don't actually need a GPU yet.

Why It Was Delayed — the GDDR7 Shortage

The cause is not manufacturing trouble with the GPU dies. The Blackwell silicon is fine. The bottleneck is memory.

The entire point of the SUPER refresh was the VRAM bump, and that bump depended on dense 3GB GDDR7 modules. Those exact modules are in ferocious demand from the AI datacenter buildout — the same structural squeeze we documented in depth in our 2026 DRAM shortage analysis. When memory makers can sell every wafer of dense, high-margin memory into datacenter accelerators, allocating it to a mid-cycle consumer refresh becomes a low priority.

TechPowerUp reported directly that the GDDR7 shortage could stop the RTX 50-series SUPER rollout. Two factors make the delay rational from NVIDIA's side:

The memory is worth more elsewhere. Datacenter demand for dense memory is effectively unlimited at current AI-buildout pace. A consumer SUPER card is the least profitable home for a scarce 3GB GDDR7 module.
There is no competitive pressure. AMD is not contesting the high end of the consumer market in 2026. With nobody forcing NVIDIA's hand, there is no strategic reason to burn scarce memory to ship a refresh that mostly helps buyers, not margins.

The takeaway: this delay is not a temporary hiccup that resolves next quarter. It is downstream of a structural memory shortage with no clean end date — which is precisely why "just wait" is the wrong instinct.

Why This Matters More for Local AI Than for Gamers

Here is the point no gaming-focused outlet is making. Every TweakTown, VideoCardz, and wccftech writeup frames this as an FPS story. For a gamer, losing the SUPER refresh means losing roughly 5–10% of frame rate — annoying, forgettable, not decisive.

For a local AI builder, the delayed 24GB RTX 5070 Ti SUPER / 5080 SUPER would have been the value sweet spot of the entire year. The reason is the way model sizes map to VRAM. The jump from 16GB to 24GB is not incremental — it crosses a hard threshold:

16GB runs 13B–14B models comfortably at Q4 quantization, and squeezes a 27B model in only at tight Q4 with little room for context.
24GB unlocks the entire 32B-parameter class with breathing room: Gemma 3 27B (~16GB at Q4), CodeLlama 34B (~20GB at Q4), and Qwen 32B-class models all become genuinely usable.

So when the Super refresh disappears, gamers lose a single-digit FPS percentage. AI builders lose a whole model tier. The card that would have democratized 32B-class local inference at a sub-$1,100 price simply does not exist anymore. That is the framing this post owns, and it is the reason your buying decision can't wait on the gaming-press timeline.

The VRAM Tiers You're Actually Choosing Between Now

With the phantom Super out of the picture, here are the real cards on the table and what each one runs. VRAM-per-model is the only spec that decides which models you can load at all — see our full VRAM guide for the complete breakdown.

VRAM tier	Cards	What it runs (Q4)	Price range
12GB	Intel Arc B580	7B–8B comfortably (Llama 4 Scout 8B)	$249 – $289
16GB	RTX 5060 Ti, RTX 4060 Ti, RTX 5080	13B–14B comfortably (Phi-4 14B); 27B only at tight Q4	$399 – $1,099
24GB	used RTX 3090, RTX 4090	32B-class at Q4; 70B at heavy quant	$699 – $1,999
32GB	RTX 5090	70B at Q4 with headroom (DeepSeek R1 70B)	$1,999 – $2,199
Unified memory	Mac Mini M4 Pro (24GB), Mac Studio M4 Max (up to 192GB)	Sidesteps the GDDR7 shortage entirely	$1,399 – $5,999

VRAM-per-model figures based on model weight sizes at Q4; actual usable headroom depends on context length and KV cache. Card prices are current MSRP/street ranges and move with the ongoing memory shortage.

Should You Wait? The Decision Framework

Three buyer profiles, one hard decision rule. The rule: if the card you're waiting for has no announced date and prices are rising, waiting is a losing trade. The GDDR7 shortage means GPU prices are drifting up, so every month you wait, the thing you eventually buy costs more — and the thing you're waiting for still doesn't exist.

Profile 1: You need 16GB or less

Buy now. The Super refresh wouldn't have helped you — the 18GB RTX 5070 SUPER was a marginal step, and you were never targeting 32B-class models. A new RTX 5060 Ti 16GB or RTX 4060 Ti 16GB covers the 13B–14B tier today. Compare them directly in our RTX 5060 Ti vs RTX 4060 Ti breakdown.

Profile 2: You want 24GB

Do not wait — this is the profile the delay actually hurts. The 24GB Super has no date, and GDDR7-era pricing means even an eventual launch would not be cheap. Buy a used RTX 3090 or an RTX 4090 today. You get the 24GB tier now, and — critically — a used 24GB card holds resale value, so your downside is capped (more on that below).

Profile 3: You want 32GB or more

Buy an RTX 5090 — the only new consumer card above 24GB — or step to Apple Silicon for raw big-model fit. The Super refresh topped out at a rumored 24GB; it was never going to serve the 70B-class buyer anyway.

Best Buys Right Now, by Budget

Concrete picks, honest caveats, current pricing. For the broader buyer's guide this funnels into, see our best consumer GPU for local LLMs and the AI GPU buying guide hub.

Under $500: Intel Arc B580, or a used RTX 3090

The Intel Arc B580 ($249 – $289) is the budget floor: 12GB of VRAM, a low 150W TDP, and a community-reported ~28 tok/s on Llama 3 8B Q4 (LM Studio Community, needs verification). The caveat is real — Intel's OpenVINO and IPEX stack is less mature than CUDA, so expect occasional friction. See our used RTX 3090 vs RTX 5060 Ti deep-dive for how the budget tiers stack up.

But the smarter sub-$1,000 buy — and the headline value pick of this entire delay — is a used RTX 3090 ($699 – $999). It is the exact card the Super refresh was meant to dethrone, and it is sitting on the shelf right now. 24GB of GDDR6X, ~48 tok/s on Llama 3 8B Q4 and ~9 tok/s on Llama 3 70B Q4 (LM Studio Community, needs verification), drawn from a memory supply chain the GDDR7 shortage barely touches. It runs the entire 32B-class tier today. Best for: maximum VRAM-per-dollar.

$500–$1,000: 16GB new, or 24GB used

If you genuinely never needed 24GB, the RTX 5060 Ti 16GB ($429 – $479) is the best new 16GB card under $500 — Blackwell tensor cores with FP4 support and ~42 tok/s on Llama 3 8B Q4 (LM Studio Community, needs verification). The RTX 4060 Ti 16GB ($399 – $449) is the cheaper Ada alternative. But if 24GB is the goal, a used RTX 3090 still wins this bracket outright — that is the trade-off our RTX 5080 vs RTX 3090 comparison lays out.

$1,000–$1,500: RTX 4090 — the real winner of this delay

The RTX 4090 ($1,599 – $1,999) is the card the delayed Super was supposed to make redundant — and now it isn't. 24GB of GDDR6X, 16,384 CUDA cores, ~62 tok/s on Llama 3 8B Q4 and ~12 tok/s on Llama 3 70B Q4 (LM Studio Community, needs verification), full CUDA support, and a mountain of community documentation. It runs 32B-class models with room to spare and 70B-class models at heavy quant. The alternative new option here is the 16GB RTX 5080 ($999 – $1,099) — current-gen Blackwell, but the same 16GB ceiling a 5080 SUPER would have lifted. The RTX 5080 vs RTX 4090 comparison is the core decision: newer 16GB silicon, or the 24GB model tier. For local AI, 24GB wins.

$2,000+: RTX 5090 — the only new card above 24GB

If you want 70B-class models at Q4 with genuine headroom and refuse to buy used, the RTX 5090 ($1,999 – $2,199) is the answer: 32GB of GDDR7, ~95 tok/s on Llama 3 8B Q4 and ~18 tok/s on Llama 3 70B Q4 (LM Studio Community, needs verification). It is the only new consumer card above 24GB and the only one that comfortably runs Qwen 3 72B at Q4. The catch is a 575W TDP demanding a 1000W+ PSU. For buyers chasing 32GB specifically, our cheapest 32GB GPU guide compares the alternatives.

The Apple Silicon Escape Hatch

Here is the option most GPU-focused coverage ignores entirely: Apple Silicon's unified memory is not made of GDDR7. It uses LPDDR5X from a separate supply chain, so Apple's memory capacity is untouched by the shortage that killed the Super.

The Mac Mini M4 Pro ($1,399 – $1,599) gives you 24GB of unified memory in a silent, palm-sized box — the same effective VRAM tier as the rumored 5070 Ti SUPER, available today. Step up to the Mac Studio M4 Max ($1,999 – $5,999) and you can configure up to 192GB of unified memory — large-model capacity no consumer GPU, Super or otherwise, can touch. Both run local models cleanly via Ollama, MLX, and GGUF-based tooling.

The trade-offs are honest: lower raw tok/s than an equivalent NVIDIA card, and no CUDA, so some training and fine-tuning frameworks won't run. But for pure inference and big-model fit, it sidesteps the entire problem. The Mac Studio M4 Max vs RTX 5090 comparison covers this decision in full, and our local LLM guide has setup walkthroughs.

What If the Super Eventually Launches?

Plan for the realistic scenario, not the hopeful one. Even if NVIDIA revives the SUPER refresh in early 2027, it would launch into the GDDR7 shortage, not after it. That means inflated, memory-shortage-era pricing — not the clean MSRP the CES leaks implied. A "$799" 24GB card that ships at $1,100-plus street is not the deal you were waiting for.

This is where the used-24GB-as-a-hedge argument matters, and it is the genuinely useful financial insight pure-news sites won't give you. When you buy a used RTX 3090 today, you are buying an asset with a real resale market. If the Super somehow launches at a compelling price, you sell the 3090 for close to what you paid and upgrade. Your downside is capped. Meanwhile you had a working 24GB card the entire time instead of an empty PCIe slot.

The principle to carry out of this: don't price a phantom card into a real budget. Buy the hardware that exists, keep the receipts, and treat any future Super launch as an optional upgrade — not a plan.

Bottom Line — the Decision Tree

The RTX 50 SUPER refresh is delayed indefinitely, the GDDR7 shortage that caused it has no clean end date, and prices are rising while you wait. Stop waiting. Here is the decision tree:

Want 24GB for under $1,000? Buy a used RTX 3090 ($699 – $999) today.
Want 24GB done right, $1,000–$2,000? Buy an RTX 4090 ($1,599 – $1,999) — the real winner of this delay.
Only ever needed 16GB? Buy a new RTX 5060 Ti 16GB ($429 – $479) — the Super wouldn't have changed your build.
Need 32GB+ for 70B-class models? Buy an RTX 5090 ($1,999 – $2,199).
Want shortage immunity and big-model fit? Go Apple Silicon — Mac Mini M4 Pro ($1,399 – $1,599) or Mac Studio M4 Max ($1,999 – $5,999).

For deeper context on where prices go from here, read our 2026 GPU pricing guide and GPU market trends analysis. If you're sizing system RAM alongside VRAM, our how much RAM for local AI guide is the companion piece, and the best local LLMs for the RTX 50 series covers what the current Blackwell lineup actually runs. Budget-focused builders should also check the AI on a budget hub.

Last updated: May 22, 2026. Delay reporting sourced from TweakTown, VideoCardz, guru3d, and TechPowerUp; rumored SUPER specifications from VideoCardz and wccftech leak aggregation and are not officially confirmed by NVIDIA. Performance figures are community-sourced and marked "needs verification." Prices reflect current MSRP/street ranges and move with the ongoing memory shortage.

Pair-buy essentials

Pairs with your NVIDIA GeForce RTX 3090

A 5090 is wasted without clean power, fresh paste, and fast storage. Pair-buys that keep the rig stable.

Corsair RM850x ATX 3.1 (Native 12V-2x6)
$130 – $170
Native 12V-2x6 at 850W, 80+ Gold, fully modular — skips the melted-adapter saga on RTX 40/50 builds.
Shop on Amazon
Arctic MX-6 Thermal Paste (4g)
$8 – $14
Drops sustained-load temps 4–8°C vs. dried-out stock paste. Reapply on day one.
Shop on Amazon
Samsung 990 Pro 2TB Gen4 NVMe
$160 – $210
7,450 MB/s reads cut 70B-class quant cold-loads to seconds. 2TB fits ~10 quantized models.
Shop on Amazon

Show 3 more →

Arctic P14 PWM PST 140mm Fans (5-pack)
$40 – $55
High static pressure + PWM daisy-chain. A full tower's worth of airflow for ~$50.
Shop on Amazon
CyberPower CP1500PFCLCD Pure-Sine UPS
$200 – $260
1500VA pure sine + AVR — protects PSUs from the brownouts that corrupt model files mid-run.
Shop on Amazon
Acer GPU Support Bracket (Magnetic Base)
$15 – $25
Stops a 3-slot RTX 5090 from sagging into the PCIe pins. Magnetic base + non-slip foot — 30-second install.
Shop on Amazon

Includes paid promotion from Acer via Amazon Creator Connections. We earn a commission on qualifying purchases at no cost to you.

RTX 50 SuperRTX 5070 Ti SuperRTX 5080 SuperGDDR7 shortagebest GPU for local AIRTX 4090RTX 3090RTX 509024GB GPUlocal LLM hardwareMac Mini M4 Probuy GPU 2026