← Back to blog
Essay No. 017  ·  AI Infrastructure  ·  Melbourne, Australia
AI Infrastructure Apple Silicon TSMC SRAM Moore’s Law Semiconductors AI Chips Transistor Density Packaging Memory Bandwidth ASML Advanced Nodes

The Density Illusion.Original analysisNot investment advice

Why Apple’s A14 exposed the end of easy transistor scaling, and why AI chips now depend on SRAM, memory bandwidth, packaging, power delivery, and architecture.
PM
Pugalenthi Magendran
March 2026  ·  Melbourne, Australia
12 min read

A14 looked like a perfect Moore’s Law win: TSMC 5nm, 11.8 billion transistors, and a tiny die. The real lesson was the gap between theoretical density and usable density. In the AI era, progress no longer comes from shrinking logic alone. It comes from SRAM, memory bandwidth, packaging, power delivery, and architecture.

In 2020, Apple’s A14 looked like a clean Moore’s Law victory. It was built on TSMC’s 5nm process. It packed 11.8 billion transistors into a die of roughly 88 mm², which worked out to about 134 million transistors per square millimetre.12 On paper, that sounds like pure progress.

The more interesting story was the gap. The A14 did not reach TSMC’s full theoretical N5 density. The uploaded SemiAnalysis piece estimated Apple captured only around 78% of theoretical N5 density on this chip, even though prior Apple SoCs had landed closer to the theoretical line. That was not because Apple forgot how to design chips. It was not because TSMC failed. It was because a real SoC is not made of one thing.1

A process node can shrink logic transistors aggressively, but an actual SoC contains CPU cores, GPU cores, neural engines, SRAM caches, analog circuits, memory controllers, I/O blocks, display and media engines, security IP, and physical-design overhead. Those parts do not scale at the same rate.

That is the density illusion.

Key idea

The correct claim is not that Moore’s Law is dead. The correct claim is that Moore’s Law became a system problem. Transistors still improve. Nodes still advance. But the benefit no longer arrives automatically as cheaper, denser, smaller chips. It arrives as a negotiation across SRAM, memory bandwidth, packaging, power delivery, and architecture.


I. The 2020 thesis

In October 2020, Dylan Patel published a SemiAnalysis piece on Apple’s A14. He estimated 11.8 billion transistors, a die size around 88 mm², and a resulting effective density near 134 million transistors per square millimetre.1 Those numbers were not the point. The point was the delta between Apple’s effective density and TSMC’s headline N5 logic-density number, and the explanation: SRAM and analog do not shrink like pure logic, and a real SoC is dominated by exactly the blocks that scale worst.

I revisited that piece because it stopped looking like a one-chip footnote and started looking like the structural story of the next six years.

2020 thesis

A foundry can improve theoretical logic density, but a real SoC only captures part of that gain because SRAM, analog, I/O, and design overhead do not shrink evenly.


II. Marketing density vs product density

The industry talks about density as if the whole chip shrinks evenly. It does not. Marketing density is usually a representative logic figure that the foundry publishes for node-to-node comparison. Product density is what a real chip achieves once SRAM, analog, I/O, accelerators, and physical-design overhead are included.

Marketing density

An ideal logic shrink

  • Basis · representative logic cells.
  • Use · comparing nodes against each other.
  • Strength · useful for roadmap framing.
  • Limit · not what a product actually achieves.
Product density

What a real SoC delivers

  • Basis · logic + SRAM + analog + I/O + accelerators + overhead.
  • Use · tracking real chip generations.
  • Strength · honest about scaling friction.
  • Limit · chip-specific, harder to compare cleanly.
Diagram 01 · A real SoC is not one thing
Logic
SRAM
Analog
Accelerators
I/O
Overhead
Six rough block types, six different scaling rates. The headline density number describes one of them, not the chip.

III. SRAM is the hidden bottleneck

SRAM is the fast memory built directly into the chip. It shows up everywhere: registers, L1, L2, system-level cache, GPU cache, neural-engine buffers, branch predictors, interconnect buffers. Modern designs need more SRAM, not less, because larger caches and local scratchpads are how chips hide the latency of going off-die.

The problem is that SRAM does not scale like logic. The uploaded SemiAnalysis piece identified slow SRAM scaling as a primary reason A14 fell short of N5’s theoretical line, and the same dynamic shows up at later nodes.1 TSMC’s own 2nm SRAM research reports 38.1 Mb/mm² in a 2 nm CMOS nanosheet technology with a 0.021 μm² high-density bitcell, and frames the result as roughly 1.1x denser than the prior node thanks to design-technology co-optimisation.3

That is real progress. It is also the kind of progress that no longer looks like a clean shrink. SRAM moves more slowly than logic. Analog and I/O move more slowly than SRAM. So the effective SoC density is a weighted average, dragged down by exactly the parts AI chips need more of.

Diagram 02 · The scaling mismatch
Logic shrink
strong
SRAM shrink
moderate
Analog / I/O
weak
Effective SoC
weighted
Illustrative directional bars, not exact ratios. Logic keeps moving. SRAM is dragging its feet. AI chips need more of both.13

Logic keeps moving. SRAM is dragging its feet. AI chips need more of both.


IV. Why AI makes the problem worse

AI workloads do not only need raw compute. They need data movement. On-device AI needs model weights resident in fast memory, activation buffers, cache, memory bandwidth, low latency, and low energy per generated token. Cloud AI multiplies the same demands: HBM, DRAM, interconnect, advanced packaging, power delivery, and data-center-scale infrastructure to keep accelerators fed.

ASML’s 2025 strategic report makes this explicit. AI requires leading-edge, high-performance processor chips, and a significant increase in DRAM chips compared with traditional compute architectures. Advanced Logic and AI-related DRAM are the two structural drivers of lithography demand.8

The trouble is that DRAM and on-die SRAM are exactly the places where scaling is harder. AI is pulling on the part of the stack that does not shrink cleanly.

The A14 density gap was an early warning for the AI era.


V. TSMC is still scaling, but scaling changed shape

TSMC is not failing. It is the leader. What changed is the texture of scaling. Node progress is now multi-dimensional. Nanosheets, backside power, design-technology co-optimisation, advanced packaging, and 3D stacking each contribute a slice of the gain, where a node-to-node shrink used to do most of the work alone.

The 2025-2026 roadmap looks like this on the public record. N2 entered volume production in 4Q25 with first-generation nanosheet transistors. N2P is scheduled for second-half 2026 volume production. A16 integrates nanosheet transistors with Super Power Rail backside power delivery. A13, debuted at TSMC’s 2026 symposium as a direct shrink of A14, provides a 6% area saving from A14 and is scheduled for production in 2029, one year after A14.459

Table · TSMC node roadmap, 2025–2029
Node What it is Production
N2
First-generation nanosheet transistor process. The big node transition.
4Q 2025
N2P
Enhanced N2 with performance and power improvements.
H2 2026
A16
Nanosheet transistors plus Super Power Rail backside power delivery.
2026 / 2027 ramp
A14
Next-generation logic node beyond A16, AI/HPC focused.
2028
A13
Direct shrink of A14 with backward-compatible design rules and ~6% area savings.
2029
Scaling is no longer one big shrink. It is many small wins layered together. Sources: TSMC technology pages and 2026 North America Technology Symposium briefings.45

The old node shrink was like pressing the whole chip smaller. The new node shrink is like negotiating with physics, one block at a time.


VI. Apple’s answer is architecture, memory, and on-device AI

Apple’s 2026 silicon argument is no longer mostly about transistor counts. It is about what the chip can do once the easy shrink stops paying.

M5 is built on third-generation 3 nm technology. Apple says it adds a next-generation 10-core GPU architecture with a Neural Accelerator in each GPU core, a 16-core Neural Engine, and unified memory bandwidth of 153 GB/s, with up to 4x peak GPU compute performance for AI compared with M4. Apple frames the unified memory architecture as the reason devices can run larger AI models entirely on-device.6 M4 itself, on second-generation 3 nm, was already a 28 billion-transistor chip with a 38 TOPS Neural Engine.7

The marketing structure of those announcements is the giveaway. The headline is not “we shrank transistors.” The headline is “we improved GPU AI acceleration, neural acceleration, memory bandwidth, performance per watt, and the ability to run real models on the device.”

Timeline · A14 to M5, where the story changed
2020
A14 on TSMC N5. 11.8B transistors, ~88 mm², ~134M transistors/mm², well below the theoretical N5 density line.12
2022–23
M2 / A16 family. Performance-per-watt and accelerator gains start to lead marketing language. Density growth slows relative to N5.
2024
M4 on second-gen 3 nm. 28 billion transistors, 38 TOPS Neural Engine, explicit AI framing.7
2025
M5 on third-gen 3 nm. Neural Accelerator in each GPU core, 16-core NE, 153 GB/s unified memory, >4x M4 GPU AI compute. The pitch is system, not transistor count.6
2026 +
N2 era opens with nanosheets and Super Power Rail entering the picture. Apple keeps converting limited density gains into AI throughput, memory, and on-device model size.4
Apple’s marketing story shifted from “more transistors” to “more useful compute per watt, more memory bandwidth, more models on-device.”

The best chip is no longer simply the densest chip. The best chip is the one that converts limited density gains into useful system performance.


VII. The new scaling stack

If A14 told us that one number cannot describe a chip, the AI era tells us that the scaling story spans the full stack. Each layer contributes a slice of the gain, and the layer mix is what wins generations.

Diagram 03 · The new scaling stack
10
Application / AI workload
Demand
09
Software frameworks & compilers
Lever
08
Architecture & workload-specific accelerators
Design
07
Compute cores (CPU, GPU, NPU)
Logic
06
SRAM & cache hierarchy
Memory
05
Memory bandwidth & unified memory
Bandwidth
04
Packaging, chiplets, 3D stacking
Integration
03
Backside power & interconnect
Routing
02
Transistor process (nanosheet, future CFET)
Device
01
Lithography & equipment (EUV, High-NA)
Print
Moore’s Law became a full-stack problem. No single layer dominates the curve anymore.

VIII. Why cost scaling also matters

Density is not the only thing that broke down. Even where transistors continue to shrink, the cost per useful transistor has not fallen the way it used to. The reasons are mundane and additive. Advanced wafers cost more. EUV tools are expensive. Process complexity rises. Yield learning takes time. Design and verification cost rise. Packaging costs rise. HBM and advanced memory cost rise. Every node demands more DTCO engineering than the last.

The uploaded SemiAnalysis piece argued that cost per transistor was already slowing around N5.1 Six years later, that observation reads as conservative. The economic magic of Moore’s Law was not only more transistors. It was cheaper useful transistors. That part is harder now.

If density only buys you a fraction of the historical shrink, and cost stops falling at the same rate, the only way forward is to extract more value per transistor.


Quick terms


IX. What could break the thesis

A serious piece needs counterarguments. The case that Moore’s Law has become a system problem has plausible challenges.

Risks & counterarguments
  1. SRAM keeps improving. DTCO and new bitcell designs continue to lift on-die memory density, just at a slower pace than logic.3
  2. 3D stacking restores effective density. Stacking logic, memory, or both vertically can recover area that 2D shrink no longer provides.
  3. Backside power frees routing. Super Power Rail and similar approaches make denser logic practical even when 2D shrink slows.4
  4. Chiplets improve product economics. Yield improves when dies are smaller, and integration costs may fall as packaging matures.
  5. Specialised accelerators trade density for performance. Domain-specific designs can outpace general-purpose density gains.
  6. Future device architectures. CFETs and stacked devices may extend the device-physics curve beyond nanosheets.
  7. Software optimisation. Better compilers, quantisation, and inference scheduling can reduce hardware pressure per workload.
  8. AI model efficiency. Sparsity, mixture-of-experts, and smaller specialised models can soften memory and density demand.

Scaling is not dead. The easy interpretation of scaling is dead. The mistake is not believing in progress. The mistake is believing progress still comes from one number.


X. The density illusion

A14 was not a failure. It was a warning. The node kept scaling. The chip did not scale like the marketing number. In 2026, that warning has become the semiconductor story.

Logic still shrinks. Transistors still improve. TSMC still leads. Apple still builds world-class chips. But real progress now comes from how well a company turns partial density gains into full system performance. SRAM matters. Memory bandwidth matters. Packaging matters. Power delivery matters. Architecture matters. Software matters.

Moore’s Law is no longer one curve. It is a system problem.

That is the density illusion. And in the AI era, it is the constraint that shapes everything above it.


1 Patel, D. (Oct 2020). Apple’s A14 Packs 134 Million Transistors/mm², but Falls Short of TSMC’s Density Claims. SemiAnalysis. Historical anchor for the A14 effective-density gap and the SRAM-scaling explanation. Used as inspiration only. No content, structure, or charts reproduced.

2 Apple (Sep 2020). Apple introduces the all-new iPad Air. Official A14 context including the 5 nm chip, 11.8 billion transistor count, and Neural Engine framing.

3 TSMC Research. 2 nm CMOS Nanosheet SRAM. 38.1 Mb/mm² SRAM, 0.021 μm² high-density bitcell, ~1.1x SRAM density improvement versus the previous node via DTCO.

4 TSMC. A16 technology page. Nanosheet transistors with Super Power Rail backside power delivery. N2 volume production in 4Q25 and N2P scheduled for the second half of 2026 per TSMC technology disclosures.

5 TSMC (2026). North America Technology Symposium 2026 briefing. A13 debuted as a direct shrink of A14 with ~6% area savings, backward-compatible design rules, and production scheduled in 2029.

6 Apple (Oct 2025). Apple unleashes M5. Third-generation 3 nm process, 10-core GPU with a Neural Accelerator in each core, 16-core Neural Engine, 153 GB/s unified memory bandwidth, and up to 4x peak GPU compute for AI versus M4.

7 Apple (May 2024). Apple introduces M4. Second-generation 3 nm, 28 billion transistors, 38 TOPS Neural Engine, performance-per-watt framing.

8 ASML (2025). 2025 Annual Report, strategic report section. AI requires leading-edge, high-performance processor chips and a significant increase in DRAM chips compared with traditional compute architectures, with advanced Logic and AI-related DRAM as the structural drivers of lithography demand.

9 TSMC. 2025 Annual Report. Advanced process roadmap and AI/HPC packaging context including 3DFabric and CoWoS positioning. Used as confirmatory context for the node roadmap.

Further reading
*   *   *

This is Essay No. 017. The topics: intelligence, AI, systems, knowledge, and the questions underneath the questions everyone else is asking. If you read this far and disagreed with any part of it, write to me. I read everything.

Pugalenthi Magendran