Essay No. 052 · Apple Silicon & Semiconductor Strategy

Apple Silicon Semiconductors A15 A19 M5 System Silicon AI Hardware TSMC ASML Die Shot Neural Engine System Cache

The A15 Die Shot That Predicted Apple Silicon's Future. Die-shot analysis Not investment advice

In 2021, the A15 looked weak if judged only by CPU gains. But the die shot showed the real strategy: Apple was moving silicon budget into cache, GPU, neural compute, media, ISP, and system-level efficiency.

Pugalenthi Magendran

Published May 27, 2026

14 min read

Thesis

Apple stopped spending silicon like a CPU company and started spending it like a system company. The A15 die shot showed this years before Apple's marketing did.

The most useful thing about a die shot is that it shows what a company really values.

Marketing tells the story the company wants to tell. Benchmarks show what a chip does in a selected set of workloads. But die area is harder to spin. It shows where the silicon budget actually went, block by block.

In 2021, the easy story about the A15 was that its headline CPU gains looked weak. The deeper story was that Apple was reallocating silicon budget away from a pure CPU-hero narrative and toward the system around the CPU. Bigger system cache. Larger GPU area. Revised neural compute. New ISP and media blocks. Tighter hardware and software integration^[1].

A15 looked like the year Apple's CPU gains slowed. The die shot showed something more important: Apple was reallocating silicon toward the whole system.

Section 01 What the A15 die shot showed

The 2021 SemiAnalysis die-shot analysis is the historical base for this essay^[1]. It documented a set of changes that, together, told a different story than the headline CPU benchmarks.

A15 die area increased to about 107.7 mm² from about 87.76 mm² on A14. Transistor count increased from roughly 11.8 billion to about 15 billion. But the analysis described process density as effectively unchanged, with SRAM cells described as the same size and LPDDR4x PHYs described as identical in size to A14^[1].

That combination suggested Apple was likely using a refined version of the same generation rather than a major new density step, consistent with N5P rather than a new node. In other words, Apple was not getting a free area win from process scaling. The extra silicon was a real choice, not a side-effect of shrinking transistors.

Source notes — A15 die-shot claims (from the 2021 SemiAnalysis analysis)

A15 die area around 107.7 mm², up from about 87.76 mm² on A14.
Transistor count up from about 11.8B to 15B.
Process density appears unchanged, consistent with N5P rather than a new node.
SRAM cells described as unchanged in size.
LPDDR4x PHYs described as identical in area.
Performance core plus L1 area increased by about 15.26%.
Shared L2 cache grew from 8 MB to 12 MB.
System cache doubled from 16 MB to 32 MB.
GPU block grew by about 30%.
NPU stayed at 16 cores but was revised architecturally; ISP also revised.

The interesting part is not only that the die got bigger. The interesting part is where the extra area went. Once you map the change, it becomes a budget map of Apple's priorities for the next several years.

Die budget map — where A15 spent its extra area

CPU complex

Performance core plus L1 area up about 15%. Modest IPC gain, but the cache and MMU around the cores were reshaped.

Shared L2

Grew from 8 MB to 12 MB. Matched M1's shared L2 despite A15 having fewer big cores.

System cache

Doubled from 16 MB to 32 MB. The single largest area contributor in the redesign.

GPU

Block grew about 30%, mostly through the added fifth GPU core and shared logic, plus doubled FP32 ALUs per core.

Neural Engine

Still 16 cores, but architecturally revised for higher per-core throughput.

ISP & media

Heavily revised. Computational photography and video pipelines became larger silicon citizens.

Section 02 The CPU was not ignored, but it was not the whole story

A15's performance core did change. The MMU and cache layout were reshaped, and the L1 behavior shifted in measurable ways, including faster same-page access patterns observed by independent reviewers at the time. The SemiAnalysis breakdown described the performance core plus L1 area as up by roughly 15.26%, with shared L2 increasing from 8 MB to 12 MB, matching the shared L2 of the M1 despite A15 having fewer big cores^[1].

But the headline IPC gain was described as the smallest in recent Apple history at the time. The reading was not that Apple had run out of ideas. It was that Apple was no longer trying to win every generation on a single CPU number.

A15 told the more honest version of what was happening across the high-end SoC industry. Single-thread IPC gains were getting harder to extract from process and microarchitecture alone, and the cost of every extra percent of IPC was rising. So Apple kept improving the CPU, but stopped letting the CPU stand alone as the marketing exhibit.

Section 03 Cache became a strategy

The single biggest area contributor in A15's redesign was the system cache, which doubled from 16 MB to 32 MB^[1]. That is the kind of change that almost never shows up in marketing slides, because it does not map cleanly to a benchmark headline. But it is one of the most important design choices in the chip.

System cache sits between the on-die compute blocks and external DRAM, and it is shared across CPU, GPU, Neural Engine, ISP, media engines, display pipeline, and the rest of the SoC. Every time a piece of compute can satisfy a memory request from system cache instead of DRAM, two things happen at once: latency drops, and energy per access drops by roughly an order of magnitude.

Cache is not glamorous, but it is one of the most important ways to turn silicon area into system efficiency.

The 2021 analysis drew a loose parallel between A15's bigger system cache and AMD's Infinity Cache, because both designs improved performance without a corresponding increase in external memory bandwidth^[1]. The mechanism is the same. When the pin count and DRAM cost cannot scale, you bring memory on-die. You spend area to buy bandwidth and power.

For a phone, that is more than a performance trick. It is a battery and thermal trick. More cache means fewer round trips to DRAM. Fewer round trips means lower energy per inference, per frame, per photo. That cascades into longer battery life and higher sustained performance, both of which Apple's later marketing would lean on heavily.

Section 04 The GPU was the louder signal

The most visible change in the A15 die shot was the GPU. The GPU block grew by roughly 30%, mainly through the added fifth GPU core on the Pro variant and shared logic, with the individual GPU core itself only modestly larger^[1]. Inside each core, the architectural changes were more interesting than the area numbers suggested. The 2021 analysis flagged doubled FP32 ALUs, memory-saving renderable textures, sparse depth and stencil support, and new SIMD instructions among the changes Apple highlighted in developer materials.

That is a graphics architecture trying to push more arithmetic per memory access. Combined with the doubled system cache, it forms a single design philosophy: extract more compute per byte of DRAM traffic.

This is the part of the A15 story that aged the best. Every Apple Silicon generation after A15 expanded the GPU footprint and made the GPU more central to Apple's performance messaging. By 2025, the GPU was not just a graphics engine. It was Apple's preferred place to put neural acceleration as well.

Section 05 NPU, ISP, media, and the rise of system silicon

A15's Neural Engine kept its 16-core configuration but was described as architecturally revised, with a sizable per-core performance increase. The ISP was described as highly revised^[1]. Neither block produces a clean benchmark number that consumers compare across phones, which is precisely why this part of the die shot is interesting.

By 2021, a modern phone workload was already a coordinated machine. Computational photography. Video processing. Real-time camera pipelines. Display processing. Gaming. On-device machine learning. Voice and image features. Security and sensor fusion. Each of those workloads touches several blocks on the SoC at once, and none of them is purely a CPU benchmark.

The SoC was no longer just a CPU with extras around it. It was becoming a coordinated machine.

The A15 die shot showed Apple investing more area into the blocks that handled those workloads rather than into squeezing one more percentage point of single-thread IPC. The strategic implication was clear in hindsight. Apple was preparing for a world where the chip won by being a good system, not by being a great CPU.

Section 06 A15 to A19, the roadmap confirms the die shot

The clearest evidence that A15 was a strategy shift is what followed it. Every generation from A17 Pro through A19 Pro and into M5 doubled down on the same priorities the A15 die shot revealed.

2021 · A15

The die shot reveals the system-silicon shift

Bigger die at unchanged density. System cache doubled. GPU block up about 30%. NPU and ISP revised. Headline CPU IPC gain is modest. The silicon budget moved into the system around the CPU^[1].

2023 · A17 Pro

GPU and Neural Engine become the headline

First 3 nm iPhone chip. Apple claimed up to 10% faster CPU but framed A17 Pro as the biggest GPU redesign in Apple history, up to 20% faster, with hardware-accelerated ray tracing and Neural Engine up to 2x faster^[2].

2024 · A18

Apple Intelligence reframes the chip

Apple claimed A18 CPU up to 30% faster than A16 at the same power, and GPU up to 40% faster and 35% more efficient than A16. Neural Engine optimized for generative AI workloads. The marketing centered on on-device AI, not raw CPU^[3].

2025 · A19 and A19 Pro

The asymmetry confirms the shift

Apple's A19 disclosures compared CPU as 1.5x faster than A15 and GPU as more than 2x faster than A15^[4]. A19 Pro added a vapor chamber, up to 40% better sustained performance versus the prior generation, Neural Accelerators inside each GPU core, larger cache, more memory, and local LLM workloads^[5].

2025 · M5

The same strategy scales to Mac and iPad

M5 brought a Neural Accelerator into each GPU core, over 4x peak GPU compute for AI versus M4, up to 45% higher graphics performance, up to 15% faster multithreaded CPU, and 153 GB/s unified memory bandwidth^[6].

Look at the asymmetry in Apple's own A19 framing. CPU is 1.5x faster than A15. GPU is more than 2x faster than A15. Across four years, the GPU gain ran roughly twice as fast as the CPU gain. That is the A15 die shot, four generations later, in the form of an Apple keynote slide.

Section 07 Why this happened, process scaling got harder

The A15 die shot caught Apple in the middle of a wider industry change. TSMC's 3 nm family extended through N3, N3E, and N3P, with N2 entering volume production in Q4 2025 using first-generation nanosheet transistor technology^[7]^[8]. Scaling is still alive, but each generation costs more, takes longer, and gives back less raw density than the comparable step a decade earlier.

ASML's own 2026 AGM material put it plainly. AI compute demand has outpaced Moore's Law alone, and the future of compute scaling is not just smaller transistors. It is 2D scaling combined with 3D integration, including advanced packaging, hybrid bonding, and chiplet architectures^[9].

When process scaling stops giving effortless gains, the design choices matter more. You have to spend silicon where the system is actually constrained. For a phone SoC in 2021, that meant cache, GPU, Neural Engine, ISP, media engines, memory efficiency, thermals, and unified software-hardware integration. Which is exactly the list of places where A15 spent its area.

Section 08 The real Apple Silicon moat

Once you read the A15 die shot as a system story rather than a CPU story, Apple's competitive position becomes more legible. The advantage was never one fast CPU core. The advantage is the coordination of a long list of pieces.

Custom CPU cores. Custom GPU. Neural Engine. Neural Accelerators inside each GPU core. ISP. Media engines. Secure Enclave. System cache. Unified memory architecture. Metal. Core ML. The iOS and macOS frameworks that sit on top. The camera pipeline. The battery and thermal management. The product-level integration where all of those decisions get made against the same constraints.

Apple's moat is not only that it designs chips. It designs the chip, the OS, the APIs, the camera stack, the thermal envelope, and the product around the same set of tradeoffs.

Old framing

CPU company

Headline CPU IPC.
Frequency.
Core count.
Benchmark-first framing.
One number for marketing.
Generation defined by CPU uplift.

New framing

System company

Cache and unified memory.
GPU and Neural Accelerators.
Neural Engine and on-device AI.
ISP and media engines.
Sustained performance and thermals.
Software and product integration.

Section 09 What people got wrong in 2021

The weak interpretation in 2021 went something like this. A15's CPU gains slowed, so Apple Silicon's momentum was fading, and the bar for the next generation would be hard to clear. That reading misread the die shot.

The better interpretation was that A15's CPU gains slowed because Apple was moving the battleground. The CPU still mattered. It continued to improve every generation. But Apple was increasingly optimizing for the real product workloads where users actually felt the chip: photos, video, games, on-device machine learning, battery life, sustained performance, local AI, memory efficiency, and display and media features.

A15 die-shot signals — what the analysis actually revealed

Bigger die at unchanged density. About 107.7 mm² and around 15B transistors, consistent with N5P rather than a new node.

Modest CPU IPC gain. Performance core plus L1 area up about 15%, but headline IPC framed as Apple's smallest in years.

Bigger shared L2. 8 MB to 12 MB, matching the M1's shared L2 despite fewer big cores.

System cache doubled. 16 MB to 32 MB. The single largest area contributor in the redesign.

Larger GPU. Roughly 30% more area, doubled FP32 ALUs per core, plus memory-saving features.

Revised NPU. Same 16-core count, but architectural changes pushed per-core throughput up.

Revised ISP. Heavily reworked, which fit Apple's growing emphasis on computational photography.

The CPU did not stop mattering. It stopped being the only visible proof of progress.

Section 10 Risks and limits

The story is cleaner in hindsight than it was in 2021. There are real reasons to read this argument with care.

Risk 01

Apple's official performance claims are marketing, not independent benchmarks. They should be treated as directional, not definitive.

Risk 02

Die-shot area analysis involves annotation and estimation. Different reviewers can disagree on block boundaries by several percent.

Risk 03

Benchmark gains vary by workload. A bigger system cache helps some workloads much more than others.

Risk 04

Specialized blocks like ISP and Neural Engine produce uneven user-visible improvement depending on app and OS support.

Risk 05

The smartphone thermal envelope still caps sustained performance, regardless of peak silicon throughput.

Risk 06

Apple remains structurally dependent on TSMC process execution and yields for each generation.

Risk 07

On-device AI depends on more than silicon. Model quality, memory capacity, developer adoption, and privacy constraints all matter.

Risk 08

Cache and bandwidth efficiency wins do not appear cleanly in every cross-vendor benchmark, which can hide their real value.

Risk 09

The A15 die shot does not prove every later choice. It reveals a direction. The roadmap evidence is what makes the direction credible.

Risk 10

System-silicon advantages can be eroded if competitors close the gap on cache, neural acceleration, or unified memory at the package level.

The point is not that the A15 predicted every Apple chip in detail. The point is that it revealed the strategic direction, and the next four generations executed against that direction more consistently than against any single CPU benchmark.

Section 11 Final verdict

The A15 die shot was not just evidence of weak CPU gains. It was an early map of Apple Silicon's future. Apple's CPU did not stop improving. But the silicon budget increasingly moved into the system: cache, GPU, neural compute, media, ISP, memory behavior, thermals, and software-defined product experiences.

By 2025, that shift had become the dominant story. M5 carried it into the Mac and iPad. A19 carried it into the phone. A19 Pro added a vapor chamber and explicit local LLM workloads. The Neural Engine was no longer the only neural block on the chip, because Apple had also placed Neural Accelerators inside each GPU core.

You can draw a straight line from the doubled system cache and reshaped ISP on the A15 die shot to the way Apple talks about its silicon today. The vocabulary changed, but the underlying decision was already visible in 2021.

Apple stopped spending silicon like a CPU company and started spending it like a system company.

Section 12 Evidence ledger and source notes

Evidence ledger — load-bearing claims with their sources

Source	Claim	Why it matters
SemiAnalysis (2021)	A15 die area ~107.7 mm², transistors ~15B, density unchanged.	Establishes that A15's extra area was a real choice, not a free shrink.
SemiAnalysis (2021)	System cache doubled from 16 MB to 32 MB.	Biggest single area contributor. Anchors the cache-as-strategy argument.
SemiAnalysis (2021)	GPU block grew ~30%, FP32 ALUs doubled per core.	Shows the GPU was the louder area signal of the redesign.
SemiAnalysis (2021)	NPU and ISP architecturally revised.	Frames the rise of system silicon beyond CPU.
Apple A17 Pro (2023)	Up to 20% faster GPU, biggest GPU redesign in Apple history, ray tracing.	Confirms GPU-first messaging after A15.
Apple A18 (2024)	GPU up to 40% faster than A16, 35% more efficient. Neural Engine retuned for generative AI.	Apple Intelligence reframes the chip around on-device AI.
Apple A19 (2025)	CPU 1.5x and GPU 2x+ versus A15.	The four-year asymmetry, GPU gain outruns CPU gain.
Apple A19 Pro (2025)	Vapor chamber, +40% sustained performance, Neural Accelerators per GPU core.	Sustained performance and local LLM become explicit goals.
Apple M5 (2025)	4x+ peak GPU compute for AI vs M4, 153 GB/s unified memory.	The same system-silicon strategy scales to Mac and iPad.
ASML 2026 AGM	AI compute demand has outpaced Moore's Law alone.	Justifies why design choices, not process scaling, increasingly drive gains.

Footnotes & sources

SemiAnalysis, “Apple A15 Die Shot and Annotation — IP Block Area Analysis,” 2021 (PDF supplied by author). Source for A15 die area, transistor count, density observations, L1 and L2 sizing, system cache doubling, GPU growth, NPU and ISP descriptions.
Apple Newsroom, “Apple unveils iPhone 15 Pro and iPhone 15 Pro Max,” September 2023. apple.com/newsroom/2023/09. Source for A17 Pro CPU, GPU, ray tracing, and Neural Engine claims.
Apple Newsroom, “Apple introduces iPhone 16 and iPhone 16 Plus,” September 2024. apple.com/au/newsroom/2024/09. Source for A18 CPU, GPU efficiency, and Apple Intelligence framing.
Apple Newsroom, “Apple introduces iPhone 17,” September 2025. apple.com/au/newsroom/2025/09. Source for A19 CPU 1.5x versus A15, GPU 2x+ versus A15, display engine, ISP, Neural Engine updates, and Neural Accelerators in each GPU core.
Apple Newsroom, “Apple unveils iPhone 17 Pro and iPhone 17 Pro Max,” September 2025. apple.com/au/newsroom/2025/09 (Pro). Source for A19 Pro vapor chamber, sustained-performance claim, GPU Neural Accelerators, larger cache, more memory, and local LLM framing.
Apple Newsroom, “Apple unleashes M5,” October 2025. apple.com/au/newsroom/2025/10. Source for M5 Neural Accelerators per GPU core, 4x+ AI compute vs M4, 45% graphics uplift, 15% multithreaded CPU uplift, and 153 GB/s unified memory bandwidth.
TSMC, “3nm Technology,” tsmc.com/dedicatedFoundry/technology/logic/l_3nm. Source for N3, N3E, and N3P family context.
TSMC, “2nm Technology,” tsmc.com/dedicatedFoundry/technology/logic/l_2nm. Source for N2 Q4 2025 volume production and nanosheet transistor framing.
ASML, 2026 AGM Presentation. ourbrand.asml.com/…/2026-AGM-presentation.pdf. Source for the statement that AI compute demand has outpaced Moore's Law alone, and that future scaling combines 2D scaling with 3D integration.