Array
(
    [content] => 
    [params] => Array
        (
            [0] => /forum/threads/ceo-andy-jassy%E2%80%99s-2025-letter-to-shareholders.24908/
        )

    [addOns] => Array
        (
            [DL6/MLTP] => 13
            [Hampel/TimeZoneDebug] => 1000070
            [SV/ChangePostDate] => 2010200
            [SemiWiki/Newsletter] => 1000010
            [SemiWiki/WPMenu] => 1000010
            [SemiWiki/XPressExtend] => 1000010
            [ThemeHouse/XLink] => 1000970
            [ThemeHouse/XPress] => 1010570
            [XF] => 2031070
            [XFI] => 1060170
        )

    [wordpress] => /var/www/html
)

CEO Andy Jassy’s 2025 Letter to Shareholders

soAsian

Well-known member
An interesting post -- thank you.

Do Amazon's chips have any fundamental improvements that AMD, Nvidia, Intel, or Qualcomm don't already offer, or is this more of "we have chips available to buy"?

(I don't see how this isn't going to end up in a glut of products at some point in the future, similar to the over-ordering that occured after the COVID chip-related "shortage". That said, I do think this is a good opportunity right now for Amazon).
 
An interesting post -- thank you.

Do Amazon's chips have any fundamental improvements that AMD, Nvidia, Intel, or Qualcomm don't already offer, or is this more of "we have chips available to buy"?
I read the shareholder letter, and Amazon is not going to be selling chips as a merchant vendor. Like Google, they are going to be renting chips on AWS. The letter has some convoluted wording speaking of their chip development and deployment as if it were a merchant chip business, but there's nothing in the letter that implies they have a plan like that:

"Our annual revenue run rate for our chips business (inclusive of Graviton, Trainium, and Nitro—our EC2 NIC) is now over $20 billion, and growing triple digit percentages YoY. To dimensionalize this versus other chips companies, that run rate is somewhat understated by our currently only monetizing our chips through EC2. If our chips business was a stand-alone business, and sold chips produced this year to AWS and other third parties (as other leading chips companies do), our annual run rate would be ~$50 billion. There’s so much demand for our chips that it’s quite possible we’ll sell racks of them to third parties in the future."

The word "sell" in the last paragraph clearly refers to specially reserved and rented racks in AWS, and I'm surprised Jassy let the word "sell" get past him. Obviously, Amazon is very proud of their internal chip development and the Annapurna Labs acquisition that made it possible, but it's a problem of another magnitude altogether to sell server chips to others.


(I don't see how this isn't going to end up in a glut of products at some point in the future, similar to the over-ordering that occurred after the COVID chip-related "shortage". That said, I do think this is a good opportunity right now for Amazon).
I highly doubt, IMO, Amazon becomes a merchant chip vendor. After all that Broadcom hype about selling Google TPUs to Anthropic, it turned out it was just Google Cloud deployment. Being a merchant server chip vendor is hugely expensive and you need very high margins to support it, and I think Arm is about to find that out the hard way.
 
Do Amazon's chips have any fundamental improvements that AMD, Nvidia, Intel, or Qualcomm don't already offer, or is this more of "we have chips available to buy"?
OK - claimed benefits via Perplexity, with associated caveats (below):

Amazon’s in-house chips are mainly claimed to excel on price-performance, throughput, latency, memory bandwidth, and energy efficiency versus general-purpose GPUs. The strongest claims are for AWS Inferentia and Trainium: lower inference cost, higher throughput, and better performance per watt for AI workloads.[247wallst +1]

Claimed strengths (mostly compared to earlier Amazon chips / no direct references to merchant market competitors)
• Lower cost for AI inference and training. AWS says Inferentia2 delivers up to 4x higher throughput and up to 10x lower latency than Inferentia, while Inf1 instances can provide up to 70% lower cost per inference than comparable EC2 instances.[aws.amazon]
• Better price-performance than GPUs. Amazon has said its Trainium-based systems can cut AI training and inference costs by up to half versus comparable GPU setups, framing custom silicon as a cost-leadership play.[247wallst]
• Higher throughput and memory capacity. AWS says Inferentia2 has 32 GB of HBM per chip, 4x the memory of Inferentia, and 10x the memory bandwidth, which helps with larger and more complex models.[aws.amazon]
• Scale-out inference support. Inf2 instances are described as the first inference-optimized EC2 instances with ultra-high-speed chip-to-chip connectivity for distributed inference.[aws.amazon]
• Framework compatibility. AWS says Neuron integrates natively with PyTorch and TensorFlow, so customers can use existing workflows with fewer code changes.[aws.amazon]
• Energy efficiency. AWS claims Inf2 instances offer up to 50% better performance per watt than comparable EC2 instances.[aws.amazon]

Important caveat
These are Amazon’s own claims, and some outside reporting says the chips still trail Nvidia in certain areas, especially latency and software maturity. So the core technical story is not “best overall chip,” but rather “optimized for AWS workloads at lower cost, with improving performance generation by generation”.

Real Benchmark Results
I’m personally waiting to see how Amazon Trainium does on InferenceX data center scale inference benchmarks to see if any of their claims are anywhere near true in a third-party assessment. Per a couple sources, Amazon is signed up to do the InferenceX work with SemiAnalysis, but haven’t produced any real results yet vis a vis existing NVIDIA and AMD comparisons. The longer it takes, the less likely it is that they have a competitive product at the rack level.


That Amazon has paired with Cerebras on disaggregated inference also tells me that their chips aren't particularly good on decode when compared to other new solutions.
 
Last edited:
OK - claimed benefits via Perplexity, with associated caveats (below):

Amazon’s in-house chips are mainly claimed to excel on price-performance, throughput, latency, memory bandwidth, and energy efficiency versus general-purpose GPUs. The strongest claims are for AWS Inferentia and Trainium: lower inference cost, higher throughput, and better performance per watt for AI workloads.[247wallst +1]

Claimed strengths (mostly compared to earlier Amazon chips / no direct references to merchant market competitors)
• Lower cost for AI inference and training. AWS says Inferentia2 delivers up to 4x higher throughput and up to 10x lower latency than Inferentia, while Inf1 instances can provide up to 70% lower cost per inference than comparable EC2 instances.[aws.amazon]
• Better price-performance than GPUs. Amazon has said its Trainium-based systems can cut AI training and inference costs by up to half versus comparable GPU setups, framing custom silicon as a cost-leadership play.[247wallst]
• Higher throughput and memory capacity. AWS says Inferentia2 has 32 GB of HBM per chip, 4x the memory of Inferentia, and 10x the memory bandwidth, which helps with larger and more complex models.[aws.amazon]
• Scale-out inference support. Inf2 instances are described as the first inference-optimized EC2 instances with ultra-high-speed chip-to-chip connectivity for distributed inference.[aws.amazon]
• Framework compatibility. AWS says Neuron integrates natively with PyTorch and TensorFlow, so customers can use existing workflows with fewer code changes.[aws.amazon]
• Energy efficiency. AWS claims Inf2 instances offer up to 50% better performance per watt than comparable EC2 instances.[aws.amazon]

Important caveat
These are Amazon’s own claims, and some outside reporting says the chips still trail Nvidia in certain areas, especially latency and software maturity. So the core technical story is not “best overall chip,” but rather “optimized for AWS workloads at lower cost, with improving performance generation by generation”.
The Perplexity analysis is terrible.
Real Benchmark Results
I’m personally waiting to see how Amazon Trainium does on InferenceX data center scale inference benchmarks to see if any of their claims are anywhere near true in a third-party assessment. Per a couple sources, Amazon is signed up to do the InferenceX work with SemiAnalysis, but haven’t produced any real results yet vis a vis existing NVIDIA and AMD comparisons. The longer it takes, the less likely it is that they have a competitive product at the rack level.

This sounds like a real issue, but not a surprise since the Tranium and Inferentia software stack was probably only developed and tested with the internal-only AWS software.
That Amazon has paired with Cerebras on disaggregated inference also tells me that their chips aren't particularly good on decode when compared to other new solutions.
I'm not sure, since the Cerebras solution is purported to be the fastest inference available, so it gets a lot of attention. And even though Cerebras has a cloud service of their own, AWS is probably a much lower impedance path to people trying out Cerebras software and systems. I also guess Cerebras gave AWS a "special price" and probably priority support.
 
The Perplexity analysis is terrible.
Yeah - but it's only based on claims, not real technical analysis outside of what Amazon has published.

AWS is probably a much lower impedance path to people trying out Cerebras software and systems. I also guess Cerebras gave AWS a "special price" and probably priority support.

And I think the benefit to Amazon is that they can offer a lower latency inference product, that still includes some of their own infrastructure.
 
I read the shareholder letter, and Amazon is not going to be selling chips as a merchant vendor. Like Google, they are going to be renting chips on AWS. The letter has some convoluted wording speaking of their chip development and deployment as if it were a merchant chip business, but there's nothing in the letter that implies they have a plan like that:

"Our annual revenue run rate for our chips business (inclusive of Graviton, Trainium, and Nitro—our EC2 NIC) is now over $20 billion, and growing triple digit percentages YoY. To dimensionalize this versus other chips companies, that run rate is somewhat understated by our currently only monetizing our chips through EC2. If our chips business was a stand-alone business, and sold chips produced this year to AWS and other third parties (as other leading chips companies do), our annual run rate would be ~$50 billion. There’s so much demand for our chips that it’s quite possible we’ll sell racks of them to third parties in the future."

The word "sell" in the last paragraph clearly refers to specially reserved and rented racks in AWS, and I'm surprised Jassy let the word "sell" get past him. Obviously, Amazon is very proud of their internal chip development and the Annapurna Labs acquisition that made it possible, but it's a problem of another magnitude altogether to sell server chips to others.



I highly doubt, IMO, Amazon becomes a merchant chip vendor. After all that Broadcom hype about selling Google TPUs to Anthropic, it turned out it was just Google Cloud deployment. Being a merchant server chip vendor is hugely expensive and you need very high margins to support it, and I think Arm is about to find that out the hard way.
you are correct. Amazon will sell their own chip via AWS.


Golden age for semis industry. I love to see options out there beside Intel, AMD and Nvidia. Even tho, you can't buy Amazon's chip. You have the option to "lease". beside, everything is an subscription in today's world.
 
And I think the benefit to Amazon is that they can offer a lower latency inference product, that still includes some of their own infrastructure.
I agree. AWS is so profitable, anything that gets people to use AWS more often is a win for Amazon.
 
Golden age for semis industry. I love to see options out there beside Intel, AMD and Nvidia. Even tho, you can't buy Amazon's chip. You have the option to "lease". beside, everything is an subscription in today's world.
I agree. The AWS chips aren't all that different than Google TPUs; you need the rack systems and the networks and the proprietary software that makes Tranium and Inferentia into a usable system. Microsoft's Maia is almost certainly the same way. Cerebras is even worse, but at least they're already in business of selling systems and supporting them.
 
Back
Top