If you can't handle me at my OTM you don't deserve me at my ITM
JAX logo

JAX

J. Alexander's Holdings Inc C

Price Data Unavailable

About J. Alexander's Holdings Inc C

View all WallStreetBets trending stocks

Premarket Buzz
0
Comments today 12am to 9:30am EST


Comment Volume (7 days)
15
Total Comments on WallstreetBets

0
Total Comments on 4chan's biz

View all WallStreetBets trending stocks

Recent Comments

I'm an Nvidia bull, but I will say this; JAX has improved significantly over the past few years and TPUs are a genuine threat to Nvidia in the event that compute demand slows down (not very likely), but Nvidia's biggest moat at this point is the fact that CoWoS capacity is growing slower than demand and Nvidia owns the overwhelming majority of the current and new capacity. Google is competing for crumbs in CoWoS allocation and they need a lot of their current allocation for their own internal workloads. There's a reason direct TPU sales are expected to generate peanuts in revenue over the next 5 years. They are a lot less of a direct competitor than people are making them out to be. This is definitely a buy the rumor sell the news moment,
Hardware specs don't matter, their biggest hurdle is trying to convince developers to adopt a new software stack. There has been plenty of chips that outperformed Nvidia on paper over the last few years. CUDA has always been the moat, the only reason TPUs are even remotely threatening to Nvidia is because JAX has matured a lot, but a lot of that threat is diminished by the fact that their TPUs are on the same process as Nvidia's and they only have access to a small portion of that allocation. Not to mention the fact that they need a lot of it for their own internal workloads.
Early in the year. I moved to JAX a while back, so I don't follow what Torch is doing too closely anymore. Do you know if the XLA backend has gotten any better?
Not completible with CUDA but Google has a framework called JAX that abstracts away TPU vs GPU vs CPU, so you can "write once, run it everywhere". In use by "Anthropic, xAI, and Apple" [https://developers.googleblog.com/building-production-ai-on-google-cloud-tpus-with-jax/](https://developers.googleblog.com/building-production-ai-on-google-cloud-tpus-with-jax/)
All of the LLMs you named are running more or less the same Transformers architecture. There's nothing stopping you from running those on TPUs, if PyTorch XLA is not the flaming garbage heap it was when I last tried using it years ago you can probably even do it without touching JAX (but JAX in many ways more pleasant to work with from my experiences, so you might opt to just use that. It's a bit of a learning curve because it forces you to do things a certain way (the right way), and prototyping is a bit slower on it). Nvidia GPGPUs do a *lot* more than just tensor operations, TPUs optimize for a specific subset of those operations. If you don't need anything like multi-process access, or integration of hardware video encoding/decoding, you can do it on a TPU. My main criticism as someone who has used both platforms with some depth is that I don't like how the TPU is much more closed off and that I can never have a TPU in my own hands like I can with an Nvidia GPU (though Nvidia sure is trying to match Google on this matter with their pricing).
That analogy was spot-on three years ago, but it’s outdated for the modern stack (PyTorch/XLA 2.0+ w/ptrj It is true that PyTorch is fundamentally 'eager' (dynamic) and TPUs are 'graph-based' (static). If you treat a TPU exactly like a GPU and throw dynamic shapes or constantly changing tensor sizes at it, the XLA compiler will thrash, and performance will tank The ‘translation layer' isn't the bottleneck anymore. With the move to the pjrt runtime (the same one JAX uses) and torch.compile, the issue is largely gone for properly written code. But yeah, if you just put “import tpu” in your code is gonna be shit
No, but tensorflow, PyTorch, Jax and xla and other frameworks do. As long as those work the cuda part is irrelevant, that is too deep down the sw stack. It is not easy to train/serve in a fungible fleet, but that is why we are paid the big bucks
If you're doing absolute cutting edge work you just use tensorflow or JAX though.
You’re treating “can deploy” and “makes sense to deploy” like they’re the same thing. Sure, any big company could hire people to deal with the TPU/JAX/XLA workflow. That’s not really the point. Outside of Google, almost nobody wants to because you lose a ton of the kernel ecosystem, tooling, and debugging support that everyone already relies on with GPUs. And this idea that inference is just a static graph you compile once isn’t how modern LLMs actually run. Real world inference stacks use things like fused attention kernels, dynamic batching, paged KV caches, speculative decoding and other tricks that come straight out of the GPU ecosystem. On TPUs a lot of that either doesn’t exist or has to be rebuilt around XLA’s rules. Yeah, a company could throw money at hiring TPU specialists, but that’s exactly what I mean about the switching cost. On GPUs, everything already works with the frameworks people use by default. On TPUs you have to adopt Google’s entire way of doing things before you get the same performance. So sure, companies could adapt to TPUs. They just usually don’t because the cost of changing the whole stack is way higher than you’re making it sound. TPU TCO only wins if you restructure a big chunk of your system to fit Google’s setup. GPUs don’t force you to do that.
Uh this sounds like you asked AI to respond to my comment and prove it wrong with a bunch of info that’s borderline misinformation. All the bleeding edge labs except Deepmind are using CUDA based platforms for training their SOTA models. Obviously they go deeper than just CUDA, but it still starts with Nvidia GPUs. The moat is fully intact and gets stronger than ever. Anthropic is the only one that is using a hybrid approach but considering they started as GPU based and only disclose vague statements about TPU deployment, it’s likely they only run some highly specific inference on TPUs. Also, their job postings and hires almost never include anything about JAX. So I’m pretty sure the vague Google partnership statement was purely marketing and is just a sign that they offer customers access and integration to GCP. PyTorch on the TPU framework is a joke and is not a thing for these labs.
View All

Next stock JBGS

Previous stock JAG