Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

I can see that. It can implement a basic transformer in JAX/PyTorch just fine.

Anything else it breaks



sort by: page size:

There are a bunch of frameworks built on top of Pytorch too (fastAI, lighting, torchbearer, ignite...), I don't see why this should be a problem (or at least a problem to JAX but not to Pytorch)

The underlying concept of JAX, function transformations, is very powerful and elegant.

PyTorch 2.0 has gotten a similar underlying feature for torch.compile now. https://pytorch.org/get-started/pytorch-2.0/


Can you explain how JAX compares to pytorch? AFAIK pytorch also closely resembles numpy API.

transitioning to Jax (and by extension, Pytorch)

Wait, what? Why would transition to Jax imply transition to Pytorch?


people already ported a lot of stuff from pytorch to jax.

if you're a research scientist or grad student, to a certain extent a lot of projects are "greenfield" so it's easy to jump on a new framework if it is nice to use and offers some advantage.


How is PyTorch compares to JAX and its stack?

PyTorch is working on catching up — I think they’ve already got some kind of “vmap” style function transformations in beta. And I’m sure they’ll figure out good higher order derivatives too. That’s like 90% of what people want out of Jax, so I think they’ll be able to compete.

The downside of Jax is it’s not easy to debug. PyTorch, for better or for worse, will actually run your Python code as you wrote it.


Also I bet the pytorch training code is written with CUDA semantics. Maybe a JAX version would work without messing with the code.

Do JAX and functorch have the same level of builtin functionalities (operations) as the original Pytorch library?

Where to learn about it other than the documentation?


Pytorch and Jax have a numpy like API, so you can use PyTorch for other things too.

JAX is very barebones and will require you to write much more code for the same task than you write in PyTorch.

Functorch is still new, and honestly, there is little to learn if you already know JAX. There are some talks from Meta, and then there is always the docs.


IMO, FX is more of a toolkit for writing transforms over your FX modules than "moving in Jax's direction" (although there are certainly some similarities!)

It's not totally clear what "Jax's direction" means to you, but I'd consider its defining characteristics as 1. composable transformations, 2. a functional way of programming (related to its function transformations)

I'd say that Pytorch is moving towards the first (see https://github.com/pytorch/functorch) but not the second.

Disclaimer: I work on PyTorch, and Functorch more specifically, although my opinions here aren't on behalf of PyTorch.


why would you use Jax over pytorch? even if it has technical merits it lacks an ecosystem of readily available models to study and tweak.

The example that fails in Jax would work fine in PyTorch. If you're working on purely training the model, TorchScript doesn't give many benefits, if any.

> why not Pytorch?

JAX enables using (parts of) existing numpy codebases in disciplines other than deep learning. Autodiff and compilation to GPUs are very useful for all kinds of algorithms and processing pipelines.


In my experience, the answer comes down to "does your code use classes liberally?"

If no, you're just passing things between functions, then go ahead with Jax! But converting larger codebases with classes is just significantly better with PyTorch even if they use different method names etc.


I really don't feel that there is magic in PyTorch or jax, but that may be because I have written my own autograd libs.

In PyTorch you have a graph that is created on runtime by connecting the operations together in a transparent manner.

Jax may feel a bit magic, but all that's done is sending / splitting tracers and recording the operations and compiling; by limiting the language, you have controlled branching with the proper semantics.

---

The main reason I have had such with Julia is simply because of how early / soon I ended up needing to use them whereas with the other languages, you can get away without getting into the very messed up things.


When I first read about JAX I thought it would kill Pytorch, but I'm not sure I can get on with an immutable language for tensor operations in deep learning.

If I have an array `x` and want to set index 0 to 10, I cannot do:

  x[0] = 10
I instead have to do:

  y = x.at[0].set(10)
I'm sure I could get used to it, but it really puts me off.

How is it compared to JAX? After TensorFlow and PyTorch, JAX seems very simple, basically an accelerated numpy with just a few additional useful features like automatic differentiation, vectorization and jit-compilation. In terms of API I don't see how you can go any simpler.
next

Legal | privacy