With Truffle you have to map your language’s semantics to java ones. I am unfortunately out of my depth on the details, but my guess would be that LLVM operates here with this in mind in a completely safe way (I guess pointers to the stack are not safe) so presumably it should work for these as well.
thank you. What's the status of truffle? I often hear news from the project, but i never read anything about the scope of the project. Is is highly experimental? Are people using it in production? Is it intended to be used in production in the near future?
Also, if it's faster and the C-extension solved by the LLVM interpreter, what's holding it back from being used?
The Truffle VM stuff takes an interesting approach by (afaik) executing the languages getter/setters and then having a common data format (I presume?). What if someone could make a LLVM version of that?
I believe it’s more about truffle’s llvm interpreter (which is perhaps the successor of this project), but one motivation is that many scripting languages (python, ruby) use C (and fortran) libraries extensively through FFI.
Truffle can give these scripting languages a huge boost in performance (TrufflyRuby is 3x faster than the second fastest implementation), but the JVM “doesn’t like to” rely on FFI all that much - and also, truffle is polyglot with the ability to optimize between different languages. So by creating an LLVM interpreter, ruby or python calling into that can be also optimized by e.g. inlining, in certain cases bettering the performance compared to native FFI.
Other than becoming truly cross-platform, running on top of a singular runtime gives it the ability to observe these parts as well (which in itself a huge advantage because the JVM has some killer observability tools), so for example project loom might be applicable to a python script using C libs for IO, putting the whole on a virtual thread and making its blocking calls unblocking magically.
In a theoretical sense, yes. But it would be very hard to avoid introducing nonportable elements in your code. There is no practical way to go from any existing language to LLVM and keep it portable, with any existing LLVM frontend that I am familiar with.
There's actually a fairly long history of cross-language VMs, with various degrees of success. What usually happens is that they work fine for languages that look, semantically, basically like the native language on the VM. So LLVM works well as long as your language is mostly like C (C, C++, Objective-C, Rust, Swift). Parrot works if you language is mostly like Perl 6 (Perl, Python, PHP, Ruby). .NET works if your language is mostly like C# (C#, F#, Boo, IronPython). The JVM works if your language is mostly like Java (Java, Clojure, Scala, Kotlin). Even PyPy has frontends for Ruby, PHP, Smalltalk, and Prolog.
"Mostly" in this case means semantically, not syntactically. It's things like concurrency model; floating point behavior; memory layout; FFI; semantics of basic libraries; performance characteristics; and level of dynamism. You can write a Python frontend for both the JVM and CLR, and it will look like Python but act like Java and .NET, respectively, with no guarantee that native CPython libraries will work on it.
The problem is that this is where basically all the interesting language-design research is. I wouldn't use Rust for the syntax; I'd use it because I want properties like complete memory safety, manual control over memory allocation, easy (and fast!) access to C libraries, and short startup time. These are all things that Truffle explicitly does not deliver.
It's a great tool if you work in the JVM ecosystem and want to play around as a language designer. But most of the interesting languages lately have created their own ecosystems, and they succeed by solving problems so fundamental that people will put up with having to learn a new ecosystem to gain the benefits they offer.
Strong caveat about LLVM: it's not a good target for precise garbage collection, at a certain point the distinction between pointers and integers is lost. Fixing that would require diving into and writing a lot of C++, and I'd rather do that sort of work with a safe pointer language or drop all pretenses for the lowest level stuff.
We've been JITing using LLVM for a number of years without an obvious problem. It obviously depends on the size of the code you are wanting to JIT, but the OP was discussing a toy language as a side project, and LLVM is certainly perfect for that use case.
No, that's because in the specific case of LLVM only the C API is currently mapped. We'd have to work a little bit harder for the C++ API, but there is interest in doing so, if only to use it as a better parser for JavaCPP itself:
Improve Parser: Use the Clang API
https://github.com/bytedeco/javacpp/issues/51
Could you please expand on this/link me somewhere? I am not familiar with LLVM, and I am only familiar with the JVM spec (currently in the process of writing a templated interpreter), but not yet familiar with OpenJDK’s existing code base, nor a complete JIT compiler.
As far as I know, LLVM does exactly the same thing; the LLVM representation of that code would have a different syntax of course, but semantically would be exactly equivalent to the Gimple version you've presented.
As the siblings have mentioned, llvm doesn't really do this. What does, though, are the jvm and .net, although those aren't really oriented towards low-level c programming.
reply