Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

Io can do fast vector processing if you use its vector library. That's what I assume is going on.


sort by: page size:

I heard that the vector operations were very slow though. Has this changed?

Support for vector chaining, I imagine.

I once implemented a "vector DB" in memory over a million of vectors using a simple linear scan. It takes several milliseconds.

What you think of as "vector" processing is currently being used by compilers to speed up things you didn't think were vectorizable. This is possible only because these instructions are pretty cheap latency-wise. By introducing huge latency, you'd be ruining performance of autovectorization, which accounts for a lot of the performance gains in the past decade.

It seems like you're saying you'd rather have a slower implementation given that a bunch of single instructions useful for this sort of thing aren't available in the Vector API and must be built from sequences of Vector methods that themselves must be implemented using multiple instructions.

By vector operations do you mean using something like the Accelerate framework? or SSE/NEON primitives? or just retooling your code so that your compiler can make attempts to vectorize when possible?

It's not the vector instructions, it's the careful scheduling of instructions to spend just enough time manipulating pointers when you want to crunch actual data. All while respecting dependency chains and memory stall times. (Hyperthreading helps a lot with the latter, see Nvidia Maxas (nervana systems now) for details on how flexible number of threads benefit weighing of memory load stall hiding vs. register pressure causing more data shuffling.

Java does as well, check out the Vector API.

Where is vectored I/O used in practice? I'd think the requirement of using multiple buffers for scatter/gather is not very efficient...

You lose a lot of performance not using vectorised functions. Maybe not an issue if you're only dealing with small amounts of data.

Unfortunately, the complexity argument is generally bullshit and you really need to profile. It turns out multiple very respected authors have found under a typical work load, `vector` performs very well on a lot of machines.

[citation needed, in the form of actual benchmarks]

The thing is that list fusion and whatnot is all just there to get around the handicap that was placed there in the first place by the language paradigm. So you start by insisting on shooting yourself in the foot, then put lots of armor on your boot so the bullet hopefully bounces off.

I assume by "vectors" you mean arrays ... there is no case in which this can be faster than arrays, because in the limit, if the list fusion system works perfectly, it is just making an array. A thing can't be faster than itself.


yup, Vector is the one i have mind. (Vector is also an array lib)

already read that article about ILL, did not make much sense to me.

Where do you explain why the vector is slower ?


Vector can be slow if you create and destroy them a lot, since they allocate. You can work around to some extent by providing a custom allocator, but using something like SmallVector or absl::InlinedVector can be much faster when the N is known.

Vector API is very far from being hardware intrinsics. Unfortunately for Java programmers, it’s merely a least common denominator of SIMD instructions across different ISAs. This makes the feature very limited by design, IMO borderline useless.

Heard the same thing from Bjarne Stroustrup. The cache properties of vectors are incredible.

Cool to see languages besides C running on small hardware.

I would guess that memory consumption, not speed, is the limiting factor vs. C. I skimmed through the source code and couldn't find a way to define heterogeneous packed data types (i.e. structs). That would be a serious turn-off for me. Cons cells are a lot of overhead. At least it has vectors.


Cool to see languages besides C running on small hardware.

I would guess that memory consumption, not speed, is the limiting factor vs. C. I skimmed through the source code and couldn't find a way to define heterogeneous packed data types (i.e. structs). That would be a serious turn-off for me. Cons cells are a lot of overhead. At least it has vectors.

next

Legal | privacy