You do, in fact. It’s called a memory write barrier. Ensures consistency of data structures as needed. And it call stall the cpu pipeline, so there’s a nontrivial cost involved.
They both involve flushing cache to backing stores, and waiting for confirmation of the write. It’s literally the same thing. It’s just writing a cache line to RAM is orders of magnitude faster than writing a disk sector to storage, even with NVME SSDs. Octane is/was somewhere in the middle.
> They both involve flushing cache to backing stores, and waiting for confirmation of the write.
No they don't. A fence only imposes ordering. It's instant. It can increase the chance of a stall when it forbids certain optimizations, but it won't cause a stall by itself.
CLWB is a small flush, but as tanelpoder explained the more recent CPUs did not need CLWB.
Do you mean the latency of ensuring fsync safety is lower?
reply