Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

Unless I've misunderstood something, the sliding context window should decrease memory usage at inference compared to normal flash attention.


view as:

Legal | privacy