izard: (Default)
izard ([personal profile] izard) wrote2010-01-28 01:45 pm
Entry tags:

Cache partitioning.

Shared cache in a CPU is a great thing for multicore - it allows efficient data sharing between cores and almost always efficiently shares capacity.

What if developer thinks cache is not shared fairly between e.g. 2 cores? There are no means to explicitly control this. But here is a workaround, a weird one though. If we write a custom allocator that only allocates data starting from addresses that go to 0-7th cache sets on 1st core, and 7th-15th sets on a second core, then we effectively make the cache a non-shared one. Unfortunately, the biggest continuous area of memory allocator could accommodate is 512 bytes then (64 bytes cache line multiplied by cache sets divided by 2 cores). The more data is allocated through this "weird cache-conscious allocator", the more fair it gets.

512 bytes cap is very annoying and thus likely not realistic for practical use, but if we would have 128-way, not 16-way last level shared cache, the cap would go up to 4k that would work naturally with OS VM mechanics :). Fortunately last level cache is tagged by physical address, so this would lift the continuous memory limit, not just make it 4k and move complexity from allocator to OS.

Upd: after careful study of prior art, it looks like I've reinvented a wheel, and made it a square one rather then round. There is a better way to partition shared cache than I've described above.