-
Notifications
You must be signed in to change notification settings - Fork 25.4k
Description
Java has a generational collector that relies on the fact that most objects die young. Sometimes, it may happen that there is pressure on the young gen because of the allocation of large objects, and some objects that were actually going to die young are promoted to the old generation in order to make space for new objects. This is bad because collections of the old generation are typically much more costly so this is something that we should try to avoid whenever possible.
So here comes the cache recycler: by reusing large objects, these objects are promoted to the old generation (because we keep strong refs on them), but on the other hand they help diminish the allocation rate of large objects in the young generation, and this makes short-lived objects more unlikely to be promoted to the old generation.
Although this is nice from a GC point-of-view, this can have bad implications on the application side. Typically today, these cached data-structures grow with time and at some point, they may become oversized for the data that they need to carry. Typically, if you store 10 entries in a hash table of capacity 1M, the cost of iterating over all entries is proportional to 1M, not 10. Moreover, over-sized data-structures also tend to not play nicely with CPU caches.
In order to improve it, an idea could be: instead of recycling whole data-structures, we could build paged data-structures and recycle pages individually. This is nice on several levels:
- it would retain the advantage of cache recycling while avoiding over-sized structures
- the recycling logic would be simpler since it would only care about recycling fixed-length arrays
- it is easy to compute the size of Java arrays so we could do byte accounting and make the cache size (in bytes) configurable, eg. "reuse at most 50MB of memory per thread".
My plan is to use aggregations to experiment with this idea: they already use paged arrays and hash tables that we could modify to reuse pages.