We have a great technical team here at Infinio and starting today, they'll be sharing their point of view on technology, practitioner tips, and technical answers to common questions.
Today's question is:
"Dear Cache Guys,
After installing a cache product the system consumes more CPU; is that normal? If so, why?"
That’s a great question, and yes, it is normal. There are a few reasons why.
First, the host is not spending time waiting for disk when the cache contains requested blocks. The normal application pattern is to request blocks from disk, wait for the disk to return them, and then do some processing. With a disk cache the waiting part is eliminated so a lot more work gets done.
A good analogy would be to think about a wood cutter. A wood cutter carries an armload of wood into the axe yard and splits it, then walks over to the wood pile to get another armload. The distance between the wood pile and the axe yard determines how much time is spent working versus how much is spent walking to get wood. In short, time spent walking is time wasted for getting the wood split. The wood cutter will be more productive if he can greatly reduce or eliminate the amount of time walking, even though it means he works harder.
A cache with a high hit rate is like a wood pile that's inside the axe yard: no more walking back and forth for more wood, so the wood cutter gets more wood split. No more waiting for the I/O request to be returned from the storage, so the CPU gets more work done.
The second answer to why CPU utilization increases is that a cache does require a small amount of CPU to function. Computers have limited resources and there are always tradeoffs; when a cache is installed it will use RAM and CPU to make storage I/O much faster. A cache uses the most CPU when it starts up and is “warming” because it is retrieving blocks from storage and performing computations to save them into the cache. Once the cache is loaded, the extra CPU consumption is negligible; in fact, it's roughly the same amount required to request disk blocks.
A third cause of increased CPU utilization comes from shared storage getting faster. Consider an environment where several servers are connected to a shared storage device. When a cache offloads disk requests, it is also reducing the load on the storage. The shared storage gets more responsive and CPU consumption goes up because more work gets done. If the shared storage isn't overloaded to begin with then this doesn’t come into play, but in many cases it's another benefit from using a disk cache.
We built the Infinio cache to use CPU as well as possible. Our cache limits the amount of CPU that is used on startup so servers aren't overwhelmed when the cache is warming. And because Infinio is a content-based cache, the deduplicated nature of the cache means the cache reaches a higher hit rate (and thus less CPU consumption for cache purposes) relatively quickly.