In This Article
Share This Article
Caching is one of the simplest ways to make websites and applications feel faster. Done well, it reduces load on your servers, improves user experience, and can cut infrastructure costs. Done badly, it causes stale data, confusing bugs, and hard to diagnose performance issues.
This guide explains caching in plain English, then walks through the most useful cache strategies, where they fit, and the practical best practices that help you avoid common pitfalls. Whether you run a content site, an ecommerce shop, or an API driven product, you will find patterns you can apply immediately.
What is caching and why it matters
A cache is a place that stores a copy of data so it can be served faster next time. Instead of recalculating a result or fetching it from a slower system, you reuse a saved response.
Most systems use multiple layers of caching at once, for example:
- Browser caching for static files like images, CSS, and JavaScript.
- CDN caching at the edge, close to users, to reduce latency.
- Server-side caching for rendered pages, API responses, or computed results.
- Database caching to reduce repetitive queries and improve throughput.
- In-memory caching using tools like Redis or Memcached for very fast reads.
The value is simple: caching reduces repeated work. That usually means lower response times, fewer database hits, and better resilience under traffic spikes.
How caching works in practice
At a high level, caching follows a loop:
- A request arrives for some data.
- The system checks the cache for a stored copy.
- If it exists and is still valid, you get a cache hit and return it quickly.
- If not, you get a cache miss, fetch or compute the data, store it, then return it.
Two ideas control most caching behaviour:
- Cache keys: how you uniquely identify what is stored. Poor keys cause incorrect data to be served.
- Freshness rules: how long data stays valid, often managed with a TTL (time to live) or explicit invalidation.
Cache hit ratio and why it is not the only metric
A high hit ratio is usually good, but it is not the whole story. You also need to consider:
- How expensive a miss is.
- Whether cached responses are correct for each user and context.
- How quickly you can recover from stale or poisoned cache entries.
Caching strategies you should know
Different cache strategies suit different systems. The best choice depends on how often data changes, how critical freshness is, and where the bottlenecks sit.
Cache-aside (lazy loading)
With cache-aside, the application checks the cache first. On a miss, it loads from the database or service, then writes the result to the cache.
- Pros: simple, flexible, works well for read heavy workloads.
- Cons: first request is slow, and you must handle cache invalidation carefully.
Read-through caching
Read-through caching moves the loading logic into the cache layer. The app asks the cache for data, and the cache fetches it from the source if needed.
- Pros: cleaner application code, consistent behaviour.
- Cons: more complex cache infrastructure, harder to customise per endpoint.
Write-through caching
With write-through, every write goes to the cache and the database together. Reads can then be served from cache with high confidence.
- Pros: fewer stale reads, predictable behaviour.
- Cons: higher write latency, extra load on the cache.
Write-back (write-behind) caching
Write-back stores writes in cache first and flushes them to the database later.
- Pros: very fast writes, good for high throughput systems.
- Cons: risk of data loss if the cache fails, more complex consistency guarantees.
Cache invalidation approaches
Most caching problems are really cache invalidation problems. You typically choose one of these approaches:
- TTL based: data expires after a set time. Simple and safe, but can serve stale data until expiry.
- Event based invalidation: purge or update cache when data changes. Fresher, but requires reliable events.
- Versioned keys: change the key when content changes, often used for static assets.
Where to cache: common layers and what to store
Browser caching for static assets
Browser caching is ideal for files that rarely change. Use long cache lifetimes for versioned assets, and short lifetimes for anything that can change without a filename change.
Practical tip: use fingerprinted filenames for CSS and JavaScript so you can set a long max age without worrying about users seeing old files.
CDN caching at the edge
CDN caching reduces latency by serving content from locations close to users. It also protects your origin servers during spikes.
- Cache images, fonts, scripts, and public pages where appropriate.
- Be careful with personalised pages. If you cache them at the edge, you must vary by the right headers or avoid caching altogether.
Server-side caching for pages and API responses
Server-side caching can store rendered HTML, fragments, or API responses. It is often the biggest win for dynamic sites where database queries are the bottleneck.
- Cache expensive computations like pricing rules, search facets, or recommendation blocks.
- Cache API responses that are identical for many users.
- Use short TTLs for fast changing data such as stock levels.
Database caching and query optimisation
Database caching can happen inside the database engine, in your application, or via a dedicated cache store. It helps when the same queries run repeatedly.
Do not use caching to hide poor indexing or slow queries. Fix the query plan first, then cache the results where it makes sense.
In-memory caching with Redis or Memcached
In-memory caching is fast because it avoids disk and reduces network round trips. Redis is popular when you need richer data structures, persistence options, or pub sub. Memcached is often used for simple key value caching at very high speed.
Cache eviction policies and memory management
Caches have limited space. When they fill up, they remove entries based on an eviction policy. Common cache eviction approaches include:
- LRU (least recently used): removes items not accessed recently. Good general purpose choice.
- LFU (least frequently used): removes items accessed least often. Useful when some keys are consistently popular.
- FIFO (first in, first out): simple, but can remove still useful items.
Choose an eviction policy that matches your traffic patterns. If you have a small set of very popular pages, LFU can work well. If popularity changes quickly, LRU is often safer.
Common caching pitfalls and how to avoid them
Serving the wrong content to the wrong user
This happens when cache keys do not include the right context, such as language, currency, device type, or authentication state. For personalised content, be cautious about caching full pages. Consider caching fragments that are safe to share.
Stale data and inconsistent experiences
Stale data is not always a problem, but it must be a deliberate choice. For example, a news homepage can tolerate a short delay, but a checkout page cannot.
Use different TTLs per content type, and document the expected freshness. If you rely on event based invalidation, make sure it is reliable and monitored.
Cache stampede
A cache stampede occurs when many requests hit an expired or missing key at the same time, causing a surge of load on the database or upstream service.
- Use request coalescing or locking so only one request rebuilds the value.
- Add jitter to TTLs so many keys do not expire together.
- Pre warm caches for predictable traffic peaks.
Cache poisoning and security risks
Cache poisoning can occur when an attacker tricks a cache into storing a harmful or incorrect response that is then served to others. Reduce risk by:
- Validating and normalising inputs used in cache keys.
- Avoiding caching responses that depend on untrusted headers unless you explicitly vary on them.
- Setting correct cache control rules for private data.
Hard to debug behaviour
Caching can make issues intermittent. Add observability from the start:
- Log whether responses are hits or misses.
- Track hit ratio by endpoint.
- Expose cache age and key metadata in internal headers for troubleshooting.
Best practices for caching that holds up in production
- Start with a clear goal: reduce database load, improve TTFB, or stabilise spikes. Measure before and after.
- Cache the expensive parts: focus on slow queries, heavy computations, and repeated API calls.
- Use sensible TTLs: short for volatile data, longer for stable content. Avoid one size fits all.
- Design cache keys carefully: include only what changes the response. Keep keys consistent and documented.
- Plan invalidation: decide whether TTL, events, or versioning is the right model for each data type.
- Protect against stampedes: use locking, jitter, and pre warming where needed.
- Do not cache errors by accident: ensure 500 responses and transient failures are not stored.
- Separate public and private content: avoid caching authenticated pages in shared caches unless you are confident in your vary rules.
- Test with real scenarios: include logins, different locales, promotions, and edge cases.
Choosing the right caching approach for your site or app
If you want a practical starting point, use this simple mapping:
- Content sites: strong browser caching and CDN caching for assets, plus server-side caching for rendered pages.
- Ecommerce: cache category pages and product details carefully, keep stock and pricing freshness tight, and avoid caching personalised basket and checkout flows.
- SaaS dashboards: cache shared reference data, use in-memory caching for permissions lookups, and avoid shared caching of user specific responses unless keys are strict.
- APIs: cache idempotent GET responses, use ETags where possible, and set clear cache control rules.
FAQ: caching
What is the difference between caching and a CDN?
Caching is the general technique of storing copies of data for faster access. A CDN is a network that often provides caching at the edge, closer to users, typically for static assets and sometimes for public HTML responses.
How do I know what to cache first?
Start with your slowest, most repeated work. Look for endpoints with high traffic and high database time. Cache expensive queries or rendered pages that are identical for many users.
How long should a TTL be?
It depends on how quickly the underlying data changes and how much staleness you can tolerate. Use short TTLs for stock, pricing, and time sensitive information. Use longer TTLs for static content and versioned assets.
Can caching hurt SEO?
It can if it serves the wrong content to crawlers, such as showing a logged in view, incorrect language, or inconsistent canonical tags. Keep caching rules clear, avoid personalisation in shared caches, and test key pages with and without cache.
What causes a cache stampede and how can I prevent it?
A cache stampede happens when many requests rebuild the same expired key at once. Prevent it with request locking, TTL jitter, and pre warming for predictable peaks.
Is Redis always the best choice for in-memory caching?
No. Redis is powerful and flexible, but Memcached can be simpler and very fast for basic key value caching. Choose based on your needs for data structures, persistence, clustering, and operational complexity.
What is cache poisoning and should I worry about it?
Cache poisoning is when a cache stores a response that should not be shared, or stores a manipulated response that then reaches other users. You should take it seriously for shared caches like CDNs. Use safe cache keys, validate inputs, and set correct cache control for private data.































