Caching overview

Wikimedia infrastructure
	Data centers Networking Global traffic routing Global traffic routing; Caching overview; DNS; HTTPS; Varnish; Apache Traffic Server; Traffic cache hardware; HTTP timeouts; MediaWiki HTTP cache headers; Kafka HTTP purging; URI Path Normalization; MediaWiki Media Data Persistence Logs Search Data Engineering [edit]

"CDN" redirects here. For the Toolforge CDN, see Help:Toolforge/Web#External assets.

This page documents the traffic routing and caching infrastructure from a high level. It details which layers of the infrastructure exist in edge caching PoPs, which exist in core data centers, and how traffic flows through them.

Cache software

We currently (July 2019) use Apache Traffic Server (ATS) for TLS termination, Varnish as the in-memory cache ("frontend"), and a second ATS instance is responsible on-disk, persistent caching ("backend").

Prior to 2019, we used Nginx- for TLS termination (instead of ATS), and used a second Varnish instance for the backend cache (instead of ATS). In older documetnation "Varnish" thus might be referring to the cache backend (which runs ATS now).

Graph

 __________________
| browser/the webz |
|__________________|
    |
    |
    |
  ____________________
 |  LVS               |
 |    (load balancer) |
 |____________________|
     |
     |
     |
     |
    __________________________________________________________
   |  Edge Frontend (ATS for tls + Varnish cache)             |
   |     Short-lived cache (~10sec, mostly to prevent DDOS)   |
   |     Stored in memory                                     |
   |__________________________________________________________|
         |
         |
         |
       _______________________________
      | Edge Backend (ATS cache)      |
      |  Long-lived cache             |
      |  Stored on disk               |
      |_______________________________|
            |
            |
            |
           _______________________________________
          | Apaches (MediaWiki PHP)               |
          |                                       |
          |   * Cache-Control for page view HTML: |
          |      max-age is 14 days               |
          |   * wikitext parsercache:             |
          |      Expires at 22 days               |
          |       / wgParserCacheExpireTime       |
          |_______________________________________|

Routing

When LVS balances traffic to ports :80 (varnish), and :443 (~~nginx~~), it uses a hash of the client IP to help with TCP Fast Open and SSL session persistence respectively.

Within the caching layer (cp#xxx machines), the jump from ~~nginx~~:443 to varnish:80 is direct on the local host.

However, the jump from varnish:80 (frontend) to varnish:3128 (backend) is different: for that jump, we hash on the URL (and other req meta-data) when balancing to the backends to divide the cache space among all machines, and thus the request typically moves from one machine to another within the same cluster.

Cache clusters

Current cache clusters in all data centers:

cache_text - Primary cluster for MediaWiki and various app/service (e.g. RESTBase, phabricator) traffic
cache_upload - Serves upload.wikimedia.org and maps.wikimedia.org exclusively (images, thumbnails, map tiles)

Former clusters (no longer exist):

cache_bits - Used to exist just for static content and ResourceLoader, now decommed (traffic went to cache_text)
cache_mobile - Was like cache_text but just for (m|zero)\. mobile hostnames, now decommed (traffic went to cache_text)
cache_parsoid - Legacy entrypoint for parsoid and related *oid services, now decommend (traffic goes via cache_text to RestBase)
cache_maps - Served maps.wikimedia.org exclusively, which is now serviced by cache_upload
cache_misc - Miscellaneous lower-traffic / support services (e.g. phabricator, metrics, etherpad, graphite, etc). Now moved to cache_text.

Headers

See also MediaWiki HTTP cache headers

X-Cache

X-Cache is a comma-separated list of cache hostnames with information such as hit/miss status for each entry. The header is read right-to-left: the rightmost is the outermost cache, things to the left are progressively deeper towards the applayer. The rightmost cache is the in-memory cache, all others are disk caches.

In case of cache hit, the number of times the object has been returned is also specified. Once "hit" is encountered while reading right to left, everything to the left of "hit" is part of the cached object that got hit. It's whether the entries to the left missed, passed, or hit when that object was first pulled into the hitting cache. For example:

X-Cache: cp1066 hit/6, cp3043 hit/1, cp3040 hit/26603

An explanation of the possible information contained in X-Cache follows.

Not talking to other servers

hit: a cache hit in cache storage. There was no need to query a deeper cache server (or the applayer, if already at the last cache server)
int: locally-generated response from the cache. For example, a 301 redirect. The cache did not use a cache object and it didn't need to contact another server

Talking to other servers

miss: the object might be cacheable, but we don't have it
pass: the object was uncacheable, talk to a deeper level

Some subtleties on pass: different caches (eg: in-memory vs. on-disk) might disagree on whether the object is cacheable or not. A pass on the in-memory cache (for example, because the object is too big) could be a hit for an on-disk cache. Also, it's sometimes not clear that an object is uncacheable till the moment we fetch it. In that case, we cache for a short while the fact that the object is uncachable. In Varnish terminology, this is a hit-for-pass.

If we don't know an object is uncacheable until after we fetch it, it's initially identical to a normal miss. Which means coalescing, other requests for the same object will wait for the first response. But after that first fetch we get an uncacheable object, which can't answer the other requests which might have queued. Because of that they all get serialized and we've destroy the performance of hot (high-parallelism) objects that are uncacheable. hit-for-pass is the answer to that problem. When we make that first request (no knowledge), and get an uncacheable response, we create a special cache entry that says something like "this object cannot be cached, remember it for 10 minutes" and then all remaining queries for the next 10 minutes proceed in parallel without coalescing, because it's already known the object isn't cacheable.

The content of the X-Cache header is recorded for every request in the webrequest log table.

Functionalities provided by cache backends

The following functionalities are provided by all cache backends:

Path normalization
Pass everthing which is not GET or HEAD
Pass X-Wikimedia-Debug and X-Wikimedia-Security-Audit
Pass Authorization
Pass Set-Cookie responses
Pass CC:private, no-cache, no-store
Pass X-MISS2PASS
Performance hack to assign a single Vary slot for HFP to logged in users
Provides custom error html if error response has no body
Set X-Cache-Int
Compress compressible things if the origin didn't already
Set various TTL caps
Return 403 to client IPs not in wikimedia_trust
Unset Accept-Encoding to avoid some corner-cases (see T125938)
Unset Public-Key-Pins Public-Key-Pins-Report-Only

Specific to cache_text:

Pass the beta variant of the mobile site
Pass cxserver.wikimedia.org
Request mangling for MediaWiki (keywords: Host, X-Dt-Host, X-Subdomain)
Request mangling for RESTBase (/api/rest_v1/ -> /v1/)
Request mangling for w.wiki (send to meta.wikimedia.org/wiki/Special:UrlRedirector)
Vary slotting for PHP7 (X-Seven)
Vary slotting for X-Forwarded-Proto on 301/302
Reduce TTL to 60s for mobileaction= / useformat=

Specific to cache_upload:

Storage binning to try workaround scalability issues of -sfile
Request mangling for X-MediaWiki-Original
Disable streaming if Content-Length is missing
Pass small objects
Pass objects >= ~1GB
Pass 200 responses with CL:0 (T144257)

Specific to cache_misc:

Pass objects >= ~1GB
Disable streaming if Content-Length is missing
Cache requests with google analytics cookies and our own global WMF-Last-Access, WMF-Last-Access-Global GeoIP, and CP cookies

MediaWiki

Default max-age setting in Cache-Control headers on page views is 14 days. ($wgCdnMaxAge; wmf-config)
- ... however, both Varnish and ATS-BE have an internal cap to only cache documents for a maximum of 24 hours.
Default parsercache expiration is 22 days. ($wgParserCacheExpireTime; wmf-config )

Invalidating content

For Varnish:

When pages are edited, their canonical url is proactively purged in Varnish by MediaWiki.

For ParserCache: Values in ParserCache are verifiable by revision ID. Edits will naturally update it.

puppet: manifests/misc/maintenance.pp
- class misc::maintenance::parsercachepurging
  - Set to 22 days (expire age=2592000)

Past events

2013: Prevent white-washing of expired page-view HTML.
- Various static aspects of a page are not tracked or versions, as such, when the max-age expires, a If-Not-Modified must not return true after expiry even if the database entry of the wiki page was unchanged.
- More info: https://phabricator.wikimedia.org/T46570
2016: Decrease max object ttl in Varnish
- More info: https://phabricator.wikimedia.org/T124954
- Varnish frontends changed from 31 days to 1 day.
- Varnish backends changed from 31 days to 14 days.
- MediaWiki max-age changed from 31 days to 14 days.

Caching overview

Contents

Cache software

Graph

Routing

Cache clusters

Headers

X-Cache

Functionalities provided by cache backends

MediaWiki

Invalidating content

Past events

Navigation menu

Caching overview

Cache software

Graph

Routing

Cache clusters

Headers

X-Cache

Functionalities provided by cache backends

MediaWiki

Invalidating content

Past events

Navigation menu

Search