Varnish Purge
Varnish Purging or Banning
Everything
varnishadm "ban req.url ~ /"
Everything for one domain
varnishadm "ban req.http.host == blaataap.com"
Specific domain
varnishadm "ban req.http.host == example.com && req.url == /some/url/"
Specific domain starting with blaat
varnishadm "ban req.url ~ /blaat"
https://www.hipex.io/docs/nl/varnish/flushen/
More complex
The result of this is that, for all objects in memory, the HTTP response header Content-Type would match the regular expression ^image/, which would invalidate immediately.
varnishadm ban obj.http.content-type ~ “^image/”
varnishadm ban obj.http.content-type ~ “^image/” && req.url ~ “^/feature”
HTTP Purging
HTTP Purging is the most straightforward of these methods. Instead of sending a GET /url to Varnish, you would send PURGE /url. Varnish would then discard that object from the cache.
Add an access control list to Varnish so that not just anyone can purge objects from your cache; other than that, though, you’re home free.
Shortcomings of Purging
HTTP purging falls short when a piece of content has a complex relationship to the URLs it appears on. A news article, for instance, might show up on a number of URLs.
The article might have a desktop view and a mobile view, and it might show up on a section page and on the front page. Therefore, you would have to either get the content management system to keep track of all of these manifestations or let Varnish do it for you.
To let Varnish do it, you would use bans, which we’ll get into now.
Bans
A ban is a feature specific to Varnish and one that is frequently misunderstood. It enables you to ban Varnish from serving certain content in memory, forcing Varnish to fetch new versions of these pages.
An interesting aspect is how you specify which pages to ban. Varnish has a language that provides quite a bit of flexibility. You could tell Varnish to ban by giving the ban command in the command-line interface, typically connecting to it with varnishadm.
You could also do it through the Varnish configuration language (VCL), which provides a simple way to implement HTTP-based banning.
Let’s start with an example. Suppose we need to purge our website of all images.
varnishadm ban obj.http.content-type ~ “^image/”
The result of this is that, for all objects in memory, the HTTP response header Content-Type would match the regular expression ^image/
, which would invalidate immediately.
Here’s what happens in Varnish. First, the ban command puts the ban on the “ban list.” When this command is on the ban list, every cache hit that serves an object older than the ban itself will start to look at the ban list and compare the object to the bans on the list. If the object matches, then Varnish kills it and fetches a newer one. If the object doesn’t match, then Varnish will make a note of it so that it does not check again.
Let’s build on our example. Now, we’ll only ban images that are placed somewhere in the /feature
URL. Note the logical “and” operator, &&
.
varnishadm ban obj.http.content-type ~ “^image/” && req.url ~ “^/feature”
You’ll notice that it says obj.http.content-type
and req.url
. In the first part of the ban, we refer to an attribute of an object stored in Varnish. In the latter, we refer to a part of a request for an object. This might be a bit unconventional, but you can actually use attributes on the request to invalidate objects in cache. Now, req.url
isn’t normally stored in the object, so referring to the request is the only thing we can do here.
Issuing bans that depend on the request opens up some interesting possibilities. However, there is one downside to the process: A very long list of bans could slow down content delivery.
There is a worker thread assigned to the task of shortening the list of bans, “the ban lurker”. The ban lurker tries to match a ban against applicable objects. When a ban has been matched against all objects older than itself, it is discarded.
As the ban lurker iterates through the bans, it doesn’t have an HTTP request that it is trying to serve. So, any bans that rely on data from the request cannot be tested by the ban lurker. To keep ban performance up, then, we would recommend not using request data in the bans. If you need to ban something that is typically in the request, like the URL, you can copy the data from the request to the object in vcl_fetch
, like this:
set beresp.http.x-url = req.url;
Now, you’ll be able to use bans on obj.http.x-url
. Remember that the beresp
objects turn into obj
as it gets stored in cache.
Graceful Cache Invalidations
Imagine purging something from Varnish and then the origin server that was supposed to replace the content suddenly crashes. You’ve just thrown away your only workable copy of the content. What have you done?! Turns out that quite a few content management systems crash on a regular basis.
Ideally, we would want to put the object in a third state — to invalidate it on the condition that we’re able to get some new content. This third state exists in Varnish: It is called “grace,” and it is used with TTL-based invalidations. After an object expires, it is kept in memory in case the back-end server crashes. If Varnish can’t talk to the back end, then it checks to see whether any graced objects match, and it serves those instead.
One Varnish module (or VMOD), named softpurge
, allows you to invalidate an object by putting it into the grace state. Using it is simple. Just replace the PURGE
VCL with the VCL that uses the softpurge
VMOD.
import softpurge;
sub vcl_hit {
if (req.method == "PURGE") {
softpurge.softpurge();
error 200 "Successful softpurge";
}
}
sub vcl_miss {
if (req.method == "PURGE") {
softpurge.softpurge();
error 200 "Successful softpurge";
}
}
Distributing Cache Invalidations Events
All of the methods listed above describe the process of invalidating content on a single cache server. Most serious configurations would have more than one Varnish server. If you have two, which should give enough oomph for most websites, then you would want to issue one invalidation event for each server. However, if you have 20 or 30 Varnish servers, then you really wouldn’t want to bog down the application by having it loop through a huge list of servers.
Instead, you would want a single API end point to which you can send your purges, having it distribute the invalidation event to all of your Varnish servers. For reference, here is a very simple invalidation service written in shell script. It will listen on port 2000 and invalidate URLs to three different servers (alfa
, beta
and gamma
) using cURL
.
nc -l 2000 | while true
do read url
for srv in "alfa" "beta" "gamma"
do curl -m 2 -x $srv -X PURGE $url
done
done
It might not be suitable for production because the error handling leaves something to be desired!
Cache invalidation is almost as important as caching. Therefore, having a sound strategy for invalidating the content is crucial to maintaining high performance and having a high cache-hit ratio. If you maintain a high hit rate, then you’ll need fewer servers and will have happier users and probably less downtime. With this, you’re hopefully more comfortable using tools like these to get stale content out of your cache.
Purging/Banning via an HTTP request
You can use the following template to write ban lurker friendly bans:
sub vcl_backend_response {
# For banning/purging
set beresp.http.x-url = bereq.url;
set beresp.http.x-host = bereq.http.host;
}
sub vcl_deliver {
# For banning/purging
# We remove resp.http.x-* HTTP header fields,
# because the client does not need them
unset resp.http.x-url;
unset resp.http.x-host;
}
sub vcl_recv {
# For banning/purging
if (req.method == "PURGE") {
if (client.ip !~ purge) {
return(synth(403, "Not allowed"));
}
return (purge);
}
if (req.method == "BAN") {
if (client.ip !~ purge) {
return(synth(403, "Not allowed"));
}
ban("obj.http.url ~ " + req.url); # Assumes req.url is a regex. This might be a bit too simple
# Throw a synthetic page so the request won't go to the backend.
return(synth(200, "Ban added"));
}
if (req.method == "REFRESH") {
if (client.ip !~ purge) {
return(synth(403, "Not allowed"));
}
set req.method = "GET";
set req.hash_always_miss = true;
}
}
View BAN logging
varnishlog -g request -q 'ReqMethod eq "PURGE"'
varnishlog -g request -q 'ReqMethod eq "BAN"'
varnishlog -g request -q 'ReqMethod eq "REFRESH"'
Actual PURGE/BAN/REFRESH
curl -X PURGE https://technotes.adelerhof.eu/test/add-test/
curl -X BAN https://technotes.adelerhof.eu/test/add-test/
curl -X REFRESH https://technotes.adelerhof.eu/test/add-test/
Request headers
curl -I https://technotes.adelerhof.eu/test/add-test/
Check BAN list
varnishadm ban.list
https://varnish-cache.org/docs/6.3/users-guide/purging.html
https://info.varnish-software.com/blog/wiki-highlights-cache-invalidation-varnish
http://book.varnish-software.com/4.0/chapters/Cache_Invalidation.html
https://www.smashingmagazine.com/2014/04/cache-invalidation-strategies-with-varnish-cache/
http://book.varnish-software.com/4.0/chapters/Cache_Invalidation.html