A Content Delivery Network (CDN) is designed to reduce the load on your primary web servers by caching your static assets on a network of servers. These servers will be closer to users, which can speed up your load times.
How Do CDNs Work?
A CDN doesn’t replace your web server; it sits in between the user and the web server, and caches your site’s content. Each CDN endpoint is called a point of presence (PoP), and most CDN providers will have hundreds of them spread across the globe. Having a PoP physically close to the end user reduces latency, referred to as the “network edge.” CDNs try to optimize the amount of time users spend on the network edge without having to make a request further into the network (and bug your web server).
This kind of CDN is called an “origin pull” or “mirror” CDN. A pull CDN mirrors the content on your website and delivers it with lower latency and improved caching. The other variant is called a “origin push” CDN, which can replace some parts of your web server. Push CDNs are primarily used to host content that would be infeasible to host on traditional web servers like video streaming services or other large media. For example, images and videos stored in Amazon S3 can be served through their CloudFront CDN in a push configuration, eliminating the need to host that content altogether.
Many CDNs are used primarily to cache images, files, and other static content. But some CDNs, particularly Cloudflare, Fastly, and Amazon CloudFront, can cache your entire site. Full-site caching can be configured to expire in just a few seconds, which keeps your site responsive while taking a lot of load off your web servers.
You can set up rules for each type of page you have, and choose how long you want the content to be cached. This can either be done through the admin panel with your CDN provider, or by adding
Cache-Control headers to your HTTP responses, with which you can set
max-age to a specific time in seconds.
However, you shouldn’t apply these settings blanketly to your entire site. Some things need to be dynamic. For example, a user’s profile page, or any page requiring authentication, should never be cached, or else everyone visiting their own profile page would find themselves viewing the information of another account. This is exactly what happened to Steam during their 2015 Christmas sale, when Valve updated their caching configuration to attempt to mitigate a traffic spike, and inadvertently cached user data. This didn’t let anyone log in as another user, but it let them view a copy of their private data, which is still a huge security breach.
In some cases, APIs can be cached. A site like Reddit, for example, doesn’t need to make a database request every time someone requests the top posts on the home page. You could instead cache them every minute or so, and only make requests when necessary, like someone requesting new posts. However, some APIs may break with caching, so you’ll need to do testing on your end to see if it will work for yours.
Should You Use a CDN?
If your site meets a lot of traffic, full-site caching (or at least, on the main pages of your site) can take a lot of load off your web servers.
If you’re using a specialized hosting provider, such as SquareSpace, Shopify, or WordPress, that provider probably already has their own CDN built in, and will usually handle the details of hosting your website for you.
You should also make use of the browser cache, which you can use alongside a CDN. Essentially, your assets will be stored in the user’s browser for a short time (5-10 minutes), so that if they click on another page of your site, their browser won’t even have to make a request for the content it already has. But if you made changes to the site, and they came back the next day, they would be served an updated page from the CDN because the browser cache has expired.