Skip to content

Serving assets faster with AWS CloudFront

· 3 min read · Amrith Vengalath

  • AWS
  • CloudFront
  • Web

We had a product catalog site where the image-heavy pages felt slow for anyone outside our city. The images lived in an S3 bucket in Mumbai, and a visitor in, say, Delhi was pulling every thumbnail across the country on each load. The fix was a CDN, and since we were already on AWS, CloudFront was the path of least resistance.

This is mostly a note to my future self, because I got a few things wrong the first time.

What CloudFront is actually doing

A CDN keeps copies of your files at edge locations close to your users. The first request to an edge still goes back to the origin (your S3 bucket), but after that the edge holds onto the file and serves it locally until it expires. So the second visitor in a region gets a much shorter round trip.

The setup itself is short:

  • Create a distribution, point the origin at the S3 bucket.
  • Restrict the bucket so it only allows access through CloudFront (Origin Access Identity back then), not the public S3 URL.
  • Wait. Distributions take a while to deploy the first time - go get coffee, it's not broken.

The part nobody warned me about is that none of this matters much if your cache headers are wrong.

Cache-Control is the whole game

CloudFront decides how long to keep a file by looking at the Cache-Control header the origin sends. If S3 doesn't send one, you fall back to the distribution's default TTL, which is usually not what you want.

For our product images - which basically never changed once uploaded - I set a long max-age on the objects in S3:

aws s3 cp ./thumbnails s3://our-catalog-assets/thumbnails \
  --recursive \
  --cache-control "public, max-age=31536000, immutable"

A year, marked immutable. The browser and the edge can both hold onto these and never re-check. For the HTML and anything that changed often, I kept the max-age tiny so updates showed up quickly.

The rule I settled on: long cache for things that get a new filename when they change, short cache for things that keep the same URL.

The invalidation trap

Here's where I lost an afternoon. We replaced a banner image, kept the same filename, uploaded it to S3, and... the old one kept showing. Of course it did. I'd just told every edge to cache it for a year.

You can force CloudFront to drop a cached file with an invalidation:

aws cloudfront create-invalidation \
  --distribution-id E1XXXXXXXXXXXX \
  --paths "/banners/home-hero.jpg"

That works, but you only get a limited number of free invalidations a month, and it's slow. The better habit - which I picked up the hard way - is to never reuse a filename for changed content. Add a hash or a version: home-hero.v2.jpg, or home-hero.a1b2c3.jpg. Then a "change" is just a new file at a new URL, the old cache entry is irrelevant, and you never invalidate anything.

I wish I'd known that on day one instead of day three.

Did it help?

Plainly, yes. Repeat page loads for users in other regions went from "noticeably laggy" to "fine." It wasn't a heroic optimization - it's a checkbox most teams should tick early - but for a small team without much infra experience at the time, it felt like a real win.

If you're putting CloudFront in front of S3 for the first time: get your cache headers right before you tune anything else, and version your filenames so you can forget invalidations exist.