I want to control cache using Cache-Control on Cloud CDN (Google Cloud)
Introduction
My name is Sumi and I work in the System Solutions Department and am involved in infrastructure.
Today we will introduce how to configure Cloud CDN to cache only specified files on a server using an application load balancer.
Cloud CDN itself is useful for reducing server load, tuning response time, etc. by caching content on edge servers.
In this article, we will introduce a configuration example for controlling cache using HTTP headers so that certain static content is not cached.
If your site does not mind being cached, please also refer to Disable cached content
*We strongly recommend that you conduct verification before making any adjustments.
Cloud CDN cache control parameter description
cache mode
Cloud CDN offers three caching modes that control how your content is cached. For more detailed explanation, please refer to the official document link.
- CACHE_ALL_STATIC: Cache all static content (HTML, CSS, JavaScript, images, etc.).
- USE_ORIGIN_HEADERS: Cache only static content with the extension specified in the
Content-Type
- FORCE_CACHE_ALL: Caches content based on
Cache-Control
Cloud CDN official documentation
Cache period
The cache period specifies how long cached content is valid.
- CACHE_ALL_STATIC /USE_ORIGIN_HEADERS mode:
- Client TTL: Specifies how long the browser or client should retain the cache.
- Default TTL: Specifies the cache period that will be applied if the response from the origin server
a Cache-Control
- Maximum TTL: Specify the maximum retention period for cached content.
- DYNAMIC_CACHE mode:
- Specify the cache period in the
max-age
ofthe Cache-Control
included in the response from the origin server
- Specify the cache period in the
Example: Cache-Control: max-age=3600 (1 hour)
cache key
Cloud CDN uses the entire URL as a cache key.
When receiving a request, we identify it by protocol (HTTP/HTTPS), host name, query string, HTTP header, and cookie.
protocol
Identifies whether the request is HTTP or HTTPS.
Default : Enabled
setting | behavior | use case |
---|---|---|
enable | Cache HTTP and HTTPS requests separately | When providing different content over HTTP and HTTPS |
disable | Do not differentiate between HTTP and HTTPS requests | When serving the same content over HTTP and HTTPS |
When using Cloud CDN, it is common to deliver HTTP and HTTPS content under the same hostname, but
there are use cases where certain browsers require TLS and therefore allow different delivery. Please see here for details.
Use the same host name for HTTP/HTTPS
host
Used when distributing different content using multiple host names.
Default: Enabled
query string
Identifies the parameters following the "?" in the URL.
Default: Enabled/Include all items other than selected items/No value
setting | behavior | use case |
---|---|---|
Include only selected items | Include specified parameters in cache key | When content changes based on specific parameters, such as search results or product detail pages |
Include everything except selected items | Exclude certain parameters and include others in cache key | If there are multiple parameters but you want to use only some of them as cache keys |
invalid | Do not identify using parameters | When providing the same content without depending on the query |
Custom query string settings allow you to specify parameters in the form of a whitelist or exclusion list if enabled.
HTTP header
Identifies additional information to include in the request.
Default: disabled
setting | use case |
---|---|
include | When optimizing content with HTTP headers such as User-Agent |
exclude | If you want to serve the same content regardless of the header |
named cookie
We use cookies to identify specific users.
Default: disabled
setting | use case |
---|---|
valid | When content changes depending on login status or user |
invalid | When providing the same content regardless of cookies |
Use case for each parameter
use case | protocol | host | query string | HTTP header | named cookie |
---|---|---|---|---|---|
When delivering the same content using HTTP/HTTPS | exclusion | include | exclusion | exclusion | exclusion |
When distributing the same content using multiple host names | include | exclusion | exclusion | exclusion | exclusion |
If the content does not change depending on the query string | include | include | exclusion | exclusion | exclusion |
When optimizing content by user agent | include | include | include | User-Agent | exclusion |
If the content changes depending on your login status | include | include | include | exclusion | session_id |
How to use DYNAMIC_CACHE mode
To use DYNAMIC_CACHE mode, there is a way to set Cache-Control on the server side.
The Cache-Control header is an HTTP response header used to instruct browsers and caching servers how to cache content.
This header can contain the following directives:
max-age: Specifies the cache expiration time in seconds.
s-maxage: Specifies the cache expiration time in seconds for shared caches (such as CDNs).
public: Allows the object to be stored in a public cache.
private: Allows objects to be stored only in private caches.
no-cache: Prevents objects from being cached.
no-store: Prevents the object from being stored in the cache or in the browser.
Actual setting method
Environment (Rocky Linux9 + Nginx)
This time we will use Rocky Linux9 + Nginx as a simple test environment. *The server construction part is omitted. Since this is a test only for Cache hit conditions, a CMS such as Wordpress is not used.
Preparation on the server side
Performs cache control for each directory.
Create a directory to place the test files
mkdir -p /(document root path)/cache mkdir -p /(document root path)/nocache
Place test files in each directory.
touch /(document root path)/cache/test.html touch /(document root path)/nocache.html touch /(document root path)/nocache/nocache.html
Add the following settings to the Nginx conf.
server { listen 80; # Listen on HTTP port (80) server_name example.com www.example.com; # Domain name to operate () root /var/www/example.com/public_html/; # Document root ( index index.html index.htm; # Files to display by default location / { try_files $uri $uri/ =404; # Returns a 404 error if the file does not exist } location / { add_header Cache-Control "private"; # For other requests private } location /cache/ { add_header Cache-Control "public, max-age=3600"; # 1 hour cache #add_header X-Cache-Status $upstream_cache_status; By removing it, the header will show whether the cache was hit or not. #Since it is possible to return the same header on the CDN side, this time we will set it on the CDN side. } location /nocache/ { add_header Cache-Control "no-cache"; # Cache disabled #add_header X-Cache-Status $upstream_cache_status; By removing #, it will show whether the cache was hit or not in the header. #Since it is possible to return the same header on the CDN side, this time we will set it on the CDN side. } }
After completing the settings, check the syntax and restart Nginx.
nginx -t
If you see something like the following, there is no syntax error.
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok nginx: configuration file /etc/nginx/nginx.conf test is successful
If there are no problems, restart Nginx
systemctl restart nginx
Check header
After applying the settings, check the test site and confirm that the Cache-Control header is added to each directory.
If it is not registered in DNS, register the server's IP in Hosts and check the header using a debugging tool (F12) such as Chrome browser.
Cache-Control for each path should be as follows.
/cache/test.html
⇒Cache-Control public, max-age=3600 Cached and retained for 1 hour
/nocache/nocache.html
⇒Cache-Control no-cache Not cached, always retrieved from the origin server.
/nocache.html
⇒Cache-Control private Treated as private and only browser caching is allowed.
Settings on Cloud CDN side
cache mode
⇒Use source settings based on Cache-Control header
cache key
⇒If
you want to configure the default path or more detailed settings, you can edit the cache key by customizing it.
custom response headers
⇒ Header name: cdn_cache_status
⇒ Header value 1: {cdn_cache_status}
Cache hits and misses are added to the header, so you can check the cache status using debugging tools.
The settings are now complete.
Click Finish to save immediately.
Check cache status
Open your browser's incognito window and check each path again.
*Response headers may not be visible due to browser cache, etc. If necessary, please try accessing by changing your browser.
/cache/test.html
⇒Cache-Control public, max-age=3600 #Cache and retain for 1 hour ⇒cdn_cache_status: hit #If it is the first access, there will be no cache, so it will be a miss. If you open the same site again in a separate window, it will be a hit.
/nocache/nocache.html
⇒Cache-Control no-cache #Not cached, always retrieved from the origin server. ⇒cdn_cache_status: miss #Cache is not performed, so it is a miss.
/nocache.html
⇒Cache-Control private Treated as #private and only browser caching is allowed. ⇒cdn_cache_status: miss #Caching is not performed on the CDN side, so it is a miss.
After checking Cache hits several times, you can check the hit rate on the console by going to CLB > CDN > Click on the origin name in GCP > View Cloud CDN in Monitoring.
By looking at the hit rate and bandwidth and configuring the cache in more detail, you can roughly determine the range of cache influence, so adjust the cache range according to the conditions.
summary
Cache-Control headers and Cloud CDN give you fine-grained control over your static content.
It also speeds up content delivery and reduces the load on origin servers.
We hope this blog post helps you leverage caching with Cloud CDN.
Beyond also provides CDN construction and operation and maintenance services, so
please feel free to contact us with any CDN-related inquiries!
https://beyondjapan.com/service/cdn/