One of our clients had an issue with their weekly deployments of a Node.js application that was under active development and had been made available to the public as a beta service. They had identified a caching issue but, despite putting in place sensible mitigations, still could not locate the cause of the issue.
They have an Nginx reverse proxy fronting their many websites. The website running the Node.js application is hosted on a Windows 2019 Server running IIS 10.0.
The weekly deployment process ran as follows:
- Build the application using Node.js
- Copy the application to the IIS server
- Clear caches on IIS and Nginx
- Test application
The web application had some significant errors until the browser cache was cleared, which indicates something was not working as it ought to.
They had sensibly included, in the Nginx configuration for this web app, the following as instructions to browsers:
add_header Cache-Control 'no-cache';
expires 0;
Cache-Control 'no-cache' performs a different action than may be expected. It allows a browser to cache the contents but requires that it is revalidated on every use. Expires 0 means the contents is already deemed to have expired, but may be cached to allow revalidation. This is our first clue.
In the response headers, it was noticed IIS was adding an identical eTag header for each resource. eTags are meant to produce an individual identifier that can be used to check whether a resource has changed. There is no standard method for producing eTags but many web servers hash either the file contents or the last modified date. Since the eTag was the same for all files, we could rule out IIS hashing the file contents. It must instead have been hashing based on something that was common to all files served as part of the web application. This is our second clue.
When a Node.js application is built, the files have a default modification date of 26 October 1985 08:15:00 (1985-10-26T08:15:00.000Z). A hard coded default is used on purpose to allow Node.js to be compliant with reproducible builds (NPM:PR20027). This means each of the weekly builds had exactly the same modification date. The development of the application was iterative, so many filenames remained the same between builds but had different contents. The web browser mixing resources from different builds created the significant errors. This is consistent with the observation that the errors were resolved when the browser cache was cleared.
The Cause
- Node.js creates builds with a default modification date.
- When iterative builds are created, many files will have the same names but may have different contents.
- IIS by default uses the last modified date produce an eTag header, which is used to determine whether the file contents has changed from the previously requested version. If the eTag is the same, the file contents is assumed not to have changed.
Potential solutions
Choose from one of the following:
- Prevent IIS from providing eTag header
- Strip the eTag header as the response is passed via Nginx
- Ensure the last modified date of the files is the actual build date and not the default
The client decided option 3 was most appropriate as this would permit proper caching by browsers, whilst avoiding the application errors on deployment.