I thought it would be possible to closely monitor the appearance using Google Chrome's headless mode, so I tried it.
table of contents
Hello.
I'm Mandai, in charge of Wild on the development team.
Beyond Co., Ltd. is an MSP, so we monitor various things depending on the customer's order.
Among them, there is a type of monitoring called appearance monitoring, which accesses the site using the same (or similar) method that users access the site and checks whether the page is displayed correctly.
There are various tools that can be used to access a site and check whether the data is being collected correctly, and it is possible to do it yourself using curl, etc., but these methods do not allow people who receive alerts from various monitoring tools to check the site. You can find out the cause only by accessing .
That's fine, but if you use Google Chrome's headless mode, you can get detailed information about which content is taking a long time to get via DevTools, so you can investigate the cause without looking at the site. I would like to explore the possibility that it might become possible.
Preparation
First, gather everything you need to try it out.
There is also a sample that works by copying and pasting later.
Google Chrome
First, install Google Chrome (version 59 or higher), which runs in headless mode.
This version of Google Chrome is already stable, so install it as usual using the installer.
Also, if automatic updates are enabled, I think they are already installed.
Node.js
This time, I used the latest version, 8.0.0, but I think it's okay even if it's fairly old.
I have installed the following modules using npm.
- chrome-launcher
- chrome-remote-interface
chrome-launcher is used to launch Google Chrome from Node.js.
chrome-remote-interface is used when accessing Google Chrome's DevTools from Node.js.
npm install chrome-launcher chrome-remote-interface
The sample script includes a process that starts Google Chrome every time it starts and stops it when it ends, but if you are going to use it seriously, I feel like you can leave Google Chrome running (memory usage It seems like the amount is rapidly increasing, so you may need to restart it periodically.)
Once you have installed the above two things, you are ready to go.
For basic information on how to use Google Chrome's headless mode, you refer to Google Chrome 59, which comes standard with headless mode, so let's play around with it.
For basic information on how to use Google Chrome's headless mode from Node.js, please refer to Try using headless Google Chrome from Node.js.
sample script
We have prepared the sample below, so feel free to use it!
const ChromeLauncher = require('chrome-launcher'); const CDP = require('chrome-remote-interface'); function launchChrome(headless = true) { ChromeLauncher.launch({ port: 9222, autoSelectChrome: true, chromeFlags: [ '--disable-gpu', headless ? '--headless' : '', '--no-sandbox', ], }).then(launcher => { CDP(protocol => { setTimeout(() => { protocol.close(); launcher.kill(); process.exit(); }, 10000); const {Page, Network} = protocol; Promise.all([ Page.enable(), Network.enable(), ]).then(() => { Page.navigate({url : 'https://beyondjapan.com/blog/2017/07/headless-chrome-networks'}); Network.responseReceived(res => { console .log(r) }) }) }) }) } launchChrome();
Since it does not end automatically, I use setTimeout to end it after 10 seconds.
Sites with a lot of communication occurring within one page will have a lot of output, but if you extract one
{ requestId: '30371.7375', frameId: '30371.1', loaderId: '30371.190', timestamp: 1197407.412915, type: 'Document', response: { url: 'https://beyondjapan.com/', status: 200, statusText : 'OK', headers: { Date: 'Tue, 18 Jul 2017 04:16:02 GMT', 'Transfer-Encoding': 'chunked', Server: 'nginx', Connection: 'keep-alive', Link: '<https://beyondjapan.com/cms-json/> ; rel="https://api.w.org/"', Vary: 'Accept-Encoding', 'Content-Type': 'text/html; charset=UTF-8' }, mimeType: 'text/html ', connectionReused: false, connectionId: 5394, remoteIPAddress: '122.218.102.93', remotePort: 80, fromDiskCache: false, fromServiceWorker: false, encodedDataLength: 253, timing: { requestTime: 1197406.917743, proxyStart: -1, proxyEnd: -1 , dnsStart: 0.944999977946281, dnsEnd: 2.30200006626546, connectStart: 2.30200006626546, connectEnd: 7.89500004611909, sslStart: -1, sslEnd: -1, workerStart: -1, workerReady: -1, sendStart: 7.98300001770258, sendEnd: 8.11700010672212, pushStart: 0, pushEnd : 0, receiveHeadersEnd: 494.426999939606 }, protocol: 'http/1.1', securityState: 'neutral' } }
It seems that most of the information that can be obtained from the Network tab of DevTools can be obtained.
It seems that all the units around the timestamp are milliseconds.
DevTools also seems to be able to do everything that can be done, such as deleting and adding cookies (this may be useful for monitoring sites that use sessions), clearing the cache, and rewriting the user agent.
It looks like it would be fun to create a UI that displays the data obtained using D3.js!
summary
How did you like the appearance monitoring in Google Chrome's headless mode?
I think it would be cheaper to use a monitoring tool, but since you can get all the responses of the loaded content, you can immediately identify things like something happened with the CDN, or the response from the API is extremely slow. So there seems to be a use for it.
That's it.