Headless mode is now standard in Google Chrome 59, so I'll try it out

table of contents
Hello.
I'm Mandai, the Wild team member in charge of development.
Google Chrome version 59 introduced a headless mode.
Until now, well-known headless browsers included PhantomJS and Selenium (I also personally used Watir, see list of headless browsers
All of these required various preparatory environment setups, but with Google Chrome 59, you can set up a headless environment just by installing it! So I decided to give it a try.
As of version 59, it is not compatible with Windows, so we have confirmed that it works on CentOS 7
Update Google Chrome
If you installed it via yum, updating is easy
sudo yum -y upgrade google-chrome-stable
If it's not already installed,this pagedownload the rpm package from
The repository for Google Chrome will be added automatically, so future updates can be performed using the yum command.
Let's try running it ~ DOM check ~
through Google ChromeBeyond's homepageLet's try accessing
google-chrome --headless --disable-gpu --dump-dom https://beyondjapan.com<body id="index" style=""> ...
By running it with the option "--dump-dom", I was able to get the DOM of the site
Let's try running it ~ Screenshot ~
Next, I'll try taking a screenshot
google-chrome --headless --disable-gpu --screenshot --window-size=1280,1440 https://beyondjapan.com [0608/054855.748933:INFO:headless_shell.cc(436)] Written to file screenshot.png.
When taking a screenshot, if you do not specify the browser display size, you will only be able to capture a very small area, so you need to set the screen size using the "--window-size=[width],[height]" option
To save a file by specifying a file name,
google-chrome --headless --disable-gpu --screenshot=top.png --window-size=1280,1440 https://beyondjapan.com [0608/055147.536344:INFO:headless_shell.cc(436)] Written to file top.png.
Specify the file name as an argument to the "--screenshot" option
Let's try running it ~ PDF conversion ~
Next, try converting the site to PDF
google-chrome --headless --disable-gpu --print-to-pdf https://beyondjapan.com [0608/033512.266562:INFO:headless_shell.cc(436)] Written to file output.pdf.
The PDF was output with the name output.pdf
Upon examining the contents, the output PDF perfectly replicates the website's layout.
While not particularly unusual, the website's header, which is fixed at the top, is displayed on each page, obscuring the top of subsequent pages.
I believe this is a fairly common issue, so it's good to be aware of it.
Also, if you use this command to convert multiple pages to PDF, they will be overwritten
google-chrome --headless --disable-gpu --print-to-pdf=top.pdf https://beyondjapan.com [0608/033723.196640:INFO:headless_shell.cc(436)] Written to file top.pdf.
By specifying a file name as an argument to the "--print-to-pdf" option, you can save the PDF with a different name
summary
Google Chrome updates to the latest version automatically even when you use it casually, but I thought this update might have quite an impact, so I decided to cover it here
Actually, there's a way to control Google Chrome running in headless mode using Node.js via the DevTools Protocol, which I think makes things more casual and allows for deeper exploration. I'd
like to write an article about this in the future.
Today is Rock Day (69), so I wanted to do something rock-themed, but it wasn't that great
Addendum: I wrote a related article called [Using headless Google Chrome with Node.js | Beyond Inc.](https://beyondjapan.com/blog/2017/07/headless-chrome-with-nodejs)
Addendum 2: I also wrote a related article called [I tried using Google Chrome's headless mode, as it seemed like it would allow for detailed external monitoring | Beyond Inc.](https://beyondjapan.com/blog/2017/07/headless-chrome-networks)
That's all
0
