Headless mode is now standard in Google Chrome 59, so I'll try it out

table of contents
Hello,
I'm Mandai, the Wild Team member of the development team.
Google Chrome version 59 now includes a headless mode.
Until now, PhantomJS and Selenium were popular headless browsers (one I've personally used is Watir. →Headless Browser List).
Both of these required various preparatory steps to set up the environment, but with Google Chrome 59, you can set up a headless environment just by installing it! So, I decided to give it a try.
As of version 59, it is not compatible with Windows, so we have confirmed that it works on CentOS 7
Update Google Chrome
If you installed it via yum, updating is easy
sudo yum -y upgrade google-chrome-stable
If you haven't installed it yet, this page and install it.
The repository for Google Chrome will also be automatically added, so you can run future updates using the yum command.
Let's try running it ~ DOM check ~
Let's try accessing Beyond's homepage through Google Chrome
google-chrome --headless --disable-gpu --dump-dom https://beyondjapan.com<body id="index" style=""> ...
By running it with the option "--dump-dom", I was able to get the DOM of the site
Let's try running it ~ Screenshot ~
Next, I'll try taking a screenshot
google-chrome --headless --disable-gpu --screenshot --window-size=1280,1440 https://beyondjapan.com [0608/054855.748933:INFO:headless_shell.cc(436)] Written to file screenshot.png.
When taking a screenshot, if you do not specify the browser display size, you will only be able to capture a very small area, so you need to set the screen size using the "--window-size=[width],[height]" option
To save a file by specifying a file name,
google-chrome --headless --disable-gpu --screenshot=top.png --window-size=1280,1440 https://beyondjapan.com [0608/055147.536344:INFO:headless_shell.cc(436)] Written to file top.png.
Specify the file name as an argument to the "--screenshot" option
Let's try running it ~ PDF conversion ~
Next, try converting the site to PDF
google-chrome --headless --disable-gpu --print-to-pdf https://beyondjapan.com [0608/033512.266562:INFO:headless_shell.cc(436)] Written to file output.pdf.
The PDF was output with the name output.pdf
When I looked inside, I saw that the PDF output was exactly the same as the site's layout.
It's not particularly unusual, but the site header, which is fixed at the top, is displayed on each page, so the top of the second page and beyond is not visible.
I think there are many sites like this, so it's good to be aware of it.
Also, if you use this command to convert multiple pages to PDF, they will be overwritten
google-chrome --headless --disable-gpu --print-to-pdf=top.pdf https://beyondjapan.com [0608/033723.196640:INFO:headless_shell.cc(436)] Written to file top.pdf.
By specifying a file name as an argument to the "--print-to-pdf" option, you can save the PDF with a different name
summary
Google Chrome updates to the latest version automatically even when you use it casually, but I thought this update might have quite an impact, so I decided to cover it here
In fact, there is a way to operate Google Chrome launched in headless mode with Node.js via the DevTools Protocol, which makes it easier to work with it in a more casual and in-depth way. I
hope to write an article about this as well.
Today is Rock Day (69), so I wanted to do something rock-themed, but it wasn't that great
Addendum: I wrote a related article called [Using headless Google Chrome with Node.js | Beyond Inc.](https://beyondjapan.com/blog/2017/07/headless-chrome-with-nodejs)
Addendum 2: I also wrote a related article called [I tried using Google Chrome's headless mode, as it seemed like it would allow for detailed external monitoring | Beyond Inc.](https://beyondjapan.com/blog/2017/07/headless-chrome-networks)
That's all
0