Headless mode is now standard in Google Chrome 59, so I'll try it out

2017.06.09

Web system development

table of contents

1 Update Google Chrome
2 Let's try running it ~ DOM check ~
3 Let's try running it ~ Screenshot ~
4 Let's try running it ~ PDF conversion ~
5 summary

Hello.
I'm Mandai, the Wild team member in charge of development.

Google Chrome version 59 introduced a headless mode.
Until now, well-known headless browsers included PhantomJS and Selenium (I also personally used Watir, see list of headless browsers
All of these required various preparatory environment setups, but with Google Chrome 59, you can set up a headless environment just by installing it! So I decided to give it a try.

As of version 59, it is not compatible with Windows, so we have confirmed that it works on CentOS 7

Update Google Chrome

If you installed it via yum, updating is easy

sudo yum -y upgrade google-chrome-stable

If it's not already installed, download the rpm package from this page and install it. The repository for Google Chrome will be added automatically, so future updates can be performed using the yum command.

Let's try running it ~ DOM check ~

Let's try accessing Beyond's homepage through Google Chrome

google-chrome --headless --disable-gpu --dump-dom https://beyondjapan.com<body id="index" style=""> ...

By running it with the option "--dump-dom", I was able to get the DOM of the site

Let's try running it ~ Screenshot ~

Next, I'll try taking a screenshot

google-chrome --headless --disable-gpu --screenshot --window-size=1280,1440 https://beyondjapan.com [0608/054855.748933:INFO:headless_shell.cc(436)] Written to file screenshot.png.

When taking a screenshot, if you do not specify the browser display size, you will only be able to capture a very small area, so you need to set the screen size using the "--window-size=[width],[height]" option

To save a file by specifying a file name,

google-chrome --headless --disable-gpu --screenshot=top.png --window-size=1280,1440 https://beyondjapan.com [0608/055147.536344:INFO:headless_shell.cc(436)] Written to file top.png.

Specify the file name as an argument to the "--screenshot" option

Let's try running it ~ PDF conversion ~

Next, try converting the site to PDF

google-chrome --headless --disable-gpu --print-to-pdf https://beyondjapan.com [0608/033512.266562:INFO:headless_shell.cc(436)] Written to file output.pdf.

The PDF was output with the name output.pdf

Upon examining the contents, the output PDF perfectly replicates the website's layout.
While not particularly unusual, the website's header, which is fixed at the top, is displayed on each page, obscuring the top of subsequent pages.
I believe this is a fairly common issue, so it's good to be aware of it.

Also, if you use this command to convert multiple pages to PDF, they will be overwritten

google-chrome --headless --disable-gpu --print-to-pdf=top.pdf https://beyondjapan.com [0608/033723.196640:INFO:headless_shell.cc(436)] Written to file top.pdf.

By specifying a file name as an argument to the "--print-to-pdf" option, you can save the PDF with a different name

summary

Google Chrome updates to the latest version automatically even when you use it casually, but I thought this update might have quite an impact, so I decided to cover it here

Actually, there's a way to control Google Chrome running in headless mode using Node.js via the DevTools Protocol, which I think makes things more casual and allows for deeper exploration. I'd
like to write an article about this in the future.

Today is Rock Day (69), so I wanted to do something rock-themed, but it wasn't that great

Addendum: I wrote a related article called [Using headless Google Chrome with Node.js | Beyond Inc.](https://beyondjapan.com/blog/2017/07/headless-chrome-with-nodejs)

Addendum 2: I also wrote a related article called [I tried using Google Chrome's headless mode, as it seemed like it would allow for detailed external monitoring | Beyond Inc.](https://beyondjapan.com/blog/2017/07/headless-chrome-networks)

That's all

If you found this article helpful,please give it a "Like"!

The person who wrote this article

About the author

Yoichi Bandai

My main job is developing web APIs for social games, but thankfully I'm also given the opportunity to work on various other tasks, including marketing.
My image rights within Beyond are treated as CC0.

to Understanding the `ls` Command (Even If You're Embarrassed to Ask) and the Windows Server 2016 Licensing System