[Osaka/Yokohama/Tokushima] Looking for infrastructure/server side engineers!

[Osaka/Yokohama/Tokushima] Looking for infrastructure/server side engineers!

[Deployed by over 500 companies] AWS construction, operation, maintenance, and monitoring services

[Deployed by over 500 companies] AWS construction, operation, maintenance, and monitoring services

[Successor to CentOS] AlmaLinux OS server construction/migration service

[Successor to CentOS] AlmaLinux OS server construction/migration service

[For WordPress only] Cloud server “Web Speed”

[For WordPress only] Cloud server “Web Speed”

[Cheap] Website security automatic diagnosis “Quick Scanner”

[Cheap] Website security automatic diagnosis “Quick Scanner”

[Reservation system development] EDISONE customization development service

[Reservation system development] EDISONE customization development service

[Registration of 100 URLs is 0 yen] Website monitoring service “Appmill”

[Registration of 100 URLs is 0 yen] Website monitoring service “Appmill”

[Compatible with over 200 countries] Global eSIM “Beyond SIM”

[Compatible with over 200 countries] Global eSIM “Beyond SIM”

[If you are traveling, business trip, or stationed in China] Chinese SIM service “Choco SIM”

[If you are traveling, business trip, or stationed in China] Chinese SIM service “Choco SIM”

[Global exclusive service] Beyond's MSP in North America and China

[Global exclusive service] Beyond's MSP in North America and China

[YouTube] Beyond official channel “Biyomaru Channel”

[YouTube] Beyond official channel “Biyomaru Channel”

[nginx] Explaining how to view, settings, location, etc. of access logs

Hello everyone.
I am a member of the System Solutions Department and spend my days getting sleepy when I don't want to, and having trouble sleeping when I want to.

This time, when we talk about the operation and maintenance of web servers, we will definitely talk about "access logs," which we often come into contact with.

Among these, I would like to explain how to view, configure, and locate the access logs of nginx, which has finally surpassed Apache in global market share in recent years.

test environment

  • Linux environment
    OS: AlmaLinux release 9. 2 (VirtualBox 7.0.12 environment)
    Middleware: nginx (1:1.20.1-14.el9_2.1.alma.1), HTTP(80)
  • Browser
    Chrome: 120.0.6099.217 (Official Build) (64 bit)

test page

  • Domain: example.com
    * Because it is a localhost environment, access by rewriting hosts
  • HTML: index.html (for top page), FAQ.html (FAQ page)

Nginx access log location and log examples

The default access log location is "/var/log/nginx/access.log".
If you want to check the access log first, we recommend opening it with the less load command.

less /var/log/nginx/access.log

1️⃣ URL:example.com (index.html) Access log

192.168.33.1 - - [17/Jan/2024:08:47:50 +0000] "GET / HTTP/1.1" 200 37 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML , like Gecko) Chrome/120.0.0.0 Safari/537.36" "-"

2️⃣ index.html (internal link) → FAQ.html Access log

192.168.33.1 - - [17/Jan/2024:08:50:33 +0000] "GET /FAQ.html HTTP/1.1" 200 34 "http://example.com/" "Mozilla/5.0 (Windows NT 10.0 ; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36" "-"

I built a site in my local environment that accepts example.com, and excerpted some of the logs when accessed from a browser.

This is the log output when accessing the top page (index.html) of example.com and the FAQ page (FAQ.html) from there.

The first IP address and time are easy to understand, but the rest may be difficult to understand, so I will explain them by comparing them with the setting items.

About setting the log format (log format)

The basic configuration file for nignx is "/etc/nginx/nginx.conf".

In this, the "log_format" directive (setting) defines the format of the access log.
*The access log output destination is also defined.

less /etc/nginx/nginx.conf ~Excerpt~ http { log_format main '$remote_addr - $remote_user [$time_local] "$request" ' '$status $body_bytes_sent "$http_referer" ' '"$http_user_agent" "$ http_x_forwarded_for"';

The "log_format main" part defines the format name with the name "main".
After that, the format of what content is to be output is lined up [nginx variables + hyphens, square brackets, etc. to arrange the display].

Log format explanation / Comparison table with access log 2️⃣

log format Content Access log 2️⃣ value Remarks column
$remote_addr Connected IP address 192.168.33.1 Since the IP was directly requested,
the LB IP is recorded when going through the LB.
- Separator hyphen -
$remote_user Username specified for Basic authentication - (blank) Basic authentication is often used
during development and maintenance, so
it is basically empty.
[$time_local] [Local time at the time of processing completion + Timezone] [17/Jan/2024:08:27:22
+0000]
The "+0000" part is the time difference.
"+0000" is UTC (Standard Time)
and "+0900" is JST (Japan Time).
"$request" "Request content"
( method, request path, HTTP version)
"GET /FAQ.html
HTTP/1.1"

This means that a "GET" (display) request for "FAQ.html" has been received using "HTTP/1.1"
$status  "Status code" 200 (success)
$body_bytes_sent "Number of bytes sent to client" 34 (bytes) FAQ.html etc.
Number of bytes of main data (body) part
"$http_referer" "Referrer"
(access source URL)
"http://example.com/"
*Top page
Access to top page ⇨ FAQ
If "-" (blank),
access by directly specifying the URL
"$http_user_agent" "User agent"
(browser/OS information)
"Mozilla/5.0
( Windows NT 10.0;Win64;x64)
AppleWebKit/537.36
(KHTML, like Gecko)
Chrome /120.0.0.0 Safari/537.36"

accessing from a Chrome browser
on Windows (OS) .
"$http_x_ forwarded_for" "X-Forwarded-For"
(source IP )
"-" When going through Proxy or LB,
the previous source IP address is displayed.

There is a lot of information available

The above table can be summarized more concisely as follows.

  • IP : 192.168.33.1
  • Username : None (=not authenticated)
  • Access time : 08:27:22 on January 17, 2024, UTC (+9 hours in Japan time)
  • Access destination : FAQ site page (FAQ.html)
  • Connection status : Success (200)
  • Data amount : 4B (bytes)
  • Access source : From the top page (http://example.com)
  • Environment : Windows OS using Chrome browser (declared)
  • Is it via LB or Proxy? Not via LB or Proxy (because it is empty)

In this way, you can get quite a lot of information from the access log.
By aggregating these various types, it is possible to investigate access trends and whether or not there is malicious access.

The default log format is very convenient, so please make use of it.

Terminology explanation

Basic Authentication

This is a simple authentication function that requires you to enter a predefined user name and password name.
Since it is a minimal and simple item, it is used for temporary purposes such as during construction or emergency maintenance.

In particular, with HTTP (80) communication, the authentication information is also sent in plain text (unencrypted), so security is weak.
When using the site even temporarily, it is desirable that the site only supports HTTPS (443) communication.

Referer

Points to the previous URL with a link to the page that was accessed.

If you open the home page from a Google search, the Google URL will be recorded in the log, and if you open the FAQ from the site's home page, the URL of the home page will be recorded in the log.

This term is originally a misspelling of the English word "referrer" (meaning: referrer), but it has an interesting history and is still used today as it was decided on as a misspelling when the specifications were being developed.

HTTP status codes

This is a code that tells you the processing result when HTTP(S) communicates.
It would be too long to list everything, so I will omit it, but the third digit is important.

  • 2xx: Success response
  • 3xx: Redirect response
  • 4xx: Client error response
  • 5xx: Server error response

As shown above, the condition can be roughly determined by the third digit number.

The most common codes you will see are 200 (success), 302 (temporary redirect), 404 (unable to access non-existent location), and
503 (server cannot process).

User-Agent

User agent as a term refers to "the software used to communicate with a website."

Generally, websites are accessed using a browser, so this information is treated as "information about the browser the user is using (along with information about the OS, etc.)".

X-Forwarded-For

This is an item (header) that describes the source IP when LB or Proxy communicates.

When communication is interposed between a client (user) such as LB or Proxy and the web server, the IP of the LB or Proxy is recorded on the web server side, but the IP of the source client in front of it is not known. .

Therefore, when communicating via LB or Proxy, it has become a de facto standard to save the source IP in the X-Forwarded-For

Side note: Regarding defining the name “main” in the log format

Why define names? Regarding this, `` When configuring log output, the log format to be used is specified by name .''

less /etc/nginx/nginx.conf ~Excerpt~ access_log /var/log/nginx/access.log main;

It is used in the "access_log" directive, a setting (directive) that specifies the log output destination. Since the items to be defined and the items to be used are different, naming is necessary.

In other words, multiple definitions can be set.

For example, you can define a simplified log format with the name "easy" by reducing unnecessary information, or conversely, if you want more detailed information, you can define a format with more log items (variables) as "detailed". .

Therefore, you can use different definitions for each domain and environment.

What happens if you don't specify a format name in the access_log directive?

Some people may say, "This is an environment where there is no format name specified."

In this case, there is no problem with syntax checking or operation.

If no format name is specified in this access_log directive the default setting is the ``combined'' definition, which is not written in conf but is already included

log_format combined '$remote_addr - $remote_user [$time_local] ' '"$request" $status $body_bytes_sent ' '"$http_referer" "$http_user_agent"';

It is stated in the nginx official documentation that the above definition is used.

The format of the content is slightly different from "main" written by default in conf, and "$http_x_forwarded_for" is not specified at the end .

By the way , this definition has the same name and output content as the default definition name "combined" in Apache.

lastly

Apache logs are frequently accessed and contain a lot of information.

Compared to that, I don't have many opportunities to use nginx, so I thought it would be more convenient to compile information about it, so I decided to write an article about it.

Personally, I like it because it's easier to understand than Apache's log format specification.

I hope this article provides some useful knowledge to those who read it.
Thank you for reading this far.

*If you want to know more about nginx, please also check out this blog.
[Super beginner] Just read this! NGINX explanation that even beginners can understand

Reference materials

Module ngx_http_log_module
https://nginx.org/en/docs/http/ngx_http_log_module.html

Module ngx_http_core_module
https://nginx.org/en/docs/http/ngx_http_core_module.html

The 'Basic' HTTP Authentication Scheme
https://datatracker.ietf.org/doc/html/rfc7617

Referer
https://developer.mozilla.org/ja/docs/Web/HTTP/Headers/Referer

HTTP response status code
https://developer.mozilla.org/ja/docs/Web/HTTP/Status

User agent
https://developer.mozilla.org/ja/docs/Glossary/User_agent

X-Forwarded-For
https://developer.mozilla.org/ja/docs/Web/HTTP/Headers/X-Forwarded-For

If you found this article helpful , please give it a like!
8
Loading...
8 votes, average: 1.00 / 18
20,946
X facebook Hatena Bookmark pocket
[2025.6.30 Amazon Linux 2 support ended] Amazon Linux server migration solution

[2025.6.30 Amazon Linux 2 support ended] Amazon Linux server migration solution

The person who wrote this article

About the author

inside

Beyond mid-career in 2022 Belongs to
the System Solutions Department
LPIC-3 I have a 304 and AWS SAA I only
have three choices for regular drinks: milk, cola, and black tea.