[For absolute beginners] What is suspicious access?

2025.12.10

table of contents

1 Introduction
2 Learn about access logs
3 Check the access log
- 3.1 Step 1: Check the trend in access numbers
- 3.2 Step 2: Identify suspicious activity
4 How to deal with suspicious access?
5 summary

Introduction

Hello. I'm Paru, a second-year infrastructure engineer in the System Solutions Department, graduating in 2024.

To get straight to the point, can you all immediately picture what an access log is?
As an infrastructure engineer dealing with alerts, it's one of the basic logs I see every day, but in my first year, I particularly disliked and had a strong aversion to them.

So, this time, I'd like to write an article for those who are still unfamiliar with access logs, such as those who "don't know what's written in access logs ," " don't know how to investigate access logs ," or "can't determine what kind of access is an attack ."

*This article is about how to conduct an investigation, so it does not mention any commands for investigation, but there are many excellent articles listed in the reference sites at the bottom of the page, so I hope you will take a look at them!

Learn about access logs

What is the purpose of an access log?

It's fairly common to encounter slow websites or websites displaying errors that prevent you from viewing them. While increased traffic and resulting server load aren't the only causes of these issues, examining access logs can help determine if traffic is the root cause . Outputting access logs is essential for isolating the cause.

Let's take a look at the access log

Let's take a look.
For example, let's say you access a certain website (http://example.net).
Then, the following message will be output to the server as an access log.

192.168.100.101 - - [1/Nov/2025:10:20:50 +0900] "GET /index.php HTTP/1.1" 200 1042 "http://example.com" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36"

At first glance, it looks like a simple string of characters that resembles a code, but in reality, it's just information about "when, who, where, and what they came for" written in a fixed order.
It's a bit long, but let's break it down below.

192.168.100.101 ...The original client's IP address(This is the IP address from which the access came)
- … Client identifier(usually not used, so it will be -)
- ... the username of the requester(the name of the person who accessed the page that requires authentication. Usually this is -).
[1/Nov/2025:10:20:50 +0900] ... Date and time of access(+0900 indicates a 9-hour time difference from UTC)
"GET ... HTTP method(there are various types, such as GET communication and POST communication)"
/index.php ... Request URI(This is the path that tells the server, "I want this file!")
HTTP/1.1" ... HTTP version
200 ... Status code(200 means the request was successful!)
1042 ... Response bytes(size of data returned by the server; unit is bytes)
"http://example.com" ... Referrer URL(The URL of the previous page from which you arrived via a link. This is the page you were on before http://example.net/index.php)
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36" ... User agent(browser and OS information. It shows that Chrome is being used.)

How to set the output format

There isn't a fixed format for access logs output by web servers;you can freely define "what information is recorded and in what order" in a configuration file.
This setting is generally/etc, but the format differs depending on the middleware being used (Apache, Nginx
, etc.). Below is an example of how to write it.

Apache ver.

LogFormat "%{X-Forwarded-For}i %l %u [%t] \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined

Nginx ver.

log_format combined '$http_x_forwarded_for - $remote_user [$time_local] ' '"$request" $status $body_bytes_sent ' '"$http_referer" "$http_user_agent"';

Let's briefly touch upon %{X-Forwarded-For}i in the Apache example and $http_x_forwarded_for in the Nginx example

This is an item that is not written by default, but by writing it, you can find out which IP address the access came from , even in environments where there is a load balancer or reverse proxy in front of it . Conversely, if this setting is not written, only the IP addresses of the load balancer or reverse proxy in front of it will be written, and detailed investigation will not be possible.

In short, the log format can be freely customized depending on how you write it in the configuration file . Changing the order in which you write the information will change the output order, and you can also record special values (header information) as in this case.

For a detailed explanation of the format, please see the article below.

[Apache] A simple guide to reading access logs! *Updated February 2025

[nginx] Explaining how to view, configure, and locate access logs

アLet's investigate the access logs.

This is a very rough guide to investigating access logs, showing you what you might want to look at.
Since access content and investigation methods vary widely, please consider this merely one example.

Premise: Your server's CPU LA (waiting queue) is rising! Let's investigate!

Step 1: Check the trend in access numbers

First, to check whether server load is the cause of the access, let's count the number of accesses per minute.

10 [29/Oct/2025:12:00
10 [29/Oct/2025:12:01
11 [29/Oct/2025:12:02
2529 [29/Oct/2025:12:03
5107 [29/Oct/2025:12:04
2714 [29/Oct/2025:12:05
26 [29/Oct/2025:12:06
30 [29/Oct/2025:12:07

for just three minutes starting at 12:03 is quitestrange.
This appears to be the reason for the increased CPU load.

Step 2: Identify suspicious activity

Once you've identified a sudden surge in access, the next step is to determine "who came and why." From here, you'll focus on the logs from the time period when the access surged and look for anything suspicious.

① Access source IP

Check if there are frequent accesses from a specific IP address.

# Access count IP address
10143 192.0.2.115
128 192.0.2.22
62 192.0.2.88
17 192.0.2.54

If you check the IP address information on an IP lookup site andfind that a website intended for Japanese users is receiving a huge number of visits from overseas IP addresses,it's suspicious. It's highly likely to be a crawler or an attack.

② Access path (request URI)

We'll look at which paths on the site are receiving the most traffic.
Focusing on suspicious IP addresses and examining the paths will make it even clearer.

# Access destinations for IP 192.0.2.115: Path
105 /wp-login.php
98 /xmlrpc.php
42 /?author=1
27 /wp-admin/admin-ajax.php
15 /wp-config.php.bak
15 /wp-admin/profile.php
14 /wp-content/plugins/file-manager/readme.txt

It's quite difficult to distinguish between "suspicious" and "not suspicious" here, so let's look at them one by one. Let's think of the website as a shop , with legitimate access (simply clicking the URL) as the front entrance and suspicious access (trying to access the administration panel of WordPress, etc.) as the back entrance

wp-login.php and xmlrpc.php (suspicious)
Purpose: Unauthorized login (brute-force attack)
Example: It's like trying every possible ID and password to rattle the doorknob of a store's "employee-only back door".

/?author=1 (Suspicious)
Purpose: To identify the username
Example: Asking "Is there an employee with ID number 1?" from the front entrance, trying to steal the employee's name (username for login)

wp-config.php.bak (suspicious)
Purpose: Theft of confidential information
Example: It's like checking to see if a backup file (.bak) of a memo containing a store's "safe combination" (such as a database password) has been downloaded.

legitimate
WordPress function (automatic saving of articles, dynamic processing, etc.). If the number of accesses is unusually high, it may be being exploited for attacks.
wp-admin/profile.php (may not be suspicious depending on the accessing IP address) - allows logged-in users to view their own profile.
plugins/.../readme.txt (possibly for reconnaissance) - for scanning plugins.

The above is just one example, but access to such obviously suspicious paths is quite common.

③ GET communication and POST communication

I've talked at length about paths, but it's also important to consider how information is obtained through communication with those paths . I'd like to touch upon the commonly seen "GET communication" and "POST communication."

GET communication (image: postcard)
Purpose: When you want to ask for information. (Example: display a page, view information with /?author=1)
Features: The information you send is attached to the URL (like ?author=1). It's completely visible, so you can't send important information.

POST communication (Image: A letter in an envelope)
Purpose: To ask someone to "please receive this information!" (Examples: logging in, submitting a form)
Features: The information being sent (ID and password) can be hidden by placing it in an envelope. The information does not appear in the URL.

Why is this important?
In this investigation example, attacks on /wp-login.php (the back door) need to hide sensitive information like "ID and password" when sending it, so they are always done using POST requests. If you look at the logs and see a lot of GET requests to wp-login.php, you can determine that "this is probably a scan, not an attack . " Conversely , if there are a lot of POST requests, you can determine that " this might be a brute-force attack! "

③ Status code

See how the server responds.

# Access count Status code
6821 404
153 500
112 403
35 200

4xx(Not Found, etc.) or5xx(Internal Server Error, etc.) errors, it indicates that access attempts are failing.
4xxerrors still allow the web server to respond,5xxerrors mean the server is unresponsive (i.e., the program is throwing an error), which can slow down the server and lead to increased load.

What if it returns 200 OK (normal) to suspicious access?

It's important to note that 200 OK (request successful) simply means that the server returned what was requested, not that the access was secure. To use the store analogy, 404 is like a dead end, meaning "the safe combination is not here," but 200 means you've handed over the safe combination, saying "here you go."

How to read the login attack (wp-login.php) log

POST /wp-login.php → 200 OK (Login Failed)
Meaning: Incorrect ID or password. The login page will be displayed again.
If this happens repeatedly, you can conclude that a brute-force attack is in progress.

POST /wp-login.php → 302 Found (Login successful)
Meaning: Authentication OK! You will be redirected to the dashboard (/wp-admin/).
If a 302 is returned from the attacking IP address, it is a dangerous sign that your site has been compromised by an attacker.

Therefore, if you are receiving 200 OK or 302 Found when accessing suspicious paths, itthat the attack is starting to succeed, and you should investigate more carefully.

There are many other types of status codes, so please see the article below for more details.

■[This is all you need to remember] A quick review of HTTP status code errors

[This is all you need to remember] A quick review of HTTP status code errors

④ User Agent

Finally, look at what you used to access it.

# Examples of user agents
: "Mozilla/5.0 (compatible; DotBot/1.2; ...)"
"Mozilla/5.0 (... compatible; Google-Read-Aloud; ...)"

If the logs contain terms like "bot" or "crawler ," it indicates automated access by a program. While there are beneficial crawlers like Google's, there are also malicious ones . Furthermore, if this section contains strange strings of characters, or if access is heavily skewed towards a specific user agent, it can be a sign of an attack

That's a lot of information, it's tiring...
Roughly speaking, based on the information above, we'll determine whether the server load is caused by access and what kind of access can be considered "suspicious."

How to deal with suspicious access?

When actually dealing with this issue, you will need to reach an agreement with the customer, but here are some typical examples of how to deal with this issue.

① Block the IP address (quickest method)

After investigating, we found that an abnormal number of accesses were coming from a specific IP address (e.g., 192.0.2.115), so the quickest solution would be to block access from this address.

The method of blocking varies depending on the environment.
With Nginx, you can write `deny 192.0.2.115;` in the conf file, or with an Apache environment, you can restrict access using .htaccess. You can also block access through the OS firewall or, if there is a WAF in front of it, there.

② Restrict access to specific paths

Attackers will target "employee-only backdoors (such as wp-login.php )." Therefore, it is essential to strictly restrict access to these "backdoors."

For example, you would add a setting to the Nginx configuration file that states, "Only your company's IP address should be allowed to access wp-login.php ." *Note that the syntax may vary depending on the Nginx version!

location = /wp-login.php {
# Allow only my IP address
allow 1.2.3.4;
# Deny all others
deny all;

# Don't forget to include the PHP code:
include fastcgi_params;
fastcgi_pass unix:/run/php-fpm/www.sock;
}

For paths that you do not want to be accessed from an unspecified number of IP addresses (for example, xmlrpc.php, which is only used for attacks), it is effective to use deny all; to make them completely invisible in the first place.

summary

What did you think?

I used to hate access logs, but after looking at them for a year, I gradually started to understand them. I hope this
will be of some help to anyone struggling with investigating access logs.

Thank you for reading to the end 🌷

Reference sites:
How to check Linux access logs | A clear explanation of location, how to view, and usage examples
Explaining the commands "grep" and "awk" used to make logs easier to read
[Case study] How we recovered and took countermeasures after a WordPress site was tampered with
Learning the basics of HTTP requests: the difference between GET and POST
Setting access restrictions to the admin screen as a security measure for WordPress

If you found this article helpful,please give it a "Like"!