[For absolute beginners] What is suspicious access?

Introduction

Hello. I'm Paru, a second-year infrastructure engineer in the System Solutions Department, graduating in 2024.

First of all, can you immediately imagine what an access log is?
As an infrastructure engineer, it is one of the basic logs that you see every day when responding to alerts, but in my first year, I was particularly uncomfortable with it and hated it.

So, this article is people who are not yet familiar with access logs and have questions such as "I don't know what's written in the ," " I don't know how to investigate the access I can't determine what kind of access constitutes an attack ."

*This article is about how to conduct an investigation, so it does not mention any commands for investigation, but there are many excellent articles listed in the reference sites at the bottom of the page, so I hope you will take a look at them!

Learn about access logs

What is the purpose of an access log?

In our daily lives, we have probably experienced websites being slow or displaying errors that make them unviewable.
While server load due to increased access is not the only cause of these issues, investigating access logs can help us determine
whether access is having an effect in the first place Access log output is essential for identifying the cause.

Let's take a look at the access log

Let's take a look.
For example, suppose you access a website (http://example.net).
When you do so, the following sentence will be output as an access log on the server side:

192.168.100.101 - - [1/Nov/2025:10:20:50 +0900] "GET /index.php HTTP/1.1" 200 1042 "http://example.com" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36"

At first glance, it looks like a simple code, but it's actually just information about " when, who, where, and what they came for " written in a set order.
It's a bit long, but we'll break it down below.

  1. 192.168.100.101 ... the IP address of the original client (this is the IP address that was accessed)
  2. - ... client identifier (usually unused, so it will be -)
  3. - ... the requester's username (the name of the person accessing the page that requires authentication, usually - )
  4. [1/Nov/2025:10:20:50 +0900] ... the date and time of access (+0900 means the time difference from UTC is 9 hours)
  5. "GET ... HTTP method (there are various methods such as GET communication, POST communication, etc.)
  6. /index.php ... Request URI (the path that tells the server "I want this file!")
  7. HTTP/1.1" ... HTTP version
  8. 200 ... Status code (200 means the request was successful!)
  9. 1042 ... Response byte count (size of data returned by the server, in bytes)
  10. "http://example.com" ... referrer URL (the URL of the previous page that was followed by a link from this page. The page before http://example.net/index.php)
  11. "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36" ... User agent (browser and OS information. You can see that I'm using Chrome)

How to set the output format

There is no set format for the access logs that web servers output; you can freely define "what information to record and in what order" in the configuration file .
These settings are generally /etc , but the way they are written varies depending on the middleware you are using (Apache or Nginx).
Below is an example of how to write them.

Apache ver.

LogFormat "%{X-Forwarded-For}i %l %u [%t] \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined

Nginx ver.

log_format combined '$http_x_forwarded_for - $remote_user [$time_local] ' '"$request" $status $body_bytes_sent ' '"$http_referer" "$http_user_agent"';

Let's briefly discuss %{X-Forwarded-For}i in the Apache example $http_x_forwarded_for in the Nginx example

This is an item that is not written in the initial settings, but by writing this, you will be able to see which IP address the access came from , even in an environment where there is a load balancer or reverse proxy .
Conversely, if this setting is not written, only the IP addresses of the load balancer or reverse proxy in the previous stage will be written, making it impossible to conduct a detailed investigation.

In short, you can
freely customize the log format depending on how you write it in the configuration file You can change the output order by changing the order in which you write it, and you can also record additional special values ​​(header information) as in this case.

For a detailed explanation of the format, please see the article below.

[Apache] Easy explanation of how to view access logs! *Updated February 2025

[nginx] Explaining how to view, settings, location, etc. of access logs

Check the access

This is a very rough investigation method, so you might want to check this when investigating access logs.
Since access content and investigation methods vary widely, please use this as just one example.

Premise: Your server's CPU LA (waiting queue) is rising! Let's investigate!

Step 1: Check the trend in access numbers

First, to check whether server load is the cause of the access, let's count the number of accesses per minute.

10 [29/Oct/2025:12:00
10 [29/Oct/2025:12:01
11 [29/Oct/2025:12:02
2529 [29/Oct/2025:12:03 <-- ! ?
5107 [29/Oct/2025:12:04 <--! !
2714 [29/Oct/2025:12:05 <--! !
26 [29/Oct/2025:12:06
30 [29/Oct/2025:12:07

for only three minutes from 12:03 is quite strange .
This appears to be the reason for the increased CPU load.

Step 2: Identify suspicious activity

Once you have determined that there has been a sudden increase in access, the next step is to identify
who visited and what they were doing From here, you can narrow down the logs to the time period when there was a sudden increase and look for anything suspicious.

① Access source IP

Check if there are frequent accesses from a specific IP address.

# Number of accesses IP address
10143 192.0.2.115 <-- Outstandingly high
128 192.0.2.22
62 192.0.2.88
17 192.0.2.54

If you check the IP address on an IP search site and see that a large number of IP addresses are coming from overseas IP addresses even though the site is aimed at Japanese users, it's suspicious. It's likely a crawler or an attack.

② Access path (request URI)

Look at which paths on your site are getting the most traffic.
It's easier to find the paths by narrowing down to suspicious IP addresses.

# Access destination of IP 192.0.2.115 Path
105 /wp-login.php <-- Suspicious
98 /xmlrpc.php <-- Suspicious
42 /?author=1 <-- Suspicious
27 /wp-admin/admin-ajax.php
15 /wp-config.php.bak <-- Suspicious
15 /wp-admin/profile.php
14 /wp-content/plugins/file-manager/readme.txt

It's quite difficult to distinguish between "suspicious" and "not suspicious," so let's look at each one.
Let's think the website a store , with legitimate access (clicking on a URL) as the front entrance and suspicious access (attempting to enter the admin panel of WordPress or similar) the back door

wp-login.php and xmlrpc.php (suspicious)
Purpose: Unauthorized login (brute force attack)
Example: Trying every ID and password to unlock the doorknob of a store's "employee-only back door"

/?author=1 (Suspicious)
Purpose: Identifying user names
Example: Asking from the front entrance, "Is the employee with ID number 1 there?", trying to spy on the employee's name (user name for login)

wp-config.php.bak (Suspicious)
Purpose: Stealing confidential information
Example: Checking to see if the backup file (.bak) of the memo containing the store's "safe code" (such as the database password) has been dropped.

Paths that are sometimes used as legitimate functions
admin-ajax.php (suspicious depending on the context) - A legitimate WordPress function (auto-saving posts, dynamic processing, etc.). If the number of accesses is abnormally high, it may be being used for attacks.
wp-admin/profile.php (not suspicious depending on the access source IP) - Logged-in users view their own profile.
plugins/.../readme.txt (possible reconnaissance) - Plugin scanning.

The above is just one example, but access to such obviously suspicious paths is quite common.

③ GET communication and POST communication

We've talked a bit about paths, but also important what kind of communication you use to obtain information .
I'd like to touch on the commonly seen "GET communication" and "POST communication."

GET communication (image: postcard)
Purpose: When asking for information. (Example: Display a page, view information with /?author=1)
Characteristics: The information sent is attached to the URL (like ?author=1). Since it is completely visible, important information cannot be sent.

POST communication (Image: A letter in an envelope)
Purpose: When requesting someone to receive information (e.g., logging in, submitting a form)
Features: The information to be sent (ID and password) can be hidden in an envelope and sent. The information is not revealed in the URL.

Why is this important?
In this research example, attacks on /wp-login.php (the backdoor) must send the important information, "ID and password," hidden, so they always use POST communications.
If you look at the logs and see GET communications you can conclude that
"this is probably a scan, not an attack , conversely if there are a large number of POSTs you can conclude that this might be a brute force attack!

③ Status code

See how the server responds.

# Number of accesses Status code
6821 404 <-- Most accesses to non-existent files
153 500 <-- An error occurred inside the server
112 403 <-- Access denied
35 200 <-- Normal access

4xx (Not Found) or 5xx (Internal Server Error), it is a sign that access is failing.
4xx means that the web server is still able to return the error, but 5xx means that the server is unable to respond (i.e. the program is throwing an error), which slows down the server and causes a load.

What if it returns 200 OK (normal) to suspicious access?

The thing to note here is 200 OK (request successful).
Success simply means that "the server successfully returned what was requested," and not that "the access was safe."
To use the store analogy, 404 means "the safe's combination is not here," meaning no answer , but means "the safe's combination has been handed over, here you go."

How to read the login attack (wp-login.php) log

POST /wp-login.php200 OK (Login failed)
Meaning: Your ID or password is incorrect. The login page will be displayed again.
If this occurs repeatedly, you can conclude that a brute force attack is in progress.

POST /wp-login.php302 Found (Login successful)
Meaning: Authentication OK! You will be redirected to the dashboard (/wp-admin/).
If a 302 is returned from the attacking IP address, it is a dangerous sign that your site has been invaded by an attacker.

As such, if you see a 200 OK or 302 Found response when accessing a suspicious path, it an attack is beginning to succeed , so you should investigate more carefully.

There are many other types of status codes, so please see the article below for more details.

[Things you need to remember] A simple explanation of HTTP status code errors as a refresher and refresher.

[Things you need to remember] A simple explanation of HTTP status code errors as a refresher and refresher.

④ User Agent

Finally, look at what you used to access it.

# User agent examples
"Mozilla/5.0 (compatible; DotBot/1.2; ...)"
"Mozilla/5.0 (... compatible; Google-Read-Aloud; ...)"

When you see "bot" or "crawler" in the log there are also
malicious crawlers Also, if there are strange characters here, or if access is biased towards a specific user agent, it can be a sign that an attack is occurring

That's a lot of information, and it can be tiring...
But roughly speaking, based on the above information, we can determine whether the server load is caused by access, and what kind of access can be considered "suspicious."

How to deal with suspicious access?

When actually dealing with this issue, you will need to reach an agreement with the customer, but here are some typical examples of how to deal with this issue.

① Block the IP address (quickest method)

After investigating, we found that an abnormal number of accesses were coming from a specific IP address (e.g., 192.0.2.115), so the quickest solution would be to block access from this address.

The way to block varies depending on the environment.
In Nginx, you can write `deny 192.0.2.115;` in the conf file, and in an Apache environment, you can restrict it with .htaccess. If you have an OS firewall or a WAF in front of it, you can block it there as well.

② Restrict access to specific paths

Attackers will target "employee-only backdoors (such as wp-login.php it's essential to strictly restrict access to those "backdoors."

For example, you can add a setting to your Nginx conf file that says, "Only your company's IP address can access
wp-login.php *Please note that the way to write this will vary depending on the version!

location = /wp-login.php {
# Allow only your IP address
allow 1.2.3.4;
# Deny all else
deny all;

# Don't forget to process to PHP
include fastcgi_params;
fastcgi_pass unix:/run/php-fpm/www.sock;
}

For paths that you do not want to be accessed from an unspecified number of IP addresses (for example, xmlrpc.php, which is only used for attacks), it is effective to use deny all; to make them completely invisible in the first place.

summary

What did you think?

I used to hate access logs, but after looking at them for a year, I've gradually become able to read them. I hope this article
will be of some help to those who are struggling to investigate access logs.

Thank you for reading to the end 🌷

Reference site:
How to check Linux access logs | An easy-to-understand explanation of where to find them, how to view them, and examples
of their use Explaining the commands "grep" and "awk" used to make logs easier to read
[Case study] Restoring and taking measures after WordPress site tampering
Learn the basics of HTTP requests: the difference between GET and POST
Set access restrictions on the admin panel as a WordPress security measure

If you found this article helpful , please give it a like!
1
Loading...
1 vote, average: 1.00 / 11
14
X facebook Hatena Bookmark pocket

The person who wrote this article

About the author

Paru

24th graduate, System Solutions Department
My dream for the future is to rent a slightly larger room and keep a cat.