The Linux OOM killer strikes again

As a frequent reader of my blog, you might have noticed that vcloudnine.de was unavailable from time to time. Reason for this was, that my server was running out of memory at night.

Jan  1 05:22:16 webserver kernel: : httpd invoked oom-killer: gfp_mask=0x200da, order=0, oom_adj=0, oom_score_adj=0

Running out of memory is bad for system uptime. Sometimes you have to sacrifice someone to help others.

It is the job of the linux ‘oom killer’ to sacrifice one or more processes in order to free up memory for the system when all else fails.

Source: OOM Killer - linux-mm.org

The OOM killer selects the process, that frees up the most memory, and that is the least important to the system. Unfortunately, in my case it is Apache or MySQL. On the other hand: Killing these processes have never brought back the system to life. But that is another story. Something has consumed so much memory at night, that the OOM killer had to start its deadly work.

Checking the logs

The OOM has started its work at ~5am, and it killed the httpd (Apache).

Jan  1 05:22:16 webserver kernel: : httpd invoked oom-killer: gfp_mask=0x200da, order=0, oom_adj=0, oom_score_adj=0

While checking the Apache error_log, this log entry caught my attention.

[Sun Jan 01 03:51:04 2017] [notice] SIGHUP received.  Attempting to restart

The next stop was the Apache access_log. At the same time as in the error_log, the Apache logged a POST request wp-login.php in the access_log.

[01/Jan/2017:03:51:03 +0100] "POST /wp-login.php HTTP/1.1" 200 4168

And there were a lot more attempts… I did a short check of older log files. It was not the first OOM killer event, and the log entries were smoking gun. Especially the POST for wp-login.php.

[root@webserver httpd]# zgrep 'POST /wp-login.php HTTP/1.1' access_log | wc -l
876
[root@webserver httpd]# zgrep 'POST /wp-login.php HTTP/1.1' access_log-20161218.gz | wc -l
14577
[root@webserver httpd]# zgrep 'POST /wp-login.php HTTP/1.1' access_log-20161225.gz | wc -l
12368
[root@webserver httpd]# zgrep 'POST /wp-login.php HTTP/1.1' access_log-20170101.gz | wc -l
12054
[root@webserver httpd]# zgrep 'POST /wp-login.php HTTP/1.1' access_log-20170108.gz | wc -l
6814

The number below the command is the number of the POST requests logged in the access_log. The current access_log starts on Jan 08 2017. And since start, there are alreay 876 POST requests to wp-login.php. Looks like a brute force attack.

So there is nothing wrong with the sever setup, it simply breaks down during a brute force attack.