16 Mar 2015

How to keep your web app logs clean using nginx

I’m sure that if you take a look at your internet exposed web app production logs, you’ll find lots of entries like this one:

I, [2015-03-02T15:43:38.103037 #13610]  INFO -- : Started GET
"/pma/scripts/setup.php" for at 2015-03-02 15:43:38 -0500
F, [2015-03-02T15:43:38.108401 #13610] FATAL -- : ActionController::RoutingError (No route matches [GET] "/pma/scripts/setup.php"):
    actionpack (4.2.0) lib/action_dispatch/middleware/debug_exceptions.rb:21:in `call'
    actionpack (4.2.0) lib/action_dispatch/middleware/show_exceptions.rb:30:in `call'
    railties (4.2.0) lib/rails/rack/logger.rb:38:in `call_app'
    railties (4.2.0) lib/rails/rack/logger.rb:20:in `block in call'
    activesupport (4.2.0) lib/active_support/tagged_logging.rb:68:in `block in tagged'
    activesupport (4.2.0) lib/active_support/tagged_logging.rb:26:in `tagged'
    activesupport (4.2.0) lib/active_support/tagged_logging.rb:68:in `tagged'
    railties (4.2.0) lib/rails/rack/logger.rb:20:in `call'
    request_store (1.1.0) lib/request_store/middleware.rb:8:in `call'
    actionpack (4.2.0)
    lib/action_dispatch/middleware/request_id.rb:21:in `call'
    rack (1.6.0) lib/rack/methodoverride.rb:22:in `call'
    rack (1.6.0) lib/rack/runtime.rb:18:in `call'
    activesupport (4.2.0)
    lib/active_support/cache/strategy/local_cache_middleware.rb:28:in `call'
    rack (1.6.0) lib/rack/sendfile.rb:113:in `call'
    railties (4.2.0) lib/rails/engine.rb:518:in `call'
    railties (4.2.0)
    lib/rails/application.rb:164:in `call'
    /usr/local/rvm/gems/ruby-2.1.2/gems/passenger-4.0.56/lib/phusion_passenger/rack/thread_handler_extension.rb:74:in `process_request'
    /usr/local/rvm/gems/ruby-2.1.2/gems/passenger-4.0.56/lib/phusion_passenger/request_handler/thread_handler.rb:141:in `accept_and_process_next_request'
    /usr/local/rvm/gems/ruby-2.1.2/gems/passenger-4.0.56/lib/phusion_passenger/request_handler/thread_handler.rb:109:in `main_loop'
    /usr/local/rvm/gems/ruby-2.1.2/gems/passenger-4.0.56/lib/phusion_passenger/request_handler.rb:455:in `block (3 levels) in start_threads'

This is from a Rails app. This is a bot trying to find vulnerabilities in webpages. There are hundreds of attempts like this daily. Usually they don’t cause any harm but they are annoying because they pollute your logs.

Try to run this against your log:

    cat production.log | grep php|cgi | wc -l

It will show you how many requests contains php or cgi (assuming you are not using those technologies). Also, for each one of those requests you get the full call stack.

One way to partially mitigate this entries, is to use nginx location directive.

    location ~ php|cgi { return 444; log_not_found off; }

Put this line on your server section and it will block all requests which contains the strings php or cgi. Returning 444 instead of 404, nginx won’t even respond, it will just drop the connection.

You can reload nginx without stopping the service with:

sudo service nginx reload

You might want to customize the regex to match your scenario or add multiple filters depending on your scenario. I’ve found the following samples on forums to protect a Rails log:

location ~ ^/cgi-bin { return 444; log_not_found off; }
location ~ \.(?:php|aspx|asp)$ { return 444; log_not_found off; }
location ~ php\.cgi$ { return 444; log_not_found off; }
location ~ myadmin { return 444; log_not_found off; }