six demon bag

Wind, fire, all that kind of thing!

2020-06-02

Privacy-friendly Logging With Nginx

IP addresses are considered personal information, so your web server is not supposed to log them, at least not in a way that would allow tracing the address back to a person. The usual way to comply with this requirement is to mask part of the visitor's IP address, e.g. 192.168.23.42 → 192.168.23.0 (IPv4) or fd22:e03e:f88a:ea84::42 → fd22:e03e:: (IPv6).


In Nginx you'd define a custom variable to hold that masked address:

# /etc/nginx/anonymize_remote_addr.conf
map $remote_addr $remote_addr_anon {
  127.0.0.1                 $remote_addr;
  ::1                       $remote_addr;
  ~^(?P<ip>\d+\.\d+\.\d+)\. $ip.0;
  ~^(?P<ip>[^:]+:[^:]+):    $ip::;
  default                   0.0.0.0;
}

However, that only works for Nginx v1.11 or newer. Debian-based systems still ship v1.10, so I had to do it the hard way:

# /etc/nginx/anonymize_remote_addr.conf

map $remote_addr $ip_anon1 {
  ~^(?P<ip>\d+\.\d+\.\d+)\. $ip;
  ~^(?P<ip>[^:]+:[^:]+):    $ip;
  default                   0.0.0;
}

map $remote_addr $ip_anon2 {
  ~^\d+\.\d+\.\d+\. .0;
  ~^[^:]+:[^:]+:    ::;
  default           .0;
}

map $ip_anon1$ip_anon2 $remote_addr_anon {
  127.0.0.1    $remote_addr;
  ::1          $remote_addr;
  ~^(?P<ip>.*) $ip;
  default      0.0.0.0;
}

Define a log format that uses $remote_addr_anon instead of $remote_addr (for good measure you may also want to omit $http_referer and $http_user_agent from the log format)

log_format main '[$time_iso8601] $remote_addr_anon $remote_user "$request" $status $bytes_sent $request_time'

and use that format for your access log(s):

access_log /var/log/nginx/access.log main;

For error logging the IP address cannot be anonymized, though. If you want to go hardcore you can drop error logging entirely by logging errors to /dev/null:

error_log /dev/null crit;

However, I'd argue that the ability to debug errors is an operational requirement for someone running a web server, which should be compliant with the relevant laws, e.g. art. 6 (1) GDPR:

  1. Processing shall be lawful only if and to the extent that at least one of the following applies:
    [...]
    (f) processing is necessary for the purposes of the legitimate interests pursued by the controller or by a third party, except where such interests are overridden by the interests or fundamental rights and freedoms of the data subject which require protection of personal data, in particular where the data subject is a child.

If logging is limited to just errors and the error logs are not retained for extended periods of time, that should be fine (in my layman opinion; I am not a lawyer, so don't take this as legal advice).

I'd also argue that another operational requirement is the ability to protect the server from attackers. One of the protective measures I implement to lock out bots trying to exploit vulnerabilities is fail2ban. But I can't use the masked client addresses with fail2ban, because then I would be blocking entire /24 subnets, which would be throwing out the baby with the bathwater.

Fortunately Nginx allows for multiple access logs with different formats. So you can create another log format with just the information fail2ban needs

log_format fail2ban '[$time_iso8601] $remote_addr "$request" $status';

To further limit the logged data create a variable $client_error that takes a value 1 or 0 depending on whether the request status indicates a client error (4xx) or not.

# /etc/nginx/conf.d/client_error.conf
map $status $client_error {
  ~^4     1;
  default 0;
}

That way logging for fail2ban can be restricted to client errors only:

access_log /var/log/nginx/fail2ban.log fail2ban if=$client_error;

Proper log rotation ensures that logs are not retained longer than they are actually needed:

# /etc/logrotate.d/nginx

# keep access logs 4 weeks
/var/log/nginx/access.log {
  weekly
  missingok
  rotate 4
  compress
  delaycompress
  notifempty
  sharedscripts
  postrotate
    service nginx rotate >/dev/null 2>&1
  endscript
}

# keep error logs 7 days
/var/log/nginx/error.log {
  daily
  missingok
  rotate 7
  compress
  delaycompress
  notifempty
  sharedscripts
  postrotate
    service nginx rotate >/dev/null 2>&1
  endscript
}

# keep logs for fail2ban until next day
/var/log/nginx/fail2ban.log {
  daily
  missingok
  rotate 1
  notifempty
  sharedscripts
  postrotate
    service nginx rotate >/dev/null 2>&1
  endscript
}

Posted 15:27 [permalink]