7

I want to log the $request_body in the access logs.

But some of the requests have some JSON fields that are sensitive like passwords.

Example:

[2019-03-28] 201 - POST /api/user/add HTTP/1.1 - {\x22email\x22:\[email protected]\x22,\x22password\x22:\x22myPassword\x22}

Is there a way to obfuscate the password value so the output would look something like this:

[2019-03-28] 201 - POST /api/user/add HTTP/1.1 - {\x22email\x22:\[email protected]\x22,\x22password\x22:\x22****\x22}
1
  • 5
    You really don't want to log the request bodies. Even if you figure out how to mask the passwords, you'll have to remember to update the masking code whenever you create a new API with sensitive information or modify an existing API. In my experience (25 years), no one ever remembers to. It also opens you up to other attack vectors--for example, what happens if someone crafts requests with a 1GB parameter (easy and fast enough with HTTP compression) and you're trying to log them all? Commented Apr 3, 2019 at 19:03

2 Answers 2

9
+25

Here are some regex patterns wich can be used for obfuscating request body data in various formats.

Of course the fisrt thing you need to do is to add obfuscated data to log file line format with log_format directive:

log_format custom '$remote_addr - $remote_user [$time_local] '
                    '"$request" "$obfuscated_request_body" $status $body_bytes_sent '
                    '"$http_referer" "$http_user_agent"';

Let's look at the following post body data formats (assuming the field we need to obfuscate is password).

  • Request body is a JSON string (typical of REST API request)

JSON sample:

{"email":"[email protected]","password":"myPassword"}

Escaped JSON string:

{\x22email\x22:\[email protected]\x22,\x22password\x22:\x22myPassword\x22}

nginx map block:

map $request_body $obfuscated_request_body {
    "~(.*[{,]\\x22password\\x22:\\x22).*?(\\x22[,}].*)" $1********$2;
    default $request_body;
}
  • Request body is a JSON array of name and value pairs (returned by jQuery serializeArray() function)

JSON sample:

[{"name":"email","value":"[email protected]"},{"name":"password","value":"myPassword"}]

Escaped JSON string:

[{\x22name\x22:\x22email\x22,\x22value\x22:\[email protected]\x22},{\x22name\x22:\x22password\x22,\x22value\x22:\x22myPassword\x22}]

nginx map block:

map $request_body $obfuscated_request_body {
    "~(.*[\[,]{\\x22name\\x22:\\x22password\\x22,\\x22value\\x22:\\x22).*?(\\x22}[,\]].*)" $1********$2;
    default $request_body;
}
  • Request body is an urlencoded string (submitted by HTML form with enctype="application/x-www-form-urlencoded")

POST body sample:

login=test%40test.com&password=myPassword

nginx map block: nginx map block:

map $request_body $obfuscated_request_body {
    ~(^|.*&)(password=)[^&]*(&.*|$) $1$2********$3;
    default $request_body;
}

If you need to obfuscate more than one data field, you can chain several map transformations:

log_format custom '$remote_addr - $remote_user [$time_local] '
                  '"$request" "$obfuscated_request_body_2" $status $body_bytes_sent '
                  '"$http_referer" "$http_user_agent"';

map $request_body $obfuscated_request_body_1 {
    "~(.*[{,]\\x22password\\x22:\\x22).*?(\\x22[,}].*)" $1********$2;
    default $request_body;
}

map $obfuscated_request_body_1 $obfuscated_request_body_2 {
    "~(.*[{,]\\x22email\\x22:\\x22).*?(\\x22[,}].*)" $1********$2;
    default $request_body;
}

All given regexes will be working only with escape=default escaping mode of log_format nginx directive! If for some reason you need to change this mode to escape=json (available from nginx 1.11.8) or escape=none (available from nginx 1.13.10), I built regexes for this escaping modes too, but for some strange reasons couldn't managed them to work with nginx until specifying pcre_jit on; directive (although they pass other PCRE tests). For those who interested, these regexes are

  • for escape=json escaping mode:
map $request_body $obfuscated_request_body {
    "~(.*[{,]\\\"password\\\":\\\")(?:[^\\]|\\{3}\"|\\{2}[bfnrt]|\\{4})*(\\\"[,}].*)" $1********$2;
    default $request_body;
}

for JSON string, and

map $request_body $obfuscated_request_body {
    "~(.*[\[,]{\\\"name\\\":\\\"password\\\",\\\"value\\\":\\\")(?:[^\\]|\\{3}\"|\\{2}[bfnrt]|\\{4})*(\\\"}[,\]].*)" $1********$2;
    default $request_body;
}

for JSON array of name and value pairs.

  • for escape=none escaping mode:
map $request_body $obfuscated_request_body {
    "~(.*[{,]\"password\":\")(?:[^\\\"]|\\.)*(\"[,}].*)' $1********$2;
    default $request_body;
}

for JSON string, and

map $request_body $obfuscated_request_body {
    "~(.*[\[,]{\"name\":\"password\",\"value\":\")(?:[^\\\"]|\\.)*(\"}[,\]].*)" $1********$2;
    default $request_body;
}

for JSON array of name and value pairs.

Bonus - obfuscating GET request query parameters

Sometimes people also need to obfuscate data passed as GET request query parameters. To do this while preserving the original nginx access log format, let's look at the default access log format first:

log_format combined '$remote_addr - $remote_user [$time_local] '
                    '"$request" $status $body_bytes_sent '
                    '"$http_referer" "$http_user_agent"';

nginx bulit-in $request variable can be represented as $request_method $request_uri $server_protocol sequence of variables:

log_format combined '$remote_addr - $remote_user [$time_local] '
                    '"$request_method $request_uri $server_protocol" $status $body_bytes_sent '
                    '"$http_referer" "$http_user_agent"';

We need to obfuscate part of $request_uri variable data:

log_format custom '$remote_addr - $remote_user [$time_local] '
                  '"$request_method $obfuscated_request_uri $server_protocol" $status $body_bytes_sent '
                  '"$http_referer" "$http_user_agent"';

map $request_uri $obfuscated_request_uri {
    ~(.+\?)(.*&)?(password=)[^&]*(&.*|$) $1$2$3********$4;
    default $request_uri;
}

To obfuscate several query parameters you can chain several map translations as shown above.

Update - safety considerations

Alvin Thompson commented OP's question mentioning some attack vectors like very large compressed requests. It is worth mentioning that nginx will log these requests "as-is" in their compressed form, so log files will not grow an unpredictable way.

Assuming our log file has following format:

log_format debug '$remote_addr - $remote_user [$time_local] '
                 '"$request" $request_length $content_length '
                 '"$request_body" $status $body_bytes_sent '
                 '"$http_referer" "$http_user_agent"';

request with gzipped body of 5,000 spaces will be logged as

127.0.0.1 - - [09/Feb/2020:05:27:41 +0200] "POST /dump.php HTTP/1.1" 193 41 "\x1F\x8B\x08\x00\x00\x00\x00\x00\x00\x0B\xED\xC11\x01\x00\x00\x00\xC2\xA0*\xEB\x9F\xD2\x14~@\x01\x00\x00\x00\x00o\x03`,\x0B\x87\x88\x13\x00\x00" 200 6881 "-" "curl/7.62.0"

As you can see, $request_length and $content_length values (193 and 41) reflects the length of the incoming data from the client and not the byte count of the decompressed data stream.

In order to filter abnormally large uncompressed requests, you can additionally filter request bodies by their length:

map $content_length $processed_request_body {
    # Here are some regexes for log filtering by POST body maximum size
    # (only one should be used at a time)

    # Content length value is 4 digits or more ($request_length > 999)
    "~(.*\d{4})" "Too big (request length $1 bytes)";

    # Content length > 499
    "~^((?:[5-9]|\d{2,})\d{2})" "Too big (request length $1 bytes)";

    # Content length > 2999
    "~^((?:[3-9]|\d{2,})\d{3})" "Too big (request length $1 bytes)";

    default $request_body;
}

map $processed_request_body $obfuscated_request_body {
    ...
    default $processed_request_body;
}
Sign up to request clarification or add additional context in comments.

Comments

-1

Look at this blog which talks about masking user data for logs: https://www.nginx.com/blog/data-masking-user-privacy-nginscript/

1 Comment

From How to answer: Links to external resources are encouraged, but please add context around the link so your fellow users will have some idea what it is and why it’s there. Always quote the most relevant part of an important link, in case the target site is unreachable or goes permanently offline.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.