Abstract:
Recently I have to work out a solution for logging information in my Phoenix application.
I have some PHI information going in and out of my API Servers written in Elixir, using Phoenix.
I want to log those information for both operational purposes and debugging purposes.
For operational purposes, I have an extremely secured private server to store PHI, with all the necessary encryption/decryption, and only a few authorized people can access. The PHI information can be stored at they are, at-rest encrypted, of course.
For debugging purposes, I use one secured server that allow all the software engineers and qa personnel to access. The PHI information will be masked for this server, so no original information can be found, what soever. For example %User{ first_name: “*****”, last_name: “****”}
There is no way to convert back ***** to real information. So everybody can have a look at those information.
Here are the requirements:
1- Retain as much as possible the Logger calls in the code. We have a lot of Logger calls in the code, so ideally we should have to change only minimum log calls that log the PHI information. The other calls must just work.
2- For each of the call to the Logger, for example Logger.debug(“Call list_user:” … whatever users list from database here), it must do the following things:
- Log to the extremely secured PHI log server, send only information that is PHI, so we don’t redundantly write Non-PHI information there.
- Log to the Non-PHI log server: All non-phi information is stored as they are, and all the phi information is stored with phi data masked as “*”
3- All the current logger backends, for example: :console, LoggerFileBackend should still work with no modification, and they must log non-phi data, phi-data securely, meaning no real PHI data can be printed to those existing logger backends.
4- When people add more Logger backends to log to different services, for example Timber or Spark, it must be incorporated in to PHI – Non PHI tasks seamlessly. If they misconfigure the logger, the PHI information should not go there unmasked. And the existing calls to Logger in the current application code must not be changed.
5- When software engineers call the Logger.info, Logger.debug, Logger.error in the code, the PHI information cannot be accidentally leaked.
For example some existing code or new code like this: Logger.error(“Error updating user: #{inspect(user)}) will automatically print the masked version of data to log, instead of the original.
Considering all those requirements, you can see it is not an easy one. It also explains why in many cases, companies leaked username, password, credit cards and other confidential informations all over the places, even when they encrypted information in the database. They forgot about the network transportations and logging at many levels of complex systems.
It took me quite a few days to complete the whole solution that satisfied all the requirements above.
The solution:
I create a ProxyBackend that acts as the only backend in the system. The application also configures a backend list that contains all the logger backends that ProxyBackend can call.
The Elixir Logger sends log messages only to this ProxyBackend. Then from within the ProxyBackend, it will call the handle_event of each of the logger backend in the backend list to write log messages appropriately.
Configuration:
The ProxyBackend is configured similar to any other logger backends in Elixir. Depends on the environment you want to configure, you can write the configurations to files in the directory “config”.
config :logger,
backends: [{CustomBackendDemo.ProxyBackend, :proxy_log}]
Then a list of backends that we want to send logs from ProxyBackend must be configured:
config :logger, :proxy_log,
backend_list: [
%{
module: LoggerFileBackend,
backend_type: :only_phi,
name: :only_phi_log,
init_params: {LoggerFileBackend, :only_phi_log}
},
%{
module: LoggerFileBackend,
backend_type: :all_no_phi,
name: :all_no_phi_log,
init_params: {LoggerFileBackend, :all_no_phi_log}
},
%{
module: LoggerFileBackend,
backend_type: :all,
name: :all_log,
init_params: {LoggerFileBackend, :all_log}
}
]
In this case, I use LoggerFileBackend logging to different files with different types of information for demo purposes. In reality, the backend list can have any backend, for example Logger.Backends.Console to log to :console, Timber.LoggerBackends.HTTP to log to Timber, or SlackLoggerBackend.Logger to log to Slack.
For demo purpose, this application handles PHI information in 3 different ways:
- :only_phi : Write only phi information, not masked, not encrypted. We assume that any record with first_name, last_name, birthdate, email is PHI. Other information which is not phi, for example “Hello, World”, will not be written to this log.
- :all_no_phi : All information is written to this log, but the PHI information will be masked with “*”, so nobody can read the PHI. The non PHI information is also written here.
- :all : All information is written to this log, the PHI information is written as it is, not masked, not encrypted. The non PHI information is also written here.
In this demo application, I create 3 different types of backends: :only_phi, :all_no_phi and :all.
Of course, depends on the logging purpose, we can modify ProxyBackend to handle more types, or entirely different set of types and make it configurable. But for the simplicity of the demo, let make it handle 3 types.
The name for each of the backend in backend_list is used to configure options for each of the backend later.
The init_params specify how each backend’s init method must be called.
Normally, when we configure backend for an Elixir program like this:
config :logger,
backends: [:console, {LoggerFileBackend, :error_log}, Timber.LoggerBackends.HTTP]
Console backend will be called with Logger.Backends.Console .init(:console)
LoggerFileBackend will be called with LoggerFileBackend.init({__MODULE__, name}), in this case the name is :error_log.
Timber backend will be called with Timber.LoggerBackends.HTTP.init(__MODULE__)
So we know which init will be called for each backend by the way they are written in the config :backends. When we use ProxyBackend, we take those things and put in the :init_params for each backend in the backend_list.
The next step is configure specific options for each of the backends in the backend list. They are written like this:
config :logger, :only_phi_log,
path: "./logs/only_phi.log",
level: :debug
config :logger, :all_no_phi_log,
path: "./logs/all_no_phi.log",
level: :debug
config :logger, :all_log,
path: "./logs/all.log",
level: :debug
The names after the comma of “config :logger,” must match with the names of backends in the backend_list.
When we are done with the configuration, the application will log to multiple backends differently, with different types of information, as we want them to do.
Here is the full demo application:
https://github.com/chauhonglinh/custom_backend_demo
When you run a server, and use a web browser to visit http://localhost:4000/api/users ,
it will write to 3 log files in the directory logs/
Take a look at them to see the differences.