# Enabling self-hosted website analytics with NixOS

Recently, I was adding visitor analytics to the Extreme Wavefront Control Lab blog. We want to know, within the strictures of relevant privacy laws, who's reading our amazing blog posts. (There's some history here.)

However, I have a thing about funnelling visitor information to Google, so Google Analytics was a non-starter.

I use Linux for this server and some others I manage, but I got frustrated with the magic incantations required to configure software and services I wanted to use. (Install this, run this command you haven't heard of, then sacrifice a chicken under a full moon to enable local DNS caching.) To be clear, I didn't have a problem with the opaque nature of the incantations. I just kept forgetting them, and having to look them up again, at which point they've invariably changed slightly. Even if they haven't, I've doubtless forgotten some key intermediate step that the whole thing hinged upon.

That's why I love NixOS so much. It's still Linux, and the incantations are equally opaque (to me), but it has a wonderful property: once you've written down your desired configuration in their Nix configuration language, the description is sufficient to apply that configuration to any NixOS server.

# Preliminaries

This assumes you have a website you want to count visits to at example.com, and you've got NGINX set up to serve requests to both example.com and stats.example.com. (Whether by adding a DNS record for stats.example.com, or just setting a wildcard record for *.example.com pointing to the same server as example.com.)

# Matomo

In your configuration.nix, configuring Matomo to serve on stats.example.com is as simple as adding this to your /etc/nixos/configuration.nix:

  services.matomo = {
enable = true;
nginx = {
serverName = "stats.example.com";
};
};


When you go to stats.example.com, you'll be prompted to complete the setup process. (Alas, not yet automated by NixOS.) The first thing you need is a MySQL database and username/password pair to connect with. If you don't already have those set up, read on to configure them through configuration.nix.

# MySQL

In your /etc/nixos/configuration.nix, you probably have something like

{ config, pkgs, lib, ... }:
{
< your configuration goes here >
}


The let ... in structure in Nix language defines some local values. We need to configure our database and password, so we add

{ config, pkgs, lib, ... }:
let
statsConfig = {
db = "statsdb";
user = "stats";
};
in
{
< your configuration goes here >
}


Now, in the { } that enclose your configuration, enable MySQL using its NixOS options:

  # MySQL for Wordpress and Matomo/Piwik
services.mysql = {
enable = true;
bind = "localhost";
};


Here we're using pkgs.mariadb to indicate we want the MariaDB fork of MySQL. We've also declared that we want the MySQL server available only to programs running on this same machine (localhost). (The default is to bind to all interfaces, which you probably don't want.)

Now, stateless declarative configuration doesn't go well with a database. However, we can at least tell NixOS to ensure the existence of the database and user account we named in our statsConfig attribute set.

  services.mysql = {
enable = true;
bind = "localhost";
ensureDatabases = [
statsConfig.db
];
ensureUsers = [
{
name = "${statsConfig.user}"; ensurePermissions = { "${statsConfig.db}.*" = "ALL PRIVILEGES";
};
}
];
};


Recall that "this is ${variableName}" inserts the value of variableName in the string. If variableName = "foo", the resulting string would then be this is foo. We can even use this interpolation in attribute names, like "${statsConfig.db}.*".

Unfortunately, the configuration options under services.mysql don't let you ensure a password is set. We can write a one-shot SystemD service that exists to set the password on each reboot (or activation of a new configuration). After the closing brace of the MySQL configuration block, add:

  systemd.services.setdbpass = {
description = "MySQL database password setup";
wants = [ "mysql.service" ];
wantedBy = [ "multi-user.target" ];
serviceConfig = {
ExecStart = ''
${pkgs.mariadb}/bin/mysql -e "grant all privileges on${statsConfig.db}.* to ${statsConfig.user}@localhost identified by '${statsConfig.password}';" ${statsConfig.db} ''; User = "root"; PermissionsStartOnly = true; RemainAfterExit = true; }; };  This seems like a bit of magic, but ${pkgs.mariadb} is just a path to the Nix store where MariaDB installed (hence why we must append /bin/mysql to get the actual binary). ExecStart specifies the script we want to run, and once all our statsConfig.whatevers have been substituted in appropriately, it's a script that sets the database password.

Now we can finish setting up Matomo at stats.example.com using the database name, username, and password we wrote in our configuration.nix.

# nginx and GeoIP2

I like knowing where my website visitors are coming from. Each visit is a request to serve a page to a particular IP address, which in turn can give you an approximate location. For instance, most University of Arizona IPv4 addresses begin with 128.196..., so you can guess they're in Tucson, Arizona and be right most of the time. The lookup of geographical location from IP addresses is sometimes called "GeoIP". Unfortunately, there appears to be an incompatibility between Matomo installed through Nix (and hence, served from the read-only Nix store) and the GeoIP features built into Matomo itself.

Fortunately, it's not too hard to make geolocating visitors happen in NGINX through (probably) more performant C code rather than Matomo's PHP. It just requires a bit more configuration. (Or, see the last section if you're impatient.)

The leading provider of GeoIP databases, Maxmind, recently got rid of their old database format and data package in favor of a format called GeoIP2. Most of the documentation online assumes you're using this format. In fact, NGINX doesn't even know how to read MaxMind DB files in the new format without a plugin. So, let's get that plugin!

Back in the let ... in block before the configuration expression, add the following lines:

  ngx_http_geoip2_module = pkgs.stdenv.mkDerivation rec {
name = "ngx_http_geoip2_module-a28ceff";
src = pkgs.fetchzip {
url = "https://github.com/leev/ngx_http_geoip2_module/archive/a28ceffdeffb2b8ec2896bd1192a87f4f2c7a12a.zip";
sha256 = "0ba1bjnkskd241ix7ax27n3d9klpymigm0wdjgd4yhlsgbxsxvdx";
};
installPhase = ''
mkdir $out cp *.c config$out/
'';
fixupPhase = "";
};


Here I've used a particular commit hash in the GitHub URL for reproducibility. (If we used the master.zip file, the sha256 we get for it will change over time.) We also need to replace the NGINX version installed by default with a version that's built with this module included. This involves dark Nix/NixOS/Nixpkgs magic. In the services.nginx = block that defines your website configuration, add

  services.nginx = {
...
package = pkgs.nginxStable.overrideAttrs (oldAttrs: rec {
configureFlags = oldAttrs.configureFlags ++ [ "--add-module=${ngx_http_geoip2_module}" ]; buildInputs = oldAttrs.buildInputs ++ [ pkgs.libmaxminddb ]; }); ... }  Here we've taken the default package configured to provide the nginx service, pkgs.nginxStable, and overridden some of its attributes. Just enough, in fact, to make sure --add-module= is passed to the configure script and pkgs.libmaxminddb is built and available at build time for our GeoIP2 plugin. ## Downloading GeoLite2 databases We're going to put the database files mapping IP addresses to locations in /var/lib/geoip-databases, so better make it: sudo mkdir -p /var/lib/geoip-databases sudo chmod 755 /var/lib/geoip-databases  Next you'll need to download the GeoLite2 database files, extract them, and move the .mmdb file from each archive into /var/lib/geoip-databases. Unfortunately, these lookup tables go out of date quickly. We'll see later on how to automatically update them. ## Configuring NGINX Whenever a request comes in, we want to pull out every relevant value we can from the GeoIP database and make them available to PHP scripts invoked by nginx (i.e. Matomo). NixOS has an attribute services.nginx.appendHttpConfig as a sort of escape hatch for when you really just need to hand-write some part of the NGINX config file, and that's what we'll use. (The syntax of the geoip2 {} blocks is documented at the plugin author's GitHub page for the module, and the fastcgi_param MM_... lines are based on this listing of the default names Matomo looks for.)  services.nginx.appendHttpConfig = '' geoip2 /var/lib/geoip-databases/GeoLite2-Country.mmdb { auto_reload 5m;$geoip2_metadata_country_build metadata build_epoch;
$geoip2_data_country_code country iso_code;$geoip2_data_country_name country names en;
$geoip2_data_continent_code continent code;$geoip2_data_continent_name continent names en;
}

geoip2 /var/lib/geoip-databases/GeoLite2-City.mmdb {
$geoip2_data_city_name city names en;$geoip2_data_lat location latitude;
$geoip2_data_lon location longitude; } geoip2 /var/lib/geoip-databases/GeoLite2-ASN.mmdb { auto_reload 5m;$geoip2_data_asn autonomous_system_number;
$geoip2_data_asorg autonomous_system_organization; } fastcgi_param MM_CONTINENT_CODE$geoip2_data_continent_code;
fastcgi_param MM_CONTINENT_NAME $geoip2_data_continent_name; fastcgi_param MM_COUNTRY_CODE$geoip2_data_country_code;
fastcgi_param MM_COUNTRY_NAME $geoip2_data_country_name; fastcgi_param MM_CITY_NAME$geoip2_data_city_name;
fastcgi_param MM_LATITUDE $geoip2_data_lat; fastcgi_param MM_LONGITUDE$geoip2_data_lon;
fastcgi_param MM_ISP $geoip2_data_asorg; '';  # Auto-updating GeoLite2 databases The GeoLite2 databases update monthly (or weekly, in the case of the ASN mapping). Ideally, that would happen without your intervention. A short shell script is all you need to update your databases in /var/lib/geoip-databases at a regular interval. To create said script within your NixOS configuration, add to the let ... in block before your config block:  geolite2UpdaterConfig = rec { dbBaseUrl = "https://geolite.maxmind.com/download/geoip/database"; databaseDir = "/var/lib/geoip-databases"; interval = "weekly"; geoip-updater = pkgs.writeScriptBin "geoip-updater" '' #!${pkgs.runtimeShell}
skipExisting=0
for arg in "$@"; do case "$arg" in
--skip-existing)
skipExisting=1
echo "Option --skip-existing is set: not updating existing databases"
;;
*)
echo "Unknown argument: $arg";; esac done cd${databaseDir}
for dbName in "GeoLite2-City" "GeoLite2-Country" "GeoLite2-ASN"; do
if [ "$skipExisting" -eq 1 -a -f$dbName.mmdb ]; then
echo "Skipping existing file: $dbName.mmdb" continue fi${pkgs.curl}/bin/curl -OL ${dbBaseUrl}/$dbName.tar.gz
${pkgs.gzip}/bin/gzip -cd$dbName.tar.gz | ${pkgs.gnutar}/bin/tar xvf - mv$dbName\_*/$dbName.mmdb . rm -rfv$dbName\_*/ $dbName.tar.gz done ''; };  This creates a script called geoip-updater in the Nix store that updates the City, Country, and ASN databases along with a place to hold some details of the configuration (e.g. locations). Now, to run it at regular intervals. In the old days, a cron job would be sufficient. But it's 2019, and we all love SystemD unit files now, so let's do that instead. The following goes in your main configuration block:  users.groups.srv = {}; users.users.geoip = { isSystemUser = true; group = "srv"; description = "GeoIP database updater"; }; systemd.timers.geoip-updater = { description = "GeoIP Updater Timer"; partOf = [ "geoip-updater.service" ]; wantedBy = [ "timers.target" ]; timerConfig.OnCalendar = geolite2UpdaterConfig.interval; timerConfig.Persistent = "true"; timerConfig.RandomizedDelaySec = "3600"; }; systemd.services.geoip-updater = { description = "GeoIP Updater"; after = [ "network-online.target" "nss-lookup.target" ]; wants = [ "network-online.target" ]; preStart = with geolite2UpdaterConfig; '' mkdir -p "${databaseDir}"
chmod 755 "${databaseDir}" chown geoip:srv "${databaseDir}"
'';
serviceConfig = {
ExecStart = "${geolite2UpdaterConfig.geoip-updater}/bin/geoip-updater"; User = "geoip"; PermissionsStartOnly = true; }; }; systemd.services.geoip-updater-setup = { description = "GeoIP Updater Setup"; after = [ "network-online.target" "nss-lookup.target" ]; wants = [ "network-online.target" ]; wantedBy = [ "multi-user.target" ]; conflicts = [ "geoip-updater.service" ]; preStart = with geolite2UpdaterConfig; '' mkdir -p "${databaseDir}"
chmod 755 "${databaseDir}" chown geoip:srv "${databaseDir}"
'';
serviceConfig = {
ExecStart = "\${geolite2UpdaterConfig.geoip-updater}/bin/geoip-updater --skip-existing";
User = "geoip";
PermissionsStartOnly = true;
# So it won't be (needlessly) restarted:
RemainAfterExit = true;
};
};


Here we've added a user and group for the updater job, ensured the existence of the destination folder, and indicated that we want our updater script called at startup (and nixos-rebuild switch) and at a regular interval set by geolite2UpdaterConfig.interval. (This is largely based on the code from the now-deprecated GeoIP updater in NixOS, used under the terms of the MIT License.)

You may well ask why we did this instead of just adding a line to the system crontab. It's certainly more verbose. The answer is that it was simpler to adapt the existing Nix expressions than write and test crontab additions. It does add some niceties, like the command systemctl status geoip-updater to see a few lines of output from the script and verify that everything's working, or systemctl start geoip-updater to trigger it manually exactly as it would run via the timer.

## Or you could just...

This section of the blogpost nearly didn't get written, because once I'd figured everything out I realized you can just take my geoip.nix, place it in /etc/nixos/, and import it in your configuration.nix like this:

  imports =
[
# Include the results of the hardware scan.
./hardware-configuration.nix
# Enable GeoIP2 location plugin for nginx
./geoip.nix
];


posted

← return home
Background image of the Carina nebula by NASA, ESA, N. Smith (University of California, Berkeley), and The Hubble Heritage Team (STScI/AURA)