OpenLMIS Reporting Stack DevOps Training

Explaining Docker Compose Files

Installed Services

The following containers are brought up when you run docker-compose up:

  1. nginx
  2. superset
  3. nifi
  4. kafka
  5. scalyr
  6. consul
  7. zookeeper
  8. db
  9. log
  10. config-container
  11. db-config-container


However, the stack should be able to function OK without the scalyr, kafka, and zookeeper containers. To run the docker-compose without these, run:


docker-compose up --build -d --scale scalyr=0 --scale kafka=0 --scale zookeeper=0


The config containers are responsible for copying configuration files into volumes used by the service containers. The config-container also registers the Superset and NiFi services on Consul. It is expected that these containers will exit after some time.

Configuring the Reporting Stack

The reporting stack configuration variables are all in the reporting/.env file. The file also includes documentation on what each of the variables do.


The NiFi and Superset containers can be accessed through NGINX. This is the preferred method. On your development machine, you will need to add the following line to your /etc/hosts file:


127.0.0.1   nifi.local superset.local


In your development machine, you should be able to access NiFi on http://nifi.local and Superset on http://superset.local from your browser. You can change the domain names to something else by changing the values of SUPERSET_DOMAIN_NAME and NIFI_DOMAIN_NAME in the reporting/.env file.

NiFi Registry

Though NiFi Registry is part of the reporting stack, it is not deployed using the Docker Compose setup defined in the openlmis/openlmis-ref-distro repository.


The Nifi Registry Docker Compose is in the openlmis/openlmis-deployment repository.

The Docker Compose viles are in deployment/reporting_env/nifi-registry/

The directory has an environment configuration file deployment/reporting_env/nifi-registry/.env which has non-sensitive variables.


The Docker Compose setup also loads another environment configuration file which is assumed to be located at deployment/reporting_env/nifi-registry/settings.env, but is git-ignored because it holds secrets we don’t want in a public repository. This file needs to be recreated with the following variables:

  1. SCALYR_API_KEY: Scalyr API key
  2. AWS_ACCESS_KEY_ID: AWS access key ID for the user with write access to the configured S3 bucket to store Nifi Registry backup
  3. AWS_SECRET_ACCESS_KEY: Corresponding AWS security access key for the backup S3 user
  4. AWS_DEFAULT_REGION: Default region for the S3 backup (should  be close to the region where the Nifi registry server is located)
  5. S3_BACKUP_URI: An S3 URI for where to backup the Nifi Registry flow files and database. Example URI  s3://bucket/folder

Persistence

  1. NiFi Registry

Nifi Registry Docker Compose setup has a backup container which syncs a folder mounted as docker volume and shared between the backup container and Nifi registry.


The shared volume is mounted in the Nifi Registry database and flow files directories. These shared volumes are also accessible from the backup container, which has a cron job that syncs the files in those folders to a configured S3 bucket, at the interval configured (default 1 minute).


The Nifi Registry backup container downloads the database and flow files from S3 the first time the container is brought up, then keeps syncing the files to S3.

  1. PostgreSQL

Initially, we were using a DB container but shifted last week to using an RDS instance which is still in testing.


The RDS instance is hosted at: http://uat-reporting.cpmydgulhchj.us-east-1.rds.amazonaws.com/


Database name: openlmis_reporting

Username: reporting

Password: 0b39aca786e3


These settings have been added in the .env file as variables.

Administration

Upgrading the Container Versions

Upgrading the images used should be a straight-forward process for most of the containers; the reporting/.env file contains OL_*_VERSION  variables that you can change. Here are things to note about changing the NiFi and Superset versions:

  1. The NiFi container is based on https://hub.docker.com/r/onaio/nifi Which has just one version of NiFi published right now (version 1.7.0). To upgrade to a more recent version of NiFi, switch to https://hub.docker.com/r/apache/nifi. The NiFi OAuth2 controller service will, however, not be available in other versions (apart from onaio/nifi version 1.7.0) until when this pull request is merged https://github.com/apache/nifi/pull/2901.
  2. You can change the PostgreSQL JDBC driver used by Nifi by updating the .jar file in the reporting/config/services/nifi/libs directory.
  3. The Superset container is built from the Dockerfile reporting/superset/Dockerfile, which is tied to Superset version 0.29.0rc7. To upgrade the Superset version, you will have to change the version of Superset in that Dockerfile. You might also be required to change the Dockerfile further, including updating the versions of software packages installed.


NiFi Process Group Preloading

The Docker Compose setup allows for you to configure which NiFi process groups are loaded and started when the setup is brought up. The process groups are downloaded from NiFi Registry instances.


Before you can configure which process groups are preloaded, you’ll need to configure the NiFi Registry instances the process groups will be downloaded from. Define each of the NiFi Registry instances in its own file in the reporting/config/services/nifi/scripts/preload/registries directory. Currently, the directory contains a single .properties file configuring how to get to the Shared OpenLMIS NiFi Registry instance http://nifi-registry.openlmis.org:18080/nifi-registry. The .properties file for this instance looks like this:


registryClientName=OpenLMIS

registryClientUrl=http://nifi-registry.openlmis.org:18080

registryClientDesc=Shared OpenLMIS NiFi Registry


You can now configure which process groups to preload in the reporting/config/services/nifi/scripts/preload/process-groups directory. In this directory create a directory that is named exactly as the registryClientName configuration for NiFi Registry instance (configured above) from where the flow files for the process groups will be gotten from. Inside this directory create a .properties file for each of the process groups you will be pulling from the NiFi Registry instance. Here’s an example .properties file defining a process group to pull:


bucketIdentifier=9a51b9f4-c902-463c-a440-4bfdac0dea6a

flowIdentifier=77a5e697-b3ae-4f91-8c4b-309bd3a5b82b

flowVersion=1


The bucketIdentifier and flowIdentifier can be gotten from the NiFi Registry webapp. Specify which version of the process group to preload using the flowVersion configuration.

Superset Data Sources Preloading

To make changes to the existing data sources before preloading you can modify the database.yaml file. You can also create your own data source file to import by running superset export_datasources on your own superset instance. Or using the UI by selecting the data sources you want to export and clicking on Export to YAML.

The data sources are imported by the superset import_datasources command in the docker-compose file

Superset Dashboard Preloading

Like the data sources, the dashboards are imported by the superset import_dashboards command in the docker-compose file.

To modify the dashboards being imported you can edit or replace the openlmis_uat_dashboards.json file.


Server Deployments

Terraform

Terraform allows you to codify your cloud deployments. For instance, for the OpenLMIS deployment, our Terraform definition specifies that the UAT reporting stack deployment includes one AWS EC2 instance with ports 22, 80, and 443 open, and one EC2 classic load balancer pointing to the EC2 instance.


To allow for code reuse, Terraform allows us to define modules that can be reused in different deployments. In the openlmis/openlmis-deployment repository we currently have two modules that can be reused on different deployments:

  • openlmis: Brings up infrastructure that is usable by OpenLMIS deployments.
  • nifi-registry: The name is misleading. This module can be used to bring up infrastructure that is usable by a reporting deployment.


To define a Terraform deployment that uses the nifi-registry module:

  1. Create a directory under the provision/terraform directory in the openlmis/openlmis-deployment repository, let’s say provision/terraform/reporting/demo-environment.
  2. Add these three files in this directory:
    1. variables.tf: This is where you declare variables to be used with the deployment. Declare variables here, that are also in the nifi-registry module, which you would wish to override. Find the full list of variables defined in the nifi-registry module in this file https://github.com/OpenLMIS/openlmis-deployment/blob/master/provision/terraform/modules/nifi-registry/variables.tf.
    2. terraform.tfvars: This is where to specify the values of the variables defined in the variables.tf file here.
    3. main.tf: This is where you import the nifi-registry module. Under the nifi-registry module import you should also define how variables defined in the variable.tf file match up to variables in the module. You should also define where to store the Terraform state in this file.


You can run Terraform from your personal machine. The Terraform binary will call the AWS API and create or modify the cloud resources for you. Download the Terraform binary from https://www.terraform.io/downloads.html or install it using a package manager on your Operating System.


Since Terraform hits the AWS API, it will need to authenticate as you. Export the following environment variables in your terminal before running Terraform:


export AWS_ACCESS_KEY_ID=<your access key ID>

export AWS_SECRET_ACCESS_KEY=<your secret access key>

export TF_VAR_aws_access_key_id=$AWS_ACCESS_KEY_ID

export TF_VAR_aws_secret_access_key=$AWS_SECRET_ACCESS_KEY


The second two variables are used to pass your key ID and secret key into the Terraform variables aws_access_key_id and aws_secret_access_key defined in the variables.tf file. If you name the variables differently in the file make sure to use these names in the last two commands after (format is TF_VAR_<variable name>).


You can now run Terraform! Inside the deployment directory you have created (e.g provision/terraform/reporting/demo-environment from the previous section), run:


terraform init


This will:

  1. Download the necessary Terraform plugins into the .terraform directory (that’s inside the current directory).
  2. Check if there is already a Terraform state for the deployment in the shared state storage (defined in the main.tf file).


The command will, however, not bring up or modify the deployment. You need not run this command every time you need to update the deployment, just when; you delete the .terraform directory in the deployment’s directory or add a new Terraform module or plugin to your deployment.


Before creating or modifying your deployment using Terraform, it’s always good to check which modifications will be made to the setup. Do this by running:


terraform plan


Pro tip; terraform plan will also let you know if you will need to run terraform init.


Things to look out for when you run terraform plan are:

  1. Resources that will be recreated
  2. Resources that will be permanently deleted


If you are OK with the changes that Terraform will be making in your deployment, run:


terraform apply


This will effect the changes.


Here are some other useful Terraform commands:


  1. terraform fmt: Use this command to format the Terraform files you are working on.
  2. terraform taint: Use this command to forcefully taint (to force recreation) a resource. Since resource blocks are defined in the nifi-registry Terraform module, you will need to add a -module=nifi-registry flag for the command to run.
  3. terraform destroy: Use this command to tear-down all the provisioned resources.


Part of what the nifi-registry Terraform module does is install and configure the Docker daemon in the provisioned EC2 instance. It does this using Ansible, described in the next section.

Ansible

Ansible allows you to codify how a server should look like (which software should be installed in the server, how to configure installed software e.t.c).


For reporting stack deployments, Ansible does the following in the provisioned EC2 instance:

  1. Install the Docker daemon
  2. Generate TLS certs and keys to be used to access the Docker daemon remotely. The following TLS files are created:
    1. CA key and certificate.
    2. TLS key and certificate to be used by the Docker daemon, signed by the CA certificate created above.
    3. TLS key and certificate to be used by the remote host that controls the Docker daemon, signed by the CA certificate created above.
  3. Configure the Docker daemon to allow it to be controlled remotely over a HTTP connection. The configuration is updated to authenticate remote hosts using TLS (only hosts that provide certificates that are signed using the same CA that signed the certificate used by the daemon are allowed to connect).
  4. Copy the generated TLS files to this S3 bucket folder https://s3.console.aws.amazon.com/s3/buckets/aws-instance-keys/tls/?region=us-east-1&tab=overview.


Ansible, however, does not bring up the reporting stack’s Docker Compose setup.

Jenkins

To deploy Nifi Registry to the shared reporting uat environment, use jenkins. The jenkins build clones the openlmis-deployment and openlmis-config repository, copies the reporting.env to deployment/reporting_env/nifi-registry/settings.env


The build then executes deployment/reporting_env/nifi-registry/deploy_nifi_registry.sh which

connects to the shared UAT reporting server docker daemon and runs docker-compose up.


To deploy the reporting stack (Nifi, Superset) to UAT shared environment, the jenkins build


  1. clones the OpenLMIS/openlmis-ref-distro, villagereach/openlmis-config and OpenLMIS/openlmis-deployment
  2. copies reporting.env from the config repository to ref-distro/settings.env
  3. Executes openlmis-deployment/deployment/reporting_env/services/deploy_services.sh which connects to the docker daemon on the shared UAT server to run docker-compose up.

The jenkins build configuration has some environment bindings required during the build process:

DOCKER_CERT_PATH: CA certificates created when setting up docker with ansible

reporting_ssl: SSL certifactes for the reporting stack

reporting_uat_nginx_password: NGINX basic authentication password(used when accessing NIFI)

OpenLMIS: the global initiative for powerful LMIS software