DevOps

Docker Compose setup

Running questions

  • Why is reporting stack incorporated into ref distro? Why is it not in the main docker compose?
  • Note to create a recommended topology for reporting stack

Reporting stack pieces

  • Nifi
    • Open source by Apache, data ingestion, sophisticated ETL
    • Connect to OpenLMIS API, get data, preprocess data, then put it into reporting stack
    • Could get sources from anywhere, OpenLMIS, DHIS2, etc.
  • Superset
    • Also by Apache, data visualization
    • Indicator queries, build charts
  • Postgres
  • Nifi → Postgres ← Superset

Nifi registry

  • Nifi works with entities called process groups, flow of how to process data
    • Essentially a bunch of XML files
    • This is how to version control Nifi flows
  • Nifi registry docker compose file separate from reporting stack
    • One registry for multiple deployments, why it is in the openlmis-deployment repository

Zookeeper/Kafka

  • Don't use it right now
  • For multiple instances of Nifi
  • Tied in with Kafka
  • Jason Rogena (Unlicensed) to exclude Kafka and Zookeeper from documentation touchups he's doing

Reporting docker-compose file

  • Zookeeper/Kafka - not used
  • Scalyr - log analyzer
  • Consul - register Nifi and Superset as running services, service discovery
  • Nginx - reverse proxy, uses consul-template to pick up services
    • Potential conflict if both reporting and main docker compose are running on the same instance because of port conflicts
    • Also for managing authentication and SSL config
  • Assistive containers
    • config-container - config folder, similar to service-configuration in main ref-distro docker compose
      • Also hits consul API and registers Nifi and Superset
      • Has a bunch of API statements, if those need to be customized, docker-compose file would have to be directly modified
    • db-config-container - db folder
      • No real reason why it is separate from config-container
  • Some use settings.env file
  • Nifi - under onaio
    • Initially because we needed OAuth support, but now no longer the case
    • Now can move back to apache Nifi image

Reporting .env file

  • Nifi passwords are here, but not client ids or usernames
  • Nginx basic auth username and pw are for clients needing to access Nifi through Nginx

Nifi registry docker-compose file

  • In deployment repository
  • Also has a backup container to back up to s3

Spinning up reporting stack

  • docker-compose up --build --scale scalyr=0 --scale kafka=0 --scale zookeeper=0 - to build images and not bring up scalyr or kafka
  • How nginx proxy passes, checks host header, if nifi, then redirects to nifi; if superset, then redirects to superset
  • Superset image gets built each time we spin up reporting stack
    • This was not incorporated into CI process, but there is no obstacle that keeps it from being done

Reporting Stack Persistence

  • How to load data into Nifi from a Nifi registry, persisting nifi flows for a nifi instance
    • Folder reporting/config/services/nifi
    • Process groups are versioned, each separately
    • For configuring load, need bucket id and flow id
    • Property file for nifi registries - preload/registries
      • Registry client name is used by process groups, unique, used by process groups
    • Property files for flows/process groups - preload/process-groups
    • When creating a new process group, config automatically uploaded and then persisted and backed up
  • How to persist a nifi registry itself
    • Not in ref distro, but deployment repository, folder deployment/reporting_env
    • Two volumes, one for flow storage, one for db
    • Registry database has nifi metadata
    • Back up flow files to git, version control
    • Back up registry database (as well as flow files) to s3
    • When backup container comes up, gets backup files/db from s3, so it must come up before nifi-registry container
    • Backup cron schedule in .env file, default is once a minute
      • Check local files vs. s3 remote files, if local is newer, copy to s3

2019-05-07

How to do Provisioning / Deployment

  • Shared UAT instance; shared registry, nifi, superset, etc.
  • Not using AWS dashboard
  • EC2 instance, terminate SSL certs on load balancer, so we can just create a new server and attach to load balancer
  • Guidance on EC2 size?
  • Need to update recommended deployment topology and requirements to include reporting stack
  • Terraform, definition of state of servers, its maintenance (creating/destroying resources); "infrastructure as code"
    • These states are stored on s3
    • Can automatically create resources based on resource definition
    • Creates servers, load balancers, bringing up AWS stuff
  • Ansible
    • Configuring inside the server itself; installing docker and other stuff, etc.
    • Kind of like sshing into the server
  • Jenkins
    • Automated service deployment, git clone, docker-compose up
    • Use certs installed on Jenkins to connect to docker daemon on remote host
  • Could define in terraform how to limit access from only the Jenkins server
  • nifi-registry folder actually is a server that contains shared registry, nifi and superset
  • The .terraform folder is a snapshot of the infrastructure generated, not checked into source control
    • Also backed up to s3 bucket
  • Reference variables in main.tf with prefix "var."
  • Can have multiple tfvars files, scanned and evaluated alphabetically in the folder
  • Can also have default values in variables.tf
  • Start with terraform init, then terraform plan (generates a .terraform folder), then terraform apply (which executes then saves)
  • How does terraform connect with AWS to init and generate plan?
    • Need to install AWS CLI first
  • The modules folder shows how it actually creates the resources
    • The local exec section (in compute.tf) runs the ansible playbook

Superset Persistence

  • Superset Dockerfile
    • pip install a package called superset-patchup aka ketchup
    • Made modifications to superset itself
      • Want users to have role-based permissions in superset
      • Automatically create and grant role
  • Superset config python file
    • Making changes to superset config
    • Import CustomSecurityManager
    • Using OAuth, because authentication with OpenLMIS
    • OLMIS_Gamma role
    • Users can self register
    • Allow-from uat.openlmis.org, need to customize setting
  • Superset-patchup
    • Adding OAuth
    • Creating custom roles
    • Redirect URLs

OpenLMIS: the global initiative for powerful LMIS software