DevOps

Docker Compose setup

Running questions

Why is reporting stack incorporated into ref distro? Why is it not in the main docker compose?
Note to create a recommended topology for reporting stack

Reporting stack pieces

Nifi
- Open source by Apache, data ingestion, sophisticated ETL
- Connect to OpenLMIS API, get data, preprocess data, then put it into reporting stack
- Could get sources from anywhere, OpenLMIS, DHIS2, etc.
Superset
- Also by Apache, data visualization
- Indicator queries, build charts
Postgres
Nifi → Postgres ← Superset

Nifi registry

Nifi works with entities called process groups, flow of how to process data
- Essentially a bunch of XML files
- This is how to version control Nifi flows
Nifi registry docker compose file separate from reporting stack
- One registry for multiple deployments, why it is in the openlmis-deployment repository

Zookeeper/Kafka

Don't use it right now
For multiple instances of Nifi
Tied in with Kafka
Jason Rogena (Unlicensed) to exclude Kafka and Zookeeper from documentation touchups he's doing

Reporting docker-compose file

Zookeeper/Kafka - not used
Scalyr - log analyzer
Consul - register Nifi and Superset as running services, service discovery
Nginx - reverse proxy, uses consul-template to pick up services
- Potential conflict if both reporting and main docker compose are running on the same instance because of port conflicts
- Also for managing authentication and SSL config
Assistive containers
- config-container - config folder, similar to service-configuration in main ref-distro docker compose
  - Also hits consul API and registers Nifi and Superset
  - Has a bunch of API statements, if those need to be customized, docker-compose file would have to be directly modified
- db-config-container - db folder
  - No real reason why it is separate from config-container
Some use settings.env file
Nifi - under onaio
- Initially because we needed OAuth support, but now no longer the case
- Now can move back to apache Nifi image

Reporting .env file

Nifi passwords are here, but not client ids or usernames
Nginx basic auth username and pw are for clients needing to access Nifi through Nginx

Nifi registry docker-compose file

In deployment repository
Also has a backup container to back up to s3

Spinning up reporting stack

docker-compose up --build --scale scalyr=0 --scale kafka=0 --scale zookeeper=0 - to build images and not bring up scalyr or kafka
How nginx proxy passes, checks host header, if nifi, then redirects to nifi; if superset, then redirects to superset
Superset image gets built each time we spin up reporting stack
- This was not incorporated into CI process, but there is no obstacle that keeps it from being done

Reporting Stack Persistence

How to load data into Nifi from a Nifi registry, persisting nifi flows for a nifi instance
- Folder reporting/config/services/nifi
- Process groups are versioned, each separately
- For configuring load, need bucket id and flow id
- Property file for nifi registries - preload/registries
  - Registry client name is used by process groups, unique, used by process groups
- Property files for flows/process groups - preload/process-groups
- When creating a new process group, config automatically uploaded and then persisted and backed up
How to persist a nifi registry itself
- Not in ref distro, but deployment repository, folder deployment/reporting_env
- Two volumes, one for flow storage, one for db
- Registry database has nifi metadata
- Back up flow files to git, version control
- Back up registry database (as well as flow files) to s3
- When backup container comes up, gets backup files/db from s3, so it must come up before nifi-registry container
- Backup cron schedule in .env file, default is once a minute
  - Check local files vs. s3 remote files, if local is newer, copy to s3

2019-05-07

How to do Provisioning / Deployment

Shared UAT instance; shared registry, nifi, superset, etc.
Not using AWS dashboard
EC2 instance, terminate SSL certs on load balancer, so we can just create a new server and attach to load balancer
Guidance on EC2 size?
Need to update recommended deployment topology and requirements to include reporting stack
Terraform, definition of state of servers, its maintenance (creating/destroying resources); "infrastructure as code"
- These states are stored on s3
- Can automatically create resources based on resource definition
- Creates servers, load balancers, bringing up AWS stuff
Ansible
- Configuring inside the server itself; installing docker and other stuff, etc.
- Kind of like sshing into the server
Jenkins
- Automated service deployment, git clone, docker-compose up
- Use certs installed on Jenkins to connect to docker daemon on remote host
Could define in terraform how to limit access from only the Jenkins server
nifi-registry folder actually is a server that contains shared registry, nifi and superset
The .terraform folder is a snapshot of the infrastructure generated, not checked into source control
- Also backed up to s3 bucket
Reference variables in main.tf with prefix "var."
Can have multiple tfvars files, scanned and evaluated alphabetically in the folder
Can also have default values in variables.tf
Start with terraform init, then terraform plan (generates a .terraform folder), then terraform apply (which executes then saves)
How does terraform connect with AWS to init and generate plan?
- Need to install AWS CLI first
The modules folder shows how it actually creates the resources
- The local exec section (in compute.tf) runs the ansible playbook

Superset Persistence

Superset Dockerfile
- pip install a package called superset-patchup aka ketchup
- Made modifications to superset itself
  - Want users to have role-based permissions in superset
  - Automatically create and grant role
Superset config python file
- Making changes to superset config
- Import CustomSecurityManager
- Using OAuth, because authentication with OpenLMIS
- OLMIS_Gamma role
- Users can self register
- Allow-from uat.openlmis.org, need to customize setting
Superset-patchup
- Adding OAuth
- Creating custom roles
- Redirect URLs

OpenLMIS

DevOps

Analytics

Docker Compose setup

Reporting Stack Persistence

Related content