DevOps
Docker Compose setup
Running questions
Why is reporting stack incorporated into ref distro? Why is it not in the main docker compose?
Note to create a recommended topology for reporting stack
Reporting stack pieces
Nifi
Open source by Apache, data ingestion, sophisticated ETL
Connect to OpenLMIS API, get data, preprocess data, then put it into reporting stack
Could get sources from anywhere, OpenLMIS, DHIS2, etc.
Superset
Also by Apache, data visualization
Indicator queries, build charts
Postgres
Nifi → Postgres ← Superset
Nifi registry
Nifi works with entities called process groups, flow of how to process data
Essentially a bunch of XML files
This is how to version control Nifi flows
Nifi registry docker compose file separate from reporting stack
One registry for multiple deployments, why it is in the openlmis-deployment repository
Zookeeper/Kafka
Don't use it right now
For multiple instances of Nifi
Tied in with Kafka
@Jason Rogena (Unlicensed) to exclude Kafka and Zookeeper from documentation touchups he's doing
Reporting docker-compose file
Zookeeper/Kafka - not used
Scalyr - log analyzer
Consul - register Nifi and Superset as running services, service discovery
Nginx - reverse proxy, uses consul-template to pick up services
Potential conflict if both reporting and main docker compose are running on the same instance because of port conflicts
Also for managing authentication and SSL config
Assistive containers
config-container - config folder, similar to service-configuration in main ref-distro docker compose
Also hits consul API and registers Nifi and Superset
Has a bunch of API statements, if those need to be customized, docker-compose file would have to be directly modified
db-config-container - db folder
No real reason why it is separate from config-container
Some use settings.env file
Nifi - under onaio
Initially because we needed OAuth support, but now no longer the case
Now can move back to apache Nifi image
Reporting .env file
Nifi passwords are here, but not client ids or usernames
Nginx basic auth username and pw are for clients needing to access Nifi through Nginx
Nifi registry docker-compose file
In deployment repository
Also has a backup container to back up to s3
Spinning up reporting stack
docker-compose up --build --scale scalyr=0 --scale kafka=0 --scale zookeeper=0 - to build images and not bring up scalyr or kafka
How nginx proxy passes, checks host header, if nifi, then redirects to nifi; if superset, then redirects to superset
Superset image gets built each time we spin up reporting stack
This was not incorporated into CI process, but there is no obstacle that keeps it from being done
Reporting Stack Persistence
How to load data into Nifi from a Nifi registry, persisting nifi flows for a nifi instance
Folder reporting/config/services/nifi
Process groups are versioned, each separately
For configuring load, need bucket id and flow id
Property file for nifi registries - preload/registries
Registry client name is used by process groups, unique, used by process groups
Property files for flows/process groups - preload/process-groups
When creating a new process group, config automatically uploaded and then persisted and backed up
How to persist a nifi registry itself
Not in ref distro, but deployment repository, folder deployment/reporting_env
Two volumes, one for flow storage, one for db
Registry database has nifi metadata
Back up flow files to git, version control
Back up registry database (as well as flow files) to s3
When backup container comes up, gets backup files/db from s3, so it must come up before nifi-registry container
Backup cron schedule in .env file, default is once a minute
Check local files vs. s3 remote files, if local is newer, copy to s3
2019-05-07
How to do Provisioning / Deployment
Shared UAT instance; shared registry, nifi, superset, etc.
Not using AWS dashboard
EC2 instance, terminate SSL certs on load balancer, so we can just create a new server and attach to load balancer
Guidance on EC2 size?
Need to update recommended deployment topology and requirements to include reporting stack
Terraform, definition of state of servers, its maintenance (creating/destroying resources); "infrastructure as code"
These states are stored on s3
Can automatically create resources based on resource definition
Creates servers, load balancers, bringing up AWS stuff
Ansible
Configuring inside the server itself; installing docker and other stuff, etc.
Kind of like sshing into the server
Jenkins
Automated service deployment, git clone, docker-compose up
Use certs installed on Jenkins to connect to docker daemon on remote host
Could define in terraform how to limit access from only the Jenkins server
nifi-registry folder actually is a server that contains shared registry, nifi and superset
The .terraform folder is a snapshot of the infrastructure generated, not checked into source control
Also backed up to s3 bucket
Reference variables in main.tf with prefix "var."
Can have multiple tfvars files, scanned and evaluated alphabetically in the folder
Can also have default values in variables.tf
Start with terraform init, then terraform plan (generates a .terraform folder), then terraform apply (which executes then saves)
How does terraform connect with AWS to init and generate plan?
Need to install AWS CLI first
The modules folder shows how it actually creates the resources
The local exec section (in compute.tf) runs the ansible playbook
Superset Persistence
Superset Dockerfile
pip install a package called superset-patchup aka ketchup
Made modifications to superset itself
Want users to have role-based permissions in superset
Automatically create and grant role
Superset config python file
Making changes to superset config
Import CustomSecurityManager
Using OAuth, because authentication with OpenLMIS
OLMIS_Gamma role
Users can self register
Allow-from uat.openlmis.org, need to customize setting
Superset-patchup
Adding OAuth
Creating custom roles
Redirect URLs