...
- Why is reporting stack incorporated into ref distro? Why is it not in the main docker compose?
- Note to create a recommended topology for reporting stack
Reporting stack pieces
- Nifi
- Open source by Apache, data ingestion, sophisticated ETL
- Connect to OpenLMIS API, get data, preprocess data, then put it into reporting stack
- Could get sources from anywhere, OpenLMIS, DHIS2, etc.
- Superset
- Also by Apache, data visualization
- Indicator queries, build charts
- Postgres
- Nifi → Postgres ← Superset
...
- How to load data into Nifi from a Nifi registry, persisting nifi flows for a nifi instance
- Folder reporting/config/services/nifi
- Process groups are versioned, each separately
- For configuring load, need bucket id and flow id
- Property file for nifi registries - preload/registries
- Registry client name is used by process groups, unique, used by process groups
- Property files for flows/process groups - preload/process-groups
- When creating a new process group, config automatically uploaded and then persisted and backed up
- How to persist a nifi registry itself
- Not in ref distro, but deployment repository, folder deployment/reporting_env
- Two volumes, one for flow storage, one for db
- Registry database has nifi metadata
- Back up flow files to git, version control
- Back up registry database (as well as flow files) to s3
- When backup container comes up, gets backup files/db from s3, so it must come up before nifi-registry container
- Backup cron schedule in .env file, default is once a minute
- Check local files vs. s3 remote files, if local is newer, copy to s3
2019-05-07
How to do Provisioning / Deployment
- Shared UAT instance; shared registry, nifi, superset, etc.
- Not using AWS dashboard
- EC2 instance, terminate SSL certs on load balancer, so we can just create a new server and attach to load balancer
- Guidance on EC2 size?
- Need to update recommended deployment topology and requirements to include reporting stack
- Terraform, definition of state of servers, its maintenance (creating/destroying resources); "infrastructure as code"
- These states are stored on s3
- Can automatically create resources based on resource definition
- Creates servers, load balancers, bringing up AWS stuff
- Ansible
- Configuring inside the server itself; installing docker and other stuff, etc.
- Kind of like sshing into the server
- Jenkins
- Automated service deployment, git clone, docker-compose up
- Use certs installed on Jenkins to connect to docker daemon on remote host
- Could define in terraform how to limit access from only the Jenkins server
- nifi-registry folder actually is a server that contains shared registry, nifi and superset
- The .terraform folder is a snapshot of the infrastructure generated, not checked into source control
- Also backed up to s3 bucket
- Reference variables in main.tf with prefix "var."
- Can have multiple tfvars files, scanned and evaluated alphabetically in the folder
- Can also have default values in variables.tf
- Start with terraform init, then terraform plan (generates a .terraform folder), then terraform apply (which executes then saves)
- How does terraform connect with AWS to init and generate plan?
- Need to install AWS CLI first
- The modules folder shows how it actually creates the resources
- The local exec section (in compute.tf) runs the ansible playbook
Superset Persistence
- Superset Dockerfile
- pip install a package called superset-patchup aka ketchup
- Made modifications to superset itself
- Want users to have role-based permissions in superset
- Automatically create and grant role
- Superset config python file
- Making changes to superset config
- Import CustomSecurityManager
- Using OAuth, because authentication with OpenLMIS
- OLMIS_Gamma role
- Users can self register
- Allow-from uat.openlmis.org, need to customize setting
- Superset-patchup
- Adding OAuth
- Creating custom roles
- Redirect URLs