Developer Onboarding Guide
Who this is for / How to use this
This guide is for implementation developers who will be deploying, maintaining, and/or improving Casper (e.g. TZ eLMIS developers). This document has five parts:
Everyone should read the overview section. Then, you should use the following sections based on what your goals are.
The current Casper demo has code in the following git repositories:
openlmis/openlmis-deployment - Used for provisioning Amazon Web Services (AWS) resources.
openlmis/casper-deployment - Defines jobs in its
.gitlab-ci.ymlwhich deploy the five pieces of the Casper architectureopenlmis/casper-elmis - Used for compiling eLMIS and packaging it into a Docker image
openlmis/casper-pipeline - The "real work" of Casper is done by the pipeline, which connects eLMIS v2 to OpenLMIS v3, transforming data along the way
villagereach/openlmis-config - Holds configuration files (i.e.
settings.envfiles) which hold secrets (e.g. API access tokens, passwords)
Deployment is done from the master branch of each repository. Three of the repositories are hosted on Gitlab, instead of Github, primarily so that we can use Gitlab instead of Jenkins to run the deployment jobs.
In addition, eLMIS is deployed from the elmis-tzm/open-lmis repository, and OpenLMIS v3 is deployed via the openlmis/openlmis-ref-distro repository.
The five pieces of the Casper architecture are deployed on three AWS EC2 instances, as shown in the following diagram:
casper-elmis.a.openlmis.org casper.a.openlmis.org casper-superset.a.openlmis.org
↓ ↓ ↓
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
| eLMIS v2 | | |---→| v3 Reporting |
|-----------------| | OpenLMIS v3 | |-----------------|
| Casper pipeline |---→| | | NiFi Registry |
└─────────────────┘ └─────────────────┘ └─────────────────┘
↑ ↑
casper-elmis.a.openlmis.org:8080/nifi casper-nifi-registry.a.openlmis.org
The eLMIS instance is deployed on the same machine as the Casper pipeline, which collects data from it. The NiFi registry is deployed on the same machine as the OpenLMIS v3 reporting stack, since the NiFi registry is only used by the reporting stack (the Casper pipeline is built on NiFi, but it doesn't use the registry).
Key Technologies Used
Docker and docker-compose - These are perhaps the most important tool for deploying Casper. I highly recommend reading both of the linked guides. Each of the five components used by the Casper demo is defined by a single
docker-compose.ymlfile. Each docker-compose file in turn lists between 1 and 17 docker containers used by that component.eLMIS: docker-compose.yml
Casper pipeline: docker-compose.yml
OpenLMIS v3: docker-compose.yml
v3 reporting stack: docker-compose.yml
NiFi registry: docker-compose.yml
NiFi - This is the primary piece of the Casper pipeline, and it is also used by the v3 reporting stack. The NiFi transforms in the Casper pipeline are what turn data in v2's format into data for v3's format. NiFi (and the rest of the pipeline) does "stream processing": the NiFi process is always waiting and listening for new input data (from v2), which is transformed (into v3 data) and passed forward as soon as it is received, without waiting for more data. I recommend starting to read the linked NiFi guide, though it may be more useful to looking things up as you need them. More details are below in the "How the Casper Pipeline Works" section
Terraform (and Ansible) - Terraform is used for creating resources on AWS. Those resources include servers (EC2 instances), domain names, databases, firewalls, and more. These resources are all defined by our terraform code, an approach known as "infrastructure as code". Though terraform is used to set up servers for the Casper demo, it terraform would not work with NDC servers.
Additional Technologies Used
TODO: what should we link here ?
Kafka - Along with NiFi, Kafka is used to drive the Casper pipeline. Each change to the v2 database becomes a message on a Kafka topic, and NiFi outputs transformed data as messages to other Kafka topics
Debezium - Streams data from eLMIS into Kafka
Superset - The data visualization web app that is the user interface for the OpenLMIS v3 reporting stack
AWS - The Casper demo is entirely hosted on Amazon's cloud services, managed by Terraform
Grafana - We run this application for monitoring the Casper pipeline to view statistics about Kafka and NiFi
The diagram in the "Overview of the Casper Demo" section shows how the the five components of the demo are deployed to three servers. The goal of this section is to provide instructions for setting up all of the parts in that diagram.
Provisioning
This part of the guide assumes that you are using AWS to host everything and have and AWS account with IAM credentials (an access key ID and a secret access key). Provisioning AWS resources for the Casper demo is done through the, using our Terraform configuration.
Installation should be done on a machine that will control the targets. This is most likely your development computer.
Setup
Install the Terraform command line v0.11 on your development computer
Install ansible on your development computer
Note that installing on OSX has been reported to be tricky. You should use virtualenv otherwise errors seem to be likely. This guide is useful for OSX users. Use Python 2.x, not 3.x. When using a virtualenv, do not use
sudo pip install, instead drop thesudowhich allows pip to install ansible in the virtualenv.mkvirtualenv olmis-deploymentif you need a new virtual environment.
Install pip, a package manager for Python
Install the requirements for our Ansible scripts:
$ pip install -r ../ansible/requirements/ansible.pipClone the openlmis-deployment git repository
$ git clone https://github.com/OpenLMIS/openlmis-deployment.git
Running Terraform
You will have to repeat these steps for each of the three machines used for the deployment. (It would be possible to put all the resources in a single terraform environment, but this way they can be managed separately.)
Set up your AWS access keys.
$ export TF_VAR_aws_access_key_id=$AWS_ACCESS_KEY_ID $ export TF_VAR_aws_secret_access_key=$AWS_SECRET_ACCESS_KEYGo to one of the three subdirectories of the openlmis-deployment/provision/terraform/casper/ directory (e.g. v2/ for the server with eLMIS and the pipeline).
$ cd openlmis-deployment/provision/terraform/casper/v2/Prepare terraform (this creates the
.terraformdirectory):$ terraform initStart up the resources defined in the current folder. Or, if they are already running (e.g. if you are using the VillageReach AWS account), apply changes from any edited files. This command will ask for confirmation before actually making changes:
$ terraform applyYou should be able to check that the newly created resources are working by pinging them, even though no applications are deployed yet e.g.:
$ ping casper-elmis.a.openlmis.org
OpenLMIS: the global initiative for powerful LMIS software