Technical Values & Approach

Technical Values & Approach



Project goal:  Deliver v3 reports to users of v2 / eLMIS in Tanzania quickly - without a "full" upgrade.



Values

  • Pipelines allow us to break the problem down into structured independent data flows.

  • Everything is repeatable

    • Infrastructure as Code

    • (Re-)Running Pipelines

    • Experimentation

  • Nothing is manual, nothing is a one-off  - Automate

    • From environment setup to deployment, we put everything we can in code.

    • We never "just do a one-time tweak to the database/API/app/etc"

    • Backups

  • Resilient

    • Not a "big bang" migration, it will need to function continuously for weeks, months, perhaps even a year or longer.

    • During lifetime eLMIS and OpenLMIS v3 will have releases, data models will change without notice.

    • Source and destination software releases should slow new data from flowing, however both systems should stay available.

  • Data is usable in near-real time, a user should expect their data once entered to be available in seconds to minutes, not tens of minutes, hours or days.

  • Source system should be unaware and unaffected

    • We must demonstrate that eLMIS availability and performance can't be noticeably effected.

    • We must convince eLMIS developers that they do not need to coordinate releases with us, merely notify us of what's changing and by when.



Approach

Tech Stack:  Tech Stack

  • Deployed via Terraform, Ansible and Docker.

    • Terrraform & Ansible to provision environments.

    • Ansible to manage deployed application.

    • Docker as deployment unit and runtime.

  • Streaming architecture centered around Kafka

    • Decouple source and destination systems.

    • Native support for pipelines.

    • Kafka is one of the top leading technologies - built in support for high-performance, repeatable pipelines, and resilient log-architecture.

  • Change Data Capture using Debezium

    • Debezium works off the same internal PostgreSQL mechanism as replication - eLMIS database should be unaware and unaffected by it's presence.

    • Change Data Capture allows us to capture changes made directly to a database as they're made

    • We have direct access to the source data & it's schema, including changes to either.

    • Resilient against network outages - if the network is down, then Debezium simply will queue until it's restored.

  • Monitoring using Prometheus and Grafana

  • Control using ???

    • Simple (e.g. REST) interface for starting/stopping pipeline

    • Clear a pipeline's destination, and re-run pipeline (e.g. clear a table in v3, re-run the pipeline with a new transformation phase).

  • Simple Transformations using Nifi (or perhaps Java / Python)

    • Simple transformations may be chained together as needed, and are more testable.

    • Nifi is provisional:

      • Nifi is a tool we've been learning

      • Nifi Processors are a known commodity for building transformation stages

      • Nifi has been one of the more difficult tools to use, and it's more suited for building entire pipelines that are run one-off, rather than simple transformations.

      • We don't have a good method for testing Nifi - individual processors.

  • Schema Registry (optional)

    • Reduces size of messages in Kafka for database changes, improving performance.

    • Optional as it's mostly a performance enhancement.

OpenLMIS: the global initiative for powerful LMIS software