...
- Get v3 reports working in country, where the source system is v2.
- Deliver, in-code (repeatable and testable), an ETL pipeline that can pull data from v2 and get it into the v3 reporting stack.
- Achieve a 0 modification goal to v2 source system
- source system has no discernible impact
- Deliver a working system that has 0 bespoke modifications to v3, so that the v3 components can still be on the continual upgrade path
- Deliver a streaming system, so that the , and that the pipeline's transformation stages may be repeated at-will.
...
- Leverage the "reporting stack" to form the basis of this data migration pipeline
- Dockerized Nifi, Postgres, Kafka, Zookeeper, Debezium
- To reduce hosting costs we hope to leverage the same containers in migration as in reporting, however dependent on network topology and IT security we may need to run some containers in separate instances
- Move the reporting stack back toward a streaming (kappa) architecture (w/ Kafka yet again)
- Introduce Debezium for Change Data Capture, also to move back toward a streaming architecture, and to eliminate (nearly) load on source v2 system.
- Enable replication in eLMIS postgres (load module and change security)
- Debezium needs privileged network access to Postgres replication
- opt 1: Debezium straight to source production eLMIS postgres
- opt 2: Debezium to replicated eLMIS postgres
...
- Reporting stack learning curve
- Access to v2 production data, access to discern semantics behind structure
- Effecting availability of eLMIS source system would be very very bad
- Re-introduce Kafka (as in learning curve?)
- Introducing Debezium
- Solve for aggregate root problem
- Reporting stack doesn't achieve robustness level we'd need
...