Working notes on reporting database process
@Peter Lubell-Doughtie (Unlicensed) and @Josh Zamor (Deactivated) I want to capture my notes on the current state of reporting database for discussion:
We want to move forward with PostgreSQL as the primary data storage system.
(There's still an outstanding conversation around reporting for stock card line items at scale that we need to work on with @Josh Zamor (Deactivated))
We need to setup generic data ingestion processes in Nifi.
These processes would collect information from the APIs and then convert them to an intermediary schema that is database agnostic.
All information that's stored in this schema would write to a "forever" Kafka topic for backup and rerun purposes.
This schema would likely have an output port that would push over to a set of flows that are specific to the data storage processor groups.
(This keeps the database schema separate from the ingestion process, allowing for implementers to choose which data storage process they want to implement)
We need to setup a database specific set of flows:
First, create the database schema
Once the schema is setup, start the flows to store the data in that schema
Develop Nifi flows for Superset to import the metrics that are appropriate for that data storage type
Jira structure:
EPIC: https://openlmis.atlassian.net/browse/OLMIS-4901
Needs a ticket: Ingest data and setup intermediate schema
OpenLMIS: the global initiative for powerful LMIS software