Reporting Stack Vision Notes
Presenter: Josh Zamor
- Presentation: Josh Zamor Please add link to presentation
Where We Are
- Nifi batch ingestion, transform, sink
- Superset on SQL
- Nifi central, plays a big role
- What would be required in this setup if we wanted to see in a report, an extra data field found in requisition? A lot of coordination and knowledge of Nifi
Where We Want To Go
Position the Reporting Stack as central for empowering analysis, assisting with integrations, and enabling smart workflows.
- Analysis, integrations, smarter workflows
- Tools, infrastructure, community practices for all things reports
- Open standards for integrations
- Smarter workflows, ex: resupply knowing fridge (because of RTM), how much need and by when, sourced from HMIS (e.g. DHIS2)
- Match OpenLMIS for upgradeability, reliability, community contribution
- Upgrade, out-of-box report upgrades as OpenLMIS does
- Reliability, stack "just works" when starting, stopping, upgrading, etc.
- An implementation should have clear expectations for how they can build, modify, and share reports with the community
What's In Our Way
- Batch oriented
- API
- Requires correct configuration (user, roles)
- API's have been (have to be) modified to support mass data ingestion
- Finicky Scheduling
- Since working in batch operations
- Write operations needing coordination, which will become more complex as we add more data management
- Schema Management
How We're Going To Get There
- More boxes, separate out responsibilities
- Nifi no longer pulls from reference data and requisition services
- Services streaming their changes to Kafka and Connect, with data pumps, PubSub, CDC/Debezium
- Nifi processors transforms messages from input topic to output topic
- Moving away from authentication
- Moving away from hitting APIs
- Connect sink to reporting db (which now has Flyway)
- Schema Registry for schema management
- Monitoring (Prometheus and Grafana), how long is the pipeline taking, from data going into requisition, to data being available for reporting
- Services now may access reporting db for smarter workflows (like adv. forecasting)
Q&A
- Reporting db currently has materialized views, in this new paradigm, who would be in charge of scheduling aggregations?
- Nifi may still handle the scheduling piece, or it could be in a Kafka topic; just details
- We don't expect to change things wholesale, will likely have an incremental approach to changing architecture
- How much will reporting stack be prioritized going forward?
- Likely high up in priority
- Reporting has always been a point of variability in implementations
- Any changes with Superset? No
- Why Prometheus and Grafana for monitoring and not Scalyr?
- They do different things; scalyr is more for searching logs in the cloud, and the others are more for metrics
OpenLMIS: the global initiative for powerful LMIS software