Reporting Stack Vision Notes

 

Presenter: Josh Zamor

  • Presentation: Josh Zamor Please add link to presentation

Where We Are

  • Nifi batch ingestion, transform, sink
  • Superset on SQL
  • Nifi central, plays a big role
  • What would be required in this setup if we wanted to see in a report, an extra data field found in requisition? A lot of coordination and knowledge of Nifi

Where We Want To Go

Position the Reporting Stack as central for empowering analysis, assisting with integrations, and enabling smart workflows.

  • Analysis, integrations, smarter workflows
    • Tools, infrastructure, community practices for all things reports
    • Open standards for integrations
    • Smarter workflows, ex: resupply knowing fridge (because of RTM), how much need and by when, sourced from HMIS (e.g. DHIS2)
  • Match OpenLMIS for upgradeability, reliability, community contribution
    • Upgrade, out-of-box report upgrades as OpenLMIS does
    • Reliability, stack "just works" when starting, stopping, upgrading, etc.
    • An implementation should have clear expectations for how they can build, modify, and share reports with the community

What's In Our Way

  • Batch oriented
  • API
    • Requires correct configuration (user, roles)
    • API's have been (have to be) modified to support mass data ingestion
  • Finicky Scheduling
    • Since working in batch operations
    • Write operations needing coordination, which will become more complex as we add more data management
  • Schema Management

How We're Going To Get There

  • More boxes, separate out responsibilities
  • Nifi no longer pulls from reference data and requisition services
  • Services streaming their changes to Kafka and Connect, with data pumps, PubSub, CDC/Debezium
  • Nifi processors transforms messages from input topic to output topic
    • Moving away from authentication
    • Moving away from hitting APIs
  • Connect sink to reporting db (which now has Flyway)
  • Schema Registry for schema management
  • Monitoring (Prometheus and Grafana), how long is the pipeline taking, from data going into requisition, to data being available for reporting
  • Services now may access reporting db for smarter workflows (like adv. forecasting)

Q&A

  • Reporting db currently has materialized views, in this new paradigm, who would be in charge of scheduling aggregations?
    • Nifi may still handle the scheduling piece, or it could be in a Kafka topic; just details
    • We don't expect to change things wholesale, will likely have an incremental approach to changing architecture
  • How much will reporting stack be prioritized going forward?
    • Likely high up in priority
    • Reporting has always been a point of variability in implementations
  • Any changes with Superset? No
  • Why Prometheus and Grafana for monitoring and not Scalyr?
    • They do different things; scalyr is more for searching logs in the cloud, and the others are more for metrics

OpenLMIS: the global initiative for powerful LMIS software