Nifi

Improvement: save all Nifi process groups in some version control

DB Backup

Appears to run once a day around midnight

Populate PAV stockouts table from SELV

Appears to run once a day around 4am
Provinces and districts are manually mapped to match DHIS2, also ARMAZENS INTERMEDIARIOS P and CENTRAL LEVEL are excluded
Bug: gets all rows from facility visits report table and tries to insert them all into stockouts table, creating many errors, which are logged. Need to improve this process to only add new rows, like "Populate PAV selv_openlmis table from SELV"

Populate PAV selv_openlmis table from SELV

Appears to run once a day around 1am
Unclear what this table is used for

Populate coverage, utilization and wastage data from DHIS2

Appears to run once a day around 2am
- There is a manual processor (stopped) that can trigger a run once/minute
Only grab data from DHIS2 if it is the 1st of the month (to grab data for the previous month) or the 11th of the month (to grab data for the current month)
To manually grab DHIS2 data, especially to load data from previous months, need to create a link from "Split organisationunitid as attribute" processor to manual "Update period attribute" processor and set the latter processor's "pe" attribute to the DHIS2 period you want to load, then start the whole processor group

Populate CE, RED/REC, ESISTAFE from email

Is checking the dashboard-pav@openlmis.org email inbox constantly
This processor group seems to be the most error prone
Only continues if the From email address is in an accepted list of senders (which is in "Check if email is in list of senders" processor)
Ignores emails without attachments
Even though all attachments convert to CSV, sending a CSV does not seem to work well
In order for the email and attachment to be processed properly, these must be done correctly:
- The subject must match an exact format (either "2019-09-Monthly-Report" for September 2019, or "2019-Period-3-Report" for 3rd quarter 2019)
- The filename must be exactly correct (trimestral_reports_mb, trimestral_reports_ce, red_rec, esistafe) Example: trimestral_reports_mb.xlsx
- The sheet name must match filename, except for esistafe, which should have sheet name of "sheet1"
- Province and district names must be exactly correct (typos or incorrect entries will cause the pipeline to fail)
- For trimestral_reports_ce, the media column should be either radio or palestra (or lecture)
There is a wait/notify pattern where the "Get statement type" figures out if the data already exists in the database, then chooses INSERT or UPDATE for the PutDatabaseRecord processor in the "Insert records and send email" processor group
- Bug: this wait/notify has a problem where the wrong set of records might be put into the database, because one record is released at a time and the same key "putRecord" is used for all records
The same error email is sent if any error happens. Additionally, the flowfile attributes do not help in diagnosing the error. Improvement: send different emails and give more useful diagnostic information in the error email
A success email is sent if the data is put into the database

Refresh materialized views

Appears to run once a day around 7am
Views refreshed: dhis2_antigens_mv, dhis2_vaccine_breakage_mv, selv_mv, trimestral_reports_mb_mv, trimestral_reports_ce_mv, selv_country_stockout_mv, selv_province_stockout_mv, selv_district_stockout_mv_v4, redrec_mv_v2, esistafe_mv, population_mv, combined_data_mv
Unclear what selv_mv is used for, does not seem to be used in Superset
Unclear if combined_data_mv is really necessary; it is a view combining a bunch of other views, so if a view needs to be modified, this one needs to be dropped and re-added
Improvement: redrec_mv_v2 should replace redrec_mv

Update Province username mappings

This is to populate the province_username_filters table, to only show some province data based on username in Superset
This needs to be updated whenever users need to see different provinces than they currently do, or new users are added
The table must be cleared before running this process group. Improvement: do this automatically when running the process group

Create initial PAV tables, Populate initial organization levels

These generally can be ignored; they are only necessary if the database needs to be re-created

Superset DB

Need to keep track of the views stored in the PAV database, as they are what Superset uses to build its tables, charts and dashboards
- dhis2_antigens_mv - for Coverage (Cobertura)
- dhis2_vaccine_breakage_mv - for Utilization (Utilização) and Breakage (Quebra)
- trimestral_reports_mb_mv and trimestral_reports_ce_mv - for Community Engagement (Envolvimento Comunitário)
- selv_country_stockout_mv, selv_province_stockout_mv, selv_district_stockout_mv_v4 - for Stockout (Ruptura)
  - Improvement?: might be able to consolidate these views into one, but it might affect performance
- redrec_mv_v2 - for RED/REC
- esistafe_mv - for HSS
- combined_data_mv - this was created to have one view with all of the data, but it may not be necessary
  - Improvement?: might be better to create this combined view in Superset, rather than in the db, as if any underlying views need to be dropped/re-created, this one also needs to be dropped and re-created
  - Or we might do away with the combined view altogether, and just use individual views
Many views have field usernames which allow filtering of data by user, and time_slice, which allow filtering by quarterly periods

Superset

Each dashboard, Central and Provincial, have their own filter boxes
Filter box has periodo, which allows looking at historical time slices (last full month, last full quarter, last full semester, last full three quarters, year-to-date), which is what the time_slice field is for in each view
All charts in Provincial dashboard filters by current username, so that the user only sees their own province data. Some users are supposed to see all provinces, so their usernames are in the usernames field for all rows
Improvement: there are many tables and charts that are unused–they need to be removed
Improvement: not much work has been done for the RED/REC tab and the HSS tab and separate dashboard, due to lack of test data–these will need to be worked on

PAV Dashboard

Dashboard Pipeline Research Notes

Analytics

Nifi

DB Backup

Populate PAV stockouts table from SELV

Populate PAV selv_openlmis table from SELV

Populate coverage, utilization and wastage data from DHIS2

Populate CE, RED/REC, ESISTAFE from email

Refresh materialized views

Update Province username mappings

Create initial PAV tables, Populate initial organization levels

Superset DB

Superset

Related content