/
Dashboard Pipeline Research Notes
Dashboard Pipeline Research Notes
Nifi
Improvement: save all Nifi process groups in some version control
DB Backup
- Appears to run once a day around midnight
Populate PAV stockouts table from SELV
- Appears to run once a day around 4am
- Provinces and districts are manually mapped to match DHIS2, also ARMAZENS INTERMEDIARIOS P and CENTRAL LEVEL are excluded
- Bug: gets all rows from facility visits report table and tries to insert them all into stockouts table, creating many errors, which are logged. Need to improve this process to only add new rows, like "Populate PAV selv_openlmis table from SELV"
Populate PAV selv_openlmis table from SELV
- Appears to run once a day around 1am
- Unclear what this table is used for
Populate coverage, utilization and wastage data from DHIS2
- Appears to run once a day around 2am
- There is a manual processor (stopped) that can trigger a run once/minute
- Only grab data from DHIS2 if it is the 1st of the month (to grab data for the previous month) or the 11th of the month (to grab data for the current month)
- To manually grab DHIS2 data, especially to load data from previous months, need to create a link from "Split organisationunitid as attribute" processor to manual "Update period attribute" processor and set the latter processor's "pe" attribute to the DHIS2 period you want to load, then start the whole processor group
Populate CE, RED/REC, ESISTAFE from email
- Is checking the dashboard-pav@openlmis.org email inbox constantly
- This processor group seems to be the most error prone
- Only continues if the From email address is in an accepted list of senders (which is in "Check if email is in list of senders" processor)
- Ignores emails without attachments
- Even though all attachments convert to CSV, sending a CSV does not seem to work well
- In order for the email and attachment to be processed properly, these must be done correctly:
- The subject must match an exact format (either "2019-09-Monthly-Report" for September 2019, or "2019-Period-3-Report" for 3rd quarter 2019)
- The filename must be exactly correct (trimestral_reports_mb, trimestral_reports_ce, red_rec, esistafe) Example: trimestral_reports_mb.xlsx
- The sheet name must match filename, except for esistafe, which should have sheet name of "sheet1"
- Province and district names must be exactly correct (typos or incorrect entries will cause the pipeline to fail)
- For trimestral_reports_ce, the media column should be either radio or palestra (or lecture)
- There is a wait/notify pattern where the "Get statement type" figures out if the data already exists in the database, then chooses INSERT or UPDATE for the PutDatabaseRecord processor in the "Insert records and send email" processor group
- Bug: this wait/notify has a problem where the wrong set of records might be put into the database, because one record is released at a time and the same key "putRecord" is used for all records
- The same error email is sent if any error happens. Additionally, the flowfile attributes do not help in diagnosing the error. Improvement: send different emails and give more useful diagnostic information in the error email
- A success email is sent if the data is put into the database
Refresh materialized views
- Appears to run once a day around 7am
- Views refreshed: dhis2_antigens_mv, dhis2_vaccine_breakage_mv, selv_mv, trimestral_reports_mb_mv, trimestral_reports_ce_mv, selv_country_stockout_mv, selv_province_stockout_mv, selv_district_stockout_mv_v4, redrec_mv_v2, esistafe_mv, population_mv, combined_data_mv
- Unclear what selv_mv is used for, does not seem to be used in Superset
- Unclear if combined_data_mv is really necessary; it is a view combining a bunch of other views, so if a view needs to be modified, this one needs to be dropped and re-added
- Improvement: redrec_mv_v2 should replace redrec_mv
Update Province username mappings
- This is to populate the province_username_filters table, to only show some province data based on username in Superset
- This needs to be updated whenever users need to see different provinces than they currently do, or new users are added
- The table must be cleared before running this process group. Improvement: do this automatically when running the process group
Create initial PAV tables, Populate initial organization levels
- These generally can be ignored; they are only necessary if the database needs to be re-created
Superset DB
- Need to keep track of the views stored in the PAV database, as they are what Superset uses to build its tables, charts and dashboards
- dhis2_antigens_mv - for Coverage (Cobertura)
- dhis2_vaccine_breakage_mv - for Utilization (Utilização) and Breakage (Quebra)
- trimestral_reports_mb_mv and trimestral_reports_ce_mv - for Community Engagement (Envolvimento Comunitário)
- selv_country_stockout_mv, selv_province_stockout_mv, selv_district_stockout_mv_v4 - for Stockout (Ruptura)
- Improvement?: might be able to consolidate these views into one, but it might affect performance
- redrec_mv_v2 - for RED/REC
- esistafe_mv - for HSS
- combined_data_mv - this was created to have one view with all of the data, but it may not be necessary
- Improvement?: might be better to create this combined view in Superset, rather than in the db, as if any underlying views need to be dropped/re-created, this one also needs to be dropped and re-created
- Or we might do away with the combined view altogether, and just use individual views
- Many views have field usernames which allow filtering of data by user, and time_slice, which allow filtering by quarterly periods
Superset
- Each dashboard, Central and Provincial, have their own filter boxes
- Filter box has periodo, which allows looking at historical time slices (last full month, last full quarter, last full semester, last full three quarters, year-to-date), which is what the time_slice field is for in each view
- All charts in Provincial dashboard filters by current username, so that the user only sees their own province data. Some users are supposed to see all provinces, so their usernames are in the usernames field for all rows
- Improvement: there are many tables and charts that are unused–they need to be removed
- Improvement: not much work has been done for the RED/REC tab and the HSS tab and separate dashboard, due to lack of test data–these will need to be worked on
, multiple selections available,
Related content
Data Sources
Data Sources
More like this
NiFi Training Notes
NiFi Training Notes
More like this
Dashboards
Dashboards
More like this
Reporting Stack Vision Notes
Reporting Stack Vision Notes
More like this
2019-01-15 Check-in with Clay
2019-01-15 Check-in with Clay
More like this
Current Thinking in Reporting Stack First Run
Current Thinking in Reporting Stack First Run
More like this