2018-04-03 Meeting notes

Date

7am PST / 4pm CET


Note we're using Zoom now:


Join from PC, Mac, Linux, iOS or Android: https://zoom.us/j/737202874

Or iPhone one-tap :
US: +16699006833,,737202874# or +14086380968,,737202874#
Or Telephone:
Dial(for higher quality, dial a number based on your current location):
US: +1 669 900 6833 or +1 408 638 0968 or +1 646 876 9923
Meeting ID: 737 202 874
International numbers available: https://zoom.us/zoomconference?m=F0kj5u6u0wFxr7Xfz5NKZxl0xyLhuTPF


Attendees

Goals

  • See discussion items


Discussion items

TimeItemWhoNotes
10mAgenda and action item reviewJosh Zamor

Open Action item review from https://openlmis.atlassian.net/wiki/x/vAAND

Open action items from last time:


Database migration speed?Chongsun Ahn (Unlicensed)

Migration testing?  Integration?  Job?Chongsun Ahn (Unlicensed)
(didn't get to)Cross-service migrationChongsun Ahn (Unlicensed)
  • Re-use run-sql
  • Bespoke
  • Template for cross-service migration / exemplar?

Notes


Database migration speed


While testing latest code against production data for release testing, we found that fulfillment service wasn't starting in a short amount of time - the service was starting, but stuck in a long migration script.


This was because updates to schema / copying data was on a table which had 7+ million rows.  Through a bit of digging uncovered this recommendation:  https://dba.stackexchange.com/questions/52517/best-way-to-populate-a-new-column-in-a-large-table/52531#52531 . Which recommends duplicating the table, making the changes to the duplicate, and then replacing the old table with the new.


Question:  How do we ensure that when we're writing migrations, we're avoiding writing migrations that take hours?

We do know with our implementations that some hours of downtime are:


  1. Monitor the flyway migration time (we'd need a threshold value and/or we'd need those 7 million rows to test against as an example).
  2. Some teams won't allow certain statements to be used.


Prefer a mix of 1 and 2.

Where do we get the data for item #1?  For things like PoD line items that has a lot of data, we should commit to making performance data.


Action Items:

  • How do we get the data for testing (when to decide to invest in creating performance data)?
  • What are the exemplars for performant migrations / which types of migrations take a long time (is there a postgres DBA in the house?)
  • Do we intend to move performance data into demo data - do we want to do that?  It's separate data sets, we think might like to combine for maintenance and consistency)


Migration testing


Current approach is to use a jenkins job that:

  • starts the previous version of ref distro
  • stops it
  • starts the latest code which runs the new migrations
  • failure is if a migration error is encountered, success otherwise


We also have experimented with a pattern of writing integration tests for individual migrations.


Some concerns with it:

  • Migration's don't change, QA could find migration issues
  • Are integration tests written for the edge cases
  • We'd need more work:  write a test for each migration


Some concerns with the current approach:

  • If there's no demo data, the tests don't cover much
  • If a service isn't normally included in Ref Distro, it won't get covered.
  • There have been some issues that have given us concern, and while we think we know what they are, we'd still like to revisit the approach.


Josh asks:  Central to any migration testing approach is having and loading data.  In the past I've used DbUnit to setup the database before test, and revert it to pre-test condition after the test has run.  Is this tool (or one like it) still valuable?

Action Items

  • Sebastian Brudziński:  Start a discussion on: 
    • Do we intend to move performance data into demo data - do we want to do that?  It's separate data sets, we think might like to combine for maintenance and consistency)
  • Chongsun Ahn (Unlicensed):  Grab one of the options on migration performance and start a dev forum post
  • Josh Zamor and Chongsun Ahn (Unlicensed):  write a ticket to investigate the CCE migration test issue
  • Josh Zamor to followup on the db migration test port - close up 8500 in ref distro before the release


OpenLMIS: the global initiative for powerful LMIS software