2019-10-16 TC Meeting notes

Note:  Oct 1st meeting was cancelled.

Date

7am PST / 4pm CEST

https://zoom.us/j/428462410

Attendees



Discussion items

TimeItemWhoNotes
5mAgenda and action item reviewJosh Zamor
45m+How to address performance regressions going forward in releasing 3.7Wesley Brown
(next time)Check-in on release of 3.7Josh Zamor
(next time)Technical Committee moving forward?Josh Zamor
  • We haven't had many topics the past few weeks, and have cancelled a couple calls
  • How are we being effective?
  • Should we revise the format?

(next time or separate - checking with Ben Leibert (Unlicensed))

10m

Bumping the version of OrderablesDaniel Serkowski (Unlicensed)
  • Approach to bumping version of Orderables when using ?
    • When we're updating Orderables but we're not committing any changes
    • Not changing the version if nothing has changed in the UPDATE/PUT/POST action

Notes


How to address performance regressions

  • In updating Orderables, performance has regressed.
    • Biggest area:  facility approved products - 10-15x slower than in 3.6.
    • We've improved it to 3-4x slower than in 3.6
    • In automated testing we have 9000 products, of which this is where our slowdown numbers come from.
    • A cause: change in queries
      • We tried to use native queries, perhaps a step in the wrong direction as some information was missing.
      • Introduced select N+1 problems (5-6 queries per orderable)
    • In Orderables we have had a Select N+1 issue
      • In 3.7 we've made slight improvements: using lazy loading using a batch size set (1000)
      • 30% improvement (as measured manually with Malawi dataset)
    • Select N+1
      • Tried native queries, HQL, etc
      • Tried using various guides, recommending HQL, fetch joins, not one thing we've found to fix it.
      • Are we monitoring how many queries are being used?  Yes, starting to - guide in current ticket (OLMIS-6566)
  • We've been working to address the above
  • What are some strategies we can use to prevent this going forward?
    • Making performance tests not flaky - it had found this, however it wasn't noticed until we dug into flaky performance tests.
      • Look into the previous strategy of quarantining of performance tests (could be useful in other testing) (Josh Zamor find prev TC notes and link here)
        • People should feel free to quarantine a test.
        • We should be cautious about the team communication in that judgement call - and how long it takes to fix and get new information.
          • So running in a tight-loop continuously is a possible approach for this.
        • Build Status Review - daily meeting and the team reviews the status of each testing build (Jenkins dashboard). This helps with the team communication + judgement call.  It might be supported with more automation.
      • Is moving away from performance criteria (as in p(90) < 500ms), and instead graphing the result to spot outliers as opposed to trends.
        • Enthusiasm behind doing this
        • Philip started down this path - I need find this again
          • there's is non-trivial work to make a good graph by test case
      • Stabilizing test runs
        • Find a way to have these numbers be more stable
        • Increase samples
        • Focus on stabilizing Reference Data first as everything relies on this
  • How should this be prioritized for 3.8?  Fewer features in favor of fixing this?
    • We know which areas are frequently used - so focus on the areas that are used all the time
    • Background:
      • Limited budget
      • Already cut into timeline for 3.8
      • Less new things, improve what we already have (e.g. improving tech debt)
        • New things are features / user-facing things
      • PC only proposing one new thing:  configuration.  PC feedback wasn't a killer feature for 3.8 - so we're saying that this might be the time to focus solely on tech debt → performance and testing
    • SD team is ready to commit 3 development sprints to testing and back-end improvements (to release 3.8 this year).



Notes for next time:

Check-in on release of 3.7


Would we revise the priorities we set last time?

  • There is interest in (via slack):
    • Upgrade Spring Boot
    • Upgrade RAML
    • Release Faster epic



Action Items

OpenLMIS: the global initiative for powerful LMIS software