2019-10-16 TC Meeting notes

2019-10-16 TC Meeting notes

Note:  Oct 1st meeting was cancelled.

Date

Oct 16, 2019

7am PST / 4pm CEST

Meeting Link

https://zoom.us/j/428462410

Attendees

 

  • @Paulina Buzderewicz

  • @Paweł Pinker (Unlicensed)

  • @Sebastian Brudziński

  • @Daniel Serkowski (Unlicensed)

 

Discussion items

Time

Item

Who

Notes

Time

Item

Who

Notes

5m

Agenda and action item review

@Josh Zamor (Deactivated)

 

45m+

How to address performance regressions going forward in releasing 3.7

@Wesley Brown

 

(next time)

Check-in on release of 3.7

@Josh Zamor (Deactivated)

(next time)

Technical Committee moving forward?

@Josh Zamor (Deactivated)

  • We haven't had many topics the past few weeks, and have cancelled a couple calls

  • How are we being effective?

  • Should we revise the format?

(next time or separate - checking with @Ben Leibert (Unlicensed))

10m

Bumping the version of Orderables

@Daniel Serkowski (Unlicensed)

  • Approach to bumping version of Orderables when using ?

    • When we're updating Orderables but we're not committing any changes

    • Not changing the version if nothing has changed in the UPDATE/PUT/POST action

Notes

 

How to address performance regressions

  • In updating Orderables, performance has regressed.

    • Biggest area:  facility approved products - 10-15x slower than in 3.6.

    • We've improved it to 3-4x slower than in 3.6

    • In automated testing we have 9000 products, of which this is where our slowdown numbers come from.

    • A cause: change in queries

      • We tried to use native queries, perhaps a step in the wrong direction as some information was missing.

      • Introduced select N+1 problems (5-6 queries per orderable)

    • In Orderables we have had a Select N+1 issue

      • In 3.7 we've made slight improvements: using lazy loading using a batch size set (1000)

      • 30% improvement (as measured manually with Malawi dataset)

    • Select N+1

      • Tried native queries, HQL, etc

      • Tried using various guides, recommending HQL, fetch joins, not one thing we've found to fix it.

      • Are we monitoring how many queries are being used?  Yes, starting to - guide in current ticket (OLMIS-6566)

  • We've been working to address the above

  • What are some strategies we can use to prevent this going forward?

    • Making performance tests not flaky - it had found this, however it wasn't noticed until we dug into flaky performance tests.

      • Look into the previous strategy of quarantining of performance tests (could be useful in other testing) (@Josh Zamor (Deactivated) find prev TC notes and link here)

        • People should feel free to quarantine a test.

        • We should be cautious about the team communication in that judgement call - and how long it takes to fix and get new information.

          • So running in a tight-loop continuously is a possible approach for this.

        • Build Status Review - daily meeting and the team reviews the status of each testing build (Jenkins dashboard). This helps with the team communication + judgement call.  It might be supported with more automation.

      • Is moving away from performance criteria (as in p(90) < 500ms), and instead graphing the result to spot outliers as opposed to trends.

        • Enthusiasm behind doing this

        • Philip started down this path - I need find this again

          • there's is non-trivial work to make a good graph by test case

      • Stabilizing test runs

        • Find a way to have these numbers be more stable

        • Increase samples

        • Focus on stabilizing Reference Data first as everything relies on this

  • How should this be prioritized for 3.8?  Fewer features in favor of fixing this?

    • We know which areas are frequently used - so focus on the areas that are used all the time

    • Background:

      • Limited budget

      • Already cut into timeline for 3.8

      • Less new things, improve what we already have (e.g. improving tech debt)

        • New things are features / user-facing things

      • PC only proposing one new thing:  configuration.  PC feedback wasn't a killer feature for 3.8 - so we're saying that this might be the time to focus solely on tech debt → performance and testing

    • SD team is ready to commit 3 development sprints to testing and back-end improvements (to release 3.8 this year).

 

 

Notes for next time:

Check-in on release of 3.7

 

Would we revise the priorities we set last time?

  • There is interest in (via slack):

    • Upgrade Spring Boot

    • Upgrade RAML

    • Release Faster epic

 

 

Action Items

OpenLMIS: the global initiative for powerful LMIS software