Approach to bumping version of Orderables when using ?
When we're updating Orderables but we're not committing any changes
Not changing the version if nothing has changed in the UPDATE/PUT/POST action
Notes
How to address performance regressions
In updating Orderables, performance has regressed.
Biggest area: facility approved products - 10-15x slower than in 3.6.
We've improved it to 3-4x slower than in 3.6
In automated testing we have 9000 products, of which this is where our slowdown numbers come from.
A cause: change in queries
We tried to use native queries, perhaps a step in the wrong direction as some information was missing.
Introduced select N+1 problems (5-6 queries per orderable)
In Orderables we have had a Select N+1 issue
In 3.7 we've made slight improvements: using lazy loading using a batch size set (1000)
30% improvement (as measured manually with Malawi dataset)
Select N+1
Tried native queries, HQL, etc
Tried using various guides, recommending HQL, fetch joins, not one thing we've found to fix it.
Are we monitoring how many queries are being used? Yes, starting to - guide in current ticket (OLMIS-6566)
We've been working to address the above
What are some strategies we can use to prevent this going forward?
Making performance tests not flaky - it had found this, however it wasn't noticed until we dug into flaky performance tests.
Look into the previous strategy of quarantining of performance tests (could be useful in other testing) (Josh Zamor find prev TC notes and link here)
People should feel free to quarantine a test.
We should be cautious about the team communication in that judgement call - and how long it takes to fix and get new information.
So running in a tight-loop continuously is a possible approach for this.
Build Status Review - daily meeting and the team reviews the status of each testing build (Jenkins dashboard). This helps with the team communication + judgement call. It might be supported with more automation.
Is moving away from performance criteria (as in p(90) < 500ms), and instead graphing the result to spot outliers as opposed to trends.
Enthusiasm behind doing this
Philip started down this path - I need find this again
there's is non-trivial work to make a good graph by test case
Stabilizing test runs
Find a way to have these numbers be more stable
Increase samples
Focus on stabilizing Reference Data first as everything relies on this
How should this be prioritized for 3.8? Fewer features in favor of fixing this?
We know which areas are frequently used - so focus on the areas that are used all the time
Background:
Limited budget
Already cut into timeline for 3.8
Less new things, improve what we already have (e.g. improving tech debt)
New things are features / user-facing things
PC only proposing one new thing: configuration. PC feedback wasn't a killer feature for 3.8 - so we're saying that this might be the time to focus solely on tech debt → performance and testing
SD team is ready to commit 3 development sprints to testing and back-end improvements (to release 3.8 this year).