Javers log initializer retrieves all domain objects - potential huge performance issue

Description

When the system starts, the javers log initializer retrieves all domain objects (requisitions) and creates history for those that don't have it (executes a query for each one).

https://github.com/OpenLMIS/openlmis-requisition/blob/master/src/main/java/org/openlmis/AuditLogInitializer.java#L63

With Malawi having hundred thousands requisitions in plans, this has potential to blow up.

This seems primarily done to create a new history for records that don't have it yet (demo data/ETL entries from my understanding). Is there a better way to handle this? A migration script perhaps? Or creating them real time?

Activity

Show:
Łukasz Lewczyński
August 8, 2017, 10:38 AM
Edited


without changes (JAVA_OPTS: -server -Xmx512m)
result: java.lang.OutOfMemoryError: GC overhead limit exceeded

with changes (JAVA_OPTS: -server -Xmx512m)
result: no errors

But only because I set page size to 2000 elements when I used Pagination.DEFAULT_PAGE_SIZE( = Integer.MAX_VALUE) then I got the same error.

cc:

Brandon Bowersox-Johnson
August 25, 2017, 10:46 PM

Hey , Josh and Chongsun and I did not understand what we did here or what the next ticket should be. I assume we will cover this briefly at showcase. Please file the next ticket that you suggest needs to happen (after 3.2.0) based on this. It seems like we still have a very slow startup time we need to address in the future ticket.

Also, do we need to document the PAGE SIZE setting issue? Like do we need to warn people the service will crash on boot if they have a bad PAGE SIZE?

Paweł Gesek
August 26, 2017, 9:09 AM
Edited

I've created for the next improvement. I'll admit that the improvement made here is not exactly what I was hoping for, but I know Malawi has more pressing issues now.

The problem is that at startup services retrieve all records for an auditable class (for example all requisitions) and iterate over them, creating history for the records that do not have it. This means that the service fetches hundreds of thousands of requisitions and iterates over them every time at startup.

What was done in this ticket, was making that retrieval use pagination, so that it does not put hundreds of thousands of requisitions into memory before iterating over them. In that sense, it prevents this blowing up server resources because of the memory usage, but it still will do the iterating.

Regarding the page size, thanks for pointing that out, I just realized i misunderstood the comment from . If we are currently using DEFAULT_PAGE_SIZE in our code, we have to change it to something reasonable, so that these changes actually makes any impact. So I am reopening this - please change the page size.

On a side note, the DEFAULT_PAGE_SIZE constant in our code is nonsensical, since it equals to Integer.MAX_VALUE (I pointed it out in a review not so long ago). Just using Integer.MAX_VALUE in places we want to get all the data is way more obvious in terms of code readability. When I see DEFAULT_PAGE_SIZE, I am thinking something like 100. So , please remove that constant while you're at it. (alternatively rename it to NO_PAGINATION)

Łukasz Lewczyński
August 28, 2017, 7:14 AM
Edited

I set the DEFAULT_PAGE_SIZE to 2000.

Paweł Gesek
August 29, 2017, 9:16 AM

^ We went with NO_PAGINATION in the end

Done

Assignee

Paweł Gesek

Reporter

Paweł Gesek

Labels

None

Story Points

1

Time tracking

0m

Time remaining

30m

Sprint

None

Fix versions

Priority

Major