Transformations


WIP - goal is to think through the transformations that we'll likely need the most given our values, and not get confused with transformations that we might find commonly with out toolset which are usually geared toward analysis rather than migration.


Most Common


Mapper / Cross walk (likely the most common activity)

  • Map field name A → B

Filter

  • Discard field, row, etc
  • Validation - Filter out data that doesn't meet a validation requirement.  e.g. it could be that v2 supports something that v3 doesn't, in a way that's more nuanced than simply ignoring the data.  Or it could be something like "v3 supports X length fields, v2 has Y length fields".
  • Normalization - As in the concept of data https://en.wikipedia.org/wiki/Database_normalization

ID Replacement

  • Replace an ID with another.  Usually from some structural difference in schemas.

Aggregator

  • Add known values from source B not present in source A
  • Join data into aggregate roots


Less likely to be needed


De-duplication

  • Identify and remove duplicates

Pivot

  • e.g. pivot table.  Often used to de-normalize.  In migrating from transactional to transactional system we're not likely to need this often.


Nifi and Transformations?

Key Question is we know that Nifi /can/ do these things, the question is it the best tool for this job?  Key concern with Nifi has been managing it.


Batch transformation with Streams



OpenLMIS: the global initiative for powerful LMIS software