WIP - goal is to think through the transformations that we'll likely need the most given our values, and not get confused with transformations that we might find commonly with out toolset which are usually geared toward analysis rather than migration.
Most Common
Mapper / Cross walk (likely the most common activity)
...
- Discard field, row, etc
- Validation - Filter out data that doesn't meet a validation requirement. e.g. it could be that v2 supports something that v3 doesn't, in a way that's more nuanced than simply ignoring the data. Or it could be something like "v3 supports X length fields, v2 has Y length fields".
- Normalization - As in the concept of data https://en.wikipedia.org/wiki/Database_normalization
ID Replacement
- Replace an ID with another. Usually from some structural difference in schemas.
...
- e.g. pivot table. Often used to de-normalize
...
- . In migrating from transactional to transactional system we're not likely to need this often.
Nifi and Transformations?
Key Question is we know that Nifi /can/ do these things, the question is it the best tool for this job? Key concern with Nifi has been managing it.
Batch transformation with Streams
- On joining streams: https://debezium.io/blog/2018/03/08/creating-ddd-aggregates-with-debezium-and-kafka-streams/
- An alternative that publishes an aggregate (domain root) before it hits the stream: https://debezium.io/blog/2018/09/20/materializing-aggregate-views-with-hibernate-and-debezium/
- Note: this would require modifying the source system, which for this project is less attractive.