Basic normalization with Destinations V2? #35840
-
Hello, My question is about AirByte Basic normalisation and the changes in Destinations V2. I tested this with Source: File Azure Blob Storage => Destination: MSSQL The JSON used was: {
"invoicetests": [
{
"age": 30,
"city": "New York",
"email": "[email protected]",
"grades": [95, 85, 90],
"nicknames": ["foo", "bar", "john", "Doe"],
"is_student": false,
"name": "John Doe"
},
{
"email": "[email protected]",
"is_student": false,
"leeftijd": 30,
"naam": "John Doe",
"scores": [
{
"t1": 5,
"t2": 6,
"t3": 7,
"t4": 9
},
{
"t1": 14,
"t2": 123,
"t3": 54,
"t4": 12
}
],
"test": "test",
"town": "New York"
}
]
} Where indeed a separate table has been created for the nested object scores, which is a perfect fit/ Now, when I was re-reading the Destination V2 Documentation, and Basic Normalization I saw that basic normalization will be removed in favor of Typing and Deduping. Meaning, if I'm correct, the separate tables with-in nested objects will no longer be created by default. Now my question is, is it possible to achieve the same results but with Destination V2, or what are the alternatives? If we used AirByte are we stuck on an older version? Can we still use the basic normalization out-of-the-box? Do we need to create our own normalization? I hope anyone can clarify. |
Beta Was this translation helpful? Give feedback.
Replies: 6 comments
-
Exactly the same question, have just upgraded to the latest version to find that the unnested tables are no longer being created. Reading the documentation here, there seems to be no guidance at all on how to achieve a similar outcome in the new world of V2 destinations, has this feature been permanently removed? If so, it would have been nice to have some guide of how to achieve it ourselves, seems like many options but an idea of best practice here would have been useful. Perhaps I have missed something, in agreement some futher clarification would be useful. |
Beta Was this translation helpful? Give feedback.
-
+1 for this - any update or guidance on recreating basic normalization in Destinations v2? |
Beta Was this translation helpful? Give feedback.
-
I had to replicate the same functionality manually using DBT since it seems the feature is just gone in V2. |
Beta Was this translation helpful? Give feedback.
-
I did the same in the end, and in hindsight probably the right thing to do anyway, more control and less risk of changes affecting things later. Not too time consuming, and that time spent probably informed me better about the data! |
Beta Was this translation helpful? Give feedback.
-
Destination V2 will create the final table but won't normalize nested objects. It becomes your responsibility to handle the transformation of complex objects using dbt or downstream processes. The cost of maintaining the normalization module to generate dbt transformations for any schema was too high. Additionally, the models usually weren't performant, and users had to manually edit them. If you are using Airbyte Cloud you can trigger Dbt Cloud jobs to run after your sync is finished, for open-sources users you need to handle that process with an external orchestrator/scheduler. |
Beta Was this translation helpful? Give feedback.
-
I have already created and stored the DBT code in our GitLab repository. Specifically, I would like to know: |
Beta Was this translation helpful? Give feedback.
Destination V2 will create the final table but won't normalize nested objects. It becomes your responsibility to handle the transformation of complex objects using dbt or downstream processes. The cost of maintaining the normalization module to generate dbt transformations for any schema was too high. Additionally, the models usually weren't performant, and users had to manually edit them.
If you are using Airbyte Cloud you can trigger Dbt Cloud jobs to run after your sync is finished, for open-sources users you need to handle that process with an external orchestrator/scheduler.