Based mostly on a buyer case research, a sophisticated tutorial on utilizing Delta Dwell Tables to course of JSON schema evolution with out requiring to restart
Disclaimer: I’m a options architect at Databricks. The views and opinions expressed on this article are my very own and don’t essentially replicate these of Databricks.
Schema evolution is a typical phenomenon on this planet of information engineering. When extracting information from sources and loading it right into a vacation spot, adjustments within the supply schema are inevitable. This problem is amplified when coping with supply methods that embrace JSON payloads, similar to JSON-type columns in PostgreSQL. The probability of schema adjustments inside these JSON payloads is excessive — new fields might be added at any time, usually deeply nested at varied ranges. These frequent adjustments considerably improve the complexity of constructing sturdy information pipelines that parse such schema adjustments and evolve the schema seamlessly.
The Databricks Intelligence Platform, powered by the Delta Lake format, gives sturdy help for schema evolution, guaranteeing flexibility and resilience when coping with adjustments in information construction. Delta Lake can…