Semi-Structured Data

What is semi-structured data?

In comparison to flat files like CSVs, which use relational “columns and rows”, semi-structured files store data in a nested format.

Benefits

Semi-structured formats are highly adaptable to the addition of new data, meaning that the collection of data doesn’t need to be limited by the columns within the datasource. This type of data has become more common with the rise of web connected devices that need an adaptable and lightweight data communication method.

JSON for Snowflake

Snowflake is unusual in that it can natively support semi-structured data like Avro, JSON and XML alongside relational data. Most databases and data stores only support one format. With Snowflake, users can choose to “flatten” nested objects into a relational table, or store the Objects and Arrays in their native format within the VARIANT data type. JSON and other data types can then be queried or manipulated directly in Snowflake with ANSI-standard SQL, with the addition of dot notation.