What is semi-structured data? 

In comparison to flat files like CSVs, which use relational “columns and rows”, semi-structured files store data in a nested format.

The benefits of semi-structured data

Semi-structured formats are highly adaptable to the addition of new data, meaning that the collection of data doesn’t need to be limited by the columns within the datasource. Semi-structured data has become more common with the rise of web connected devices that need an adaptable and lightweight data communication method.

Semi-structured data for Snowflake

Snowflake is unusual in that it can natively support semi-structured data like Avro, JSON and XML alongside relational data. Most databases and data stores only support one format. With Snowflake, users can choose to “flatten” nested objects into a relational table, or store the Objects and Arrays in their native format within the VARIANT data type. JSON and other semi-structured data can then be queried or manipulated directly in Snowflake with ANSI-standard SQL, with the addition of dot notation.

To learn more about Semi-Structured Data check out our blog, How Snowpipe Streamlines Your Continuous Data Loading and Your Business

Return to Glossary