What is JSON?

JSON (JavaScript Object Notation) is an open-standard, text-based data format or interchange for semi-structured data. Semi-structured data is machine data that originates from a wide variety of sources and devices, including mobile phones, web browsers, servers, or IoT devices. This data, which is collected as messages called events, can then be organized into batches, which are fed to a data platform via a data pipeline.

JSON can be used in a multitude of applications, but it has become more common as a format for data transmission between servers and web applications or web-connected devices. This is partially because those applications can often only receive data as text, and JSON is text based. Originating from JavaScript, most contemporary programming languages now have the ability be able to parse and generate JSON data. With a breadth of different uses, JSON is now common and widely used due to a lightweight data-interchange format. It is also easy to be both written and read by both machines and humans. JSON is often considered as an alternative to XML for semi-structured data, since it can deliver more tightly compacted object representations. 



In comparison to flat files like CSVs, which use relational “columns and rows”, JSON files store data in nested Objects and Arrays, which themselves contain values. This structure is highly adaptable to the addition of new data, meaning that the collection of data doesn’t need to be limited by the columns within the data source.


Analytics for for SEMI-Structured DATA with Snowflake

Snowflake is unusual in that it can natively support JSON (and other semi-structured data) alongside relational data. Most databases and data stores only support one format. With Snowflake, users can choose to “flatten” nested objects into a relational table, or store the Objects and Arrays in their native format within the VARIANT data type. Semi-structured data can be manipulated with ANSI-standard SQL, with the addition of dot notation.