Rethink What You Know About Creating a Data Lake for JSON
Aug 21, 2017
Author: Michael Nixon
How to Use Snowflake, Snowflake for Industry
Over the last 10 years, the notion has been that to quickly and cost-effectively gain insights from a variety of data sources, you need a Hadoop platform. Sources of data could be weblogs, clickstreams, events, IoT and other machine-born JSON or semi-structured data. The proposition with Hadoop-based data processing is having a single repository (a data lake) with the flexibility, capacity and performance to store and analyze an array of data types. In other words, a data lake for JSON.
It shouldn’t be complicated
In reality, analyzing data with an Hadoop-based platform is not simple. Hadoop platforms start you with an HDFS file system, or equivalent. You then must piece together about a half-dozen software packages (minimum) just to provide basic enterprise-level functionality. Functionality such as provisioning, security, system management, data protection, database management and the necessary interface to explore and query data.
Despite the efforts of open-source communities to provide tools to improve the capabilities of Hadoop platforms to operate at the highest enterprise-class level, there is the constant need for highly skilled resources. Skilled resources to continually support Hadoop to keep it up and running, while enabling users to do more than just explore data. This all adds up to unnecessary complexity.
A much simpler proposition for creating a data lake for JSON
Snowflake, which is built for the cloud and delivered as a service, provides you with a different option for handling JSON and semi-structured data. Just point your data pipelines to Snowflake, land the data in our elastic storage repository and you have instant access to a bottomless data lake. You also have access to a full-fledged data warehouse. With Snowflake, you can easily load JSON and query the data with relational, robust SQL. You can mix JSON with traditional structured data and data from other sources, all from within the same database. Moreover, you can also support endless concurrent analytic workloads and work groups against the JSON data. Whether it is one workload or 1,000 workloads, Snowflake can handle it all with ease.
As a combined data lake and data warehouse platform, Snowflake allows you do much more. Read more about it with our new eBook, Beyond Hadoop: Modern Cloud Data Warehousing.
Try Snowflake for free. Sign up and receive $400 dollars of free usage credits. You can create a sandbox or launch a production implementation from the same Snowflake account.
Rethink what you’ve been told about the data lake for JSON
In order to gain insights from JSON or other machine data, Hadoop is not a prerequisite
When you need to store, warehouse and analyze JSON and other machine data, rethink what you’ve been told. Snowflake, easily, allows you to develop insights or uncover relationships that can drive business forward. You can support all of your structured and semi-structured data warehousing and analytic workload needs with a single tool. A single tool that is built for the cloud and is ACID-compliant. Unlike the special skills often needed to operate an Hadoop platform, Snowflake is a fully relational SQL environment that utilizes the familiar semantics and commands that are known to millions of SQL users and programmers, and thousands of SQL tools.