During the past year, many businesses relied on a curated data set from Starschema for essential information about the impact and spread of COVID-19. On a recent episode of our Rise of the Data Cloud podcast, Starschema Co-founder and CEO Tamas Foldi joined us to discuss the power of data sharing and how his company uses Snowflake Data Marketplace to distribute this valuable information.
Throughout the pandemic, companies were seeking a reliable source of data to help them make key decisions about the welfare of their employees and their businesses. Starschema answered the call in March 2020 with a free-of-charge, curated package of trusted COVID-19 data that is based on public information and made available through Snowflake Data Marketplace.
This was Starschema’s first foray into providing a data set, and it’s something the company takes very seriously. It has continued to maintain, update, and add to the information every day since the data set was released. “When there’s a crisis, you should think about how you could most help others,” Foldi told us. “This was the most straightforward and obvious way for us to do that.”
At the start of the pandemic, Foldi was approached by two friends who were Tableau Community Forum members and wanted his help building a COVID-19 data hub. The friends wanted to set up robust, easily accessible dashboards in Tableau to track COVID-19 cases around the world, which Starschema was able to assist them with.
In the process, Foldi realized that accessing the data for those dashboards was much more difficult than it should have been, and organizations needed an easier way to do it. He immediately thought of Snowflake Data Marketplace as an easy-to-use platform for sharing data among different organizations. “By using Snowflake Data Marketplace, we can be sure that all the data will be there in a clean and ready-to-use format and can be accessed with a single click,” he said.
Starschema set up the COVID-19 data set on a public GitHub repository and distributed it through Snowflake Data Marketplace to hundreds, and eventually thousands, of Snowflake customers.
The data set brings together a wide range of information, including epidemiological data on morbidity and mortality rates from the World Health Organization and smaller, national authorities; testing data from the Centers for Disease Control and Prevention; and, more recently, vaccination data. Other data adds further useful context, such as demographic information and ICU bed capacity.
Having all that data in one place makes it much easier for organizations to consume. “You have a really organized, easy-to-join data set where you can match the different sources and join and blend them together with your own data,” Foldi said. When an organization combines this information with data from its own ERP, CRM, and other systems, it can uncover insights that it would not otherwise be able to find.
Starschema has seen three types of use cases during the pandemic, Foldi said. At the beginning, when the data set was primarily about the number of COVID-19 cases and mortalities, companies were using it to determine which personnel and facilities to evacuate.
The second wave of use cases focused on joining the COVID-19 data with external data, such as mobility indicators derived from anonymized call phone data. For example, a large music label wanted to analyze the impact of the pandemic on streaming music consumption, since fewer people were commuting to work and listening to music in their cars.
Today, many companies are using the Starschema data to determine when it’s safe to reopen offices and other facilities as well as when consumer demand for in-person services is likely to pick up. They’re looking particularly closely at vaccination rates and the associated impact on risk.
In terms of lessons learned, Foldi talked about the power of community and the broader use cases for a marketplace like Snowflake’s. A marketplace is useful for sharing data not just externally with partners and customers but also internally at large organizations, he noted.
“Publishing your data inside your organization provides huge value,” he said. “We have started to encourage our own clients to use Snowflake to build internal data exchanges or data marketplaces, where different functions and divisions can exchange data in a really sophisticated way.”
Rise of the Data Cloud is a podcast hosted by award-winning author and journalist Steve Hamm. For each episode, Steve speaks with a data leader to learn how they leverage the cloud to manage, share, and analyze data to drive business growth, fuel innovation, and disrupt their industries. You can listen to more episodes here.