At Snowflake Healthcare and Life Sciences Forum 2021, Snowflake connected with Blake Valenta from the City of San Francisco and Jason Labonte from Datavant to hear how the Bay Area used data science in its fight against the COVID-19 pandemic.
The COVID-19 pandemic disrupted every aspect of day-to-day life, with businesses of all sizes trying to adapt to new ways of working and serving their customers. The dramatic shifts were also evident in the field of data science, where leaders in IT and analytics scrambled to get ahead of the cascade of novel challenges.
The City of San Francisco offers a great example of how a willingness to adapt can have lasting impacts. When the pandemic started, the city created an interagency hub called Emergency Operations Center (EOC). One of its biggest early challenges was the critical shortage of personal protective equipment (PPE)—masks, gloves, gowns, and other items—that Bay Area hospitals desperately needed. The city centralized purchasing through the EOC, but different organizations used different ordering and tracking systems, all with their own schema.
“It was difficult to get a bird’s-eye view of how much was on hand, how much was the right medical grade, and how much more we needed to purchase,” said Blake Valenta, an Analyst Strategist at DataSF, the colloquial name for the Office of the Chief Data Officer.
His team was brought in to fix it. They were already looking at the Snowflake Data Cloud, and the outbreak convinced them that it was time to adopt the solution.
“Using the Snowflake Data Cloud, we were able to bring the data together and harmonize it so that decision-makers could move forward and solve San Francisco’s PPE problem. We stood it up in less than a week.”
—Blake Valenta, Analyst Strategist, DataSF
A Single Medical Research Environment
Just down the street, in San Francisco’s financial district, is Datavant, a health technology company focused on solving data connectivity and data fragmentation problems in healthcare. One of the biggest industry challenges is every time patients see their doctors, their data is captured in different electronic health records systems. Lab test results are stored in lab systems, prescriptions in pharmacy systems, and so on. The data is there, but it’s siloed.
Jason Labonte, Datavant’s Chief Strategy Officer, knew that Snowflake makes it possible to capture that data, privately exchange it, and link it downstream. Since early 2020, he and his team have been working on a project with Snowflake called the COVID-19 Research Database, a single repository of all the real-world medical data.
At first, Datavant didn’t have a national data infrastructure that allowed it to pull all this data into a single environment, but it knew where the data was and who owned it. “We brought together a coalition of the willing,” Labonte said, “from organizations that had the data and organizations like Snowflake that had the technology to create that database. It was a tremendous undertaking, and we stood it up in 60 days.”
The database includes data from 13 private sources and 55 public data sets—health records, medical claims, consumer data, global mobility data, healthcare capacity by state, and more—all loaded into a single research environment. It’s governed by a privacy and security layer built with Snowflake components to help ensure that all the data is deidentified according to HIPAA regulations.
“We have a large repository of data that’s very secure and nicely set up in the Snowflake environment, including a wiki with data layouts and descriptions of the data sets. And it’s available pro bono to public health researchers working on the pandemic. So far we have more than 2,000 registered users.”
—Jason Labonte, Chief Strategy Officer, Datavant
A Single Trusted Source of Information
After DataSF solved the city’s PPE problem, it embarked on a mission similar to Datavant’s. Data from fragmented sources needed to be unified and standardized, not only for the city’s internal departments but also for the general public. Community organizations, journalists, and residents needed to be brought into the conversation, and everybody needed to be on the same page.
“We use Snowflake to take data from departments and centralize it into a catalog that analysts from each department can use to build the reports and dashboards they need. We had to come up with data governance standards about what could be released and what couldn’t, but the internal and external views are based on the same data. Residents of San Francisco, community organizations, and journalists can depend on it too. Our entire community is able to look at the same trusted data sources and move forward together as efficiently and safely as possible.”
—Blake Valenta, Analyst Strategist, DataSF
Datavant and the City of San Francisco aren’t directly working together, but it’s no accident that they’re working in parallel and solving some of the same problems in the same way using Snowflake.
“Many people in big data look at artificial intelligence and machine learning as the next big thing, but the low-hanging fruit is the tons of data we already have sitting in inaccessible siloes. If we can solve that problem, we’ll get a lot of early momentum. We just need technology partners like Snowflake that make it easy to move large data sets into accessible environments and to share that data while protecting security and privacy.”
—Jason Labonte, Chief Strategy Officer, Datavant