Ginkgo Bioworks Improves Experiment Success and Scales Biology With Near Real-Time Data

By replicating experiment data into Snowflake, Ginkgo’s scientists spot issues and correct course within minutes, speed analyses without tying up costly equipment, and deliver greater value to customers.



Minutes to identify experiment issues, instead of days


Gingko employees with near-instant access to data

Doctor looking at medical data on her tablet
Gingko logo
Healthcare & Life Sciences
Boston, MA

Data: a key ingredient to scale biology

Ginkgo Bioworks is focused on making biology easier to engineer. As the leading platform for cell programming, Ginkgo provides flexible, end-to-end services that solve challenges for organizations across diverse markets — from food and agriculture to pharmaceuticals to industrial and specialty chemicals.

Experimentation is at the core of Ginkgo’s ability to deliver innovative solutions for customers. Effectively using experiment data is essential for keeping projects on track, but that’s difficult to do when experiments span multiple days and involve hundreds of microbial samples. Snowflake’s Data Cloud makes it easier for Ginkgo to extract, join and use experiment data — ultimately ensuring more successful projects.

Ginkgo’s mission is to make biology easier to engineer, and we believe that scaling these experiments through automation will help us deliver results faster, cheaper and more reliably. That’s a huge part of the role of Snowflake.”

Aleksey Yeremenko
Director, Data Platform & Engineering, Ginkgo Bioworks
Story Highlights
  • Faster access to data for scientists, fewer bottlenecks: With Snowflake, Ginkgo’s scientists can adapt quicker to experiment issues and avoid tying up chemistry instruments during their analyses.

  • Stronger value proposition for customers: Near real-time experiment data helps Ginkgo quickly determine if a project is likely to succeed, resulting in greater value for customers.

  • Data-driven decision-making: Ginkgo builds BI dashboards and provides Snowflake access to more than 1,000 team members to improve everything from financial planning and instrument utilization to quality assurance and procurement decisions.

Expediting scientific analysis and freeing up costly equipment

Developing a microorganism that’s capable of benefiting humanity is a complex endeavor, both scientifically and from a data perspective. “Each experiment brings something new and unique to this data structure,” says Aleksey Yeremenko, Director of Data Platform and Engineering at Ginkgo Bioworks.

Prior to Snowflake, scientists accessed experiment data directly from Ginkgo’s analytical chemistry instruments, which led to costly project delays. According to Yeremenko, “Sometimes people were waiting a week for experiment results because these instruments are very expensive and in high demand.” Siloed analysis inhibited collaboration between Ginkgo’s biology teams. 

Ginkgo tested multiple database technologies but struggled to find a flexible and cost-effective solution for its data needs. “Data doesn’t flow constantly in biology. It takes time for an organism to grow and for instruments to start generating data,” Yeremenko says. “When we started, we did not need a warehouse running all the time.” Ginkgo’s high volume of diverse data adds to the complexity. The company’s database, for example, contains billions of biological sequences — some of which are millions of characters long.

Seeking to accelerate access to experiment data and simplify scientific analysis, Ginkgo turned to Snowflake.

We’re shifting the focus toward solving the challenges of science, not the challenges of software.”

Aleksey Yeremenko
Director, Data Platform & Engineering, Ginkgo Bioworks

Delivering better customer experiences while keeping pace with 10x data growth

Centralizing and joining experiment data with other data sets in Snowflake enhances scientific analysis. Connecting Ginkgo’s visualization and data science tools to Snowflake enables scientists to perform their analyses without wrangling data in CSV files or tying up instruments. Sharing analytical results back to the Data Cloud helps Ginkgo avoid data loss and expedites time to insight. “It’s a huge time savings for us because now the analysis can be completed within a day of the experiment’s completion,” Yeremenko says. 

Near real-time replication of experiment data into Snowflake frees up capacity for Ginkgo’s equipment and feeds scientists with timely data to make adjustments in minutes — instead of days. According to Yeremenko, “If something goes wrong with the experiment, they’re able to stop and rearrange.” For example, detecting contamination on an instrument can save a tremendous amount of time and money.

gingko pull quote image

“Getting data off the instruments quickly, then analyzing and enriching it with metadata is the biggest value of having all this data available in Snowflake.”

Aleksey Yeremenko
Director, Data Platform & Engineering, Ginkgo Bioworks

Since implementing Snowflake, Ginkgo’s data volumes have grown exponentially — increasing by about 10x every three years. Snowflake’s elastic performance engine and optimized storage have made it possible for Ginkgo to keep pace. “I’m really happy with the Snowflake architecture, which allows us to increase flexibility and spend our dollars to power science,” Yeremenko says. “Snowflake’s ability to scale with our needs has tremendous value.”

Achieving a reliable, scalable data environment helps Ginkgo effectively leverage data, which leads to better customer experiences. “Now we have the confidence to know early on if we can deliver on a project,” Yeremenko says. “For customers, it’s a much better value proposition than incrementally spending and learning it didn’t work.”

Fostering healthier business decisions by democratizing data for 1,000+ team members

More than 1,000 people at Ginkgo have access to the Data Cloud, and hundreds regularly use Snowflake to run a variety of queries. According to Yeremenko, “Almost everybody at Ginkgo has access to Snowflake.”

Data stored in Snowflake feeds numerous Tableau dashboards that Ginkgo uses to support resource and financial planning, instrument utilization, quality assurance and procurement decisions. Ginkgo’s biology teams use dashboards to understand demand and prioritize tasks. Lab staff rely on operations dashboards to plan each day’s work and monitor instrument status. Financial teams enjoy greater visibility to inform future investments.

Collaborating around data to unlock future breakthroughs

Snowflake Secure Data Sharing will play an important role in streamlining Ginkgo’s interactions with partners moving forward. For example, Ginkgo used Snowflake Secure Data Sharing to collaborate with an AI partner while developing a vaccine ingredient. “Without data sharing, it would have involved moving huge amounts of data through SFTP,” Yeremenko says. “Instead, we configured a data share so they could query the information and do their own training based on the data — without moving it.”

Data applications built with Streamlit in Snowflake will help Ginkgo’s scientists spend less time running SQL queries and more time on science. According to Yeremenko, “The value of Streamlit in Snowflake is that I don’t need to allocate resources separately or worry about the deployment model, framework, security and access control.” 

Ginkgo is also exploring using the Dynamic Tables feature to simplify data engineering and process data in the time it takes to change a lab coat. Yeremenko says, “The sooner our scientists can access their experiment data in Snowflake, the faster they can iterate and bring value to our customers.”

Start your 30-DayFree Trial

Try Snowflake free for 30 days and experience the AI Data Cloud that helps eliminate the complexity, cost and constraints inherent with other solutions.