Thought leadership
-
August 15, 2023

A brief history of Snowflake

Here's a brief history of one of the powerhouse organizations bringing change and innovation to the data space: Snowflake.

Liz Elfman

Snowflake Inc. —founded in 2012 and based in San Mateo, California—is a cloud-based data-warehousing company. Its unique architecture separates data computation from storage, enabling businesses to scale up or down their data usage needs. Companies use Snowflake to optimize cost, performance, and flexibility.

The company experienced rapid growth since its inception, due to an innovative approach to data warehousing. Snowflake went public in 2020 in what was one of the largest software IPOs at the time. Among other factors, the company's success can be attributed to its robust product offering, user-friendly platform, and strategic leadership decisions.

Snowflake's impact on the data industry and data observability has been substantial. By offering a cloud-native, scalable, and easy-to-use data warehousing solution, Snowflake has democratized access to data analytics for businesses of all sizes. Newer features, like real-time data sharing and the ability to process diverse data types, have pushed the boundaries of what's possible in the realm of data analytics. Moreover, Snowflake's success has sparked increased competition in the cloud data warehousing space, leading to faster innovation and better products for customers. Here's a stroll through the company's history and a look at how they've impacted the data sector.

2012: Founding and public unveiling

Snowflake Computing, now known simply as "Snowflake", was founded in 2012 by three data warehousing experts: Benoit Dageville, Thierry Cruanes, and Marcin Zukowski.

Dageville and Cruanes had previously worked at Oracle as data architects, where they witnessed firsthand the limitations of traditional on-premise and cloud data platforms. Zukowski was the co-founder of Vectorwise, a database company that was acquired by Actian. All three recognized the constraints of existing solutions, including scalability issues, complex management, and the inability to effectively handle a growing volume of data generated by businesses.

The founders envisioned a data warehouse built for the cloud from the ground up. One that could leverage the flexibility of cloud compute and storage to provide dynamic and scalable data storage and analytics.

They wanted Snowflake to be accessible and user-friendly, ensuring that businesses of all sizes could analyze their data without needing to manage complex hardware or software configurations. Their goal was to separate compute from storage, allowing users to scale each independently based on their needs, a key differentiator for Snowflake.

Snowflake was kept in stealth mode until June 2015, when it unveiled its cloud data warehousing platform. The platform was met with enthusiasm from businesses seeking to leverage their data for insights, driving Snowflake's growth in the following years.

Separating storage and compute

Snowflake's architecture is unique because it separates storage and compute resources, a major departure from traditional data warehouse design where these are intrinsically linked. This separation is crucial for several reasons:

  1. Scalability: By separating compute from storage, Snowflake allows users to scale each independently. Users can store an infinite amount of data (limited only by the capacity of their cloud provider) and can also scale up or down the compute resources as needed, depending on the power required for their queries. Users can process large data workloads quickly by adding more compute resources, and when not in use, they can be scaled down or turned off to save costs.
  2. Performance: The separation also ensures that heavy queries don't slow down the system. Each user or workload gets its dedicated resources, preventing resource contention. This means that a large, complex query won't slow down a small, simple one, improving overall system performance.
  3. Cost-effectiveness: With Snowflake, you pay for storage and computation separately. If your data is large but your queries are infrequent or simple, you can save money by keeping compute resources low. Conversely, if your data size is small but the complexity and frequency of queries are high, you can opt for more compute power. This flexibility results in cost savings and optimal resource utilization.
  4. Concurrency and accessibility: Traditional data warehouses struggle with many users accessing the system concurrently. In Snowflake's model, multiple compute nodes simultaneously access the same data without performance degradation, supporting high levels of concurrency.
  5. Flexibility and ease of use: Snowflake handles all aspects of operations, like data distribution, data partitioning, and query optimization, in a way that's transparent to the end-user. Users don't need to manage indices, partition data, or perform other administrative tasks common in other systems. They just load their data and start querying.

Early leadership

In 2014, Bob Muglia was appointed CEO of the company, a position he held until 2019. During his tenure at Snowflake, Muglia played a pivotal role in driving the company's growth and establishing its foothold in the cloud data warehousing market.

One of Muglia's significant contributions was shaping Snowflake's value proposition of the cloud-native data warehouse separating storage from compute resources. Muglia also championed the user-friendliness of Snowflake's platform, pushing for a system that required less maintenance and was easy for data analysts and scientists to use. The strategy proved successful, skyrocketing the amount of users that adopted Snowflake's technology.

Fundraising

In October 2015, shortly after the public unveiling of its product, Snowflake raised $45 million in a Series C funding round. The company kept attracting significant interest from venture capital, securing a $100 million Series D round in 2017.

  1. Series A - 2012: Snowflake raised $5 million in its initial round of funding. The round was led by Sutter Hill Ventures.
  2. Series B - 2014: The company secured $26 million in a Series B round led by Redpoint Ventures, with participation from Sutter Hill Ventures.
  3. Series C - 2015: Snowflake raised $45 million in a Series C round led by Altimeter Capital, with participation from Redpoint Ventures, Sutter Hill Ventures, and Wing Ventures.
  4. Series D - 2017: The company secured $100 million in a Series D round. ICONIQ Capital led the round, with participation from Madrona Venture Group, Redpoint Ventures, Sutter Hill Ventures, Wing Ventures, and Altimeter Capital.
  5. Series E - 2018: Snowflake raised a hefty $263 million in a Series E round, which valued the company at $1.5 billion. This round was led by ICONIQ Capital, Altimeter Capital, and Sequoia Capital.
  6. Series F - 2018: Later the same year, Snowflake raised another $450 million in a Series F round, bringing its post-money valuation to $3.5 billion. The round was led by Sequoia Capital, with participation from existing investors.
  7. Series G - 2020: In its final funding round before its IPO, Snowflake raised $479 million at a valuation of $12.4 billion. Dragoneer Investment Group and Salesforce Ventures led the round, with participation from existing investors.

Product innovation and expansion

In 2016, Snowflake introduced Snowpipe, a service that loads data in a continuous, real-time manner as the data arrives in the cloud. A couple of years later, they launched Snowflake Data Sharing, enabling direct sharing of live data across Snowflake accounts without the need for ETL processes or copying data. Notably, in 2019, Snowflake introduced the Snowflake Data Exchange, a marketplace for data providers and data consumers.

Current leadership

In 2019, Muglia stepped down as CEO and was succeeded by Frank Slootman. While the reasons for the leadership change were not detailed publicly, it marked a shift in the company's trajectory, as Slootman was known for successfully leading companies through IPOs. Notwithstanding his departure, Bob Muglia's impact on Snowflake's early growth and market positioning remains a significant part of the company's history.

Slootman previously led ServiceNow and Data Domain to successful IPOs. This change indicated Snowflake's intentions to go public. Under Slootman's leadership, Snowflake has experienced remarkable growth. By 2020, it boasted several high-profile customers like Adobe, Sony, and Capital One, among others.

IPO

In February 2020, Snowflake raised $479 million in a funding round that valued the company at $12.4 billion. This funding round included Salesforce Ventures as a new investor, highlighting the company's impressive growth and potential.

On September 16, 2020, Snowflake debuted on the New York Stock Exchange under the ticker "SNOW." Priced at $120 per share, the company's shares began trading at $245, a more than 100% increase, making it the largest software IPO ever at that time.

Between 2015 and the 2020 IPO, Snowflake exhibited steady product innovation, rapid customer growth, and successful fundraising activities, establishing itself as a significant player in the cloud data warehousing space.

Key Product Expansions and Partnerships

Snowpipe

Introduced in 2016, Snowpipe is a service that enables Snowflake users to load data in real time. It allows for automated and continuous loading of data as soon as it arrives in a cloud storage bucket. This makes it ideal for use cases where up-to-date data is critical, such as real-time analytics and monitoring.

Snowflake Data Sharing

This feature, launched in 2017, revolutionized how companies share data. With Snowflake's unique architecture, data can be shared between different Snowflake accounts instantly, securely, and without having to move or copy the data. This reduces the complexity, cost, and risk associated with traditional data sharing methods.

Snowflake Data Marketplace

Launched in 2020, the Snowflake Data Marketplace takes data sharing a step further. It provides a platform where providers can publish data sets, and consumers can access and analyze this data directly from their Snowflake account. The data remains in its original, live state, ensuring it is always up-to-date.

Snowpark

Announced in 2020, Snowpark allows data engineers, data scientists, and developers to write code in their languages (like Java, Scala, and Python) and execute workloads directly on Snowflake. This enables more types of data-intensive applications to leverage Snowflake's powerful platform, further expanding its capabilities beyond just data warehousing.

Partnerships

Strategic partnerships have played a significant role in Snowflake's success. By working closely with cloud providers, Snowflake ensures the platform runs smoothly across multiple cloud environments. Partnerships with data management and BI companies offer customers a comprehensive and seamless data pipeline, from ingestion and transformation to analysis and visualization. Security partners help maintain Snowflake's commitment to robust data security. All these alliances enable Snowflake to provide a comprehensive, flexible, and secure data solution, making it an attractive choice for businesses.

  1. Cloud Providers: Major cloud service providers, including Amazon Web Services (AWS), Google Cloud Platform (GCP), and Microsoft Azure, are foundational partners for Snowflake. These partnerships enable Snowflake to offer its cloud-based data warehousing solution across the different cloud platforms, giving customers the flexibility to choose their preferred cloud provider.
  2. Salesforce: In 2020, Salesforce became a strategic partner to Snowflake. Not only did Salesforce invest in Snowflake, but the two companies also announced integrations between Salesforce's Customer 360 platform and Snowflake. This allows mutual customers to unify and analyze their data in real time, and also provides access to Snowflake's Data Marketplace for additional data insights.
  3. Data management and integration partners: Companies like Informatica, Talend, Matillion, and Fivetran, which provide data integration, transformation, and management solutions, have partnered with Snowflake to facilitate seamless data ingestion and processing within the Snowflake platform.
  4. Business intelligence (BI) partners: BI and analytics tool providers, including Tableau, Looker, and PowerBI, have integrations with Snowflake to provide users with a streamlined experience for data analysis and visualization.
  5. Data security partners: Snowflake also collaborates with data security companies, such as Okta and Duo Security, to provide enhanced security measures for user authentication and data protection.
  6. Alation: Snowflake invested in data catalog provider Alation in 2021 and has awarded them partner of the year at the annual Snowflake Summit multiple years in a row. Their joint customers include large accounts such as Cisco, DocuSign, Expedia, and PepsiCo.

Snowflake and Bigeye

Ensuring data in Snowflake is fresh and high quality can be a challenge at scale. Bigeye automates monitoring of Snowflake data to ensure data engineering and analytics teams discover potential issues before they reach key analytics products like dashboards and reports.

To learn how Bigeye can automate monitoring and alerting for freshness, volume, schema changes, and lineage across your entire Snowflake environment in under 24 hours, request a demo with us here.

share this episode
Resource
Monthly cost ($)
Number of resources
Time (months)
Total cost ($)
Software/Data engineer
$15,000
3
12
$540,000
Data analyst
$12,000
2
6
$144,000
Business analyst
$10,000
1
3
$30,000
Data/product manager
$20,000
2
6
$240,000
Total cost
$954,000
Role
Goals
Common needs
Data engineers
Overall data flow. Data is fresh and operating at full volume. Jobs are always running, so data outages don't impact downstream systems.
Freshness + volume
Monitoring
Schema change detection
Lineage monitoring
Data scientists
Specific datasets in great detail. Looking for outliers, duplication, and other—sometimes subtle—issues that could affect their analysis or machine learning models.
Freshness monitoringCompleteness monitoringDuplicate detectionOutlier detectionDistribution shift detectionDimensional slicing and dicing
Analytics engineers
Rapidly testing the changes they’re making within the data model. Move fast and not break things—without spending hours writing tons of pipeline tests.
Lineage monitoringETL blue/green testing
Business intelligence analysts
The business impact of data. Understand where they should spend their time digging in, and when they have a red herring caused by a data pipeline problem.
Integration with analytics toolsAnomaly detectionCustom business metricsDimensional slicing and dicing
Other stakeholders
Data reliability. Customers and stakeholders don’t want data issues to bog them down, delay deadlines, or provide inaccurate information.
Integration with analytics toolsReporting and insights

Join the Bigeye Newsletter

1x per month. Get the latest in data observability right in your inbox.