Feb 21, 2023

Snowflake, a cloud-based data repository that combines the best characteristics of data warehouses and data lakes, is fast becoming firms’ go-to choice for data storage as part of their modern data platform. Snowflake offers storage, compute, and global services layers that are physically separated but logically integrated. This allows multiple data workloads to scale independently from one another and can automatically scale compute resources up or down based on workloads.

Developed in 2012 and officially launched in 2014, Snowflake is cloud-agnostic, meaning it runs on all the well-known public cloud platforms ─ Amazon Web Services (AWS), Google Cloud, and Microsoft Azure. An interesting dynamic exists between Snowflake and the cloud platforms it runs on. Although Amazon, Google, and Microsoft Azure all offer products that compete with Snowflake, Snowflake drives a large amount of business to these three platforms. While these vendors are trying to market their own platforms to clients, at the same time they can’t deny the business they get from Snowflake.

Drivers for Using Snowflake

Asset managers and asset owners use Snowflake for its rapid data ingestion, multi-cloud support, and data-sharing capabilities. Data providers and solution and service providers also have developed Snowflake capabilities or plan to create Snowflake offerings.

Whether the availability of data on Snowflake from providers drove investment managers to implement Snowflake instances or the investment management community drove the data provider/service provider industry to offer Snowflake-capable data shares is really a “chicken or the egg” situation. Capabilities seem to have developed in parallel, but there is no denying that the ease of data onboarding drives increasing numbers of investment management firms to implement Snowflake.

For this article, we talked to a number of Cutter Research members to gain insights into their use of Snowflake, their drivers for using Snowflake, and any challenges they associate with the product. The feedback we received around firms’ use of Snowflake was unusually consistent. Firms reported that Snowflake “does what it says on the side of the box” ─ meaning all the hype around the ease of implementation, onboarding of new datasets, and frictionless data access through the data share is real.

Data Sharing in Snowflake

We found that for investment management firms, the data-sharing capabilities of Snowflake are the number one driver and key differentiator over other cloud data warehouses. Data shares allow firms to quickly onboard datasets from data providers, data platforms, and service providers/custodians. They offer a future vision of moving away from data pipeline maintenance, file transfers, and bulk data updates. Firms can implement a Snowflake data share from a provider and gain access to the latest data without replicating it in their own Snowflake environment. Some firms also envision a world where data replication is replaced by Snowflake data shares internally and they can share raw data externally with their clients.

Multiple providers offer platforms built on Snowflake, including BlackRock Aladdin Data Cloud, Arcesium Data Platform, BNY Data Vault, SimCorp Data Warehouse, and State Street Alpha Data Platform. And members also told us that many of their custodians/service providers are building, or have already built, Snowflake instances where firms can access their own data through a data share, eliminating nightly bulk data loads and giving them access to the most up-to-date data.

The Snowflake Marketplace allows firms to search for and quickly trial or subscribe to new datasets. Snowflake Marketplace offers a rapidly growing number of data providers and available datasets ─ over 360 third-party data providers and 1,700-plus datasets across multiple industries, including data from FactSet, MSCI, Preqin, Rimes, Qontigo, and S&P Global Market Intelligence. Several members said they are pushing their data providers and Snowflake to add additional data providers to the platform.

We found that some members use the marketplace to trial data, but then implement a private Snowflake data share directly with the provider rather than use the marketplace. This approach is due to several reasons. First, data vendors may not offer all their available datasets on the marketplace. Secondly, investment managers can get their firm-specific data through a private data share. In fact, because of the ease of obtaining data through the Snowflake Marketplace, some firms lock it down completely to ensure data consumers go through their firm’s proper data procurement processes.

Other Drivers for Selecting Snowflake

Besides data sharing, firms conveyed several other reasons why they selected Snowflake for data storage as part of modernizing their data architecture, including the following:

  • Expertise: Snowflake has taken the time to understand the financial services market and created the Snowflake Financial Data Cloud, featuring service partners, data providers, and technology partners. BlackRock, BNY Mellon, fiserv, and State Street Alpha are founding members of the Powered by Snowflake initiative for financial services.
  • Performance: Performance of queries and running BI reports and dashboards are much faster on Snowflake than when run on on-premises data warehouses. Snowflake’s automatic scaling of compute resources ensures that firms receive the resources to run large, complex queries when necessary that will reset without manual intervention to the normal workload when they aren’t needed.
  • Ease of Transition: Snowflake, which is ANSI SQL compliant, provides a similar SQL experience for developers accustomed to SQL Server, making retraining of current resources relatively straightforward. Snowflake also offers extensive documentation and training resources for clients.
  • Reduced Maintenance: Snowflake automatically handles the common maintenance tasks associated with data warehouses, such as the vacuuming, compressing, and partitioning of data. And index tuning is a thing of the past because Snowflake does away with indexes, replacing them with a dynamic query engine that automatically optimizes queries.
  • Affordability: Several members described data storage on Snowflake as “dirt cheap.” Compute resources, based on usage, are more expensive. Firms need to monitor the usage, but those that were able to compare the all-in cost of data management on-premises (including hardware costs, software licenses, and the resources required to manage the hardware and administer the data warehouse) versus on Snowflake reported significant cost savings.
  • Trustworthy: Member firms said they felt that Snowflake salespeople provided them with information and advice that may not necessarily benefit the vendor but did benefit the client. For example, firms reported that Snowflake salespeople often recommended outside consultants over Snowflake consultants because the price might be more advantageous for the client. Firms we spoke with also said they received advice from Snowflake on sizing storage and compute in a way that seemed fair and transparent.

Challenges to Adopting Snowflake

As with most products, Snowflake presents some challenges that firms should be aware of. Not all of these challenges are Snowflake-specific, but are simply issues related to managing data on any cloud data platform. It’s worth noting that none of these challenges became roadblocks for the firms implementing Snowflake.

Member firms cited setting up data security and permissioning of data as their biggest issue with Snowflake, but this is considered a challenge with any cloud data platform. Firms want to make data available and easily accessible for all consumers, but this can conflict with data licensing, permissions, and data privacy. Finding a model to do both can be tricky. Snowflake permissioning requires firms to define user roles. Access to data is defined by those roles, but managing all the roles in a large organization is complex, especially with thousands of data sets.

When using a market data vendor with strict licensing, some firms noted that they created multiple internal shares for each type of user who needed the data ─ or firms implemented third-party data security and access control software. Recognizing some of these limitations, Snowflake in December 2022 released new security features for public preview that enable more granular access permissions on individual objects in a data share. Data vendors do not want to be in the business of managing permissions. Configuring security is important to get right at the outset, so it’s useful to seek the advice of experts who have done this before.

While data residency is not as much a Snowflake challenge as it is a cloud challenge, it’s worth considering if your firm operates in multiple countries or regions with data residency regulations. To adhere to these regulations, firms may need to store data in a Snowflake data warehouse based on a cloud server within a particular region and replicate that data elsewhere.

Thinking About Joining the ‘Snowverse’?

Obviously, implementing a Snowflake data warehouse as part of modernizing your data architecture takes planning, realistic expectations, proper training, and commitment. Contact Cutter Associates if you are interested in learning more about Snowflake or other tools that are part of the modern data platform.

To learn more, see our article Modern Data Platforms ─ Because the Old Data Solutions Just Don’t Cut It Anymore.