Snowflake Innovates to Simplify Data Foundation

Snowflake provides a strong data foundation anchored on unified data, optimal TCO and universal governance. The Snowflake platform eliminates silos to enable any architectural pattern, while supporting all data types and workloads. To further embrace open standards, Snowflake is excited to announce both the launch of Polaris Catalog — an open source catalog for Apache Iceberg that allows you to read and write using your engine of choice, without lock-in — and the general availability of support for Iceberg tables. Snowflake has built-in governance and discovery of data, apps and models through Snowflake Horizon, which now includes Internal Marketplaces (private preview), allowing you to easily share assets across your organization. Finally, Snowflake continuously improves performance, with query duration improvements of 27% since metric tracking started in August 2022, with the goal to reduce customer cost over time. 

Supporting open storage architectures

The AI Data Cloud is a single platform for processing and collaborating on data in a variety of formats, structures and storage locations, including data stored in open file and table formats. Iceberg tables (now generally available), when combined with the capabilities of the Snowflake platform, allow you to build various open architectures, including a data lakehouse and data mesh. And to provide further flexibility — including read and write interoperability from multiple engines with centralized access — Snowflake is open sourcing Polaris Catalog in the next 90 days. With Iceberg tables now generally available, you can now leverage many Snowflake features to power a variety of workloads, all on top of tables in the open Iceberg format. This includes pipelines and transformations with Snowpark, Streams, Tasks and Dynamic Tables (public preview soon); extending AI and ML to Iceberg with Snowflake Cortex AI; performing storage maintenance with capabilities like automatic clustering and compaction; as well as securely collaborating on live data shares. 

If you are not already using Iceberg in your data lakes, Snowflake provides features to help you easily and cost-effectively onboard to Iceberg. Parquet Direct (private preview) allows you to use Iceberg without rewriting or duplicating Parquet files — even as new Parquet files arrive. If you are already using Delta Lake in your lakehouse, Delta Direct (private preview) allows you to continuously and cost-effectively access your Delta Lake tables as Iceberg tables for “bronze” and “silver” layers, without all of the requirements of Universal Format (UniForm).

Built-in governance and discovery, now for internal teams

Snowflake Horizon enables data governors, data stewards and data teams to have a unified way to govern and discover data, apps and models in the AI Data Cloud. These capabilities can even be extended to Iceberg tables created by other engines. Data teams can now collaborate on these types of content not only with external partners and customers, but also internally within the same organization. New AI governance advancements help organizations better understand, share and secure models.  

Governed internal collaboration with better discoverability and AI-powered object metadata

Snowflake is introducing an entirely new way for data teams to easily discover, curate and share data, apps and now also models (private preview soon). Adding to Universal Search (generally available), which allows data teams to use natural language to find relevant data, apps and models across the AI Data Cloud, the Internal Marketplace (private preview) is a directory of all data products specifically curated for use within an organization to boost secure collaboration and value creation. AI-powered Object Descriptions (private preview soon) save data teams time by generating descriptions to tables and views, while the Object Insights Interface (private preview) provides additional context by surfacing relevant insights about the object’s popularity, access, quality and dependencies. Stewards can now also streamline the classification of sensitive data through Sensitive Data Auto-Classification (private preview) and Automatic Tag Propagation (private preview soon). 

Security improvements for models and apps

On the AI security front, Snowflake Cortex Guard will soon be generally available — this capability uses Meta’s Llama Guard, an input-output safeguard model, to filter content for violence and hate, self-harm and criminal activities. With the upcoming general availability of the Trust Center, a built-in interface for you to discover security risks, with recommendations to resolve them in one easy-to-use interface, Snowflake Horizon is advancing centralized, cross-cloud threat monitoring. Additional authentication improvements (generally available), networking enhancements (generally available) and private link connectivity (private preview soon) further strengthen security. 

Privacy-preserving collaboration with data clean rooms and advanced policies

Innovations in privacy-preserving collaboration allow multiple parties across organizations to realize high-value business outcomes and unlock analytic value from personal, regulated or proprietary data while protecting data privacy. With the introduction of Snowflake Data Clean Rooms (generally available to customers in select regions), Snowflake is giving nontechnical users access to out-of-the-box templates in a simple UI, while developers can use APIs to customize clean rooms and deploy their model of choice. Additional privacy-enhancing innovations allow more users to test or develop with sensitive data, without requiring access to the original data sets, through Synthetic Data Generation (in private preview soon). Differential Privacy Policies (in public preview soon) allow organizations to unlock value from granular, highly sensitive data through collaboration, while protecting the data against reidentification and privacy attacks through adding noise. Entity-level Privacy (generally available) protects the exposure of entities where data points may be spread across multiple rows and columns. 

Improved monitoring of data quality and models

To help data governors and stewards enhance the monitoring of data sensitivity and quality, for passing compliance reviews and audits, Snowflake is expanding the Lineage Visualization Interface to views (public preview soon) and models (private preview) so that data governors can have a bird’s eye understanding of upstream and downstream relationships for more objects. To help you define and automatically measure and monitor both out-of-the-box (such as null count) and custom data quality metrics, Snowflake is announcing that Data Quality Monitoring will soon be generally available. 

Enhanced interoperability for Iceberg tables

With the open Polaris Catalog, any engine that already supports the Iceberg REST API can now create an Iceberg table, and after these tables are synced to Snowflake, Snowflake Horizon’s leading governance and discovery capabilities, from granular access policies to Universal Search, can easily be applied to them as if they were native Snowflake objects.  

Continuously improving performance and built-in cost management 

Snowflake’s success is predicated on our customers’ success — this means that Snowflake continually strives to provide optimal performance at the best price, while equipping you with the tools to help you efficiently manage spend. 

Snowflake continues to reduce query duration, with the latest Snowflake Performance Index (SPI) results showing improvements of 27% between Aug. 25, 2022 and April 30, 2024, and 12% over the last 12 months, for stable customer workloads. The SPI measures the impact of Snowflake’s consistent performance improvements on real customers’ production workloads over time, and many of these performance optimizations are intelligent — occurring automatically with no customer action required.

To make it easier for you to have better visibility, control and optimization of your Snowflake spend, Snowflake recently added new capabilities to the generally available Cost Management Interface that you can learn more about in this blog. Additionally, Snowflake is announcing that per-query cost attribution will soon enter public preview, enabling the attribution of warehouse spend to a given query, thereby enhancing chargeback scenarios. Enhancements to the WAREHOUSE_EVENTS_HISTORY view (generally available) will help you more easily monitor warehouse changes that may affect credit consumption, while  budgets will cover Snowpark Container Services compute pools (generally available soon). To help you optimize your spend, Cost Insights is now generally available in the Cost Management Interface to surface opportunities, such as large tables that haven’t been queried recently, where you can take immediate action. 

Faster, easier ingest 

To make data ingestion even more cost effective and effortless, Snowflake is announcing performance improvements of up to 25% for loading JSON files, and for loading Parquet files, up to 50%. With this launch, Snowflake is providing more native connectors to allow you to bring data in more easily. Snowflake’s native connectors, including the existing Snowflake Connector for Kafka and for ServiceNow, are built with scalability, cost efficiency and lower latency. Getting data ingested now only takes a few clicks, and the data is encrypted. Some of the newest connectors that Snowflake now supports include: 

  • Snowflake Connector for PostgreSQL (public preview soon) 
  • Snowflake Connector for MySQL (public preview soon) 
  • Snowflake Connector for Google Analytics (generally available)

Global regulated and sovereign markets expansion

Snowflake, which already supports 40+ cloud regions across the three hyperscalers,is now expanding into global regulated and sovereign markets through zonal repositories that prioritize keeping usage data within geographical boundaries — for data residency and data sovereignty purposes — to meet customers’ regulatory requirements. In particular, Snowflake is working on providing data governance and security-enhancing controls to address more stringent data access restrictions in Europe, as part of our Sovereignty Roadmap in Europe. 

For U.S. Department of Defense (DoD) customers, Snowflake will soon be offering, in general availability, a separate Impact Level 4 (IL4) authorized environment that includes a networking integration with the Defense Information System Network’s (DISN) Boundary Cloud Access Point (BCAP) to help ensure that DoD customers can satisfy their very specific regulatory requirements.

Advanced analytics

Snowflake continues to empower you to derive more meaningful data insights with the announcement of several new analytics advancements. With Time Series ASOF JOIN (generally available), Time Series RANGE BETWEEN (public preview soon) and Higher-order Functions (generally available soon), you could easily conduct highly performant advanced analytics without the need to write complex queries or resort to custom UDFs. Snowflake’s new token-based search function, Full Text Search (public preview soon), unlocks the ability to find exact characters or text in specific columns or across one or more tables. In addition, Search Optimization is compatible with Full Text Search, enabling needle-in-the-haystack queries across terabytes of data in seconds, with low concurrency. Furthermore, Snowflake is continuing to advance geospatial analytics with support for Discrete Global Grid H3 (generally available).

Learn more

Snowflake helps you simplify your data foundation on a single platform that unifies all data, provides built-in governance and discovery, and is continuously improving in price-for-performance for a wide range of workloads.

To learn more about the different areas of the Snowflake platform highlighted here, please visit the below resources:

Forward Looking Statements

This article contains forward-looking statements, including about our future product offerings, and are not commitments to deliver any product offerings. Actual results and offerings may differ and are subject to known and unknown risk and uncertainties. See our latest 10-Q for more information.

Source