Easier Analysis of Unstructured Data Starts with Document Intelligence

Last year, Snowflake announced an intent to acquire Applica, a company focused on machine learning (ML) solutions for understanding documents. We are working full steam ahead on integrating Applica’s technology with Snowflake. But what does that mean for you? In this blog post, I’ll describe how you can get value from unstructured data, how we’re thinking about making that easier with Snowflake and Applica, and invite you to share more about your own use cases.

Why unstructured data?

For many years, customers have been using Snowflake for analyzing structured and semi-structured data. However, 80% of the world’s data is unstructured. People want an easy way to securely use all their data together, regardless of its original structure. With this in mind, Snowflake added support for unstructured data, allowing customers to store, manage, govern, share, and process unstructured data with the same performance, concurrency, and scale as structured and semi-structured data. In less than a year, we have seen hundreds of customers adopt this functionality to solve their business needs. For example, analyzing images of products to identify features and suggest similar products.

With the Data Cloud, Snowflake customers are simplifying their data architectures and enabling new use cases by bringing together all types of data in one system. By bringing Applica’s cutting-edge technology to the Snowflake platform, we aim to make it even easier to derive meaningful insights and get value from unstructured data.

What does Applica provide?

Over the past few years, Applica has produced ML models focused on document intelligence. With zero-shot or few-shot learning—in other words, without upfront training—Applica’s models can intelligently parse documents and extract meaningful data points from them. These data points can be further joined with other data in Snowflake tables, even data acquired from Snowflake Marketplace, to create enriched data sets that can be used for analytics, ML, and applications. All of this will be achieved with a low-code/no-code user experience. Extracting meaningful data points from documents and creating enriched data sets could be applied to various types of documents, such as invoices, financial statements, insurance documents, research reports, news articles, and much more. 

Where can we take this?

Although we are going to focus on bringing Applica’s technology to Snowflake to enable easier document intelligence, we don’t want to limit our ML models only to documents. We would like to expand our research into other cognitive services use cases for images, audio, and videos. Imagine use cases such as automatically tagging images, searching documents, searching across images, or automatic transcription of audio files. Some customers have already expressed interest in no-code processing of those complex unstructured data to extract meaningful data points from them, but we want to hear from you, too! 

Do you have any use cases for the understanding of unstructured data? Are you running into challenges using ML technology while trying to solve your business needs? If so, let us know! Our research team can leverage their expertise and help build solutions for your specific use cases.