The Cloud for Analytics 101
Unlock Data Sharing with Snowflake Data Marketplace
Getting critical real-time information from your suppliers.
Integrating data from a new company you just acquired.
Creating live data exchanges between departments.
Data sharing is a crucial piece of the modern analytics architecture, as it reduces inefficiencies and increases collaboration between partners.
Traditionally, data sharing involved finding a vendor, paying for a dataset, receiving a CSV file, and then having to create ETL processes for data ingestion and transformation to then put that data into a database. Today, there is another way – Snowflake Data Marketplace.
Snowflake Data Marketplace uses Snowflake Secure Data Sharing to connect providers of data with consumers. You can discover and access a variety of third-party data and have those datasets available directly in your Snowflake account to query without transformation and join it with your own data. The Data Marketplace is available directly from your Snowflake user interface.
There are lots of things we love about the data sharing capabilities within Snowflake, but here is our take on the most important features.
- First, the ability to augment data science and developer flows with insights from 16 different categories of data (e.g., public health, weather, location, demographics, etc.) is huge
- For us, this means the ability to improve our client’s data with additional powerful datasets like COVID rates within a city – this will assist with predictive models or provide additional insights not originally in the data that we can pass onto customers
- We can create and host a data exchange to breakdown data silos – for example, our clients could share their data with us, internally with other business groups, and externally with other suppliers or partners
- The most important feature of the Data Marketplace to us is the simplicity in accessing over 125 datasets, all using a SQL statement – no complex API code, ETL, or complex programming languages required (i.e. C#, Java, etc.)
- Developers can hit the ground running and easily join Marketplace data with customer internal data with a simple join statement
Provider vs. Consumer Functionality
As a data provider, you can:
- Publish data listings for free-to-use datasets to generate interest and opportunities among your customer base
- Publish data listings for datasets that can be customized by customers, suppliers, etc.
- Share live datasets securely and in real-time with partners, without creating copies of the data or imposing data integration tasks on them
- Eliminate the cost of building and maintaining APIs and data pipelines to deliver data
As a Data Consumer, you can:
- Discover and test third party data sources
- Receive frictionless access to raw data products from vendors (i.e. COVID Rates, Stock Market Data)
- Combine new datasets with your existing data in Snowflake to derive new business insights
- Have datasets available instantly and updated continually for users
- Eliminate the costs of building and maintaining various APIs and data pipelines to load and update data
- Use the BI tools of your choice to visualize the information
A great example of successfully sharing information between businesses can be seen throughout the COVID-19 pandemic. To advance vaccine discovery, large pharmaceutical companies across the world collaborated to decrease the research expense, train their machine learning algorithms on each other’s data, and advance the process as quickly as possible.
If we think of a grocery retailer & supplier example, they would be interested in seeing sales data broken up by stores to determine how far a store is from a COVID outbreak. To do this, they could enhance their data with COVID data from John Hopkins University.
Within Snowflake, grocer users can access the Data Marketplace, search a dataset in natural language, find listings within different categories, and read more about the data. Once they found a dataset they want, the user can press “Get Data” which will make the data available to query. It is pointed to the original data, so it is always live and up to date.
If the grocer wanted to share their data with the supplier so that they could predict which products run out of stock, they could share their sales data in 2 ways:
- Standard listing – providing the same data to all suppliers
- Personalized listing – each brand request data that you approve so they see only their data
They can specify how often it gets updated, provide sample queries, and securely share a specific table. Only members can see it (admin, provider, consumer), but users can request access to the data and the grocer can approve or deny that request. This creates a database and provides a secure share that dynamically filters the views so that they can only see their information to replenish their stock in stores with a COVID outbreak.
Overall, if you’re finding ETL or APIs cumbersome to share data, try testing out Snowflake Data Marketplace to get access to hundreds of data sets that could fit your business. You can get started with Snowflake here for free, and we can help you spin up a data warehouse in minutes.
Not sure on what to tackle next in your data modernization journey? We can help you with anything from gathering data from new sources, organizing it in the cloud, or even performing data science & machine learning to unlock new insights. Reach out to us to learn more.
JENNIFER MCNAUGHTON, ANALYTICS ADVISOR