Snorkel AI Introduces New Foundation Model Data Platform for Programmatic Data Development of Generative AI.
Snorkel AI enables any company to develop its own large language models using its data and knowledge
San Francisco, June 12, 2023 – Snorkel AI, the data-centric AI company, introduced the Foundation Model Data Platform, powered by its unique programmatic data development approach. With Snorkel AI’s Foundation Model Data Platform, any company can now use their proprietary enterprise data and knowledge to build custom foundation models (FMs) or large language models (LLMs) or improve the accuracy of leading commercial or open-source models for domain-specific generative AI and predictive AI applications.
Despite the Cambrian explosion for generative AI applications, accuracy and privacy are top challenges for enterprise adoption. Nearly 40 percent of enterprises are already considering building enterprise-specific LLMs or adapting existing ones using their proprietary data. The biggest blocker for model development is manually preparing the data that the models are trained with. Snorkel AI brings an unparalleled track record of transforming manual data development processes into programmatic solutions. Some of the world’s largest enterprises, such as five of the top ten US banks, Memorial Sloan Kettering, BNY Mellon, Wayfair, and more, use Snorkel Flow to programmatically label data and train models for mission-critical predictive AI applications with production-grade accuracy.
Snorkel AI’s new Foundation Model Data Platform expands programmatic data development beyond labeling for predictive AI with two core solutions: Snorkel GenFlow for building generative AI applications and Snorkel Foundry for developing custom LLMs with proprietary data. With Snorkel Flow, GenFlow, and Foundry, enterprises can support critical data development for all the ways enterprises want to leverage FMs and LLMs.
“Wayfair’s partnership with Snorkel AI underscores our commitment to machine learning innovation, continually enhancing our customers’ on-site search experience among our vast array of 40 million products,” said Tulia Plumettaz, Director of Machine Learning at Wayfair. “Snorkel’s programmatic labeling approach helps our data scientists improve catalog content automation and overcome accuracy, consistency, and efficiency challenges. In addition, Snorkel’s data-centric AI platform supports our mission to utilize foundation models in future developments.”
Snorkel AI now offers the full stack of solutions for foundation model programmatic data development, including:
- Snorkel Flow to rapidly build, manage, and deploy predictive AI applications (e.g., classification, information extraction) using programmatic labeling, fine-tuning, and distillation. Enterprises can unlock production-grade accuracy for mission-critical business applications such as financial document analysis, clinical trial analytics, KYC, etc. Snorkel AI’s customers have reduced AI development time from months to days and costs by hundreds of thousands of dollars per project.
- Snorkel GenFlow to rapidly build, manage, and deploy generative AI applications (e.g. summarization, question answering, chat) by programmatically curating, scoring, filtering, and sampling instructions and responses for instruction tuning with RLHF and other methods. Enterprises can improve performance and reliability on specific tasks using their proprietary data.
- Snorkel Foundry to build custom FMs/LLMs by programmatic sampling, filtering, cleaning, and augmenting proprietary data for domain-specific pre-training. Enterprises can use their data as a differentiator by adapting powerful but generic base models into domain-specific specialist models that can serve as a base for all internal AI applications—predictive and generative.
“Today, everyone uses nearly the same models, algorithms, and approaches for training FMs and LLMs—but it’s the data that they train on at all stages which is the differentiator, and the secret sauce that AI-first companies are investing in and guarding most heavily,” said Alex Ratner, CEO and co-founder of Snorkel AI. “Our Foundation Model Data Platform enables every enterprise to use their unique, proprietary data and knowledge to build or adapt FMs and LLMs with production-level accuracy on their data and workloads, unlike off-the-shelf FMs. Proprietary data and knowledge is the one durable moat in AI today, and we enable enterprises to own and use this themselves.”
Snorkel AI has collaborated with Microsoft to enable Azure AI customers to use proprietary data to fine-tune and customize machine learning models and applications.
“Microsoft delivers the world’s most capable foundation models and empowers AI developers focused on deeply customized domain-specific use cases to ground these models with their own data. The advancements made by Snorkel AI in this space have the potential to be transformative across the industry,” said John Montgomery, Corporate Vice President, Program Management, AI Platform at Microsoft, “Snorkel AI’s new foundation model platform has the potential to significantly enhance how Azure customers build, fine-tune, and apply large language models across their business. This could fundamentally shift the current paradigm, making AI more accessible and customizable for every enterprise, regardless of size or industry. The power of Snorkel AI’s innovations combined with Microsoft’s AI platform is a game-changer.”