adesso blog

Snowflake has taken a late but big step in the field of generative AI (GenAI). With services such as the Snowpark Container Service, Snowflake Cortex and Snowflake's own Large Language Model (LLM) 'Arctic', Snowflake wants to secure its place in the world of generative AI. The second part of the blog post looks at these three services and the opportunities they offer (for companies).

What opportunities does Snowflake offer?


Figure 1: Overview of Snowflake for GenAI, source: https://www.snowflake.com/blog/fast-easy-secure-llm-app-development-snowflake-cortex/?lang=de

Serverless AI and LLM functions

The Cortex Functions are available in conjunction with the Snowflake ML Functions. These now include both the ML Functions and the Ready-To-Use GenAI Functions. The Cortex Service was already presented at Snowday 2023 and is intended to simplify access to and use of ML, GenAI and LLMs for all users, regardless of their technical knowledge. Although many features are only gradually being released, the service now offers a fully managed environment that gives users immediate access to a growing collection of serverless features. Using SQL or Python with Snowpark, it is possible to access LLM models specialised for downstream NLP tasks:

  • Response extraction,
  • sentiment analyses,
  • text summaries,
  • translations and the
  • text embedding.

Figure 2: Overview of snowpark container service, source: https://img1.lemondeinformatique.fr/fichiers/telechargement/snowflake.3.png

To provide even more flexibility for almost all programming languages, frameworks and libraries as well as for the hardware of choice (CPU/GPU), Snowflake has developed the 'Snowpark Container Service'. This service, launched late last year, provides an enhanced Snowpark runtime that enables fast access to GPU infrastructure without additional operational costs. The latter is the result of a partnership with NVIDIA.

The Snowpark Container Service enables developers to efficiently and securely deploy containers that enable the following custom AI use cases, among others:

  • LLM fine-tuning
  • Open source vector database deployment
  • Distributed embedding processing
  • Speech-to-text processing
  • Hosting for inference processes

Behind the scenes, the container service uses a system based on Kubernetes (K8s for short) to automatically manage the creation, management and resizing of containers in the Snowflake ecosystem. According to a report by The Stack, a managed Kubernetes cluster is used. This allows users and organisations to run their applications close to the data and benefit from the platform's security mechanisms without having to worry about the underlying infrastructure.

Snowflake's own large language model

Snowflake's latest release (April 2024) is the self-developed open source LLM 'Arctic' with 17B active parameters. This model focuses on enterprise intelligence metrics, a collection of capabilities that are critical to enterprise customers. This means that the LLM has been developed for specific use cases, including coding (HumanEval+ and MBPP+), SQL generation (Spider) and Instruction Following (IFEval).


Figure 3: Complete metrics table: Source: https://www.snowflake.com/wp-content/uploads/2024/04/table-3-1-1-2048x808.png

The model efficiency should be at least as high as that of Llama 3 8B and Llama 2 70B, but with less than half the training computing budget (approximately two million USD, less than three thousand GPU weeks). This is achieved by a Dense - MoE Hybrid Transformer model architecture. A combination of Dense Transformer and MoE Transformer.

In order to train the model for the use cases mentioned above, a Business Oriented Data Curriculum was carried out. In principle, the model was first trained on generic skills in analogy to learning skills in human life, in order to then learn more complex metrics in the final phases.

If you want to learn more about the model architecture or the inference efficiency of the model, you can access the 'Arctic Cookbook' here. It contains detailed articles on modelling, system, data and fine-tuning. To test Arctic, follow this link which leads to a free Hugging Face demo.

Snowflakes Embedding Models

Text embedding models also play a central role in the world of GenAI. These models convert text data into numerical vectors. These vectors contain semantic information of the text, so that similar texts have similar vectors. Since machines require numerical input to perform computations, text embedding is a critical component of many downstream NLP applications. Snowflake has released a set of embedding models in five sizes from s-small (xs) to large (l) that achieve top performance in the Massive Text Embedding Benchmark (MTEB).


Figure 4: Massive Text Embedding Benchmark (MTEB), source: https://www.snowflake.com/blog/introducing-snowflake-arctic-embed-snowflakes-state-of-the-art-text-embedding-family-of-models/

These models are used in particular in connection with Retrieval Augmented Generation (RAG). RAG offers a way to optimise the output of an LLM with specific and up-to-date information, such as company-specific data, without having to retrain the base model. The implementation requires embedding models that convert this new data into vectors, which are then stored together with the metadata in a vector database.

Conclusion

Snowflake's mission is to help its customers use GenAI efficiently and easily to make better decisions, increase productivity and reach more customers by utilising all types of data. To this end, the services offer opportunities for custom workloads with OSS LLMs and serverless GenAI capabilities for advanced NLP tasks. With 'Arctic', an Open LLM is available for enterprises to create SQL Data Copilots, Code Copilots and, with the help of the Arctic Embedding family, RAG Chatbots. This significantly expands Snowflake's comprehensive data platform and AI capabilities, creating a robust infrastructure to help organisations significantly improve their data strategies through the use of GenAI.

Would you like to learn more about exciting topics from the adesso world? Then take a look at our previous blog posts.

Also interesting:

Data Mastery

Level up your data

How you handle data today will determine your company's success tomorrow. It is the indispensable basis for generative AI, uncovers growth potential and strengthens the resilience of your company. An optimised data and analytics strategy is therefore not only beneficial, but essential. Want to know how to get the best out of your data? We'll show you.

Find out more andutilisethepotentialof your data

Picture Sebastian Lang

Author Sebastian Lang

Sebastian Lang has been working as a Cloud Data Engineer in the D&A Data Platforms department since January 2024 and supports the Snowflake team. His main areas of interest are the efficient construction of data pipelines and the integration and processing of data to optimise ML processes in Snowflake.

Save this page. Remove this page.