AWS is investing heavily in building tools for LLMops

Amazon Web Services (AWS) made it easy for enterprises to adopt a generic generative AI chatbot with the introducing of its “plug and play” Amazon Q assistant at its re:Invent 2023 conference. But for enterprises that want to build their own generative AI assistant with their own or someone else’s large language model (LLM) instead, things are more complicated.

To help enterprises in that situation, AWS has been investing in building and adding new tools for LLMops—operating and managing LLMs—to Amazon SageMaker, its machine learning and AI service, Ankur Mehrotra, general manager of SageMaker at AWS, told InfoWorld.com.

“We are investing a lot in machine learning operations (MLops) and foundation large language model operations capabilities to help enterprises manage various LLMs and ML models in production. These capabilities help enterprises move fast and swap parts of models or entire models as they become available,” he said.

Mehrotra expects the new capabilities will be added soon—and although he wouldn’t say when, the most logical time would be at this year’s re:Invent. For now his focus is on helping enterprises with the process of maintaining, fine-tuning and updating the LLMs they use.

Modelling scenarios

There are a several scenarios in which enterprises will find these LLMops capabilities useful, he said, and AWS has already delivered tools in some of these.

One such is when a new version of the model being used, or a model that performs better for that use case, becomes available.

“Enterprises need tools to assess the model performance and its infrastructure requirements before it can be safely moved into production. This is where SageMaker tools such as shadow testing and Clarify can help these enterprises,” Mehrotra said.

Shadow testing allows enterprises to assess a model for a particular use before moving into production; Clarify detects biases in the model’s behavior.

Another scenario is when a model throws up different or unwanted answers as the user input to the model has changed over time depending on the requirement of the use case, the general manager said. This would require enterprises to either fine tune the model further or use retrieval augmented generation (RAG).

“SageMaker can help enterprises do both. At one end enterprises can use features inside the service to control how a model responds and at the other end SageMaker has integrations with LangChain for RAG,” Mehrotra explained.  

SageMaker started out as a general AI platform, but of late AWS has been adding more capabilities focused on implementing generative AI. Last November it introduced two new offerings, SageMaker HyperPod and SageMaker Inference, to help enterprises train and deploy LLMs efficiently.

In contrast to the manual LLM training process—subject to delays, unnecessary expenditure, and other complications—HyperPod removes the heavy lifting involved in building and optimizing machine learning infrastructure for training models, reducing training time by up to 40%, the company said.

Mehrotra said AWS has seen a huge rise in demand for model training and model inferencing workloads in the last few months as enterprises look to make use of generative AI for productivity and code generation purposes.

While he didn’t provide the exact number of enterprises using SageMaker, the general manager said that in just a few months the service has seen approximately 10x growth.

“A few months ago, we were saying that SageMaker has tens of thousands of customers and now we are saying that it has hundreds of thousands of customers,” Mehrotra said, adding that some of the growth can be attributed to enterprises moving their generative AI experiments into production.

Copyright © 2024 IDG Communications, Inc.

Source