LLMOps is Dead! Long Live GenOps!

Matt White
7 min readNov 26, 2023

Generative AI adoption requires a holistic approach that will scale to handle all modalities and implementation methods, not just language.

“Modify the existing cartoon-style illustration of two robots, GenOps and LLMOps. In this updated scene, GenOps, a sleek, futuristic robot with a shiny metallic surface, is now raising his arms in the air in a victorious pose while standing on top of LLMOps, a more traditional, boxy robot with visible rivets and joints. Add labels clearly identifying each robot: ‘GenOps’ on the futuristic robot and ‘LLMOps’ on the boxy robot. Maintain the playful, non-violent, and cheerful atmosphere of the orig
Dall-E 3 generated image of GenOps triumphant over LLMOps. (Yup, Dall-E 3 still hasn’t mastered text in pixels, and OpenAI claims they are close to achieving AGI. Okay.)

What feels like a lifetime ago now, nearly a year ago today, OpenAI triggered mainstream interest in Large language Models (LLMs) with the launch of their chatbot, ChatGPT. Successive Large Language Model Services (LLSs) launched by Anthropic, Cohere and others and a huge uptick in generative AI startups and investments, has caused organizations to take a serious look at leveraging the power of LLMs and LLSs to take advantage of their vast capabilities in language and code. At the same time, the increasingly opaque nature of black-box models has fueled concern around ethics, transparency and explainability for which LLM developers currently have no complete answers nor fool-proof solutions.

Big tech companies and academia alike, motivated by a myriad of reasons including by being caught off guard at the success of generative AI or genuinely caring about advancing open science, have developed open-source and open access models (the former being released with a OSI-approved licenses and the latter having usage restrictions in their licensing.) These smaller more accessible models have begun a counter-movement that has offered open-source alternatives to the large opaque models that have dominated the headlines (readers may not know that OpenAI’s GPT-2 and nearly all predecessor LLMs before GPT-3 were released as open-source.)

The accessibility of open-source models and techniques used to reduce the computational costs of inference (LoRA, QLoRA) have made deploying self-managed models in the enterprise a more viable option. Concerns over data leakage and exploitation, and a desire to leverage internal company data to enhance the usability of models with retrieval augmented generation (RAG) have only fueled this trend.

To manage large language models and their workflows, just like machine learning models, enterprises need tools and processes for model lifecycle management. For machine learning this is solved with Machine Learning Ops (MLOps) using platforms like KubeFlow and MLFlow, for LLMs, Large Language Model Operations (LLMOps) quickly emerged as a solution to address the complexities and demands of managing LLM pipelines mostly with similar tooling as MLOps.

Large Language Model Operations (LLMOps) was introduced soon after LLMs became the “soup du jour”, and promised to cater specifically to the nuances of managing large language models. However, generative AI is much bigger than LLMs, generative AI incorporates a multitude of media formats including images, video, audio and 3D in addition to text whether as input (prompts) or outputs (generations.) There is also a trend towards decreasing model sizes through new architectures and features that will move models from “large” to more moderately sized. In addition to making models more accessible for enterprises, models are transitioning from single large over-parameterized models to the deployment of multiple smaller domain specific models whether they be pre-trained or fine-tuned.

Generative AI Operations (GenOps) offers a comprehensive and future-proof methodology for managing the lifecycle of generative models. This approach addresses the complex nuances associated with the preparation of vast amounts of unstructured data, often scaling up to hundreds of terabytes, across diverse formats. It encompasses the entire spectrum of model management, from pre-training and fine-tuning stages to the intricacies of prompt engineering and the operation of multiple models.

GenOps at its core provides the necessary tools and processes designed for orchestration and automation of all stages and functions of the generative model ecosystem. Additionally, GenOps is robust enough to adapt to the dynamic and fluid nature of generative AI, ensuring compatibility, modularity and sufficient generalization to accommodate the next wave of generative AI innovations.

LLMOps: Definition and Evolution from MLOps

LLMOps focuses on the specific needs of deploying, monitoring, and maintaining LLMs. It evolved from MLOps, which provides a framework for the lifecycle management of machine learning models, to address the unique challenges posed by LLMs. These challenges include managing the substantial computational resources required, ensuring data privacy, and adapting to the rapidly evolving nature of language models.

Contrasting LLMOps with MLOps

MLOps offers a generalized approach for managing various types of machine learning models, emphasizing automation, scalability, and reproducibility across the model lifecycle. LLMOps, in contrast, zeroes in on the peculiarities of large language models, dealing with aspects like high computational demands, specialized data (text) handling, and model updating to reflect improvements achieved through new pre-trained models, updated models through fine-tuning as well as through reinforcement learning from human feedback (RLHF.)

The Emergence of GenOps

GenOps, represents an expansion of LLMOps, encompassing not just large language models and handling text, but capable of working with generative models for all media types including image, audio, video, and able to handle multi-modal systems. It acknowledges that the principles governing the management of LLMs apply broadly to other generative AI technologies and that attending only to text limits the scalability of LLMOps. GenOps integrates many of the foundational elements of MLOps with the specialized requirements of managing various generative models, creating a cohesive operational framework that maximizes operational efficiencies.

GenOps vs. LLMOps: A Comparative Analysis

While LLMOps is tailored for large language models, GenOps extends its scope to include all generative models. GenOps addresses the broader spectrum of challenges posed by different types of generative AI, such as diverse data types, varying model architectures, and distinct operational requirements. It also anticipates the shift towards smaller, more efficient models in the future, ensuring its relevance in the evolving AI landscape.

Comparison of GenOps with LLMOps on the basis of varying criteria.

The Case for GenOps

Organizations currently juggling multiple platforms for different types of AI models can significantly benefit from adopting a singular GenOps framework. This approach promotes operational efficiency, reduces complexity, and ensures a standardized process for model management. GenOps allows businesses to manage their generative model lifecycle comprehensively, from deployment to monitoring and updating, under a unified set of technologies and practices without the need to manage multiple platforms specialized for each modality like LLMOps.

Integration and Broad Scope: The adoption of GenOps is a strategic imperative for organizations venturing into the deployment of large language models (LLMs). Unlike LLMOps, which is narrowly focused on LLMs, GenOps provides a comprehensive framework that encompasses a wide spectrum of generative AI models, including but not limited to language, image, audio, and video. This broad scope allows organizations to integrate various AI functionalities under a single operational umbrella, significantly enhancing operational coherence and reducing the complexity that arises from managing disparate systems.

Future-Proofing and Scalability: In the rapidly evolving landscape of AI, GenOps offers a future-proof solution. As AI technologies advance, the line between different types of generative models is becoming increasingly blurred. GenOps is designed to be adaptable and scalable, ensuring that organizations are not only equipped to handle current AI applications but are also prepared for future developments. This scalability extends to the ability to manage varying sizes and complexities of models, a crucial aspect as organizations grow and their AI needs become more sophisticated.

Efficiency and Cost-Effectiveness: GenOps promotes operational efficiency by streamlining the management of generative models. This consolidation leads to a reduction in overlapping tasks and resources, thus offering a cost-effective solution. The unified approach of GenOps mitigates the need for multiple platforms and the associated overhead, leading to a more economical deployment and maintenance of AI models.

Enhanced Performance Optimization: GenOps facilitates a more nuanced approach to performance optimization across different model types. It allows for the fine-tuning of models based on their specific domains, and benchmarking and monitoring of model performance whether it be accuracy tested against language model benchmarks like HELM and EleutherAI’s test harness for LLMs or resolution and quality assessment for diffusion-based image models. This tailored optimization is key to achieving high-performance outputs in generative AI applications, a feature that is limited in scope under LLMOps.

Resource Allocation and Expertise Utilization: The comprehensive nature of GenOps ensures a more efficient allocation of resources and expertise incorporated into a single unified framework. It centralizes the expertise required for managing various generative AI models, thereby optimizing the use of specialized skills and knowledge. This centralization is especially beneficial for organizations as it negates the need for multiple teams with different skill sets for different types of AI models, leading to better resource utilization and knowledge sharing. The more centralized nature also allows organizations to centrally administer their generative AI systems and develop APIs to expose functionality to their other enterprise systems and to developers.

As generative AI continues to evolve, and new modalities not only become increasingly viable but more tightly cross-integrated, it is only natural that a more integrated and future-ready approach should be undertaken to manage generative AI operations. Organizations that embrace this change, can avoid significant investment in modality-specific implementations like LLMOps and better position themselves for success in the rapidly advancing field of generative AI, ensuring agility, efficiency, and readiness for the next wave of technological innovation.

--

--

Matt White

AI Researcher | Educator | Strategist | Author | Consultant | Founder | Linux Foundation, PyTorch Foundation, Generative AI Commons, UC Berkeley