From the course: MLOps and Data Pipeline Orchestration for AI Systems
Unlock this course with a free trial
Join today to access over 24,600 courses taught by industry experts.
LLM model deployment and operations
From the course: MLOps and Data Pipeline Orchestration for AI Systems
LLM model deployment and operations
- [Instructor] We've discussed LLM model development and evaluation. Let's move on and discuss LLM deployment and operations. This of course includes model deployment after training and evaluation. LLMs can be deployed as APIs, serverless endpoints, or on device, but serving large models introduces major constraints in latency, cost, and scale, driving the use of optimization tools. Once you deploy the model, you have to actively monitor the model. This involves tracking prompt behavior, toxic output, and hallucinations alongside traditional metrics like latency and resource usage. This requires custom alerts for misuse, bias or safety violations. You're likely to be improving your model periodically, which means you'll need to deploy multiple versions. LLM versioning includes tracking not just model weights, but also prompts, adapters, and fine tuning configurations, which means you need registries that manage base models and all modifications that you make along with the relevant…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.