What are generative AI models?

Generative AI models are machine learning models that produce new content — text, images, audio, video, code, or structured data — by learning patterns from training data and sampling from learned distributions. Large language models, diffusion models, and multimodal models are the primary categories in current production use.

Categories of generative AI models

Generative AI models are categorized by their output modality and architecture. Large language models (LLMs) generate text by predicting the most likely next token given the preceding context; they handle question answering, text generation, coding, reasoning, and conversation. Diffusion models generate images, audio, or video by learning to reverse a noise-addition process; they are the dominant architecture for image synthesis. Multimodal models accept and produce multiple modalities — text, image, and sometimes audio — in a single model, enabling tasks like image captioning, visual question answering, and image-guided generation. Each category has different training requirements, inference characteristics, and failure modes.

How they differ from discriminative models

Discriminative models classify or predict: given an input, what category does it belong to, or what value should be predicted? Generative models produce: given a conditioning input, what new content should exist? The distinction affects how they are evaluated and deployed. Discriminative models are evaluated against labeled test sets with clear right/wrong answers. Generative models produce outputs where quality is harder to define — an image may be technically correct (no artifacts) but aesthetically poor, or a text response may be fluent but factually wrong. This evaluation challenge makes generative AI quality assurance more complex than equivalent discriminative AI quality assurance.

Foundation models and fine-tuning

Foundation models are large generative models pre-trained on broad datasets that can be adapted to specific tasks through fine-tuning, prompting, or retrieval augmentation. The pre-training investment is enormous but amortized across many downstream applications; most organizations deploy foundation models rather than training from scratch. Fine-tuning specializes a foundation model on task-specific data to improve performance, adjust style, or embed domain knowledge. Retrieval augmentation augments foundation model outputs with information retrieved from an external knowledge base at inference time, reducing hallucination for knowledge-intensive tasks without retraining.

What are generative AI models? — FAQ

What is the difference between a generative AI model and a generative AI system?

A model is the neural network that performs the generation — a set of learned weights that transforms inputs to outputs. A system is the full production stack: the model, the infrastructure that hosts it, the API that accepts requests, the application that formats prompts, the guardrails that filter outputs, and the monitoring that tracks behavior. A generative AI model in isolation does not handle requests; a generative AI system does. Most published research discusses models; most production deployments discuss systems.

Are all large language models generative AI models?

Yes. Large language models are a type of generative AI model — specifically, they generate text. The term 'generative AI' is sometimes used more narrowly to refer to image and creative content generation, but technically it applies to any model that generates new content rather than classifying or predicting from existing data. LLMs are the most widely deployed generative AI models by usage volume, though image generation models are more visible to general audiences.