The Complete AI Terminology Guide

175 Essential Terms Across 12 KeyCategories

A comprehensive reference for understanding artificial intelligence in 2026

1. Foundational AI & Machine LearningConcepts

•Algorithm — A set of mathematical rules or instructions amodel follows to learn from data and make predictions.

•Artificial Intelligence (AI) — The broad field of computerscience focused on building systems that can perform tasksrequiring human-like intelligence, such as reasoning, perception,and language.

•Cognitive Architecture — A framework that models theunderlying structure of an intelligent agent's mind, includingmemory, perception, and decision-making.

•Deep Learning — A subset of ML using multi-layered neuralnetworks to learn hierarchical representations from largedatasets.

•Hyperparameter — Configuration settings (e.g., learning rate,batch size) set before training that control the learning processitself.

•Inference — The process of using a trained model to generateoutputs or predictions on new, unseen inputs.

•Machine Learning (ML) — A subfield of AI where algorithmslearn patterns from data without being explicitly programmedfor each task.

•Model — The trained artifact produced by running an algorithmon data; it encapsulates learned patterns for making predictions.

•Neural Network — A computational model inspired by thehuman brain, composed of interconnected nodes (neurons) thatprocess and transform data.

•Parameters — The internal numerical weights of a model thatare adjusted during training to encode learned knowledge.

•Supervised Learning — Training where the model learns fromlabeled input-output pairs.

•Training — The process of feeding data to a model and adjustingits parameters so it learns to perform a task correctly.

•Transfer Learning — Reusing a model trained on one task asthe starting point for a different but related task.

•Unsupervised Learning — Training where the model findspatterns in unlabeled data without explicit guidance.

World Model — An internal representation an AI builds of howthe environment works, used for planning and prediction.

2. Large Language Models & Architecture

•Activation Function — A mathematical function (e.g., ReLU,GELU, SiLU) that introduces non-linearity into a neural network,enabling it to learn complex patterns.

•Attention Mechanism — A technique allowing a model to focuson the most relevant parts of an input when producing eachoutput token.

•Context Window — The maximum number of tokens a modelcan "see" and process at once during a single forward pass.

•Decoder — The Transformer component that generates outputtokens one at a time using encoder representations and prioroutputs.

•Embedding — A dense numerical vector representation of atoken or concept in a continuous high-dimensional space.

•Encoder — The Transformer component that converts inputtext into rich contextual representations.

•Feed-Forward Network (FFN) — The component within eachTransformer block that applies non-linear transformations toeach token's representation independently.

•Flash Attention — A memory-efficient attention algorithm thatreorders computations to reduce GPU memory overhead andincrease speed.

•Large Language Model (LLM) — A Transformer-based neuralnetwork trained on massive text datasets to understand andgenerate human language.

•Layer Normalization — A technique that normalizesactivations within a layer to stabilize training of deepTransformer networks.

•Logits — The raw, unnormalized scores a model outputs beforebeing converted to probabilities via softmax.

•Multi-Head Attention — Running multiple self-attentionoperations in parallel, each learning different aspects of tokenrelationships.

•Positional Encoding — A technique that adds information abouttoken order to embeddings, since Transformers have noinherent sense of sequence.

•Residual Connection (Skip Connection) — A technique where alayer's input is added directly to its output, helping gradientsflow during deep network training.

•Self-Attention — A form of attention where each token in asequence attends to all other tokens to build contextualrepresentations.

•Softmax — A mathematical function that converts logits into aprobability distribution over possible next tokens.

•Sparse Attention — A variant of attention that only computesrelationships between a subset of token pairs, enabling longercontext windows at lower cost.

•Token — The basic unit of text an LLM processes — typically aword, sub-word, or character.

•Tokenization — The process of splitting raw text into tokens(using algorithms like BPE or WordPiece) before feeding it to amodel.

•Transformer — The dominant neural network architecture,introduced in 2017, that uses attention mechanisms instead ofrecurrence to process sequences in parallel.

Vocabulary — The complete set of tokens a model knows; wordsor sub-words outside it are unknown.

3. Generative AI

•Code Generation — AI's ability to write, complete, or debugsource code from natural language instructions.

•Deepfake — Synthetic media (video, audio, image) generated byAI that realistically depicts real people saying or doing thingsthey did not.

•Diffusion Model — A generative model that learns to createdata (e.g., images) by reversing a process of gradually addingnoise.

•Foundation Model — A large model trained on broad data thatcan be adapted to many downstream tasks (e.g., GPT-4, Claude,Gemini).

•GAN (Generative Adversarial Network) — A frameworkwhere a generator and discriminator network compete, causingthe generator to produce increasingly realistic outputs.

•Generative AI (GenAI) — AI systems capable of producing newcontent — text, images, audio, code, or video — rather than justclassifying or predicting.

•GPT (Generative Pre-trained Transformer) — OpenAI's familyof decoder-only Transformer models trained via next-tokenprediction on large text corpora.

•Multimodal Model — An AI model that processes and generatesmultiple types of data (e.g., text and images together).

•Text-to-Image — A generative AI capability where a modelcreates images from natural language text descriptions.

•Text-to-Speech (TTS) — Technology that converts written textinto synthesized spoken audio.

•VAE (Variational Autoencoder) — A generative model thatencodes inputs into a compressed latent space and decodes themback to reconstruct or generate data.

Vision-Language Model (VLM) — A multimodal AI modeltrained to understand and reason over both images and textsimultaneously.

4. Training Methods & Optimization

•Backpropagation — The algorithm that calculates gradients ofthe loss with respect to each parameter by propagating errorbackward through the network.

•Batch Size — The number of training examples processedtogether in one forward/backward pass during training.

•Constitutional AI — Anthropic's method of training AI to behelpful and harmless by having the model self-critique against aset of principles.

•Contrastive Learning — A training technique that teaches amodel to pull similar examples together and push dissimilarones apart in embedding space.

•Cross-Entropy Loss — The most common loss function forclassification and next-token prediction tasks in LLMs.

•Curriculum Learning — A training strategy that presentsexamples in order from easy to hard, mimicking how humanslearn.

•Data Augmentation — Techniques for expanding a trainingdataset by creating modified versions of existing data (e.g.,paraphrasing text).

•Data Flywheel — A self-reinforcing loop where a productgenerates user interaction data that improves the model, whichattracts more users, generating more data.

•Direct Preference Optimization (DPO) — A fine-tuningalternative to RLHF that directly trains a model on humanpreference data without a separate reward model.

•Dropout — A regularization technique that randomly disables afraction of neurons during training to prevent overfitting.

•Epoch — One complete pass through the entire training datasetduring model training.

•Fine-tuning — Adapting a pre-trained model on a smaller, task-specific dataset to improve performance on that task.

•Gradient Descent — The core optimization algorithm thatiteratively adjusts model weights in the direction that reducesthe loss function.

•Instruction Tuning — Fine-tuning a model on datasets ofinstructions paired with ideal responses to make it better atfollowing user directions.

•Learning Rate — A hyperparameter controlling how muchmodel weights are adjusted with each training update.

•Loss Function — A mathematical measure of how wrong amodel's predictions are; training aims to minimize it.

•Overfitting — When a model learns training data too preciselyand performs poorly on new, unseen data.

•Pre-training — The initial large-scale training phase where amodel learns general language patterns from vast datasets.

•Regularization — Techniques (e.g., dropout, weight decay) usedto prevent overfitting by constraining model complexity.

•Reinforcement Learning (RL) — A training paradigm where anagent learns by receiving rewards or penalties based on itsactions in an environment.

•RLAIF (Reinforcement Learning from AI Feedback) — Avariant of RLHF where an AI model (rather than humans)provides the feedback signal.

•RLHF (Reinforcement Learning from Human Feedback) — Atechnique where human raters score model outputs and thosescores guide further training to align behavior with humanpreferences.

•Self-Supervised Learning — A form of unsupervised learningwhere the model creates its own labels from raw data (e.g.,predicting the next token).

•Synthetic Data — Artificially generated training data used whenreal data is scarce, sensitive, or expensive to label.

•Underfitting — When a model is too simple to capture theunderlying patterns in training data.

5. Model Architecture Variants

•Model Architecture Variants•Dense Model — A model where all parameters are used forevery input, in contrast to sparse/MoE models.

•Mixture of Agents (MoA) — An architecture where multipleLLM agents collaborate in layers, each refining the outputs ofthe previous layer.

•Mixture of Experts (MoE) — A model architecture wheredifferent sub-networks ("experts") specialize in different inputs,and a router activates only the relevant ones per token,improving efficiency.

•Multimodal Embedding — A shared vector space where text,images, audio, and other modalities are embedded together,enabling cross-modal retrieval and reasoning.

•Reasoning Model — An LLM specifically designed to decomposecomplex problems into multiple logical steps before producingan answer (e.g., OpenAI's o1/o3, DeepSeek R1).

•Sparse Model — A model (often MoE-based) where only afraction of parameters are active for any given input, reducingcompute costs.

Speculative Decoding — An inference acceleration techniquewhere a small draft model generates candidate tokens that alarger model then verifies in parallel.

6. Prompting & Interaction Design

•Chain-of-Thought (CoT) — A prompting technique thatencourages a model to reason step by step before producing afinal answer.

•Few-Shot Prompting — Providing a small number of examples within the prompt to help the model understand the desired task format.

•Hallucination — When an AI model generates plausible-sounding but factually incorrect or fabricated information.

•Prompt — The input text or instruction given to an AI model toelicit a desired response.

•Prompt Engineering — The practice of carefully designingprompts to guide an LLM toward accurate, useful, and well-formatted outputs.

•Structured Output — Configuring an LLM to produce outputs ina defined format (e.g., JSON schema) rather than free-form text,critical for production pipelines.

•Sycophancy — A model failure mode where it tells users whatthey want to hear rather than what is accurate, prioritizingapproval over truth.

•System Prompt — Instructions given to a model at the start of aconversation to establish its persona, constraints, or behavior.

•Temperature — A sampling parameter controlling outputrandomness; higher values produce more creative/variedresponses, lower values more deterministic ones.

•Top-k Sampling — A decoding strategy where the modelsamples only from the top k most probable next tokens.

•Top-p (Nucleus) Sampling — A strategy where the modelsamples from the smallest set of tokens whose cumulativeprobability exceeds a threshold p.

Zero-Shot Prompting — Asking a model to perform a task without providing any examples in the prompt.

7. Retrieval, Memory & Grounding

•Chunking — The process of splitting documents into smallersegments before embedding them for retrieval in a RAG pipeline.

•Entity Resolution — The process of identifying and linkingreferences to the same real-world entity across different datasources.

•GraphRAG — An extension of RAG that uses knowledge graphsto provide structured relational context alongside retrieveddocuments.

•Grounding — Connecting an AI model's outputs to verifiablereal-world facts or data, reducing hallucination.

•Hybrid Search — Combining keyword-based (BM25) andsemantic (vector) search to improve retrieval quality.

•Knowledge Graph — A structured representation of real-worldentities and the relationships between them, often used toaugment AI reasoning.

•KV Cache (Key-Value Cache) — A performance optimizationthat stores intermediate attention computations to speed uptoken generation.

•Long-Term Memory — Mechanisms that allow an AI agent topersist and retrieve information across multiple sessions.

•RAG (Retrieval-Augmented Generation) — A technique thatretrieves relevant external documents and injects them into aprompt so the model can ground answers in real evidence.

•Reranking — A step in RAG pipelines where retrieveddocuments are scored and reordered by relevance before beingpassed to the model.

•Semantic Search — Search that finds results based on meaningand context rather than exact keyword matches, typically usingembeddings.

•Vector Database — A database optimized to store and queryhigh-dimensional embeddings for fast semantic similaritysearch, commonly used in RAG systems.

8. AI Agents & Agentic Systems

•Agentic Loop — The repeating cycle of perceive → reason → act→ observe that autonomous AI agents execute to accomplishgoals.

•AI Agent — An AI system that can autonomously take actions,use tools, and pursue goals over multiple steps.

•Function Calling — A feature in LLM APIs allowing the model tooutput structured calls to developer-defined functions or tools.

•Memory-Augmented Agent — An agent equipped with explicitexternal memory stores (short-term and long-term) to maintaincontext across tasks.

•Model Context Protocol (MCP) — An emerging open standard(popularized by Anthropic) that defines how AI models connectto external tools and data sources in a standardized way.

•Multi-Agent System — A framework where multiple AI agentscollaborate or compete to solve complex tasks.

•Orchestration — The coordination layer that manages the flowof tasks between multiple AI agents or components.

•Planning — An agent's ability to decompose a high-level goalinto a sequence of concrete sub-tasks.

•ReAct (Reason + Act) — A prompting paradigm where an agentalternates between reasoning about a situation and taking anaction.

•Sandbox — An isolated execution environment for an AI agentto safely run code or test actions without affecting productionsystems.

•Scratchpad / Chain-of-Thought Buffer — A temporary workingmemory space where a model writes intermediate reasoningsteps before producing a final output.

•Tool Calling — The broader ability of an AI agent to invokeexternal tools (APIs, databases, code) during a generation step.

•Tool Use — An agent's ability to invoke external tools (e.g.,search engines, code interpreters, APIs) to complete tasks.

9. Model Efficiency & Deployment

•API (Application Programming Interface) — An interface allowing developers to interact with an AI model or service programmatically.

•Distillation (Knowledge Distillation) — Training a smaller"student" model to mimic the behavior of a larger "teacher"model.

•Edge AI — Deploying AI models directly on local devices(phones, sensors, cars) rather than in cloud data centers, reducing latency and improving privacy.

•GPU (Graphics Processing Unit) — The hardware accelerator most commonly used to train and run deep learning models dueto its massive parallel processing capability.

•Latency — The time it takes for a model to produce a responseafter receiving an input.

•LoRA (Low-Rank Adaptation) — A parameter-efficient fine-tuning technique that adds small trainable matrices to a frozen model rather than retraining all weights.

•Model Serving — The infrastructure layer responsible for deploying trained models and efficiently handling real-time prediction requests at scale.

•NPU (Neural Processing Unit) — A processor integrated into consumer devices (phones, laptops) optimized for on-device AI inference tasks.

•PEFT (Parameter-Efficient Fine-Tuning) — A family of techniques (including LoRA) for adapting large models with minimal compute and data.

•Prompt Caching — An optimization where common prompt prefixes are cached server-side so repeated calls don't recompute them.

•Pruning — Removing redundant or low-importance weights from a model to make it smaller and faster.

•Quantization — Reducing the numerical precision of model weights (e.g., from 32-bit to 4-bit) to shrink model size and speedup inference.

•Throughput — The number of tokens or requests a model can process per unit of time.

•TPU (Tensor Processing Unit) — Google's custom AI accelerator chip, designed specifically for matrix operations in deep learning workloads.

10. Evaluation, Benchmarks & FailureModes

•Adversarial Example — An input crafted with subtleperturbations specifically designed to fool a model into makingan incorrect prediction.

•Benchmark — A standardized test or dataset used to measureand compare AI model performance (e.g., MMLU, HumanEval).

•Benchmark Contamination — When training data includesexamples from evaluation benchmarks, artificially inflatingreported scores.

•BLEU Score — A metric for evaluating machine translationquality by comparing outputs to human reference translations.

•Catastrophic Forgetting — The tendency of a neural network tolose previously learned information when trained on new data.

•Distribution Shift — When the statistical properties of real-world inputs differ from the training data, degrading modelperformance.

•Eval (Evaluation Suite) — A structured set of tests used toassess a model's capabilities across specific tasks or safetydimensions.

•F1 Score — A classification metric that balances precision(accuracy of positive predictions) and recall (coverage of truepositives).

•Human-in-the-Loop (HITL) — A design pattern where humansare included at critical decision points in an otherwiseautomated AI pipeline.

•Perplexity — A metric measuring how well a language modelpredicts a test dataset; lower perplexity means better prediction.

•Prompt Injection — An attack where malicious instructions areembedded in external content (e.g., a webpage) to hijack an AIagent's behavior.

•Red-Teaming — Deliberately attacking or adversarially probingan AI system to discover its failure modes, biases, or safetyvulnerabilities.

ROUGE — A set of metrics measuring overlap betweengenerated text summaries and reference summaries.

11. AI Ethics, Safety & Governance

•AI Act — The European Union's landmark regulatoryframework, in force from 2024, that classifies AI systems by risklevel and imposes corresponding compliance requirements.

•AI Safety — The field of research focused on preventing AIsystems from causing unintended harm.

•Alignment — The challenge of ensuring AI systems pursue goalsand behave in ways that are consistent with human values andintentions.

•Bias — Systematic errors or unfair skews in a model's outputsthat stem from biased training data or objectives.

•Constitutional AI Principles — A defined set of values or rulesgiven to an AI system against which it self-evaluates its ownoutputs during training or inference.

•Differential Privacy — A mathematical framework for trainingmodels in a way that guarantees individual data points cannotbe reverse-engineered from model outputs.

•Explainability (XAI) — The degree to which a model's decisionsand reasoning can be understood by humans.

•Fairness — The principle that an AI system should treat allgroups equitably and not produce discriminatory outcomes.

•Federated Learning — A distributed training approach wheremodels are trained locally on user devices without raw dataever leaving those devices, preserving privacy.

•Guardrails — Constraints and filters built into an AI system to prevent it from producing harmful, offensive, or off-policy outputs.

•Interpretability — The ability to understand the internal mechanisms of a model — what features it uses and why.

•Jailbreaking — Techniques used to bypass an AI model's safety restrictions and elicit prohibited outputs.

•Model Card — A documentation artifact accompanying areleased AI model that describes its intended use, limitations,training data, and performance metrics.

•Responsible AI — A set of principles and practices (fairness, transparency, accountability, safety) guiding ethical AI development and deployment.

Watermarking — Embedding hidden signals in AI-generated content (text or images) to allow later detection of machine-generated material.

12. Emerging Concepts & Future Directions

•AGI (Artificial General Intelligence) — A hypothetical AIsystem with human-like cognitive flexibility, capable ofperforming any intellectual task a human can do.

•AI Slop — Low-quality, generic, or inaccurate AI-generatedcontent produced at scale with little human review or editorialoversight.

•ASI (Artificial Superintelligence) — A theoretical AI thatsurpasses human intelligence across all domains; as of 2025 thismoved from science fiction to corporate mission statements atcompanies like Meta and Microsoft.

•Compute Budget — The total computational resources(measured in FLOPs or GPU-hours) allocated to training a model,a key constraint in AI development.

•FLOP (Floating Point Operation) — A basic unit for measuringcomputational workload; model training costs are oftendescribed in total FLOPs required.

•GEO (Generative Engine Optimization) — The practice ofoptimizing content so it appears in and is cited by AI-generatedsearch summaries, analogous to SEO for traditional search.

•Hyperscaler — The largest cloud infrastructure providers (AWS,Google Cloud, Azure, etc.) that supply the compute power fortraining and serving frontier AI models.

•Metacognition — An AI system's capacity to reflect on andevaluate its own reasoning or knowledge limits, sometimescalled "knowing what you don't know."

•Vibe Coding — A 2025 term for using natural language promptsto generate entire software projects with minimal manual code-writing.

Additional Philosophical & Ethical Concepts

•Artificial Narrow Intelligence (ANI) — The current state of AI: systems that excel at one specific task but cannot generalize beyond it.

•Autonomy Gradient — The spectrum from fully human-controlled to fully AI-autonomous systems; every point on thisgradient carries different ethical and legal implications foraccountability.

•Bayesian Inference — A probabilistic framework where beliefsare updated based on new evidence; foundational to rationalistepistemology and AI reasoning systems.

•Chinese Room Argument — John Searle's thought experimentarguing that a system manipulating symbols according to rules(like an LLM) does not thereby "understand" anything — a directphilosophical challenge to claims of AI cognition.

•Computer Vision (CV) — The AI field focused on enablingmachines to interpret and understand visual information fromimages and video.

Emergent Intelligence — Intelligence or properties that arisefrom the complexity of a system without being explicitlyprogrammed; raises questions about reduction andsupervenience.

•Epistemic Calibration — The degree to which a model's statedconfidence matches its actual accuracy; a virtue epistemologyconcept applied to machine outputs.

•Existential Risk (X-Risk) — The possibility that advanced AIcould pose a civilizational or extinction-level threat; connects tolongtermism and the moral weight of future persons.

•Functionalism — The philosophical view that mental states aredefined by their functional roles rather than their physicalsubstrate, providing the primary philosophical justification forthe possibility of machine minds.

•Instrumental Convergence — The philosophical thesis thatalmost any sufficiently capable AI will converge on sub-goalslike self-preservation and resource acquisition, regardless ofterminal goals.

•Knowledge Representation — How facts about the world areencoded inside an AI system; touches on philosophy of languageand propositional vs. non-propositional knowledge.

•Moral Agency — The capacity to act based on moral reasoningand bear responsibility; most philosophers currently deny thisapplies to AI.

•Moral Patienthood — The philosophical question of whether anAI system can be an object of moral concern — something towhich we owe duties.

•Natural Language Processing (NLP) — The branch of AIfocused on enabling machines to understand, interpret, andgenerate human language.

•Orthogonality Thesis — Nick Bostrom's claim that intelligenceand goals are independent dimensions — a superintelligent AIneed not be benevolent simply by virtue of being intelligent.

•Phenomenology of AI — The study of whether AI systems haveanything analogous to first-person experience — perception,intentionality, or a "point of view."

•Recommendation Engine — An AI system that analyzes userbehavior to suggest relevant content, products, or services.

•Sentiment Analysis — An NLP technique that classifies theemotional tone (positive, negative, neutral) of text.

•Sentience — The capacity to have subjective sensory oremotional experiences; distinct from intelligence, and the keycriterion in most ethical frameworks for moral consideration.

•Turing Test — Alan Turing's behavioral criterion for machine intelligence: if a machine's conversational output is indistinguishable from a human's, it should be considered intelligent.

Value Alignment Problem — How to specify human values precisely enough to encode them in AI, given that humans disagree, are inconsistent, and often can't articulate their own values.

Home

Contact

Reach out for collaborations or questions.

Email

aldous@gerbrot.com

Visit us on Substack.com