Advanced AI Architectures — 9 Key Terms · WYP.agency

Skip to content

New · The guide “Mastering Claude at work”, free. Download it →

AI APIAn AI API is a technical interface that lets a software application send requests to an AI model hosted by a provider, and retrieve its responses. It is the standard access mode to AI in enterprise, as opposed to local hosting of the model.
DistillationDistillation is a technique that transfers the knowledge of a large AI model (teacher model) to a smaller model (student model), while preserving most of the performance. It enables the deployment of lightweight models with reduced inference cost, viable on more modest infrastructures.
Fine-tuningFine-tuning is an adaptation technique for an already-trained AI model, which consists of continuing its training on a dataset specific to your use case. It modifies the model's internal parameters, in contrast to RAG, which simply injects context at query time.
Function callingFunction calling is the ability of an AI model to invoke predefined functions or tools to execute actions in an external system. The model returns a structured object (JSON) rather than text, allowing the application to call the function and reinject the result into the conversation.
MCP (Model Context Protocol)MCP (Model Context Protocol) is an open standard, introduced by Anthropic in November 2024, that lets an AI model connect to data sources and external tools in a uniform way. It avoids writing specific connectors for every model-application combination.
MoE (Mixture of Experts)Mixture of Experts (MoE) is an AI model architecture that splits the network into specialised sub-models, called experts. For each token processed, a router dynamically selects a few experts, leaving the others inactive. The model has the capacity of a large model but the compute cost of a smaller one.
Open-source modelAn open-source AI model is a foundation model whose weights and architecture are freely downloadable and usable under a permissive licence (Apache 2.0, MIT). It contrasts with the proprietary model (Claude, GPT, Gemini) accessible only via API. The choice engages sovereignty, cost, and long-term flexibility.
RAG (Retrieval-Augmented Generation)RAG (Retrieval-Augmented Generation) is an AI architecture that pairs a search engine across your documents with a generative model. The model answers by relying on citable business data rather than on its training knowledge alone.
Vector databaseA vector database is a database specialised in the storage and retrieval of vectors (embeddings). It allows, for a given query, finding the most semantically close content in a corpus, without exact lexical match. It is the typical search engine of a RAG system.

Other categories

Address copied