Context window (Context window) — AI Week Radar glossary

The context window is the input + output budget for a single LLM call, counted in tokens (~0.75 words for English). When the spec says "200k context", it includes the system prompt, conversation history, any retrieved chunks, AND the room for the model's response.

Bigger windows enable longer documents, deeper agent histories, and "fit-your-whole-codebase" workflows. But: recall is non-uniform across the window. Most models recall content from the start and end better than the middle ("lost-in-the-middle"). Above ~100k tokens, recall on specific facts often drops sharply.

Practical answer: RAG is still relevant even at 1M context. Don't replace retrieval with brute-force context-stuffing; combine them.

Context window

See also

RAG

MoE