Context Window
The context window is the amount of text a model can take in at once — the prompt, the conversation so far, and any documents or tool output you include. Everything the model can 'see' for a given response has to fit inside it.
Also known as: context length
The context window is the model’s working memory for a single response, measured in tokens. Whatever you want the model to use — the instruction, the chat history, retrieved documents, the output of a tool it just called — all has to fit inside that window. Anything that doesn’t fit gets truncated or never makes it in.
Bigger windows let you pack in more, but more isn’t automatically better: a window stuffed with marginally relevant text can bury the part that matters and degrade the answer. Managing what goes into the window — and what to leave out — is its own discipline, and it’s where a lot of agent reliability is won or lost.