Context Relevance
Context relevance measures whether the documents a RAG system retrieved actually bear on the question. It scores the retrieval step on its own, before the model writes anything — because the best model can't answer well from the wrong context.
RAG & RetrievalAI Evaluation & Reliability
RAG fails in two different places, and context relevance isolates the first one: retrieval. It asks whether the chunks the system pulled back are actually about the question, independent of what the model later does with them. Low context relevance means the retriever is feeding the model noise, and no amount of prompt tuning will fix an answer built on irrelevant sources.
Measuring it separately is the whole point. If your answers are bad, context relevance tells you whether to fix the retriever (bad relevance) or the generation (good relevance, bad answer). It pairs with faithfulness (did the model stick to the retrieved context?) and answer relevance (did the answer address the question?) to cover the full RAG pipeline.