Does Context7 improve the output of AI tools for developers?

The idea is roughly this: Tools powered by large language models receive context in the form of relevant documentation. The code they generate should then be higher quality. It should contain fewer outdated usages of the given languages, frameworks, or libraries. Phantom libraries, classes, methods, and functions—often seen in proposed solutions—should disappear. The result should be even better if Context7 is told exactly which documentation to draw from.

It seems obvious. There is no reason it should work otherwise. So let’s just add Context7 to the MCP servers of our favorite AI tools (see the documentation). Or is it not that clear‑cut?

Notes on my small test

I might have preferred TypeScript for its broad adoption and similarity to other popular C‑like languages. However, I had a PHP + Symfony + Doctrine ORM project at hand with a sufficiently large codebase, and I came up with a simple, easy‑to‑grasp task (for an LLM and for readers). It can be implemented by creating a single file without touching others—otherwise the test results would be hard to present on GitHub and to compare. Successful completion also requires some orientation in the project. The task is close to something one might actually need.
In my opinion, it has significantly more value than asking an AI to code a simple tic-tac-toe game from scratch.
The test is not all‑encompassing. It targets a single use case across JetBrains AI Assistant, Gemini CLI, and Junie, combined with several models. Even in this scope it was time‑consuming. In practice it mainly helped me decide whether I will use Context7.

A few words about the article’s concept

A standard‑length article describing the entire process and evaluating every result would not be read. The exceptions would have their favorite language model summarize it anyway. So I decided to save human effort and compute time and write it directly as a summary.

What did I find?

TL;DR: The assumption did not hold. For current models from major providers, results with Context7 enabled were actually worse. The only improvement appeared with the oldest model tested.

With even smaller locally hosted models, the positive effect of Context7 could be larger.

Results might differ if, instead of widely used frameworks like Symfony and Doctrine ORM, I targeted something more niche. I have seen some positive reports along those lines.

I am not making a blanket claim that Context7 is useless. It just did not prove helpful under conditions close to how I typically use AI tools.

Prompt

Create a Symfony command (ideally in ShopBundle) that deletes shopping carts (entity Cart) older than a month.
Create a Symfony command (ideally in ShopBundle) that deletes shopping carts (entity Cart) older than a month. use context7
Same as the previous, except Context7 was pointed directly at symfony-docs.

Note: You could argue the prompt could be more detailed, more specific, more precise, etc. However, you can achieve the desired result by generating boilerplate with the non-AI Symfony Maker CLI (or via the Symfony plugin in the IDE) and tweaking it within minutes. So AI generation only adds value if you just type the prompt and run it.

Generated solutions

The only result I was happy with came—perhaps surprisingly—from Gemini CLI with the Gemini 2.5 Pro model, but only without Context7. The code matched the current documentation (simpler and clearer syntax), and the core was the following SQL quer built with the Doctrine QueryBuilder:

$qb->delete(Cart::class, 'c')
    ->where('c.updatedAt < :date')
    ->setParameter('date', $oneMonthAgo);

I saw something similar, though a bit complicated, from Junie with Claude 4 Sonnet—again only without Context7.

Other solutions (including 2.5 Pro with Context7) selected ORM entities into an array and then iterated with a foreach loop. With a large number of truly "abandoned" carts, that approach would be significantly heavier.

I would not infer any general ranking of models from this—that is not the point of the test. I often see GPT‑5 go beyond the literal instruction and "think like a developer". Sometimes it is useful to realize that your request does not fully match what you actually need. In this case the unsolicited add‑ons felt more like trying to outsmart me and justify the foreach.

For the same tool with the same model, the Context7 variant performed worse than the baseline. That held for current top models from Google, OpenAI, and Anthropic. Stepping down the stack, only with Gemini 2.0 Flash Lite (an older, heavily trimmed model) did Context7 show a clear improvement. This was the smallest model I could get working with Gemini CLI. I did not go further to install an even smaller local model and wire it into AI Assistant—the effort would outweigh the value of the finding.

Pointing Context7 precisely at symfony-docs further degraded results for the higher‑end models. For 2.0 Flash Lite it flipped the previously positive effect into the opposite.

It is important to note that within JetBrains AI Assistant the Gemini 2.5 Pro model behaved oddly at the time of my testing (September 14, 2025). In PhpStorm it was labeled Beta and the responses were unusually fast and low‑quality compared to typical 2.5 Pro behavior. In Android Studio the Beta label was not present and the model behaved as expected. Today the Beta label is not shown in PhpStorm either and the behavior now matches 2.5 Pro.

All the generated files are available in the GitHub repo (in a transparent way, I hope): https://github.com/petrspevak/context7_test

This article in Czech language: Zlepšuje Context7 výstupy AI nástrojů pro vývojáře?

Command Palette

Notes on my small test

A few words about the article’s concept

What did I find?

Prompt

Generated solutions

Comments