Why Good Context Matters for Efficient AI Assistance in Tools and IDEs

July 3, 2024 | 12 min Read

Recently, AI assistants have revolutionized how users interact with their Integrated Development Environments (IDEs) and tools. By leveraging large language models (LLMs), these assistants can help with coding, creating diagrams, models, and assisting with complex configurations. At EclipseSource, we specialize in deeply integrating AI into custom tools and IDEs across various domains. One of the most important aspects of a great tool integration, in our experience, is their ability to identify and provide the LLM with the right context and to mold this context into a prompting strategy that is optimized for the task at hand. With the right context and the right prompting, the quality and accuracy of responses delivered by state-of-the-art LLMs can be amplified significantly.. This augmentation and prompting is often even more crucial than the actual choice of the LLM itself. This blog explores why identifying the right context is one of the most important aspects for AI assistance in tools and IDEs and how it can significantly enhance performance.

The Importance of Context in AI Assistance

“Context” is the information we provide to an AI in addition to the actual user request. It can be static information, information retrieved from the tool’s state or obtained from an additional source (e.g. a knowledge database or some documentation). For IDEs and source code projects, context typically includes:

  • Project files (including the actual code) and structure
  • Documentation and READMEs
  • Current state of the tool/IDE, e.g., open editors

A powerful addition to the context is often the history of what the user did last, which allows predicting the user’s intent and the precise support they need. Similarly, you can augment the context with the requirements that a user is currently working on, whether it is a ticket or a description of the purpose of an artifact they are creating. When it comes to domain-specific tools, beyond general-purpose coding, the context can often be augmented with domain-specific knowledge. This can be textual information about the domain, documentation about the semantics of a domain-specific language, as well as the underlying grammar. Having the right context allows the AI to provide accurate and useful responses, making the assistant significantly more useful. Specifically for custom tools, there is huge optimization potential. Good context and adequate prompting strategies become key success factors. Context retrieval and finding the right prompting strategy is highly specific to the respective domain, tool, and use-case. However, let’s look at a simple example to illustrate the importance and highlight conceptional improvement steps we can take.

A Simple Example

In our example, we just checked out the sources of a new project, in this case, the Eclipse Theia open-source project. We now want to build this project and request help from GitHub Copilot, a popular AI-driven tool directly integrated into the IDE.

Disclaimer: We do not want to criticize GitHub Copilot here; we actually like it a lot. To demonstrate why context matters for tool assistants, however, we had to search for simple examples where it lacks context and therefore does not provide good answers.

So, we ask Copilot:

Build the project.

A bit surprisingly, we get this response:

Sorry, I can’t assist with that.

Apparently, Copilot knows nothing about my context, i.e. the project, because how to build the project is actually well documented in the README that is directly available in the workspace that we opened in the IDE.

So maybe we need to be more explicit in telling Copilot to use the workspace context. In Copilot, there is the concept of participants; let’s ask the one which is aware of the project’s workspace.

If we use:

@workspace build my project

Copilot performs some longer-running calculations, obviously to retrieve and summarize information from the workspace. You then get a pretty unusual suggestion to use the ApplicationPackageManager class and write a script to call the function below, which is not really what we would expect.

async function buildProject() {
    const manager = new ApplicationPackageManager();
    const args = []; // Add any webpack arguments you need
    const options: GeneratorOptions = {}; // Customize generator options as needed
    await manager.build(args, options);
}

Copilot obviously obtained the wrong context for this particular question and generated a wrong response as a result.

Now, if we just open the README.md of the workspace in an editor and ask the same question again, we get a good result (see below). This is because now we have implicitly added the README.md into the context of the request. We could have achieved this as well by prompting “how to build the project #file README.md.” to tell Copilot to retrieve the md file and put them into the context. So, obviously, the underlying LLM is very capable of answering this simple question if it just had the right context.

Copilots answer with “README.md” in the context:

Sure, here is a simplified guide to build the Eclipse Theia project:

Steps to Build the Eclipse Theia Project

  1. Clone the Repository

    git clone https://github.com/eclipse-theia/theia.git
    cd theia
    
  2. Install Dependencies

    yarn
    
  3. Build the Project

    yarn build
    

By following these steps, you should be able to build the Eclipse Theia project successfully.

If we do the same exercise with ChatGPT, which obviously is not even aware of our workspace and consequently can’t answer the question directly, it indeed does ask the right questions: It asks us whether we can provide files, such as requesting a README. This indicates that Copilot, at least in our example, prompted the underlying LLM in a way that suppresses these follow-up questions and instead answers everything just based on the provided, but unfortunately, wrong context. Also, it becomes evident that LLMs would, at least in this example, be able to identify which information they would need and could retrieve the correct context themselves if we let them. We can even enforce this behavior by adding something like the following to the system prompt:

If you need any files to answer my questions in a better way, please request them.

In an optimized tool integration, we thus could allow the LLM to list the files it requires and then provide it to them or let the AI obtain the content of these files autonomously via tool functions. With this addition, we would get the great answer above without any further user interaction.

User context

If we look at the latest response above on how to build the project, it is actually correct, but it still has a shortcoming. It contains more steps than we actually asked for. In our scenario, the user had already checked out the code and was in the right directory. The tool could actually already know that. This demonstrates that we can often achieve even better answers by providing context about the tool state. For example, we could provide the last user actions like this:

System prompt:
I am in the Theia open source project’s code base with my IDE. The last commands I executed were
– git clone
– opened folder “theia”

User prompt:
Build the project

With this additional context information, we get a perfect answer, one that is even directly executable.

yarn
yarn build

This simple example shows how we can improve the quality of responses from an AI assistant by providing the right context and prompt. The example is, of course, made up to illustrate this point; we will discuss restrictions and applicability in the section Another Disclaimer below.

Before we get to this discussion, though, let’s look at another example and other options for narrowing the context to provide better results.

Narrowing down the context

A restriction that the aforementioned solutions have is that they are for general-purpose IDEs and are supposed to work in all coding projects. This is a broad scope, thus, the assistant cannot make too many assumptions, such as the existence of a README. While this increases the potential user base, it also limits the depth of context they can automatically infer. If we narrow the use case and the domain, it is typically much easier to provide the right context and a more streamlined prompting strategy. You can see this in Copilot when looking at the use case of code completion.

When a user wants to auto-complete a line of code, the tool integration knows the file the user is in and has a marker indicating the “line to complete”, so it can ask the underlying AI a pretty precise question. Thus, also the result is much better than asking a generic question in the Chat view.

A similar scheme applies if we narrow down not only the editor context but also the domain, e.g., when working with a domain-specific language. In the example video below, we see a graphical editor supporting a flow chart-like syntax. The graphical editor provides its dedicated prompt to an AI. Therefore, we can automatically fill the context with at least the following information:

  • The user works in a diagram editor
  • The “meta-model” of the underlying graphical language, i.e., which nodes and edges exist and how they can be connected
  • The current state of the diagram (empty in the example)
  • The available tools functions, such as “create node”, “layout diagram”, “center element”, etc.

As you can see in the demo video, the assistant can create a full flow chart based on a very simple user prompt that is semantically correct and even apply an auto layout.

If we would go even further and augment the context with not only details about the language but also about the domain, e.g., coffee machines, and the current user task, we would very likely get even more amazing results.

In our customer projects, we found that spending reasonable thought and care on designing context retrieval and the prompting strategy pays off significantly in terms of precision and quality. This effort often supersedes the art of fine-tuning, at least in terms of how fast you can improve an AI system initially. And of course, especially for providing domain-specific context, an intelligent integration of RAG (Retrieval-Augmented Generation) can boost an assistant to the next perceived level.

We will publish more detailed articles about some successful strategies in this area soon, which would go beyond the scope of this article.

Another Disclaimer

As mentioned before, the examples are specifically designed to show what AI assistants can and cannot do due to the context we provide. Specifically, we do not want to criticize GitHub Copilot. In its prime use case, code completion and writing chunks of code based on user prompts, it obviously performs much better than in our example, specifically due to better context. However, the example reveals a general issue for general-purpose IDEs that are supposed to work in all projects and domains. They cannot make too many generic assumptions, such as the existence of a README. While this increases the potential user base, it also limits the depth of context they can automatically infer.

This shows clearly the potential of supporting specific use cases with tailored AI assistance. We can often observe surprising results when building specialized AI assistants for our customers, specifically for domain-specific tools and use cases. They often work very efficiently and sometimes even make manual intervention obsolete. It is a huge difference to build a prompting strategy for a general-purpose IDE compared to a specialized one, such as modeling embedded devices like coffee machines.

Furthermore, general-purpose tools, such as Copilot, also have to deal with questions such as controlling access to sensitive data, managing the context window size, and balancing the number of requests to the LLM. As we can see clearly with Copilot, there is also some guard railing to avoid providing wrong answers or too much verbosity by asking additional questions. This is a design choice; other AI IDEs such as Cursor IDE actually ask for more information in such use cases.

Again, these restrictions might not apply in a more specific use case. For example, if a company decides to provide its own LLM instance, they can allow their internal users to send all data to it. In this case, the AI integration in a tool would have much more freedom in creating a good context. In this scenario with controlled hosting, the context window size and the number of requests to the AI might also be much less problematic. This would again provide more freedom to create good prompting strategies, e.g., with prompt chaining and focus on the maximum efficiency for the user rather than on minimizing requests.

One real shortcoming we observe in many existing AI tool integrations, such as Copilot or Cursor IDE, is that the prompting strategies are unfortunately not transparent. Therefore, we cannot look at the details, nor can we ourselves tweak the prompts to obtain more accurate results. We do not even know exactly which information is exchanged with the underlying AI at all. However, this is a topic for another blog post.

Conclusion

Providing the right context is essential for efficient AI assistance in tools and IDEs. Combining the right context with the appropriate prompting strategy, tools and IDEs can leverage AI to significantly enhance their users’ productivity and streamline their workflows. However, it is important to note that the wider the use case, domain, and user base, the more challenging it becomes to create effective context and prompt strategies. This complexity can often lead to suboptimal performance in general-purpose tools.

Building tailored AI integrations can circumvent these challenges, resulting in much more efficient and effective AI assistance. Specialized AI assistants designed for specific domains and use cases can leverage precise context and sophisticated prompting strategies, delivering superior results. At EclipseSource, we have seen the benefits of this approach in our customer projects. By focusing on the design of context retrieval and prompting strategies, we have been able to achieve significant improvements in the precision and quality of AI assistance.

Unfortunately, the prompt strategies of existing AI tool integrations, such as Copilot or Cursor IDE, are not open source. This lack of transparency limits our ability to tweak and optimize their strategies further. However, as the field evolves, we hope to see more open and customizable solutions that will allow developers to fully harness the power of AI in their tools and IDEs.

EclipseSource is at the forefront of technological innovation, ready to guide and support your AI initiatives. Our comprehensive services in AI integration are designed to provide the specialized know-how necessary to develop customized, AI-enhanced solutions that elevate your tools and IDEs to new levels of efficiency and innovation. Explore how we can assist in integrating AI into your tools with our AI technology services. Reach out to begin your AI integration project with us.

Jonas, Maximilian & Philip

Jonas Helming, Maximilian Koegel and Philip Langer co-lead EclipseSource. They work as consultants and software engineers for building web-based and desktop-based tools. …