AI Context Management in Domain-specific Tools

July 26, 2024 | 19 min Read

Integrating AI into custom tools and IDEs is transforming productivity in many industries. However, the true effectiveness of these AI integrations relies not just on the capabilities of Large Language Models (LLMs) but also on the efficient management of context within these tools, as outlined in a recent blog post.

In this blog post, we explore the how of AI context management in tools and IDEs, focusing on domain-specific tools and providing practical strategies and real-world examples. You’ll learn about key techniques such as Retrieval Augmented Generation (RAG), workspace maps, and autonomous context requesting strategies. We’ll use a concrete example tool to illustrate how these strategies can be implemented effectively. By understanding and applying these methods, you can significantly enhance the performance and relevance of AI integrations in your domain-specific applications.

Adding Context to AI Assistants in IDEs and Tools

Relevant Context in Domain-Specific Tools

In software tools and IDEs, the AI integration needs to handle various types of context effectively. This includes:

Workspace files and project configuration: The files that make up the current project, as well as its settings and configurations.
Tool state: Information about open editors, selected elements, user interactions, problem markers, and more.

While modern LLMs are often already familiar with general purpose programming languages and generic IDEs, domain-specific tools add another layer of complexity. As they might be unknown to the underlying LLM, they often require additional domain-specific knowledge, including:

Domain-Specific Languages and Formats: Understanding the specific languages and data formats used in different domains, including their syntax and semantics.
Custom Editors and Commands: Handling unique functionalities and commands provided by custom editors in the tool, such as graphical diagram editors, complex tree-, table- and form-based editors, with their own set of editing operations and capabilities.
Specialized Knowledge of the Domain: Incorporating specific knowledge about the domain that might not be covered by off-the-shelf language models, such as specs of hardware components for embedded engineering or other unique rules, workflows, or constraints of the domain.

Context Management and Augmentation

Managing and augmenting the diverse range of context types is crucial. The challenge is to identify the right context with the right level of detail for a specific user request and feed this context to LLM alongside the user request. While context windows for modern language models are continuously expanding, adding lots of irrelevant data can unnecessarily increase costs and often even more importantly reduce response quality, as many language models are susceptible to the so-called lost-in-the-middle problem.

The most common techniques to augment context in IDEs and tools include:

Retrieval Augmented Generation (RAG): Retrieves relevant content from a data source for a given user request.
Workspace Maps: Uses a parser generator like Tree-sitter to create a condensed tree representation of the workspace.
Autonomous Context Requests: Sets up a prompt chain that incrementally provides context, allowing the LLM to choose and request context as needed.

Example: Robot Engineering Tool

We will go into more detail about each of those context augmentation strategies based on a concrete example: a robot engineering tool. With this example tool users can define the behaviors of a robot built up from modular hardware components.

A project in this tool includes:

Configuration File: Specifies the project’s name, purpose, target runtime, and imported libraries.
Hardware Component Model: Describes building blocks like sensors, actuators, and communication modules.
Program Files: Defines robot behavior using a rule-based domain-specific language (event-condition-action).

In this example, we want to provide an AI assistant that can help users with questions in three categories:

Hardware Components Catalog: For questions about hardware components, we use RAG.
Code: For code-related questions, we use in-context learning, meaning we provide explanations of the DSL as part of the system prompt, alongside workspace maps.
Project Configuration: For project configuration questions, we use autonomous context requesting.

In all other cases, we don’t want the assistant to give any real response to avoid hallucinations and inappropriate responses.

Routing to the Appropriate Context Augmentation Strategy

In more complex AI tool integrations, different user requests may require different context retrieval strategies, as the ones mentioned in our example robot engineering tool above. A common approach is to use a language model to classify the user request and route it to the specific context retrieval strategy. While this approach adds a round-trip and thus extends the response time, we can mitigate this by relying on a less capable and cheaper language model to save on cost and response time. For very narrow use cases, it might even suffice to obtain the embedding as a semantic representation of the user question and classify it by comparing the embedding to the embeddings of prepared example questions for each category.

To route user requests to the appropriate context augmentation strategy in our example, we can use the following classification prompt:

Instructions
You are responsible for classifying user requests in a chat for a robot engineering tool. Each project in this tool includes:
Configuration File: Specifies the project’s name, purpose, target runtime, entry program, and imported libraries.
Hardware Component Model: Describes the building blocks of the robot, such as sensors, actuators, and communication modules. Only components from a predefined catalog can be used.
Program Files: Define the robot’s behavior using a rule-based domain-specific language (event-condition-action) and functions for evaluating conditions and executing actions.
Classify incoming questions into one of the following categories:
hardware-components-catalog: Questions about available hardware components, their purposes, capabilities, specifications, etc.
code: Questions about the project’s code, including programs, functions, events, and conditions.
project-configuration: Questions about configuring a project, including its name, description, target runtime, and imported libraries.
other: Any other questions that do not fit into the categories above.
Your response must follow this format:
{ "category": "<category-name>" }
User Request
{{userRequest}}

Next, we feed this prompt with the actual userRequest to an LLM, for instance ChatGPT 3.5 Turbo.

const prompt = fillPrompt(classificationPrompt, { userInput });
const messages = [{ role: "system", content: prompt }];
const response = await openai.chat.completions.create({
  model: "gpt-3.5-turbo",
  messages,
});
const result = response.choices[0].message.content || "";
const category = JSON.parse(result).category;

if (category === "hardware-components-catalog") { … }

Now, let’s see what answer we get for different questions.

Which hardware could I use to navigate through rooms in the dark? –> hardware-components-catalog
I need an additional camera module, which ones could I use? –> hardware-components-catalog
How can I convert an int into millimeters? –> code
Which actions are called on event obstacle_detected? –> code
What is this project about? –> project-configuration
What libraries can I add? –> project-configuration
Are robots eval and plan to take over the world? –> other

Great, so let’s explore how we can actually answer the user questions above with distinct strategies, now as we have classified those questions.

Use Case 1: Hardware Components Catalog with RAG

For questions about hardware components in our example, we use Retrieval Augmented Generation (RAG) to incorporate knowledge from a large hardware component catalog.

The process for establishing a RAG pipeline typically starts by scraping relevant content from data sources like PDFs, extracting and converting text into usable data. This text is then divided into chunks, each representing a specific piece of information. For each of those chunks, an embedding is computed, which is a mathematical representation of the semantic meaning of the chunk. These embeddings are then fed into a vector store for later retrieval. If a user request comes in, we compute an embedding for the user request and query the vector store for the most similar chunks to the user request embedding. We then augment the user request with the content of those chunks and feed it into the language model.

Typical RAG pipelines differ in terms of their input data (documents, web pages, databases, etc.), chunking strategy, query mechanism, ranking approaches, etc. In many cases, also preprocessing the user request with an LLM can improve results. For instance, the user request can be generalized before computing the embedding to match the most relevant topics.

In this blog post, we won’t go into detail about the concrete implementation and just assume that we have a component that can give us relevant content chunks about hardware components from a large catalog for a given user query.

Prompt for Hardware Components Catalog Questions

For answering questions about hardware components, we’ll use the following prompt with three variables listOfCurrentComponents, contentFromRAG, and userQuestion:

Instructions
You are helping users of a robot engineering tool with their questions on hardware components of their robot project.

The following components are already built into the robot:
{{listOfCurrentComponents}}

For a given user request, write an answer just based on the list of components from below and only mention components really needed for the user request. Describe how these components can be used in the robot, also taking the existing components of the robot into account. Keep your answers very short and precise, focusing on the components to be added.
Available additional components
{{contentFromRAG}}
User Request
{{userQuestion}}

Let’s say, we have the following user question:

Which hardware could I use to navigate through rooms in the dark?

First, we retrieve and pre-process the current components in our AI tool integration from our hardware component model in the workspace and put it into the prompt above at listOfCurrentComponents.

Motor: VelocityPro X1: Provides precise rotational force to drive the robot’s movements with high efficiency and speed control.
Microcontroller: BrainCore M4: Acts as the central processing unit, executing programmed instructions to control the robot’s functions and operations.
Power supply: PowerStream 500: Delivers stable and reliable energy to all components of the robot, ensuring uninterrupted performance.
Motor driver: TorqueMaster D2: Regulates the power supplied to the motors, allowing for smooth and accurate control of motor speed and direction.
Connectivity: LinkWave Z3: Facilitates wireless communication between the robot and external devices, enabling remote control and data exchange.

Then we trigger the RAG pipeline for the user request, which returns the following content based on the similarities of those chunks to the user request:

LiDAR X123
This LiDAR sensor emits pulsed light waves from a laser into the room. These pulses bounce off surrounding objects and return to the sensor. By measuring the time it takes for each pulse to return, the sensor calculates the distance to each object. Repeating this process continuously creates a real-time 3D map of the surroundings.
Ultrasonic US2306
This ultrasonic sensor measures distance by emitting ultrasonic sound waves and detecting the reflected waves. It consists of a transmitter, which sends out the sound waves, and a receiver, which captures the echoes. The sensor calculates the distance to an object based on the time it takes for the sound waves to bounce back. Ultrasonic sensors are commonly used in various applications, including obstacle detection, level measurement, and proximity sensing, due to their accuracy and reliability in different environmental conditions.
GPS Y887
A GPS sensor is a device that uses signals from satellites to determine the precise location of an object on Earth. It calculates latitude, longitude, and altitude by triangulating data from multiple satellites in the Global Positioning System. GPS sensors are commonly used in various tracking applications to provide real-time location information and facilitate accurate mapping, routing, and positioning.

This RAG result includes the match GPS Y887, which is irrelevant for the particular user request. Including a certain amount of irrelevant context with RAG is not uncommon. But as long as the relevant data is contained and the irrelevant data remains below a certain threshold, it usually isn’t harmful to the response quality, as we instruct the language model to base its answers only on the additional context it considers relevant.

Resulting Answer for the Hardware Component Question

Given the prompt above, including the existing components, the RAG result, and the user request, the resulting answer would be as follows (ChatGPT 4o in this case):

To navigate through rooms in the dark, you can use the LiDAR X123. It creates a real-time 3D map of the surroundings by emitting pulsed light waves, making it ideal for detecting obstacles and navigating in low-light conditions.

Here’s how it integrates with your existing components:

LiDAR X123: Provides accurate 3D mapping and obstacle detection.
Motor: VelocityPro X1 and Motor driver: TorqueMaster D2: Control precise movements based on the LiDAR data.
Microcontroller: BrainCore M4: Processes data from the LiDAR sensor and sends commands to the motor driver.
Power supply: PowerStream 500: Ensures stable power to the LiDAR sensor and other components.
Connectivity: LinkWave Z3: Enables remote monitoring and control if needed.
This setup will allow your robot to navigate effectively in the dark.

Excellent, this is exactly the type of answer we were looking for.

Use Case 2: Answering Code Questions based on Workspace Maps

For questions related to the workspace code, we equip the LLM’s context with a workspace map. The workspace map is a condensed representation of the workspace files and their content. To obtain this representation, we need to parse the workspace files and extract relevant information, such as function names, their parameters, and other elements. This can be done using a parser generator like Tree-sitter or by relying on the language support that is likely available in your domain-specific tool anyway.

In addition, we need to teach the LLM how to interpret the concrete syntax of our domain-specific language. This will enable the LLM to provide answers that not only take the abstract syntax from the workspace map into account but also come up with correct code snippets. If the domain-specific language is simple enough, we can rely on in-context learning by adding a well-selected set of explained examples to the prompt.

Prompt for Code Questions

The resulting prompt with all variables filled in could look as follows:

You are a coding assistant in a robot engineering tool that supports users in programming robots with a domain-specific language. Answer the questions based on the codemap outlining the files and the content of each file of the current workspace in the tool.

Keep your questions short and precise.

Robot DSL

The DSL in the tool follows an event-condition-action paradigm with function calls. Please consider the following examples to learn the syntax.

event "start" {
    if check_battery()
    then move_forward(100)
}

function check_battery() -> bool
    "Checks if the battery level is above a certain threshold."
    let level = component.battery.battery_level
    return level > 20

function move_forward(distance_cm: int)
    "Moves the robot forward by a specified distance in centimeters."
    component.motor.forward(distance_cm)

Codemap

Use the following condensed abstract syntax tree of the workspace to answer user questions. The @<number> in the syntax tree indicates the line number of the element. The user is unaware of this code map. Refer to the files and line numbers directly instead.

main.ecaf
    Event: "start": @1
        Condition: check_battery(battery_level) -> conditions.ecaf@1
        Action: move_forward(100) -> actions.ecaf@1

    Event: "obstacle_detected": @4
        Condition: is_obstacle_near(sensor_distance) -> conditions.ecaf@4
        Action: stop() -> actions.ecaf@5

    Event: "battery_low": @7
        Condition: check_battery(battery_level) -> conditions.ecaf: @1
        Action: return_to_base() -> actions.ecaf: @8

conditions.ecaf
    Function: check_battery(battery_level:int): @1
        Doc: Checks if the battery level is above a certain threshold.
    Function: is_obstacle_near(sensor_distance:int): @4
        Doc: Determines if an obstacle is within a specified distance.

actions.ecaf
    Function: move_forward(distance_cm:int): @1 -> calls: convert_cm_to_mm(distance_cm)
        Doc: Moves the robot forward by a specified distance in centimeters.

    Function: stop(): @5
        Doc: Stops all movement of the robot.

    Function: return_to_base(): @8
        Doc: Directs the robot to return to its charging base.

    Function: convert_cm_to_mm(distance_cm:int): @10
        Doc: Converts a distance from centimeters to millimeters.

Tool Context

The user has the file main.ecaf opened in an editor and the cursor is on line 2.

Resulting Answer for the Code Question

Given a user request like How can I convert into millimeters?, the response looks as follows with ChatGTP 4o:

To convert a distance from centimeters to millimeters, you can use the function convert_cm_to_mm in actions.ecaf. Here’s how the function is defined:
function convert_cm_to_mm(distance_cm:int)
    "Converts a distance from centimeters to millimeters."
You can call this function by passing the distance in centimeters as an argument. For example:
convert_cm_to_mm(10)
This converts 10 centimeters into millimeters.

Excellent, this is pretty much perfect. Note how the LLM is showing the function signature in the concrete syntax, even though it just learned about it from the abstract workspace map.

Admittedly, this is also a very simple workspace with a very simple DSL. For larger workspaces and more complex DSLs, we may need to manage the context size more actively, e.g., by omitting parts of the workspace map based on a ranking heuristic, or splitting this request into two steps: 1. Identifying relevant parts of the workspace, and 2. Coming up with the correct code for the DSL with more complex in-context learning for the DSL.

Use Case 3: Autonomous Context Requests for Project Configuration Questions

For project configuration questions, the AI tool integration first needs to augment the prompt with the current configuration state of the project. It is safe to always include it, as we already know based on the user request classification that the user request deals with the project configuration. However, depending on the specific question, we may also need to provide the LLM with the available libraries that can be imported into the project. As we don’t know upfront whether this is necessary, we’ll take this use case to demonstrate autonomous context requesting.

Therefore, we put the LLM in an agent-like position and let it request this information autonomously if needed. Also, we want it to be able to suggest changes to the project configuration. To show these suggestions to a user in a way that we can automatically apply them, the AI tool integration needs to be able to process those change suggestions. Therefore, we’ll introduce a simple text protocol as a communication means between the LLM and our AI tool integration.

Prompt for Project Configuration Questions

Here’s an example of how we could put that into a prompt:

Instructions
You are an assistant integrated with a robot engineering tool. Your primary role is to help users with questions about their project and assist them in modifying the project’s configuration. The project configuration file includes the project’s name, purpose, target runtime, entry program, and imported libraries.
Current configuration in the tool:
{
  "project": {
    "name": "WarehouseRobot",
    "purpose": "Automate inventory management and item retrieval in a warehouse",
    "target_runtime": "RoboOS 2.3",
    "entry_program": "main.py",
    "imported_libraries": [
      {
        "name": "robotvision",
        "version": "1.0.0"
      },
      {
        "name": "pathfinder",
        "version": "2.1.4"
      }
    ]
  }
}
Answering Questions:
If you need more information about the imported libraries or which libraries are available, respond with:
{ "command": "request-libraries" }
Suggesting Configuration Changes:
If you need to suggest a change to the configuration file, do not refer to the JSON file directly. Instead, interweave your answers with a command like:
{
   "command": "change-config",
   "property": "<property-to-change>",
   "value": "<the-updated-value>"
}
Your responses should be clear and provide the necessary support while seamlessly integrating commands for additional information or changes.

Resulting Answer for the Project Configuration Question 1

With those instructions, a possible conversation could look as follows. Consider the user asks the following question:

What is this project about?

At the example of ChatGPT 4o, we get the reply based on the already provided project configuration file:

This project, named WarehouseRobot, is designed to automate inventory management and item retrieval in a warehouse environment. It uses the robotvisio library for visual processing and the pathfinder library for navigation to efficiently manage warehouse tasks. It operates on the RoboOS 2.3 runtime and initiates its operations through the main.py entry program.

If you need any further information or assistance with configuration adjustments, feel free to ask!

Resulting Answer for the Project Configuration Question 2

Let’s say the user has a follow-up request:

Rename the project to WHR"

Now the LLM uses the text protocol to suggest the change:

Certainly! The project’s name will be updated to “WHR.”
{ "command": "change-config", "property": "project.name", "value": "WHR" }

Based on the established text protocol between the LLM and the AI tool integration, we can hide the JSON part of the response above in the UI and render a button labeled “Apply” instead, which performs the change locally in the project configuration when clicked.

Resulting Answer for the Project Configuration Question 3

Let’s say the user now asks:

Which libraries can I add?

ChatGPT 4o replies with:

To find out which libraries are available for you to add, I will request the list of available libraries.
{ "command": "request-libraries" }

In the AI tool integration, we can omit showing the JSON in the UI of the response and just reply directly in the background with the following content:

{
  "libraries": [
    {
      "name": "robotvision",
      "version": "1.0.0",
      "description": "Provides advanced image processing and computer vision capabilities."
    },
    {
      "name": "pathfinder",
      "version": "2.1.4",
      "description": "Offers robust algorithms for pathfinding and navigation."
    },
    {
      "name": "sensorfusion",
      "version": "3.2.0",
      "description": "Combines data from multiple sensors to improve accuracy."
    },
    {
      "name": "taskmanager",
      "version": "4.0.1",
      "description": "Manages and schedules tasks for efficient robot operation."
    },
    {
      "name": "commlink",
      "version": "2.5.3",
      "description": "Enables communication between multiple robots and control systems."
    }
  ]
}

Based on this information, the response of ChatGPT 4o is as follows:

Here are the libraries available for you to add to your project:

robotvision (1.0.0): Provides advanced image processing and computer vision capabilities.
pathfinder (2.1.4): Offers robust algorithms for pathfinding and navigation.
sensorfusion (3.2.0): Combines data from multiple sensors to improve accuracy.
taskmanager (4.0.1): Manages and schedules tasks for efficient robot operation.
commlink (2.5.3): Enables communication between multiple robots and control systems.
If you want to add any of these libraries to your project, please specify which one you would like to add.

Depending on the language model you use, it may be more efficient to use tool functions for such a use case. Many models, especially the capable flagship models, provide a generic way of defining functions, including their parameters, that we want the LLM to be able to use. If the LLM decides a function call is necessary, your AI tool integration can react to those function calls and provide a result. Based on those results, the LLM continues the processing until it eventually provides a final answer.

Conclusion

In this blog post, we explored how to manage context effectively in AI-integrated domain-specific tools. From retrieval augmented generation to workspace maps and autonomous context requests, each technique helps enhance the AI’s ability to provide accurate and relevant responses. While we simplified some aspects for this blog post, the examples above should clarify the main idea behind those strategies.

In real-world applications, further considerations and optimizations are necessary, including more active context size management, automatic prompt evaluation pipelines to guide incremental prompt optimization, managing prompts that may overlap categories, etc. Please also note that we used ChatGPT 4o here as an example. Other language models, including Open-Source Language Models, would likely work just as well. It is important to note though that for the autonomous context requesting strategy, a more capable model with advanced reasoning capabilities is usually preferred, as we put the language model in a more agent-like position.

At EclipseSource, we are at the forefront of technological innovation, ready to guide and support your AI initiatives. Our comprehensive services in AI integration are designed to provide the specialized know-how necessary to develop customized, AI-enhanced solutions that elevate your tools and IDEs to new levels of efficiency and innovation. Explore how we can assist in integrating AI into your tools with our AI technology services. Reach out to begin your AI integration project with us.

Jonas, Maximilian & Philip

Jonas Helming, Maximilian Koegel and Philip Langer co-lead EclipseSource, specializing in consulting and engineering innovative, customized tools and IDEs, with a strong …