Structured Outputs in OpenAI API: A Game-Changer for AI Integration in Tools and IDEs

August 14, 2024 | 6 min Read

Integrating AI into tools and IDEs has become a critical component in many of our customer projects. In these integrations, we frequently depend on LLMs to generate machine-processible outputs, which are essential for steering user requests to specific context retrieval strategies or translating LLM responses into actionable commands within the tool. However, achieving consistently structured outputs from LLMs has often posed challenges. The variability in responses can result in cumbersome code for parsing the LLM’s responses and inconsistent outcomes, complicating the development process.

The new Structured Outputs feature in OpenAI’s API offers a robust solution to this challenge, enabling developers to integrate AI with greater reliability and ensuring consistent output structures.

In this blog post, we explore the benefits of this new feature and demonstrate how Structured Outputs can be leveraged based on a practical example. We also provide recommendations for leveraging this feature effectively in your AI integrations.

Why Structured Outputs Matter

In AI integrations of tools and IDEs, ensuring that AI responses are both accurate and reliably structured for further processing has been a persistent challenge. This is especially critical when the LLM’s response is not merely displayed to the user but needs to be processed further by the tool e.g. to:

Classify user requests within custom domain-specific tools and route them to a specialized prompt flow.
Generate actionable responses enabling the LLM response to be enriched with executable tool commands.
Extract structured data, e.g., for auto-completing content in domain-specific, non-textual editors.
and many more

While OpenAI’s models have long supported function calling with JSON schemas for function parameters, this capability was previously limited to function calls and not available for general API responses—until now.

Structured Outputs, introduced with the recent gpt-4o-2024-08-06 model, allow developers to supply a JSON Schema directly via the response_format parameter. This ensures that the AI’s outputs strictly adhere to the defined structure. According to OpenAI’s evaluations, this feature achieves near-perfect reliability, a claim that our experience at EclipseSource supports with high confidence.

A Practical Example: Classifying User Requests in a Custom Tool

To demonstrate the power of Structured Outputs, let’s revisit an example from our previous blog post on AI Context Management in Domain-Specific Tools. Suppose you’re developing a robot engineering tool with an AI-driven chat interface. Users may pose questions related to hardware components, code, or project configuration, and you need the AI to automatically categorize these questions. By categorizing user queries into predefined categories, you can route them to specialized agents, prompt flows and context retrieval strategies within the tool.

With the new Structured Outputs feature, you can define a schema that the AI must conform to. Here’s how it works:

System Prompt

In this example, we utilize the LLM to classify user requests, enabling routing to the most appropriate context retrieval strategy and prompt flow. We define the following system prompt:

You are responsible for classifying user requests in a chat for a robot engineering tool.

Classify incoming questions into one of the following categories:

1. **hardware-components-catalog**: Questions about available hardware components, their purposes, capabilities, specifications, etc.
2. **code**: Questions about the project's code, including programs, functions, events, and conditions.
3. **project-configuration**: Questions about configuring a project, including its name, description, target runtime, and imported libraries.
4. **other**: Any other questions that do not fit into the categories above.

Provide a concise rationale for your classification decision.

If you cannot classify the question into any of the above categories or if the question is out of scope for a robot engineering tool:

- Use the 'other' category,
- Set `isErrorStatus` to true,
- Provide a user-targeted status message in the `rationale` property explaining why the request cannot be answered.

Schema Definition in TypeScript

Next, we define the JSON schema for the output we expect the LLM to return for each user query. To streamline the JSON schema definition, we use Zod, a TypeScript-first schema declaration and validation library. This approach ensures type safety and simplifies parsing in TypeScript.

const Classification = z.object({
    category: z.enum([
        'hardware-components-catalog',
        'code',
        'project-configuration',
        'other'
    ]),
    rationale: z.string(),
    isErrorStatus: z.boolean(),
});

Performing the Request

When making a request to the OpenAI API, you can now specify the schema as the expected response format. Here’s an example of how to classify a user message using the schema defined above:

async function classify(userMessage: string) {
    const runner = await openai.beta.chat.completions.parse({
        model: 'gpt-4o-2024-08-06',
        temperature: 0.5,
        response_format: zodResponseFormat(Classification, 'classification'),
        messages: [
            { role: 'system', content: systemPrompt },
            { role: 'user', content: userMessage },
        ],
    });

    const result = runner.choices[0].message.parsed;

    switch (result.category) {
        case 'hardware-components-catalog': // process the structured and typed output
    }
}

Example User Requests and Responses

Let’s explore how gpt-4o-2024-08-06 categorizes different user queries based on the schema and the system prompt defined above:

User Message: Which modules could I use to navigate through rooms in the dark?

{
  "category": "hardware-components-catalog",
  "rationale": "The question pertains to available hardware components for navigation in low-light conditions, which falls under the hardware-components-catalog category.",
  "isErrorStatus": false
}

User Message: Which actions are called on event obstacle_detected?

{
  "category": "code",
  "rationale": "The query involves actions triggered by an event, relevant to the project's codebase and its handling of events and conditions.",
  "isErrorStatus": false
}

User Message: Are robots planning to take over the world?

{
  "category": "other",
  "rationale": "The inquiry is speculative about robot intent, which does not align with the specified categories related to hardware components, code, or project configuration.",
  "isErrorStatus": true
}

As you can see, not only is the AI able to classify user queries accurately, but it consistently adheres to the defined schema, making it easier to process the responses programmatically.

Recommendations for Using Structured Outputs

To fully leverage Structured Outputs in your AI integrations, consider the following recommendations:

Simplify Schema Definition with Zod: Utilize tools like Zod to define your data schema, ensuring type safety and facilitating easier parsing in TypeScript.
Constrain Outputs to Your Needs: Tailor the schema to your specific requirements, including an error flag to handle cases where the model cannot produce a valid output.
Utilize a Rationale Field: Incorporate a rationale field to capture the AI’s reasoning, which can provide valuable context for debugging.

Conclusion

While it may seem small on a first glance, OpenAI’s Structured Outputs feature represents a major advancement for developers integrating AI into their tools and IDEs. By guaranteeing reliable, machine-processible outputs, this feature simplifies AI integration, reduces the need for complex error handling, and enables the creation of more precise AI-driven tools. At EclipseSource, we’ve already begun adopting this new API across our projects, with outstanding results.

EclipseSource is committed to leading the way in technological innovation, ready to support your AI initiatives. Our comprehensive AI integration services offer the expertise needed to develop custom AI-enhanced solutions that elevate your tools and IDEs to new heights of efficiency and innovation. Explore how we can assist with your AI integration projects through our AI technology services. Contact us to start your AI integration journey with us.

Jonas, Maximilian & Philip

Jonas Helming, Maximilian Koegel and Philip Langer co-lead EclipseSource, specializing in consulting and engineering innovative, customized tools and IDEs, with a strong …