Structured Output
The get_structured_llm function provides a unified interface for obtaining structured output from different language models, regardless of whether they natively support tool calling, JSON mode, or require a separate prompt-based approach. This is particularly useful for ensuring compatibility with a wide range of LLMs, including those that do not have built-in support for structured data generation.
Why is this needed?
Different LLMs have varying capabilities when it comes to generating structured output:
Tool Calling: Modern models (e.g., OpenAI’s GPT series, Anthropic’s Claude) support “tool calling,” where the model can be instructed to call a specific function with a given schema. This is the most reliable method for obtaining structured data.
JSON Mode: Some models offer a “JSON mode,” where they can be constrained to output a JSON object that conforms to a provided schema.
Default Behavior: Many models can automatically determine the best approach for structured output based on their capabilities, using their native
.with_structured_output()method without specifying a particular method.No Native Support: Some open-source or older models do not support any of the above methods. They can only generate raw text, which requires additional processing to be converted into a structured format.
The get_structured_llm function abstracts away these differences, allowing you to work with a consistent interface.
The get_structured_llm function and StructuredOutputType Enum
- class llm_kernel_tuner.structured_output.StructuredOutputType(value)
Enum defining the types of structured output methods available for LLMs.
This enum is used to configure how structured output should be handled for different language models, particularly in the LLM metadata to indicate which approach should be used.
- TOOL_CALLING
Use the native tool calling mechanism for structured output. This is the preferred method for LLMs that support it natively (e.g., OpenAI GPT models, Anthropic Claude). It leverages the model’s built-in function calling capabilities to ensure properly structured responses.
- SEPARATE_REQUEST
Use a separate request with prompt engineering to achieve structured output. This method uses the StructuredOutputEmulator to parse and structure the LLM’s text response into the desired format. This is used as a fallback for models that don’t support native structured output or tool calling.
- HYBRID_JSONIFY
A two-step process where the first LLM call generates raw text, and a second call uses the model’s native structured output capability to format the text into JSON. LLMs that don’t allow thinking .
- JSON_SCHEMA
Use the model’s native JSON schema if available. This leverages built-in JSON formatting capabilities that some models provide (e.g., OpenAI’s response_format=”json_schema”). It ensures the model outputs valid JSON without requiring tool calling or extensive prompt engineering.
- DEFAULT
Use the model’s default structured output behavior. This allows the model to automatically determine the best approach for structured output based on its capabilities. The model will use its native .with_structured_output() method without specifying a particular method, letting the implementation choose the optimal strategy.
- Usage:
This enum is typically used as a parameter in the LLMKernelTransformer constructor:
from llm_kernel_tuner import LLMKernelTransformer from llm_kernel_tuner.structured_output import StructuredOutputType # For models with tool calling support kernel_transformer = LLMKernelTransformer( kernel_string, llm, structured_output_type=StructuredOutputType.TOOL_CALLING ) # For models with JSON mode support kernel_transformer = LLMKernelTransformer( kernel_string, llm, structured_output_type=StructuredOutputType.JSON_SCHEMA ) # For models using default behavior kernel_transformer = LLMKernelTransformer( kernel_string, llm, structured_output_type=StructuredOutputType.DEFAULT ) # For models without native structured output kernel_transformer = LLMKernelTransformer( kernel_string, llm, structured_output_type=StructuredOutputType.SEPARATE_REQUEST ) # For models that benefit from two-step generation and formatting kernel_transformer = LLMKernelTransformer( kernel_string, llm, structured_output_type=StructuredOutputType.HYBRID_JSONIFY )
Note
The LLMKernelTransformer automatically configures the LLM’s metadata based on this parameter. Users should not manually set the structured_output_type in the LLM’s metadata.
- llm_kernel_tuner.structured_output.get_structured_llm(llm: BaseChatModel, pydantic_schema: Type[BaseModel])
Factory function that returns a structured output-capable runnable.
It inspects the LLM’s metadata to decide whether to use the native .with_structured_output() method or fall back to the emulation wrapper. The metadata is set by
LLMKernelTransformerconstructor.- Parameters:
llm (BaseChatModel) – The language model instance. It should have a metadata dict.
pydantic_schema (Type[BaseModel]) – The Pydantic model for the desired output.
- Returns:
A LangChain runnable that will produce a structured Pydantic object.
How it works
The get_structured_llm function inspects the metadata attribute of the provided LLM to determine the appropriate method for generating structured output.
If
structured_output_typeis set toTOOL_CALLING, it uses the native.with_structured_output()method with function calling.If
structured_output_typeis set toJSON_SCHEMA, it uses the native.with_structured_output()method with JSON mode.If
structured_output_typeis set toDEFAULT(the default), it uses the native.with_structured_output()method without specifying a particular method, allowing the model to choose the optimal strategy.If
structured_output_typeis set toSEPARATE_REQUEST, it falls back to theStructuredOutputEmulator, which uses a separate prompt to format the raw text output into the desired Pydantic object.If
structured_output_typeis set toHYBRID_JSONIFY, it uses a two-step process where the first LLM call generates raw text, and a second call uses the model’s native structured output capability to format the text into JSON. This is useful for LLMs that don’t allow thinking when structured output is enabled.
The structured_output_type is typically configured through the LLMKernelTransformer constructor parameter, which automatically sets the appropriate metadata on the LLM instance.
Example Usage
from llm_kernel_tuner import LLMKernelTransformer
from llm_kernel_tuner.structured_output import StructuredOutputType
from langchain_openai import ChatOpenAI
kernel_string = """
__global__ void matrixMultiply(float *A, float *B, float *C, int A_width, int A_height, int B_width) {
int col = threadIdx.x + blockDim.x * blockIdx.x;
int row = threadIdx.y + blockDim.y * blockIdx.y;
if (col < B_width && row < A_height) {
float sum = 0;
for (int k = 0; k < A_width; ++k) {
sum += A[row * A_width + k] * B[k * B_width + col];
}
C[row * B_width + col] = sum;
}
}
"""
# Configure LLM to use tool calling for structured output
llm = ChatOpenAI(model="gpt-5")
kernel_transformer = LLMKernelTransformer(
kernel_string,
llm,
structured_output_type=StructuredOutputType.TOOL_CALLING
)
# Configure LLM to use default behavior
kernel_transformer_default = LLMKernelTransformer(
kernel_string,
llm,
structured_output_type=StructuredOutputType.DEFAULT
)
# Configure LLM to use separate request emulation
kernel_transformer_emulated = LLMKernelTransformer(
kernel_string,
llm,
structured_output_type=StructuredOutputType.SEPARATE_REQUEST
)
# Configure LLM to use hybrid jsonify
kernel_transformer_hybrid = LLMKernelTransformer(
kernel_string,
llm,
structured_output_type=StructuredOutputType.HYBRID_JSONIFY
)