Testing strategies

To test the kernel for correctness, tests need to be provided. Testing strategies define how the tests for the kernel will be generated. At the moment there is only one testing strategy, namely Naive Testing Strategy.

A test strategy generates Tests. A test consists of a input, expected output and problem_size.

Tests

A test consists of input, expected output and the problem_size. To run a test you can call TunableKernel.test(self, test: KernelTest, tune_params: Dict[str, Union[int, float]]).

Likewise TunableKernel.tune(self, test: KernelTest, tune_params: Dict[str, List[Any]]) can be called to tune the kernel.

Naive Testing Strategy

Naive testing strategy simply asks to generate python code that will generate input and output pair. After the python code is generated naive testing strategy executes the code to extract input and output pair.

Custom Testing Strategy

To create your own testing strategy you will need to extend BaseTestingStrategy class found in llm_kernel_tuner.testing_strategies.base_testing_strategy. Subsequently you will need to implement the following function in your strategy: def create_graph(self, llm: BaseChatModel) -> CompiledStateGraph.

You are expected to put tests into the state["tests"].

Since the input and the output need to be sufficiently large (in the scale of 10.000.000) they cannot be directly generated by the LLM. Therefore python code that will generate input is generated instead. Here is an example of such code that is expected to be generated by an LLM for vector_add:

n = np.int32(10000000)

a = np.random.randn(n).astype(np.float32)
b = np.random.randn(n).astype(np.float32)
c = np.zeros_like(a)

input = [c, a, b, n]

BaseTestingStrategy provides helper function that can help execute python code generated by an LLM.

Here is a full example:

from langchain_core.language_models.chat_models import BaseChatModel
from llm_kernel_tuner.testing_strategies.base_testing_strategy import BaseTestingStrategy, TestingState
from llm_kernel_tuner.retry import RetryPolicy, default_tester_retry_policy
from langgraph.graph import END, START, StateGraph

class NaiveLLMTester(BaseTestingStrategy):
    def __init__(self, retry_policy: Optional[RetryPolicy] = default_tester_retry_policy): # retry_policy added to init
        super().__init__(retry_policy) # retry_policy passed to super
        self.llm: Optional[BaseChatModel] = None # llm initialized to None


    def create_graph(self, llm: BaseChatModel) -> CompiledStateGraph: # retry_policy removed from signature
        self.llm = llm
        # self.retry_policy is available here if needed

        graph_builder = StateGraph(TestingState)
        graph_builder.add_node("generate_test", self.generate_test)

        graph_builder.add_edge(START, "generate_test")
        graph_builder.add_edge("generate_test", END) # Corrected node name from "generate_tests" to "generate_test"

        retry_graph = create_retry_wrapper(graph_builder.compile(), self.retry_policy)
        return retry_graph

    def generate_test(self, state: TestingState) -> TestingState:
        kernel = state['kernel']
        answer = self.llm.invoke(...)
        test_code = ...#sanitize answer of the LLM and extract the code or use structured output
        test = self.get_test_from_code(test_code)
        state['tests'].append(test)
        return state
        ...

BaseTestingStrategy provides the following helper functions:

get_test_from_code Executes provided Python code and generates test