Agent¶

`Agent` class¶

Agent ¶

Agent(model: str | Any | None = None, tools: list[Tool] | None = None, system_prompt: str | None = None, reflexion: ReflexionConfig | bool | None = None, grounding: GroundingConfig | bool | None = None, max_iterations: int = 20, conversation_manager: Any | None = None, checkpointer: Any | None = None, hooks: list[Any] | None = None, config: AgentConfig | None = None, **kwargs: Any)

Bases: AgentRuntimeMixin, BaseModel

Primary entry point for Locus agents.

Manages the ReAct loop with optional Reflexion and Grounding.

Usage

agent = Agent( model="openai:gpt-4o", # or oci:cohere.command-r-plus tools=[search, calculate], system_prompt="You are a helpful assistant.", )

Async streaming¶

async for event in agent.run("What is 2+2?"): print(event)

Sync execution¶

result = agent.run_sync("What is 2+2?") print(result.message)

Initialize an Agent.

Parameters:

Name	Type	Description	Default
`model`	`str \| Any \| None`	Model string or ModelProtocol instance	`None`
`tools`	`list[Tool] \| None`	List of tools available to the agent	`None`
`system_prompt`	`str \| None`	System prompt for the agent	`None`
`reflexion`	`ReflexionConfig \| bool \| None`	Reflexion config (True for defaults, False/None to disable)	`None`
`grounding`	`GroundingConfig \| bool \| None`	Grounding config (True for defaults, False/None to disable)	`None`
`max_iterations`	`int`	Maximum iterations before stopping	`20`
`conversation_manager`	`Any \| None`	Conversation manager for message pruning	`None`
`checkpointer`	`Any \| None`	Checkpointer for state persistence	`None`
`hooks`	`list[Any] \| None`	Lifecycle hooks	`None`
`config`	`AgentConfig \| None`	Full AgentConfig (overrides other params)	`None`
`**kwargs`	`Any`	Additional config options	`{}`

Source code in src/locus/agent/agent.py

def __init__(
    self,
    model: str | Any | None = None,
    tools: list[Tool] | None = None,
    system_prompt: str | None = None,
    reflexion: ReflexionConfig | bool | None = None,
    grounding: GroundingConfig | bool | None = None,
    max_iterations: int = 20,
    conversation_manager: Any | None = None,
    checkpointer: Any | None = None,
    hooks: list[Any] | None = None,
    config: AgentConfig | None = None,
    **kwargs: Any,
):
    """
    Initialize an Agent.

    Args:
        model: Model string or ModelProtocol instance
        tools: List of tools available to the agent
        system_prompt: System prompt for the agent
        reflexion: Reflexion config (True for defaults, False/None to disable)
        grounding: Grounding config (True for defaults, False/None to disable)
        max_iterations: Maximum iterations before stopping
        conversation_manager: Conversation manager for message pruning
        checkpointer: Checkpointer for state persistence
        hooks: Lifecycle hooks
        config: Full AgentConfig (overrides other params)
        **kwargs: Additional config options
    """
    # Build config from params or use provided
    if config is not None:
        agent_config = config
    else:
        # Handle reflexion
        reflexion_config = None
        if reflexion is True:
            reflexion_config = ReflexionConfig()
        elif isinstance(reflexion, ReflexionConfig):
            reflexion_config = reflexion

        # Handle grounding
        grounding_config = None
        if grounding is True:
            grounding_config = GroundingConfig()
        elif isinstance(grounding, GroundingConfig):
            grounding_config = grounding

        agent_config = AgentConfig(
            model=model or "openai:gpt-4o",
            tools=tools or [],
            system_prompt=system_prompt or "You are a helpful AI assistant.",
            reflexion=reflexion_config,
            grounding=grounding_config,
            max_iterations=max_iterations,
            conversation_manager=conversation_manager,
            checkpointer=checkpointer,
            hooks=hooks or [],
            **kwargs,
        )

    super().__init__(config=agent_config)
    self._initialize()

is_cancelled `property` ¶

is_cancelled: bool

Check if cancellation has been requested.

model `property` ¶

model: Any

Get the model instance.

tools `property` ¶

tools: ToolRegistry

Get the tool registry.

system_prompt `property` ¶

system_prompt: str

Get the configured system prompt as a string.

If the config value is a callable (dynamic prompt), it is coerced to its repr so this property never returns non-str. Use self.config.system_prompt directly to access the raw value (string or callable) when you need to invoke the dynamic form.

run_sync ¶

run_sync(prompt: str, *, thread_id: str | None = None, metadata: dict[str, Any] | None = None) -> AgentResult

Run the agent synchronously.

Parameters:

Name	Type	Description	Default
`prompt`	`str`	User prompt to process	required
`thread_id`	`str \| None`	Optional thread ID for checkpointing	`None`
`metadata`	`dict[str, Any] \| None`	Additional metadata for tools	`None`

Returns:

Type	Description
`AgentResult`	AgentResult with final message and state

Source code in src/locus/agent/agent.py

def run_sync(
    self,
    prompt: str,
    *,
    thread_id: str | None = None,
    metadata: dict[str, Any] | None = None,
) -> AgentResult:
    """
    Run the agent synchronously.

    Args:
        prompt: User prompt to process
        thread_id: Optional thread ID for checkpointing
        metadata: Additional metadata for tools

    Returns:
        AgentResult with final message and state
    """

    async def _run() -> AgentResult:
        started_at = datetime.now(UTC)
        stop_reason: StopReason = "complete"
        final_message: str = ""
        tool_errors = 0

        callback = self.config.callback_handler

        async for event in self.run(prompt, thread_id=thread_id, metadata=metadata):
            # Fire callback if set
            if callback is not None:
                callback(event)

            if isinstance(event, TerminateEvent):
                stop_reason = _normalize_stop_reason(event.reason)
                final_message = event.final_message or ""
            elif isinstance(event, ToolCompleteEvent):
                if event.error:
                    tool_errors += 1

        # Use actual final state from run() instead of reconstructing
        state = self._last_run_state
        if state is None:
            state = await self._create_initial_state(prompt, thread_id, metadata)
            if final_message:
                state = state.with_message(Message.assistant(final_message))

        # Structured-output coercion (no-op when output_schema is unset).
        parsed_obj = None
        parse_error_msg = None
        structured_message = final_message
        if self.config.output_schema is not None:
            parsed_obj, parse_error_msg, state = await self._structure_output(
                state, final_message or ""
            )
            if parsed_obj is not None:
                # Replace ``message`` with the canonical JSON form so callers
                # using ``result.message`` still see a schema-valid string.
                structured_message = parsed_obj.model_dump_json()

        # Run GSAR judgment when configured. Single-pass v1: judge
        # the final answer, surface the result on AgentResult.
        # Full Algorithm-1 outer loop (regenerate / replan) lives in
        # locus.reasoning.gsar_evaluator and can be wired
        # explicitly when the caller wants the loop dynamics.
        gsar_judgment, gsar_score_value, gsar_decision = await self._run_gsar_judgment(
            state, structured_message or final_message
        )

        elapsed_ms = (datetime.now(UTC) - started_at).total_seconds() * 1000
        metrics = ExecutionMetrics(
            iterations=state.iteration,
            tool_calls=len(state.tool_executions),
            tool_errors=tool_errors,
            total_tokens=state.total_tokens_used,
            prompt_tokens=state.prompt_tokens_used,
            completion_tokens=state.completion_tokens_used,
            cache_creation_input_tokens=state.cache_creation_tokens_used,
            cache_read_input_tokens=state.cache_read_tokens_used,
            duration_ms=elapsed_ms,
        )

        return AgentResult.from_state(
            state=state,
            stop_reason=stop_reason,
            metrics=metrics,
            started_at=started_at,
            parsed=parsed_obj,
            parse_error=parse_error_msg,
            message=structured_message,
            gsar_judgment=gsar_judgment,
            gsar_score=gsar_score_value,
            gsar_decision=gsar_decision,
        )

    async def _run_and_close_clients() -> AgentResult:
        # Wrap _run() so any model-level httpx client is shut down
        # *inside* this asyncio.run loop. Otherwise the client's
        # connections remain bound to the loop we're about to close;
        # when ``run_sync`` is called again, the next ``asyncio.run``
        # opens a fresh loop and the old client's ``__del__`` tries
        # to ``aclose`` against the now-closed loop, raising
        # ``RuntimeError: Event loop is closed``.
        try:
            return await _run()
        finally:
            close = getattr(self.model, "close", None)
            if close is not None:
                try:
                    await close()
                except Exception:  # noqa: BLE001 — cleanup must never mask a real error from _run()
                    pass

            # Same reasoning for the checkpointer's connection pool.
            # The oracledb thin-mode pool is bound to the asyncio loop
            # that created it. Closing it here drains the connections
            # *inside* this loop. Skipping this step means the next
            # ``run_sync`` opens a fresh loop with the old pool still
            # holding TCP handles from the dead loop — which surfaces
            # as ORA-03146 / ORA-03138 / DPY-4011 on the next save.
            ckpt = getattr(getattr(self, "config", None), "checkpointer", None)
            ckpt_close = getattr(ckpt, "close", None) if ckpt is not None else None
            if ckpt_close is not None:
                try:
                    await ckpt_close()
                except Exception:  # noqa: BLE001 — cleanup must never mask _run() errors
                    pass

            # Drain any background tasks the SDK spawned (httpx's TLS
            # teardown schedules ``loop.call_soon`` callbacks via
            # anyio that fire after ``client.close()`` returns). If
            # we don't await them, the loop closes mid-flight and the
            # callbacks raise "Event loop is closed" on the asyncio
            # default exception handler — visible in stderr as
            # "Task exception was never retrieved".
            try:
                pending = [
                    t
                    for t in asyncio.all_tasks()
                    if t is not asyncio.current_task() and not t.done()
                ]
                if pending:
                    await asyncio.wait(pending, timeout=2.0)
            except Exception:  # noqa: BLE001 — best-effort drain; never block teardown
                pass

    try:
        asyncio.get_running_loop()
    except RuntimeError:
        # No running loop, create a new one
        return asyncio.run(_run_and_close_clients())
    else:
        # There's a running loop, run in a thread to avoid nesting
        import concurrent.futures

        with concurrent.futures.ThreadPoolExecutor(max_workers=1) as executor:
            future = executor.submit(asyncio.run, _run_and_close_clients())
            return future.result()

invoke ¶

invoke(prompt: str, *, thread_id: str | None = None, metadata: dict[str, Any] | None = None) -> AgentResult

Invoke the agent (alias for run_sync).

Parameters:

Name	Type	Description	Default
`prompt`	`str`	User prompt to process	required
`thread_id`	`str \| None`	Optional thread ID for checkpointing	`None`
`metadata`	`dict[str, Any] \| None`	Additional metadata for tools	`None`

Returns:

Type	Description
`AgentResult`	AgentResult with final message and state

Source code in src/locus/agent/agent.py

def invoke(
    self,
    prompt: str,
    *,
    thread_id: str | None = None,
    metadata: dict[str, Any] | None = None,
) -> AgentResult:
    """
    Invoke the agent (alias for run_sync).

    Args:
        prompt: User prompt to process
        thread_id: Optional thread ID for checkpointing
        metadata: Additional metadata for tools

    Returns:
        AgentResult with final message and state
    """
    return self.run_sync(prompt, thread_id=thread_id, metadata=metadata)

cancel ¶

cancel() -> None

Cancel a running agent from an external thread.

Sets a signal that the agent loop checks at each iteration. The agent will stop gracefully with stop_reason="cancelled".

Thread-safe — can be called from any thread while the agent is running.

Example

import threading

def run_agent(): result = agent.run_sync("Long task...") print(result.stop_reason) # "cancelled"

t = threading.Thread(target=run_agent) t.start() time.sleep(5) agent.cancel() # Stop from main thread t.join()

Source code in src/locus/agent/agent.py

def cancel(self) -> None:
    """Cancel a running agent from an external thread.

    Sets a signal that the agent loop checks at each iteration.
    The agent will stop gracefully with stop_reason="cancelled".

    Thread-safe — can be called from any thread while the agent is running.

    Example:
        import threading

        def run_agent():
            result = agent.run_sync("Long task...")
            print(result.stop_reason)  # "cancelled"

        t = threading.Thread(target=run_agent)
        t.start()
        time.sleep(5)
        agent.cancel()  # Stop from main thread
        t.join()
    """
    if self._cancel_signal is None:
        self._cancel_signal = threading.Event()
    self._cancel_signal.set()

as_tool ¶

as_tool(name: str | None = None, description: str | None = None) -> Tool

Wrap this agent as a Tool for use by another agent.

The returned tool accepts a prompt string and returns the agent's final response. This enables agent delegation — a parent agent can call a sub-agent as if it were any other tool.

Parameters:

Name	Type	Description	Default
`name`	`str \| None`	Tool name (defaults to agent_id or "sub_agent")	`None`
`description`	`str \| None`	Tool description (defaults to system prompt excerpt)	`None`

Returns:

Type	Description
`Tool`	A Tool that runs this agent when called

Example

researcher = Agent( ... model=model, tools=[search], system_prompt="You research topics." ... ) writer = Agent(model=model, tools=[researcher.as_tool("research")]) result = writer.run_sync("Write about quantum computing")

Source code in src/locus/agent/agent.py

def as_tool(
    self,
    name: str | None = None,
    description: str | None = None,
) -> Tool:
    """
    Wrap this agent as a Tool for use by another agent.

    The returned tool accepts a prompt string and returns the agent's
    final response. This enables agent delegation — a parent agent
    can call a sub-agent as if it were any other tool.

    Args:
        name: Tool name (defaults to agent_id or "sub_agent")
        description: Tool description (defaults to system prompt excerpt)

    Returns:
        A Tool that runs this agent when called

    Example:
        >>> researcher = Agent(
        ...     model=model, tools=[search], system_prompt="You research topics."
        ... )
        >>> writer = Agent(model=model, tools=[researcher.as_tool("research")])
        >>> result = writer.run_sync("Write about quantum computing")
    """
    from locus.tools.decorator import tool as tool_decorator

    agent = self
    tool_name = name or self.config.agent_id or "sub_agent"
    tool_desc = description or (
        "Delegate a task to a sub-agent. "
        "The sub-agent has its own tools and will work independently "
        "to answer your request. Send a clear, specific prompt."
    )

    @tool_decorator(name=tool_name, description=tool_desc)
    def agent_tool(prompt: str) -> str:
        """Run the sub-agent with the given prompt and return its response.

        Args:
            prompt: The task or question to delegate to the sub-agent

        Returns:
            The sub-agent's final response
        """
        result = agent.run_sync(prompt)
        if result.success:
            return result.message
        return f"Sub-agent finished with status '{result.stop_reason}': {result.message}"

    return agent_tool

resume `async` ¶

resume(response: str) -> AsyncIterator[LocusEvent]

Resume agent execution after an interrupt.

When a tool calls ask_user() and the agent yields an InterruptEvent, call this method with the user's response to continue execution.

Parameters:

Name	Type	Description	Default
`response`	`str`	The user's response to the interrupt question	required

Yields:

Type	Description
`AsyncIterator[LocusEvent]`	LocusEvent instances for the remaining execution

Example

async for event in agent.run("Build an app"): ... if isinstance(event, InterruptEvent): ... answer = input(event.question) ... async for event in agent.resume(answer): ... handle(event)

Source code in src/locus/agent/agent.py

async def resume(
    self,
    response: str,
) -> AsyncIterator[LocusEvent]:
    """
    Resume agent execution after an interrupt.

    When a tool calls ask_user() and the agent yields an InterruptEvent,
    call this method with the user's response to continue execution.

    Args:
        response: The user's response to the interrupt question

    Yields:
        LocusEvent instances for the remaining execution

    Example:
        >>> async for event in agent.run("Build an app"):
        ...     if isinstance(event, InterruptEvent):
        ...         answer = input(event.question)
        ...         async for event in agent.resume(answer):
        ...             handle(event)
    """
    if self._interrupt_state is None:
        raise RuntimeError("No interrupt to resume from. Call run() first.")

    # Add the user's response as a tool result for ask_user
    state = self._interrupt_state
    state = state.with_message(Message.system(f"[User Response] {response}"))

    # Store for _create_initial_state to pick up
    self._last_run_state = state
    self._interrupt_state = None

    # Re-run — _create_initial_state will load from checkpoint/state
    # We pass the original prompt; the state already has the full history
    prompt = self._interrupt_prompt or ""
    thread_id = self._interrupt_thread_id
    metadata = self._interrupt_metadata

    # Clear interrupt bookkeeping
    self._interrupt_prompt = None
    self._interrupt_thread_id = None
    self._interrupt_metadata = None

    # Continue execution from the interrupted state
    async for event in self._run_from_state(state, prompt, thread_id, metadata):
        yield event

add_tool ¶

add_tool(tool: Tool) -> None

Register a tool on this agent after construction.

Locus compiles config.tools into the runtime ToolRegistry once, inside __init__ (via :func:locus.agent.initializer. initialize_agent). Mutating self.config.tools directly after that point is a silent no-op — the model never sees the added tool because the registry has already been built.

Use this method (or :meth:add_tools) when you want to compose a specialist fleet at runtime: build each specialist, wrap it via Agent.as_tool(...), and attach the wrappers to the orchestrator.

The tool is also appended to self.config.tools so that a subsequent re-initialisation (e.g. after a config-driven clone) sees the same shape.

Raises:

Type	Description
`TypeError`	if `tool` is not a :class:`locus.tools.Tool` instance. Callable functions must be wrapped with the :func:`@tool` decorator first.
`ValueError`	if a tool with the same `name` is already registered (propagated from :meth:`ToolRegistry.register`).

Source code in src/locus/agent/agent.py

def add_tool(self, tool: Tool) -> None:
    """Register a tool on this agent after construction.

    Locus compiles ``config.tools`` into the runtime ``ToolRegistry``
    once, inside ``__init__`` (via :func:`locus.agent.initializer.
    initialize_agent`). Mutating ``self.config.tools`` directly after
    that point is a silent no-op — the model never sees the added
    tool because the registry has already been built.

    Use this method (or :meth:`add_tools`) when you want to compose a
    specialist fleet at runtime: build each specialist, wrap it via
    ``Agent.as_tool(...)``, and attach the wrappers to the
    orchestrator.

    The tool is also appended to ``self.config.tools`` so that a
    subsequent re-initialisation (e.g. after a config-driven
    clone) sees the same shape.

    Raises:
        TypeError: if ``tool`` is not a :class:`locus.tools.Tool`
            instance. Callable functions must be wrapped with the
            :func:`@tool` decorator first.
        ValueError: if a tool with the same ``name`` is already
            registered (propagated from
            :meth:`ToolRegistry.register`).
    """
    if not isinstance(tool, Tool):
        raise TypeError(
            f"Expected Tool instance (use @tool to wrap a function), got {type(tool)}"
        )
    self._initialize()
    self._tool_registry.register(tool)
    # Mirror into config so a re-initialisation reconstructs the
    # same surface. ``config.tools`` is a list[Any] by Pydantic
    # declaration, so we mutate in place rather than reassigning.
    self.config.tools.append(tool)

add_tools ¶

add_tools(tools: list[Tool]) -> None

Register multiple tools at once.

Equivalent to calling :meth:add_tool for each entry. If any single registration fails (wrong type, duplicate name), the whole call fails: tools registered before the failing one remain in the registry. Validate inputs ahead of time when atomic behaviour is required.

Source code in src/locus/agent/agent.py

def add_tools(self, tools: list[Tool]) -> None:
    """Register multiple tools at once.

    Equivalent to calling :meth:`add_tool` for each entry. If any
    single registration fails (wrong type, duplicate name), the
    whole call fails: tools registered before the failing one
    remain in the registry. Validate inputs ahead of time when
    atomic behaviour is required.
    """
    for t in tools:
        self.add_tool(t)

run `async` ¶

run(prompt: str, *, thread_id: str | None = None, metadata: dict[str, Any] | None = None) -> AsyncIterator[LocusEvent]

Run the agent with streaming events.

Parameters:

Name	Type	Description	Default
`prompt`	`str`	User prompt to process	required
`thread_id`	`str \| None`	Optional thread ID for checkpointing	`None`
`metadata`	`dict[str, Any] \| None`	Additional metadata for tools	`None`

Yields:

Type	Description
`AsyncIterator[LocusEvent]`	LocusEvent instances for each step

Source code in src/locus/agent/runtime_loop.py

@_bus_bridge
async def run(
    self,
    prompt: str,
    *,
    thread_id: str | None = None,
    metadata: dict[str, Any] | None = None,
) -> AsyncIterator[LocusEvent]:
    """
    Run the agent with streaming events.

    Args:
        prompt: User prompt to process
        thread_id: Optional thread ID for checkpointing
        metadata: Additional metadata for tools

    Yields:
        LocusEvent instances for each step
    """
    self._initialize()

    # Create initial state
    state = await self._create_initial_state(prompt, thread_id, metadata)

    # Track metrics
    started_at = datetime.now(UTC)
    _total_tokens = 0
    _tool_calls_count = 0
    _tool_errors_count = 0
    _reflexion_evals = 0
    _grounding_evals = 0
    _last_assistant_content: str | None = None
    _last_no_tool_calls = False

    # Reset any user-supplied composable termination condition so
    # time-windowed checks (TimeLimit) start their clock at run start.
    if self.config.termination is not None:
        self.config.termination.reset()

    # Run hooks: before_invocation
    state = await self._run_before_invocation_hooks(prompt, state)

    # Inject long-term memories into the system prompt.
    if self._memory_manager is not None:
        state = await self._memory_manager.on_session_start(state)

    try:
        # Main ReAct loop
        while True:
            # Check time budget
            if self.config.time_budget_seconds is not None:
                elapsed = (datetime.now(UTC) - started_at).total_seconds()
                if elapsed >= self.config.time_budget_seconds:
                    yield TerminateEvent(
                        reason="time_budget",
                        iterations_used=state.iteration,
                        final_confidence=state.confidence,
                        total_tool_calls=len(state.tool_executions),
                        final_message=_last_assistant_content,
                    )
                    break

            # Check external cancellation
            if self.is_cancelled:
                yield TerminateEvent(
                    reason="cancelled",
                    iterations_used=state.iteration,
                    final_confidence=state.confidence,
                    total_tool_calls=_tool_calls_count,
                    final_message="Agent cancelled by external signal.",
                )
                break

            # User-supplied composable termination condition runs first
            # so MaxIterations(...) | TextMention("DONE") and friends
            # actually fire before the hard-coded fallbacks.
            if self.config.termination is not None:
                user_stop, user_reason = self.config.termination.check(
                    state,
                    last_message=_last_assistant_content or "",
                    no_tool_calls=_last_no_tool_calls,
                )
                if user_stop:
                    yield TerminateEvent(
                        reason=user_reason or "complete",
                        iterations_used=state.iteration,
                        final_confidence=state.confidence,
                        total_tool_calls=len(state.tool_executions),
                        final_message=_last_assistant_content,
                    )
                    break

            # Check termination conditions
            should_stop, stop_reason = state.should_terminate
            if should_stop and stop_reason:
                if stop_reason == "max_iterations" and state.iteration > 0:
                    # Inject summary request and do one final call WITHOUT tools
                    state = state.with_message(
                        Message.system(
                            "[Iteration Limit Reached]\n"
                            "You have used all available iterations. "
                            "Provide a final summary of your findings and conclusions "
                            "based on the work done so far. Do NOT call any more tools."
                        )
                    )
                    # Call model without tool schemas to force text response.
                    # Use the auxiliary (cheap) model when configured —
                    # this is just a final summary, no need to spend
                    # primary-model budget.
                    messages = list(state.messages)
                    if self._conversation_manager:
                        if hasattr(self._conversation_manager, "async_apply"):
                            messages = await self._conversation_manager.async_apply(messages)
                        else:
                            messages = self._conversation_manager.apply(messages)
                    messages = self._validate_messages(messages)

                    summary_model = self._auxiliary_model or self._model
                    response = await summary_model.complete(
                        messages=messages,
                        tools=None,  # No tools — force text summary
                        temperature=self.config.temperature,
                        max_tokens=self.config.max_tokens,
                    )
                    prompt_toks = response.usage.get("prompt_tokens", 0)
                    completion_toks = response.usage.get("completion_tokens", 0)
                    cache_creation_toks = response.usage.get("cache_creation_input_tokens", 0)
                    cache_read_toks = response.usage.get("cache_read_input_tokens", 0)
                    _total_tokens += prompt_toks + completion_toks
                    state = state.with_token_usage(
                        prompt_toks,
                        completion_toks,
                        cache_creation_tokens=cache_creation_toks,
                        cache_read_tokens=cache_read_toks,
                    )

                    summary = (
                        response.message.content
                        or _last_assistant_content
                        or self._build_fallback_summary(state)
                    )
                    yield TerminateEvent(
                        reason="max_iterations",
                        iterations_used=state.iteration,
                        final_confidence=state.confidence,
                        total_tool_calls=len(state.tool_executions),
                        final_message=summary,
                    )
                    break

                # All other stop reasons: hard stop
                yield TerminateEvent(
                    reason=stop_reason,
                    iterations_used=state.iteration,
                    final_confidence=state.confidence,
                    total_tool_calls=len(state.tool_executions),
                    final_message=_last_assistant_content,
                )
                break

            # Increment iteration
            state = state.next_iteration()

            # Planning: inject plan prompt on first iteration
            if self.config.planning and state.iteration == 1:
                state = state.with_message(
                    Message.system(
                        "[Planning Phase]\n"
                        "Before taking any action, create a step-by-step plan.\n"
                        "Format your plan as a numbered list:\n"
                        "1. First step\n"
                        "2. Second step\n"
                        "...\n\n"
                        "After stating your plan, begin executing step 1.\n"
                        "Do NOT call tools without a plan."
                    )
                )

            # Budget warning in explicit mode — nudge model to complete
            if self.config.completion_mode == "explicit":
                remaining = self.config.max_iterations - state.iteration
                if remaining == 2:
                    state = state.with_message(
                        Message.system(
                            f"[Budget Warning] You have {remaining} iterations left. "
                            "Start wrapping up. Call task_complete(summary='your findings') "
                            "to finish, or you'll hit the iteration limit."
                        )
                    )
                elif remaining == 0:
                    state = state.with_message(
                        Message.system(
                            "[Final Iteration] This is your LAST iteration. "
                            "You MUST call task_complete now with a summary of everything "
                            "you've found. Do NOT call any other tools."
                        )
                    )

            # Get model response
            response, state = await self._get_model_response(state)
            prompt_toks = response.usage.get("prompt_tokens", 0)
            completion_toks = response.usage.get("completion_tokens", 0)
            cache_creation_toks = response.usage.get("cache_creation_input_tokens", 0)
            cache_read_toks = response.usage.get("cache_read_input_tokens", 0)
            _total_tokens += prompt_toks + completion_toks
            state = state.with_token_usage(
                prompt_toks,
                completion_toks,
                cache_creation_tokens=cache_creation_toks,
                cache_read_tokens=cache_read_toks,
            )
            _last_assistant_content = response.message.content
            # Track for the user-supplied termination condition. Updated again
            # below if a Cohere-style text tool call is parsed out of the body.
            _last_no_tool_calls = not response.message.tool_calls

            # Store plan from first iteration if planning enabled
            if self.config.planning and state.iteration == 1 and response.message.content:
                state = state.with_metadata("plan", response.message.content)

            # Emit think event
            yield ThinkEvent(
                iteration=state.iteration,
                reasoning=response.message.content,
                tool_calls=list(response.message.tool_calls),
            )

            # If no structured tool calls, try parsing from text (Cohere fallback)
            if not response.message.tool_calls and response.message.content:
                parsed_calls = self._parse_text_tool_calls(response.message.content)
                if parsed_calls:
                    response = ModelResponse(
                        message=Message(
                            role=response.message.role,
                            content=response.message.content,
                            tool_calls=parsed_calls,
                            tool_call_id=response.message.tool_call_id,
                            name=response.message.name,
                        ),
                        usage=response.usage,
                        stop_reason=response.stop_reason,
                    )
                    # Update the assistant message in state with parsed tool calls
                    messages = list(state.messages)
                    messages[-1] = response.message
                    state = state.model_copy(update={"messages": tuple(messages)})
                    _last_no_tool_calls = False

            # If still no tool calls — in auto mode we're done, in explicit mode we continue
            if not response.message.tool_calls and self.config.completion_mode != "explicit":
                # Apply grounding before final response if enabled
                if (
                    self.config.grounding
                    and self.config.grounding.enabled
                    and self.config.grounding.check_before_final
                    and self._grounding_evaluator
                    and response.message.content
                    and len(state.tool_executions) > 0
                ):
                    grounding_event, state = await self._apply_grounding(
                        state, response.message.content
                    )
                    _grounding_evals += 1
                    yield grounding_event

                    # If grounding fails, inject guidance and continue loop
                    if grounding_event.requires_replan and _grounding_evals <= (
                        self.config.grounding.max_replans
                    ):
                        from locus.reasoning.grounding import GroundingResult

                        replan_guidance = self._grounding_evaluator.get_replan_guidance(
                            GroundingResult(
                                score=grounding_event.score,
                                ungrounded_claims=grounding_event.ungrounded_claims,
                                requires_replan=True,
                            )
                        )
                        state = state.with_message(
                            Message.system(f"[Grounding Check Failed]\n{replan_guidance}")
                        )
                        continue  # Re-enter loop for replanning

                yield TerminateEvent(
                    reason="complete",
                    iterations_used=state.iteration,
                    final_confidence=state.confidence,
                    total_tool_calls=len(state.tool_executions),
                    final_message=response.message.content,
                )
                break

            # Execute tool calls.
            #
            # Three phases so ``tool_execution="concurrent"`` is real
            # (#210): per-call serial hook/cache work is hoisted out
            # of the executor call so the survivors can run in a
            # single ``asyncio.gather``. Without this split, the
            # earlier per-call loop fed the executor singletons and
            # ``ConcurrentExecutor`` collapsed to ``SequentialExecutor``.
            #
            #   Phase 1 — per call, serial: emit ToolStartEvent, run
            #     before-hooks, resolve cancel/idempotent-cache short-
            #     circuits (recorded on state immediately so a later
            #     same-args call in this batch still cache-hits).
            #   Phase 2 — one batched ``executor.execute(...)`` over
            #     the surviving calls.
            #   Phase 3 — per result, in tool_call order: interrupt
            #     detection, result truncation/offload, state update,
            #     ToolCompleteEvent, after-hook (with retry), write/
            #     verification tracking.
            tool_results: list[ToolResult] = []
            reasoning_step_tools: list[ToolExecution] = []

            # Phase 1 — pre-execute.
            slots: list[dict[str, Any]] = []
            to_execute_indices: list[int] = []
            to_execute_calls: list[ToolCall] = []
            # Within-batch idempotent dedup: when an ``@tool(idempotent=True)``
            # appears twice in the same batch with the same arguments, the
            # second slot becomes a ``batch_cache_ref`` pointing at the first
            # slot. Phase 3 copies the first slot's result back. Without
            # this, the README contract ("idempotent body fires once per
            # run") is silently violated whenever a model fans out
            # duplicates in one response — see #210 follow-up.
            batch_idempotent_seen: dict[tuple[str, str], int] = {}

            for tool_call in response.message.tool_calls:
                _tool_calls_count += 1

                yield ToolStartEvent(
                    tool_name=tool_call.name,
                    tool_call_id=tool_call.id,
                    arguments=tool_call.arguments,
                )

                tool_event = await self._run_before_tool_hooks(
                    tool_call.name, tool_call.id, tool_call.arguments
                )

                if tool_event.cancel:
                    cancel_msg = (
                        tool_event.cancel
                        if isinstance(tool_event.cancel, str)
                        else "Cancelled by hook"
                    )
                    cancel_result = ToolResult(
                        tool_call_id=tool_call.id,
                        name=tool_call.name,
                        content=cancel_msg,
                        error=None,
                        duration_ms=0.0,
                    )
                    cancel_execution = ToolExecution(
                        tool_name=cancel_result.name,
                        tool_call_id=cancel_result.tool_call_id,
                        arguments=tool_call.arguments,
                        result=cancel_result.content,
                    )
                    # NOTE: state is updated in Phase 3 (tool_call order),
                    # not here, so concurrent batches keep deterministic
                    # ordering on ``state.tool_executions``.
                    slots.append(
                        {
                            "tool_call": tool_call,
                            "arguments": tool_call.arguments,
                            "kind": "cancel",
                            "result": cancel_result,
                            "execution": cancel_execution,
                        }
                    )
                    continue

                modified_args = tool_event.arguments

                # Idempotent dedup: if the tool declared idempotent=True
                # and a prior call in this run used the same arguments,
                # reuse the prior result instead of invoking the body.
                # Without this, ``@tool(idempotent=True)`` is silently a
                # no-op for the main Agent.run() path (despite being
                # advertised on the README hero example).
                cached = self._maybe_cached_idempotent_result(
                    state, tool_call.name, modified_args, tool_call.id
                )
                if cached is not None:
                    cache_execution = ToolExecution(
                        tool_name=cached.name,
                        tool_call_id=cached.tool_call_id,
                        arguments=modified_args,
                        result=cached.content if cached.success else None,
                        error=cached.error,
                        duration_ms=cached.duration_ms,
                        idempotent_cache_hit=True,
                    )
                    slots.append(
                        {
                            "tool_call": tool_call,
                            "arguments": modified_args,
                            "kind": "cache",
                            "result": cached,
                            "execution": cache_execution,
                        }
                    )
                    continue

                # Within-batch dedup for idempotent tools whose duplicate
                # appears alongside the original in this same response.
                # Without this, a model emitting ``[submit(X), submit(X)]``
                # in one assistant message would fire the body twice — a
                # regression vs the pre-batching per-call loop, which
                # sneaked the dedup in via serial state updates.
                idempotent_key = self._idempotent_batch_key(tool_call.name, modified_args)
                if idempotent_key is not None:
                    prior_slot = batch_idempotent_seen.get(idempotent_key)
                    if prior_slot is not None:
                        slots.append(
                            {
                                "tool_call": tool_call,
                                "arguments": modified_args,
                                "kind": "batch_cache_ref",
                                "ref_slot": prior_slot,
                                "result": None,
                                "execution": None,
                            }
                        )
                        continue
                    batch_idempotent_seen[idempotent_key] = len(slots)

                slots.append(
                    {
                        "tool_call": tool_call,
                        "arguments": modified_args,
                        "kind": "execute",
                        "result": None,
                        "execution": None,
                    }
                )
                to_execute_indices.append(len(slots) - 1)
                to_execute_calls.append(
                    tool_call.model_copy(update={"arguments": modified_args})
                )

            # Phase 2 — stream the survivors through the executor.
            # Constructed after Phase 1 so ``state`` already reflects
            # cancel / cache short-circuits, matching the pre-extraction
            # per-call-loop semantics.
            #
            # Using ``execute_streaming`` (yields ``(input_idx, result)``
            # in completion order for ConcurrentExecutor, input order for
            # SequentialExecutor) gives us two things in one pass:
            #   * Opt-in completion-order ``ToolCompleteEvent`` for UI
            #     streaming (``AgentConfig.tool_event_order='completion'``).
            #   * Active sibling cancellation on interrupt: a ``break``
            #     out of the ``async for`` triggers the executor's
            #     ``finally`` clause, which cancels in-flight sibling
            #     tasks. The pre-streaming ``gather`` impl had no such
            #     hook — every sibling completed and only the fold was
            #     halted.
            if to_execute_calls:
                ctx_factory = ToolContextFactory(
                    run_id=state.run_id,
                    agent_id=state.agent_id,
                    iteration=state.iteration,
                    state=state,
                    invocation_metadata=metadata or {},
                )
                batch_start = time.perf_counter()
                interrupted_slot_idx: int | None = None
                try:
                    async for input_idx, batched_result in self._executor.execute_streaming(
                        to_execute_calls,
                        self._tool_registry,
                        ctx_factory,
                    ):
                        slot_idx = to_execute_indices[input_idx]
                        slots[slot_idx]["result"] = batched_result

                        if self.config.tool_event_order == "completion":
                            # Stream the complete event the moment the
                            # tool finishes. Phase 3 will skip its own
                            # ToolCompleteEvent emission for execute-kind
                            # slots when in completion mode.
                            yield ToolCompleteEvent(
                                tool_name=batched_result.name,
                                tool_call_id=batched_result.tool_call_id,
                                result=(
                                    batched_result.content if batched_result.success else None
                                ),
                                error=batched_result.error,
                                duration_ms=batched_result.duration_ms,
                            )

                        # Interrupt detection — break to trigger the
                        # executor's cancellation of in-flight siblings.
                        # Phase 3 will run the InterruptEvent emission
                        # path on the interrupting slot exactly like the
                        # pre-streaming code did.
                        if (
                            batched_result.content
                            and '"__interrupt__": true' in batched_result.content
                        ):
                            interrupted_slot_idx = slot_idx
                            break
                except Exception as e:  # noqa: BLE001 — ``_execute_one`` already traps user-tool exceptions; this catches a failure of the executor itself (registry blow-up, ctx_factory error) so a single bad call doesn't abort the whole turn.
                    batch_duration = (time.perf_counter() - batch_start) * 1000
                    for slot_idx, tc in zip(to_execute_indices, to_execute_calls, strict=True):
                        if slots[slot_idx]["result"] is None:
                            slots[slot_idx]["result"] = ToolResult(
                                tool_call_id=tc.id,
                                name=tc.name,
                                content="",
                                error=str(e),
                                duration_ms=batch_duration,
                            )

                # Synthesize "cancelled by sibling interrupt" results
                # for slots whose tasks were cancelled (the interrupt
                # break happened before they completed). Folded as
                # ``kind="cancel"`` so Phase 3 treats them identically
                # to hook-cancelled calls — Start/Complete pair emits,
                # state records the cancellation, no after-hook fires.
                if interrupted_slot_idx is not None:
                    for slot_idx in to_execute_indices:
                        slot = slots[slot_idx]
                        if slot["result"] is not None:
                            continue
                        tc = slot["tool_call"]
                        cancelled_result = ToolResult(
                            tool_call_id=tc.id,
                            name=tc.name,
                            content="Cancelled: sibling tool requested interrupt",
                            error=None,
                            duration_ms=0.0,
                        )
                        cancelled_execution = ToolExecution(
                            tool_name=tc.name,
                            tool_call_id=tc.id,
                            arguments=slot["arguments"],
                            result=cancelled_result.content,
                        )
                        slot["result"] = cancelled_result
                        slot["execution"] = cancelled_execution
                        slot["kind"] = "cancel"

            # Phase 3 — per-result fold, in tool_call order.
            for slot in slots:
                tool_call = slot["tool_call"]
                modified_args = slot["arguments"]
                kind = slot["kind"]
                result = slot["result"]

                if kind == "cancel":
                    tool_results.append(result)
                    state = state.with_tool_execution(slot["execution"])
                    reasoning_step_tools.append(slot["execution"])
                    yield ToolCompleteEvent(
                        tool_name=result.name,
                        tool_call_id=result.tool_call_id,
                        result=result.content,
                        duration_ms=0.0,
                    )
                    continue
                if kind == "cache":
                    tool_results.append(result)
                    state = state.with_tool_execution(slot["execution"])
                    reasoning_step_tools.append(slot["execution"])
                    yield ToolCompleteEvent(
                        tool_name=result.name,
                        tool_call_id=result.tool_call_id,
                        result=result.content,
                        error=result.error,
                        duration_ms=result.duration_ms,
                    )
                    continue
                if kind == "batch_cache_ref":
                    # Same-args duplicate of an earlier slot in this same
                    # batch. The referenced slot's result is now resolved
                    # (Phase 2 ran ``asyncio.gather`` to completion before
                    # this loop). Reuse it and rebrand the result/execution
                    # for this call's id.
                    ref_slot = slots[slot["ref_slot"]]
                    ref_result: ToolResult = ref_slot["result"]
                    result = ToolResult(
                        tool_call_id=tool_call.id,
                        name=tool_call.name,
                        content=ref_result.content,
                        error=ref_result.error,
                        duration_ms=0.0,
                    )
                    batch_cache_execution = ToolExecution(
                        tool_name=result.name,
                        tool_call_id=result.tool_call_id,
                        arguments=modified_args,
                        result=result.content if result.success else None,
                        error=result.error,
                        duration_ms=0.0,
                        idempotent_cache_hit=True,
                    )
                    tool_results.append(result)
                    state = state.with_tool_execution(batch_cache_execution)
                    reasoning_step_tools.append(batch_cache_execution)
                    yield ToolCompleteEvent(
                        tool_name=result.name,
                        tool_call_id=result.tool_call_id,
                        result=result.content,
                        error=result.error,
                        duration_ms=0.0,
                    )
                    continue

                # Interrupt marker from ``ask_user``. Sibling calls in
                # this batch have already executed in parallel — we
                # still pause here, leaving them un-folded into state.
                # An interrupt is rare and ``ask_user`` is the only
                # tool that emits it; we trust users not to fan it
                # out alongside high-cost siblings.
                if result.content and '"__interrupt__": true' in result.content:
                    import json as _json

                    try:
                        interrupt_data = _json.loads(result.content)
                        if interrupt_data.get("__interrupt__"):
                            self._last_run_state = state
                            self._interrupt_state = state
                            self._interrupt_prompt = prompt
                            self._interrupt_thread_id = thread_id
                            self._interrupt_metadata = metadata
                            yield InterruptEvent(
                                question=interrupt_data.get("question", ""),
                                options=interrupt_data.get("options"),
                                interrupt_id=result.tool_call_id,
                            )
                            return  # Pause the generator
                    except (ValueError, KeyError):
                        pass

                # Cap oversized tool results so they don't blow the
                # model's context window. When ``tool_result_store`` is
                # configured we offload the full payload through it and
                # inline a recoverable reference key; otherwise we fall
                # back to lossy head-truncation.
                if (
                    self.config.max_tool_result_length > 0
                    and result.content
                    and len(result.content) > self.config.max_tool_result_length
                ):
                    if self.config.tool_result_store is not None:
                        result = self.config.tool_result_store.maybe_offload(
                            result,
                            run_id=state.run_id,
                            iteration=state.iteration,
                        )
                    else:
                        original_len = len(result.content)
                        result = ToolResult(
                            tool_call_id=result.tool_call_id,
                            name=result.name,
                            content=(
                                result.content[: self.config.max_tool_result_length]
                                + f"\n[OUTPUT TRUNCATED — original: {original_len} chars]"
                            ),
                            error=result.error,
                            duration_ms=result.duration_ms,
                        )

                tool_results.append(result)

                execution = ToolExecution(
                    tool_name=result.name,
                    tool_call_id=result.tool_call_id,
                    arguments=modified_args,
                    result=result.content if result.success else None,
                    error=result.error,
                    duration_ms=result.duration_ms,
                )
                state = state.with_tool_execution(execution)
                reasoning_step_tools.append(execution)

                if result.error:
                    _tool_errors_count += 1

                # Skip the per-slot ToolCompleteEvent yield in
                # completion mode — Phase 2 already streamed it the
                # moment this tool finished. Sequential mode (default)
                # still emits here so consumers see events in
                # tool_call order.
                if self.config.tool_event_order == "sequential":
                    yield ToolCompleteEvent(
                        tool_name=result.name,
                        tool_call_id=result.tool_call_id,
                        result=result.content if result.success else None,
                        error=result.error,
                        duration_ms=result.duration_ms,
                    )

                after_tool_event = await self._run_after_tool_hooks(
                    result.name,
                    result.content if result.success else None,
                    result.error,
                    tool_call_id=result.tool_call_id,
                    arguments=modified_args,
                )

                if after_tool_event.retry:
                    try:
                        retry_ctx_factory = ToolContextFactory(
                            run_id=state.run_id,
                            agent_id=state.agent_id,
                            iteration=state.iteration,
                            state=state,
                            invocation_metadata=metadata or {},
                        )
                        [result] = await self._executor.execute(
                            [tool_call.model_copy(update={"arguments": modified_args})],
                            self._tool_registry,
                            retry_ctx_factory,
                        )
                    except Exception as e:  # noqa: BLE001 — user tool bodies can raise anything; surface as ToolResult.error
                        result = ToolResult(
                            tool_call_id=tool_call.id,
                            name=tool_call.name,
                            content="",
                            error=str(e),
                            duration_ms=0.0,
                        )

                if result.name in self.config.verify_tools:
                    self._has_unverified_writes = True
                if result.name in self.config.verification_tools:
                    self._has_unverified_writes = False

            # Add tool results to messages
            for result in tool_results:
                state = state.with_message(Message.tool(result))

            # Inject verification reminder if write-like tools were used
            if self.config.verify_tools:
                tools_used = {e.tool_name for e in reasoning_step_tools}
                wrote = tools_used & self.config.verify_tools
                if wrote:
                    state = state.with_message(
                        Message.system(
                            "[Verification Reminder] You modified files/data. "
                            "Before completing, verify your changes:\n"
                            "- Run tests or checks if available\n"
                            "- Read back modified files to confirm correctness\n"
                            "- Fix any issues found\n"
                            "Do NOT call task_complete until verified."
                        )
                    )

            # Apply Reflexion if enabled
            if (
                self.config.reflexion
                and self.config.reflexion.enabled
                and self._reflector
                and state.iteration % self.config.reflexion.evaluate_every_n_iterations == 0
            ):
                reflect_event, state = await self._apply_reflexion(state, reasoning_step_tools)
                _reflexion_evals += 1
                yield reflect_event

                # Inject guidance when agent is stuck or looping
                if self.config.reflexion.include_guidance and reflect_event.guidance:
                    guidance = f"[Agent Self-Reflection]\n{reflect_event.guidance}"
                    # Add replan suggestion if planning is enabled and agent is stuck
                    if self.config.planning and reflect_event.assessment in (
                        "stuck",
                        "loop_detected",
                    ):
                        guidance += (
                            "\n\n[Replan] Your current approach isn't working. "
                            "Create a NEW plan with a different strategy, then execute it."
                        )
                    state = state.with_message(Message.system(guidance))

            # Record reasoning step
            reasoning_step = ReasoningStep(
                iteration=state.iteration,
                thought=response.message.content,
                tool_calls=list(response.message.tool_calls),
                tool_results=reasoning_step_tools,
                reflection=None,  # Will be updated if reflexion was applied
                confidence_delta=0.0,
            )
            state = state.with_reasoning_step(reasoning_step)

            # Checkpoint if enabled
            if (
                self.config.checkpointer
                and self.config.checkpoint_every_n_iterations > 0
                and state.iteration % self.config.checkpoint_every_n_iterations == 0
            ):
                _cp_thread = thread_id or state.run_id
                await self.config.checkpointer.save(
                    state,
                    _cp_thread,
                )
                from locus.observability.emit import (  # noqa: PLC0415
                    EV_CHECKPOINT_SAVED,
                    emit,
                )

                await emit(
                    EV_CHECKPOINT_SAVED,
                    thread_id=_cp_thread,
                    iteration=state.iteration,
                    backend=type(self.config.checkpointer).__name__,
                    trigger="every_n_iterations",
                )

    except Exception as e:
        # Emit error termination
        state = state.with_error(str(e))
        yield TerminateEvent(
            reason="error",
            iterations_used=state.iteration,
            final_confidence=state.confidence,
            total_tool_calls=len(state.tool_executions),
        )
        raise

    finally:
        # Clear cancel signal
        if self._cancel_signal is not None:
            self._cancel_signal.clear()

        # Save output to state if output_key configured
        if self.config.output_key:
            final_msg = ""
            for msg in reversed(state.messages):
                if msg.role.value == "assistant" and msg.content:
                    final_msg = msg.content
                    break
            if final_msg:
                state = state.with_metadata(self.config.output_key, final_msg)

        # Store final state for run_sync access
        self._last_run_state = state

        # Run hooks: after_invocation
        _duration_ms = (datetime.now(UTC) - started_at).total_seconds() * 1000  # noqa: F841
        await self._run_after_invocation_hooks(state, len(state.errors) == 0)

        # Extract and persist long-term memories from this session.
        if self._memory_manager is not None:
            await self._memory_manager.on_session_end(state)

        # Final checkpoint
        if self.config.checkpointer and thread_id:
            await self.config.checkpointer.save(state, thread_id)
            from locus.observability.emit import (  # noqa: PLC0415
                EV_CHECKPOINT_SAVED,
                emit,
            )

            await emit(
                EV_CHECKPOINT_SAVED,
                thread_id=thread_id,
                iteration=state.iteration,
                backend=type(self.config.checkpointer).__name__,
                trigger="final",
            )

AgentConfig¶

AgentConfig ¶

Bases: BaseModel

Configuration for an Agent instance.

All parameters can be validated before agent creation.

validate_model `classmethod` ¶

validate_model(v: Any) -> Any

Validate model is a string or ModelProtocol.

Source code in src/locus/agent/config.py

@field_validator("model", mode="before")
@classmethod
def validate_model(cls, v: Any) -> Any:
    """Validate model is a string or ModelProtocol."""
    if isinstance(v, str):
        if ":" not in v:
            raise ValueError(
                f"Model string must be 'provider:model', got: {v}. Example: 'openai:gpt-4o'"
            )
        return v
    # Assume it's a ModelProtocol instance
    return v

validate_tools `classmethod` ¶

validate_tools(v: Any) -> list[Any]

Ensure tools is a list.

Source code in src/locus/agent/config.py

@field_validator("tools", mode="before")
@classmethod
def validate_tools(cls, v: Any) -> list[Any]:
    """Ensure tools is a list."""
    if v is None:
        return []
    if not isinstance(v, list):
        return [v]
    return v

with_reflexion ¶

with_reflexion(enabled: bool = True, confidence_threshold: float = 0.85, **kwargs: Any) -> AgentConfig

Return a copy with Reflexion configured.

Source code in src/locus/agent/config.py

def with_reflexion(
    self,
    enabled: bool = True,
    confidence_threshold: float = 0.85,
    **kwargs: Any,
) -> AgentConfig:
    """Return a copy with Reflexion configured."""
    return self.model_copy(
        update={
            "reflexion": ReflexionConfig(
                enabled=enabled,
                confidence_threshold=confidence_threshold,
                **kwargs,
            )
        }
    )

with_grounding ¶

with_grounding(enabled: bool = True, threshold: float = 0.65, **kwargs: Any) -> AgentConfig

Return a copy with Grounding configured.

Source code in src/locus/agent/config.py

def with_grounding(
    self,
    enabled: bool = True,
    threshold: float = 0.65,
    **kwargs: Any,
) -> AgentConfig:
    """Return a copy with Grounding configured."""
    return self.model_copy(
        update={
            "grounding": GroundingConfig(
                enabled=enabled,
                threshold=threshold,
                **kwargs,
            )
        }
    )

with_hooks ¶

with_hooks(*hooks: Any) -> AgentConfig

Return a copy with additional hooks.

Source code in src/locus/agent/config.py

def with_hooks(self, *hooks: Any) -> AgentConfig:
    """Return a copy with additional hooks."""
    return self.model_copy(update={"hooks": [*self.hooks, *hooks]})

Reasoning configs¶

AgentConfig.reflexion, AgentConfig.grounding, and AgentConfig.gsar accept typed configuration objects defined alongside the config itself. The boolean shorthand (reflexion=True) coerces into a default instance — the explicit form is what to reach for when you want to tune thresholds, swap models, or override defaults.

ReflexionConfig ¶

Bases: BaseModel

Configuration for Reflexion reasoning pattern.

GroundingConfig ¶

Bases: BaseModel

Configuration for Grounding evaluation.

GSARConfig ¶

Bases: BaseModel

Configuration for the GSAR typed-grounding layer.

Wires the framework from arXiv:2604.23366 onto an Agent. When set on :class:AgentConfig, the agent runs the configured judge over its final assistant message + tool-execution history after the loop completes; the resulting :class:~locus.reasoning.gsar_judge.JudgeOutput, scalar score S, and decision δ are surfaced on :class:~locus.agent.result.AgentResult.

This is a single-pass v1 — the agent produces an answer, the judge scores it, and the result is exposed for the caller to act on. The full Algorithm-1 outer loop with regenerate / replan callbacks lives separately in :mod:locus.reasoning.gsar_evaluator; wire it explicitly when you want the loop dynamics.

AgentResult¶

AgentResult ¶

Bases: BaseModel

Result from an agent execution.

Contains the final message, state, and execution metrics.

success `property` ¶

success: bool

Whether execution completed successfully.

confidence `property` ¶

confidence: float

Final confidence score.

iterations `property` ¶

iterations: int

Number of iterations used.

text `property` ¶

text: str

Alias for message.

Many AI SDKs surface the final assistant text as .text; Locus's primary field is .message. Both names now work.

messages `property` ¶

messages: tuple[Message, ...]

All messages from the conversation.

tool_executions `property` ¶

tool_executions: tuple[ToolExecution, ...]

All tool executions.

reasoning_steps `property` ¶

reasoning_steps: tuple[ReasoningStep, ...]

All reasoning steps.

last_assistant_message `property` ¶

last_assistant_message: str | None

Get the last assistant message content.

parsed_as ¶

parsed_as(schema: type[T]) -> T

Return parsed cast to schema, with a runtime check.

Use this when you want a typed handle on the structured output without casting yourself::

picks = result.parsed_as(VendorList)
for v in picks.vendors:
    ...

Raises ValueError if parsed is None (parse failed or no schema configured) and TypeError if parsed is the wrong concrete type.

Source code in src/locus/agent/result.py

def parsed_as(self, schema: type[T]) -> T:
    """Return ``parsed`` cast to ``schema``, with a runtime check.

    Use this when you want a typed handle on the structured output without
    casting yourself::

        picks = result.parsed_as(VendorList)
        for v in picks.vendors:
            ...

    Raises ``ValueError`` if ``parsed`` is None (parse failed or no schema
    configured) and ``TypeError`` if ``parsed`` is the wrong concrete type.
    """
    if self.parsed is None:
        if self.parse_error:
            raise ValueError(f"AgentResult has no parsed output: {self.parse_error}")
        raise ValueError("AgentResult has no parsed output (no output_schema was configured)")
    if not isinstance(self.parsed, schema):
        raise TypeError(f"Expected {schema.__name__}, got {type(self.parsed).__name__}")
    return self.parsed

to_dict ¶

to_dict() -> dict[str, Any]

Export result to dictionary.

Source code in src/locus/agent/result.py

def to_dict(self) -> dict[str, Any]:
    """Export result to dictionary."""
    return self.model_dump(mode="json")

from_state `classmethod` ¶

from_state(state: AgentState, stop_reason: StopReason, metrics: ExecutionMetrics | None = None, started_at: datetime | None = None, error: str | None = None, grounding_score: float | None = None, ungrounded_claims: list[str] | None = None, parsed: BaseModel | None = None, parse_error: str | None = None, message: str | None = None, gsar_judgment: Any = None, gsar_score: float | None = None, gsar_decision: str | None = None) -> AgentResult

Create a result from final state.

Extracts the final message from the last assistant response unless an explicit message is supplied (used after a structuring re-prompt).

Source code in src/locus/agent/result.py

@classmethod
def from_state(
    cls,
    state: AgentState,
    stop_reason: StopReason,
    metrics: ExecutionMetrics | None = None,
    started_at: datetime | None = None,
    error: str | None = None,
    grounding_score: float | None = None,
    ungrounded_claims: list[str] | None = None,
    parsed: BaseModel | None = None,
    parse_error: str | None = None,
    message: str | None = None,
    gsar_judgment: Any = None,
    gsar_score: float | None = None,
    gsar_decision: str | None = None,
) -> AgentResult:
    """
    Create a result from final state.

    Extracts the final message from the last assistant response unless an
    explicit ``message`` is supplied (used after a structuring re-prompt).
    """
    # Find the last assistant message if not provided
    final_message = message
    if final_message is None:
        final_message = ""
        for msg in reversed(state.messages):
            if msg.role.value == "assistant":
                final_message = msg.content or ""
                break

    return cls(
        message=final_message,
        state=state,
        stop_reason=stop_reason,
        metrics=metrics or ExecutionMetrics(),
        started_at=started_at or state.started_at,
        completed_at=datetime.now(UTC),
        error=error,
        grounding_score=grounding_score,
        ungrounded_claims=ungrounded_claims or [],
        parsed=parsed,
        parse_error=parse_error,
        gsar_judgment=gsar_judgment,
        gsar_score=gsar_score,
        gsar_decision=gsar_decision,
    )

Result sub-types¶

AgentResult.metrics is an ExecutionMetrics, AgentResult.stop_reason is a StopReason literal, and the streaming entry point yields StreamingResult between events.

ExecutionMetrics ¶

Bases: BaseModel

Metrics from agent execution.

tools_success_rate `property` ¶

tools_success_rate: float

Percentage of successful tool calls.

tokens_per_iteration `property` ¶

tokens_per_iteration: float

Average tokens per iteration.

StopReason `module-attribute` ¶

StopReason = Literal['complete', 'terminal_tool', 'confidence_met', 'max_iterations', 'tool_loop', 'no_tools', 'grounding_failed', 'token_budget', 'time_budget', 'interrupted', 'error', 'cancelled']

StreamingResult ¶

Bases: BaseModel

Partial result during streaming.

Used to provide intermediate state during agent execution.

AgentState¶

AgentState ¶

Bases: BaseModel

Immutable state for an agent execution.

All updates return a new state instance (functional updates).

has_tool_loop `property` ¶

has_tool_loop: bool

Check if agent is stuck in a tool loop across iterations.

Multiple calls to the same tool in one turn (parallel execution) is normal. A loop is the same call signature — name and arguments — repeating across consecutive iterations. Same name with different arguments (paged discovery, sweeping inputs, retrying with a corrected parameter) counts as forward progress and is not a loop.

last_tool_calls `property` ¶

last_tool_calls: list[ToolCall]

Get tool calls from the last assistant message.

called_terminal_tool `property` ¶

called_terminal_tool: bool

Check if a terminal tool was called.

should_terminate `property` ¶

should_terminate: tuple[bool, str | None]

Check if the agent should terminate.

In "auto" mode: stops on confidence, no_tools, tool_loop, or terminal_tool. In "explicit" mode: only stops on terminal_tool, max_iterations, or budgets. Use "explicit" for multi-step tasks that require verification before completion.

Returns:

Type	Description
`tuple[bool, str \| None]`	Tuple of (should_stop, reason)

total_tokens `property` ¶

total_tokens: int

Total tokens used. Returns real count if tracked, else char/4 estimate.

with_message ¶

with_message(message: Message) -> AgentState

Add a message to the conversation.

Source code in src/locus/core/state.py

def with_message(self, message: Message) -> AgentState:
    """Add a message to the conversation."""
    return self.model_copy(
        update={
            "messages": (*self.messages, message),
            "updated_at": datetime.now(UTC),
        }
    )

with_messages ¶

with_messages(messages: list[Message]) -> AgentState

Add multiple messages to the conversation.

Source code in src/locus/core/state.py

def with_messages(self, messages: list[Message]) -> AgentState:
    """Add multiple messages to the conversation."""
    return self.model_copy(
        update={
            "messages": (*self.messages, *messages),
            "updated_at": datetime.now(UTC),
        }
    )

with_iteration ¶

with_iteration(iteration: int) -> AgentState

Update the current iteration.

Source code in src/locus/core/state.py

def with_iteration(self, iteration: int) -> AgentState:
    """Update the current iteration."""
    return self.model_copy(
        update={
            "iteration": iteration,
            "updated_at": datetime.now(UTC),
        }
    )

next_iteration ¶

next_iteration() -> AgentState

Increment iteration counter.

Source code in src/locus/core/state.py

def next_iteration(self) -> AgentState:
    """Increment iteration counter."""
    return self.with_iteration(self.iteration + 1)

with_provider_state ¶

with_provider_state(provider_state: dict[str, Any] | None) -> AgentState

Replace the provider continuation state.

Server-stateful transports (e.g. OCIResponsesModel) return a continuation token in ModelResponse.provider_state; the agent calls this to thread the token into the next turn.

Source code in src/locus/core/state.py

def with_provider_state(self, provider_state: dict[str, Any] | None) -> AgentState:
    """Replace the provider continuation state.

    Server-stateful transports (e.g. ``OCIResponsesModel``) return
    a continuation token in ``ModelResponse.provider_state``; the
    agent calls this to thread the token into the next turn.
    """
    return self.model_copy(
        update={
            "provider_state": provider_state,
            "updated_at": datetime.now(UTC),
        }
    )

with_tool_execution ¶

with_tool_execution(execution: ToolExecution) -> AgentState

Record a tool execution.

Source code in src/locus/core/state.py

def with_tool_execution(self, execution: ToolExecution) -> AgentState:
    """Record a tool execution."""
    return self.model_copy(
        update={
            "tool_executions": (*self.tool_executions, execution),
            "tool_history": (*self.tool_history, execution.tool_name),
            "updated_at": datetime.now(UTC),
        }
    )

with_reasoning_step ¶

with_reasoning_step(step: ReasoningStep) -> AgentState

Add a reasoning step to the trace.

Source code in src/locus/core/state.py

def with_reasoning_step(self, step: ReasoningStep) -> AgentState:
    """Add a reasoning step to the trace."""
    return self.model_copy(
        update={
            "reasoning_steps": (*self.reasoning_steps, step),
            "updated_at": datetime.now(UTC),
        }
    )

with_confidence ¶

with_confidence(confidence: float) -> AgentState

Update confidence score.

Source code in src/locus/core/state.py

def with_confidence(self, confidence: float) -> AgentState:
    """Update confidence score."""
    clamped = max(0.0, min(1.0, confidence))
    return self.model_copy(
        update={
            "confidence": clamped,
            "confidence_history": (*self.confidence_history, clamped),
            "updated_at": datetime.now(UTC),
        }
    )

adjust_confidence ¶

adjust_confidence(delta: float, diminishing: bool = True) -> AgentState

Adjust confidence with optional diminishing returns.

Parameters:

Name	Type	Description	Default
`delta`	`float`	Raw confidence adjustment (-1.0 to 1.0)	required
`diminishing`	`bool`	If True, positive deltas are scaled by (1 - current_confidence)	`True`

Source code in src/locus/core/state.py

def adjust_confidence(self, delta: float, diminishing: bool = True) -> AgentState:
    """
    Adjust confidence with optional diminishing returns.

    Args:
        delta: Raw confidence adjustment (-1.0 to 1.0)
        diminishing: If True, positive deltas are scaled by (1 - current_confidence)
    """
    if diminishing and delta > 0:
        # Diminishing returns: harder to increase confidence as it gets higher
        effective_delta = delta * (1.0 - self.confidence)
    else:
        effective_delta = delta

    return self.with_confidence(self.confidence + effective_delta)

with_error ¶

with_error(error: str) -> AgentState

Record an error.

Source code in src/locus/core/state.py

def with_error(self, error: str) -> AgentState:
    """Record an error."""
    return self.model_copy(
        update={
            "errors": (*self.errors, error),
            "updated_at": datetime.now(UTC),
        }
    )

with_metadata ¶

with_metadata(key: str, value: Any) -> AgentState

Set a metadata value.

Source code in src/locus/core/state.py

def with_metadata(self, key: str, value: Any) -> AgentState:
    """Set a metadata value."""
    return self.model_copy(
        update={
            "metadata": {**self.metadata, key: value},
            "updated_at": datetime.now(UTC),
        }
    )

with_token_usage ¶

with_token_usage(prompt_tokens: int, completion_tokens: int, cache_creation_tokens: int = 0, cache_read_tokens: int = 0) -> AgentState

Record token usage from a model response.

cache_creation_tokens and cache_read_tokens are populated only when Anthropic returns prompt-cache stats on the response usage (i.e., the AnthropicModel was configured with prompt_cache=True). Default 0 for other providers.

Source code in src/locus/core/state.py

def with_token_usage(
    self,
    prompt_tokens: int,
    completion_tokens: int,
    cache_creation_tokens: int = 0,
    cache_read_tokens: int = 0,
) -> AgentState:
    """Record token usage from a model response.

    ``cache_creation_tokens`` and ``cache_read_tokens`` are populated
    only when Anthropic returns prompt-cache stats on the response
    usage (i.e., the AnthropicModel was configured with
    ``prompt_cache=True``). Default 0 for other providers.
    """
    return self.model_copy(
        update={
            "total_tokens_used": self.total_tokens_used + prompt_tokens + completion_tokens,
            "prompt_tokens_used": self.prompt_tokens_used + prompt_tokens,
            "completion_tokens_used": self.completion_tokens_used + completion_tokens,
            "cache_creation_tokens_used": (
                self.cache_creation_tokens_used + cache_creation_tokens
            ),
            "cache_read_tokens_used": self.cache_read_tokens_used + cache_read_tokens,
            "updated_at": datetime.now(UTC),
        }
    )

to_checkpoint ¶

to_checkpoint() -> dict[str, Any]

Serialize state for checkpointing.

Source code in src/locus/core/state.py

def to_checkpoint(self) -> dict[str, Any]:
    """Serialize state for checkpointing."""
    return self.model_dump(mode="json")

from_checkpoint `classmethod` ¶

from_checkpoint(data: dict[str, Any]) -> AgentState

Restore state from checkpoint.

Source code in src/locus/core/state.py

@classmethod
def from_checkpoint(cls, data: dict[str, Any]) -> AgentState:
    """Restore state from checkpoint."""
    return cls.model_validate(data)

State sub-types¶

AgentState.tool_executions is a tuple of ToolExecution, AgentState.reasoning_steps is a tuple of ReasoningStep. Both are surfaced on AgentResult via the matching properties.

ToolExecution ¶

Bases: BaseModel

Record of a single tool execution.

success `property` ¶

success: bool

Whether the execution succeeded.

ReasoningStep ¶

Bases: BaseModel

A single step in the agent's reasoning trace.

Composition¶

The composition helpers chain or fan-out multiple agents while keeping the same AgentResult shape at the boundary. See the Multi-agent composition page for the full pipeline classes; the functional builders below are the ergonomic entry points.

PipelineResult ¶

Bases: BaseModel

Result from a pipeline execution.

sequential ¶

sequential(*agents: Any, prompt_template: str | None = None) -> SequentialPipeline

Create a sequential pipeline from agents.

Parameters:

Name	Type	Description	Default
`*agents`	`Any`	Agents to run in order	`()`
`prompt_template`	`str \| None`	Optional template for passing output between agents	`None`

Source code in src/locus/agent/composition.py

def sequential(*agents: Any, prompt_template: str | None = None) -> SequentialPipeline:
    """Create a sequential pipeline from agents.

    Args:
        *agents: Agents to run in order
        prompt_template: Optional template for passing output between agents
    """
    kwargs: dict[str, Any] = {"agents": list(agents)}
    if prompt_template:
        kwargs["prompt_template"] = prompt_template
    return SequentialPipeline(**kwargs)

parallel ¶

parallel(*agents: Any, merge_strategy: str = 'concatenate') -> ParallelPipeline

Create a parallel pipeline from agents.

Parameters:

Name	Type	Description	Default
`*agents`	`Any`	Agents to run concurrently	`()`
`merge_strategy`	`str`	How to merge results ('concatenate' or 'last')	`'concatenate'`

Source code in src/locus/agent/composition.py

def parallel(*agents: Any, merge_strategy: str = "concatenate") -> ParallelPipeline:
    """Create a parallel pipeline from agents.

    Args:
        *agents: Agents to run concurrently
        merge_strategy: How to merge results ('concatenate' or 'last')
    """
    return ParallelPipeline(agents=list(agents), merge_strategy=merge_strategy)

loop ¶

loop(agent: Any, condition: Callable[[str], bool], max_loops: int = 5, loop_prompt: str | None = None) -> LoopAgent

Create a loop agent.

Parameters:

Name	Type	Description	Default
`agent`	`Any`	Agent to run repeatedly	required
`condition`	`Callable[[str], bool]`	Function returning True when loop should stop	required
`max_loops`	`int`	Maximum iterations	`5`
`loop_prompt`	`str \| None`	Template for loop iteration prompts	`None`

Source code in src/locus/agent/composition.py

def loop(
    agent: Any,
    condition: Callable[[str], bool],
    max_loops: int = 5,
    loop_prompt: str | None = None,
) -> LoopAgent:
    """Create a loop agent.

    Args:
        agent: Agent to run repeatedly
        condition: Function returning True when loop should stop
        max_loops: Maximum iterations
        loop_prompt: Template for loop iteration prompts
    """
    kwargs: dict[str, Any] = {
        "agent": agent,
        "condition": condition,
        "max_loops": max_loops,
    }
    if loop_prompt:
        kwargs["loop_prompt"] = loop_prompt
    return LoopAgent(**kwargs)

Agent¶

Agent class¶

Agent ¶

Async streaming¶

Sync execution¶

is_cancelled property ¶

model property ¶

tools property ¶

system_prompt property ¶

run_sync ¶

invoke ¶

cancel ¶

as_tool ¶

resume async ¶

add_tool ¶

add_tools ¶

run async ¶

AgentConfig¶

AgentConfig ¶

validate_model classmethod ¶

validate_tools classmethod ¶

with_reflexion ¶

with_grounding ¶

with_hooks ¶

Reasoning configs¶

ReflexionConfig ¶

GroundingConfig ¶

GSARConfig ¶

AgentResult¶

AgentResult ¶

success property ¶

confidence property ¶

iterations property ¶

text property ¶

messages property ¶

tool_executions property ¶

reasoning_steps property ¶

last_assistant_message property ¶

parsed_as ¶

to_dict ¶

from_state classmethod ¶

Result sub-types¶

ExecutionMetrics ¶

tools_success_rate property ¶

tokens_per_iteration property ¶

StopReason module-attribute ¶

StreamingResult ¶

AgentState¶

AgentState ¶

has_tool_loop property ¶

last_tool_calls property ¶

called_terminal_tool property ¶

should_terminate property ¶

total_tokens property ¶

with_message ¶

with_messages ¶

with_iteration ¶

next_iteration ¶

with_provider_state ¶

with_tool_execution ¶

with_reasoning_step ¶

with_confidence ¶

adjust_confidence ¶

with_error ¶

with_metadata ¶

with_token_usage ¶

to_checkpoint ¶

from_checkpoint classmethod ¶

State sub-types¶

ToolExecution ¶

success property ¶

ReasoningStep ¶

Composition¶

PipelineResult ¶

sequential ¶

parallel ¶

loop ¶

`Agent` class¶

is_cancelled `property` ¶

model `property` ¶

tools `property` ¶

system_prompt `property` ¶

resume `async` ¶

run `async` ¶

validate_model `classmethod` ¶

validate_tools `classmethod` ¶

success `property` ¶

confidence `property` ¶

iterations `property` ¶

text `property` ¶

messages `property` ¶

tool_executions `property` ¶

reasoning_steps `property` ¶

last_assistant_message `property` ¶

from_state `classmethod` ¶

tools_success_rate `property` ¶

tokens_per_iteration `property` ¶

StopReason `module-attribute` ¶

has_tool_loop `property` ¶

last_tool_calls `property` ¶

called_terminal_tool `property` ¶

should_terminate `property` ¶

total_tokens `property` ¶

from_checkpoint `classmethod` ¶

success `property` ¶