Agent Facade API¶

The cognitia.agent module provides a high-level API for building AI agents in 3-5 lines of code.

Overview¶

from cognitia import Agent, AgentConfig, tool

agent = Agent(AgentConfig(runtime="thin"))
result = await agent.query("Hello!")
print(result.text)

AgentConfig¶

Frozen dataclass with all agent configuration:

from cognitia import AgentConfig

config = AgentConfig(
    runtime="thin",                    # "thin" | "claude_sdk" | "deepagents" | "cli"
    model="sonnet",                    # model alias or full ID
    system_prompt="You are helpful.",  # system prompt
    tools=(my_tool,),                  # tuple of @tool-decorated functions
    middleware=(tracker, guard),        # middleware chain
    max_turns=10,                      # max conversation turns
    permission_mode="bypassPermissions",  # SDK permission mode
    cwd="/path/to/project",           # working directory for tools
    output_format={"type": "object"},  # JSON Schema for structured output
    mcp_servers={"my_server": config}, # MCP server configs
)

All fields have sensible defaults. Only runtime is typically required.

Agent¶

query(prompt) -> Result¶

One-shot request. Applies middleware chain, executes through runtime, collects result.

result = await agent.query("What is 2+2?")
print(result.text)           # "4"
print(result.ok)             # True
print(result.total_cost_usd) # 0.001
print(result.usage)          # {"input_tokens": 10, "output_tokens": 5}

stream(prompt) -> AsyncIterator¶

Streaming mode. Yields events as they arrive from the runtime.

async for event in agent.stream("Write a poem"):
    if event.type == "text_delta":
        print(event.text, end="", flush=True)
    elif event.type == "tool_use_start":
        print(f"\n[Using tool: {event.tool_name}]")
    elif event.type == "done":
        print("\n[Done]")

Event types: text_delta, tool_use_start, tool_use_result, done, error.

conversation(session_id=None) -> Conversation¶

Create a multi-turn conversation with persistent context.

conv = agent.conversation()

# Or as async context manager (auto-cleanup)
async with agent.conversation() as conv:
    r1 = await conv.say("My name is Alice")
    r2 = await conv.say("What's my name?")
    print(r2.text)  # "Your name is Alice."
    print(conv.history)  # list of Message objects

cleanup()¶

Release resources (runtime subprocess, adapters).

await agent.cleanup()

# Or use as context manager
async with Agent(config) as agent:
    result = await agent.query("Hello")
# cleanup called automatically

Result¶

Frozen dataclass returned by query() and conversation.say():

@dataclass(frozen=True)
class Result:
    text: str = ""
    session_id: str | None = None
    total_cost_usd: float | None = None
    usage: dict[str, Any] | None = None
    structured_output: Any = None
    error: str | None = None

    @property
    def ok(self) -> bool:
        return self.error is None

@tool Decorator¶

Define tools with automatic JSON Schema inference from type hints:

from cognitia import tool

@tool(name="weather", description="Get current weather")
async def get_weather(city: str, units: str = "celsius") -> str:
    # city is required (str -> {"type": "string"})
    # units is optional (has default)
    return f"Weather in {city}: 22 {units}"

Auto-inferred types¶

Python Type	JSON Schema Type
`str`	`"string"`
`int`	`"integer"`
`float`	`"number"`
`bool`	`"boolean"`

Parameters without defaults go into required. Optional parameters (with None default) are excluded from required.

Explicit schema¶

Override auto-inference with a custom schema:

custom_schema = {
    "type": "object",
    "properties": {"query": {"type": "string", "maxLength": 200}},
    "required": ["query"],
}

@tool(name="search", description="Search", schema=custom_schema)
async def search(query: str) -> str:
    return "results"

ToolDefinition¶

The @tool decorator attaches a ToolDefinition to the function:

td = get_weather.__tool_definition__
td.name          # "weather"
td.description   # "Get current weather"
td.parameters    # {"type": "object", "properties": {...}, "required": [...]}
td.handler       # reference to the original async function
td.to_tool_spec()  # convert to cognitia ToolSpec for runtime

Middleware¶

Middleware intercepts the request/response lifecycle:

from cognitia.agent import Middleware

class LoggingMiddleware(Middleware):
    async def before_query(self, prompt: str, config) -> str:
        print(f"Query: {prompt}")
        return prompt  # can modify prompt

    async def after_result(self, result) -> Result:
        print(f"Result: {result.text[:50]}")
        return result  # can modify result

Built-in: CostTracker¶

Tracks cumulative cost and blocks queries when budget exceeded:

from cognitia.agent import CostTracker

tracker = CostTracker(budget_usd=5.0)
agent = Agent(AgentConfig(middleware=(tracker,)))

result = await agent.query("Hello")
print(tracker.total_cost_usd)  # 0.002

If a request pushes the cumulative spend above the configured budget, CostTracker raises BudgetExceededError.

Built-in: SecurityGuard¶

Blocks prompts containing sensitive patterns:

from cognitia.agent import SecurityGuard

guard = SecurityGuard(
    block_patterns=["password", "api_key", "secret"],
)

Built-in: ToolOutputCompressor¶

Compresses large tool outputs between turns. Content-type aware: JSON arrays are truncated, HTML tags are stripped, plain text uses head+tail strategy.

from cognitia.agent import ToolOutputCompressor

compressor = ToolOutputCompressor(max_result_chars=10000)
agent = Agent(AgentConfig(middleware=(compressor,)))

Integrates with HookRegistry via on_post_tool_use callback.

build_middleware_stack()¶

Factory function for common middleware combinations:

from cognitia.agent import build_middleware_stack

stack = build_middleware_stack(
    cost_tracker=True,
    tool_compressor=True,
    security_guard=True,
    budget_usd=5.0,
    blocked_patterns=["rm -rf"],
    max_result_chars=10000,
)

agent = Agent(AgentConfig(middleware=stack))

Parameter	Type	Default	Description
`cost_tracker`	`bool`	`False`	Enable CostTracker
`tool_compressor`	`bool`	`True`	Enable ToolOutputCompressor
`security_guard`	`bool`	`False`	Enable SecurityGuard
`max_result_chars`	`int`	`10000`	Max chars for tool output
`budget_usd`	`float`	`0.0`	Cost budget limit
`blocked_patterns`	`list[str]`	`[]`	Patterns to block

Middleware chain order¶

Middleware executes in declaration order for both before_query and after_result:

# before_query: mw1 -> mw2 -> mw3
# after_result: mw1 -> mw2 -> mw3
config = AgentConfig(middleware=(mw1, mw2, mw3))

Conversation¶

Multi-turn dialog management with history tracking.

say(message) -> Result¶

Send a message and get a response:

async with agent.conversation() as conv:
    r = await conv.say("Hello")
    print(r.text)

stream(message) -> AsyncIterator¶

Stream a response in a conversation:

async with agent.conversation() as conv:
    async for event in conv.stream("Tell me a story"):
        if event.type == "text_delta":
            print(event.text, end="")

Properties¶

conv.session_id  # unique session identifier
conv.history     # list[Message] - accumulated messages

Runtime behavior¶

claude_sdk: warm subprocess, continues conversation natively
thin/deepagents: accumulated messages sent each turn via AgentRuntime.run()