Getting Started¶
Installation¶
Core (protocols, types, in-memory providers)¶
With a runtime¶
pip install cognitia[thin] # Built-in lightweight multi-provider runtime
pip install cognitia[claude] # Claude Agent SDK runtime (subprocess + MCP)
pip install cognitia[deepagents] # DeepAgents runtime baseline (native graph + Anthropic path)
cognitia[thin] bundles the Anthropic, OpenAI-compatible, and Google SDK paths used by ThinRuntime.
DeepAgents provider overrides are installed separately:
pip install cognitia[deepagents] langchain-openai openai
pip install cognitia[deepagents] langchain-google-genai
With storage¶
pip install cognitia[postgres] # PostgreSQL memory provider
pip install cognitia[sqlite] # SQLite memory provider
With web tools¶
pip install cognitia[web] # Base web fetch (httpx)
pip install cognitia[web-duckduckgo] # DuckDuckGo search (no API key)
pip install cognitia[web-tavily] # Tavily AI search
pip install cognitia[web-jina] # Jina Reader (URL → markdown)
pip install cognitia[web-crawl4ai] # Crawl4AI (Playwright-based)
With sandbox¶
Everything (for development)¶
Quick Start: cognitia init (recommended)¶
The fastest way to start a new project — scaffold a full agent in 10 seconds:
pip install cognitia[cli]
cognitia init my-agent
cd my-agent
cp .env.example .env # add your ANTHROPIC_API_KEY
pip install -e .
python agent.py "Hello!"
Options:
cognitia init my-agent # minimal (thin runtime, in-memory)
cognitia init my-agent --runtime claude # Claude Agent SDK
cognitia init my-agent --memory sqlite # persistent SQLite memory
cognitia init my-agent --full # all features + Docker setup
cognitia init my-agent --output ./projects # custom output directory
Generated structure:
my-agent/
├── agent.py ← main entry point (runnable immediately)
├── config.yaml ← agent configuration (runtime, memory, tools)
├── tests/
│ └── test_agent.py ← starter test
├── .env.example ← API key template
├── pyproject.toml ← project metadata
└── README.md ← usage instructions
# (--full adds: Dockerfile, docker-compose.yml, skills/)
Quick Start: Agent Facade (simplest)¶
The fastest way to get started without scaffolding — 3 lines of code:
from cognitia import Agent, AgentConfig
agent = Agent(AgentConfig(system_prompt="You are a helpful assistant.", runtime="thin"))
result = await agent.query("What is the capital of France?")
print(result.text) # "The capital of France is Paris."
That's it. No config files, no project structure — just an agent that works.
Credentials and Environment Variables¶
Before using a live provider, decide which runtime/provider path you want and set credentials accordingly:
thinreads provider credentials from the current shell environmentclaude_sdkcan use either local Claude login state or explicitANTHROPIC_API_KEYdeepagentsuses provider-specific LangChain credentialscliforwards credentials to the wrapped CLI via shell env orCliConfig.env
Canonical reference:
Fast examples:
# Thin + Anthropic
export ANTHROPIC_API_KEY=sk-ant-...
# Thin + OpenRouter
export OPENAI_API_KEY=sk-or-...
# DeepAgents + OpenRouter (OpenAI-compatible path)
export OPENAI_API_KEY=sk-or-...
export OPENAI_BASE_URL=https://openrouter.ai/api/v1
If you use the high-level AgentConfig facade, portable runtimes (thin, deepagents) currently read credentials from process environment. AgentConfig.env is primarily for claude_sdk.
Step-by-Step Guide¶
1. Custom Tools¶
Define tools as async Python functions. Cognitia auto-infers JSON Schema from type hints:
from cognitia import Agent, AgentConfig, tool
@tool(name="weather", description="Get current weather for a city")
async def get_weather(city: str, units: str = "celsius") -> str:
# In production, call a real weather API here
return f"Weather in {city}: 22 {units}"
agent = Agent(AgentConfig(
system_prompt="You are a weather assistant.",
runtime="thin",
tools=(get_weather,),
))
result = await agent.query("What's the weather in Paris?")
print(result.text) # "The weather in Paris is 22 celsius."
Type mapping: str → "string", int → "integer", float → "number", bool → "boolean". Parameters with defaults are optional in the schema.
2. Streaming¶
Get tokens as they arrive from the model:
agent = Agent(AgentConfig(system_prompt="You are a writer.", runtime="thin"))
async for event in agent.stream("Write a haiku about Python"):
if event.type == "text_delta":
print(event.text, end="", flush=True)
elif event.type == "tool_use_start":
print(f"\n[Tool: {event.tool_name}]")
Event types: text_delta, tool_use_start, tool_use_result, done, error. Use attributes like event.type, event.text, event.tool_name.
3. Multi-Turn Conversation¶
Maintain context across turns:
async with agent.conversation() as conv:
r1 = await conv.say("My name is Alice")
r2 = await conv.say("What's my name?")
print(r2.text) # "Your name is Alice."
# Streaming in conversation
async for event in conv.stream("Tell me a joke"):
if event.type == "text_delta":
print(event.text, end="", flush=True)
4. Structured Output¶
Force the model to return validated data using a Pydantic model:
from pydantic import BaseModel
class UserInfo(BaseModel):
name: str
age: int
from cognitia.runtime.structured_output import extract_pydantic_schema
agent = Agent(AgentConfig(
system_prompt="Extract user info from text.",
runtime="thin",
output_format=extract_pydantic_schema(UserInfo),
))
result = await agent.query("John is 30 years old")
print(result.structured_output) # UserInfo(name='John', age=30)
You can also use a raw JSON Schema dict via output_format= for simpler cases without Pydantic.
See Structured Output for nested models, retry logic, and low-level API.
5. Middleware¶
Intercept requests and responses for cost tracking, security, logging:
from cognitia.agent import CostTracker, SecurityGuard
tracker = CostTracker(budget_usd=5.0)
guard = SecurityGuard(block_patterns=["password", "secret", "api_key"])
agent = Agent(AgentConfig(
system_prompt="You are a helpful assistant.",
runtime="thin",
middleware=(tracker, guard),
))
result = await agent.query("Hello!")
print(tracker.total_cost_usd) # 0.002
If a turn pushes the cumulative spend above the configured budget, CostTracker raises BudgetExceededError.
You can write custom middleware by extending the Middleware base class:
from cognitia.agent import Middleware
class LoggingMiddleware(Middleware):
async def before_query(self, prompt: str, config) -> str:
print(f"→ {prompt}")
return prompt
async def after_result(self, result) -> "Result":
print(f"← {result.text[:50]}")
return result
6. Switching Runtimes¶
Same code, different execution engines. Switch with one config change:
# Development: fast, no subprocess
agent = Agent(AgentConfig(system_prompt="...", runtime="thin"))
# Production: full Claude ecosystem with MCP
agent = Agent(AgentConfig(system_prompt="...", runtime="claude_sdk"))
# Experiments: DeepAgents graph runtime
agent = Agent(AgentConfig(system_prompt="...", runtime="deepagents"))
Or via environment variable:
6.1 DeepAgents: portable first¶
If you want the smallest migration gap between claude_sdk and deepagents, start with portable mode:
agent = Agent(AgentConfig(
system_prompt="You are a helpful assistant.",
runtime="deepagents",
feature_mode="portable",
))
result = await agent.query("What is 2+2?")
print(result.text)
Feature modes:
portable— tested parity baseline forquery(),stream(),conversation()hybrid— portable core + DeepAgents native built-ins/store seamsnative_first— prefer DeepAgents native built-ins and graph behavior
Practical note: the baseline cognitia[deepagents] extra is Anthropic-ready. For OpenAI or Google provider paths, install the provider bridge package separately. If you enable native built-ins, also pass an explicit native_config["backend"]; Cognitia now fails fast instead of silently falling back to DeepAgents StateBackend. For tool-heavy Gemini built-ins, prefer portable mode unless you are explicitly testing native provider behavior.
7. Model Selection¶
Use human-friendly aliases for any supported provider:
# Anthropic
agent = Agent(AgentConfig(runtime="thin", model="sonnet")) # Claude Sonnet 4
agent = Agent(AgentConfig(runtime="thin", model="opus")) # Claude Opus 4
agent = Agent(AgentConfig(runtime="thin", model="haiku")) # Claude Haiku 3
# OpenAI (via base_url or thin runtime)
agent = Agent(AgentConfig(runtime="thin", model="gpt-4o"))
# Google
agent = Agent(AgentConfig(runtime="thin", model="gemini"))
# DeepSeek
agent = Agent(AgentConfig(runtime="thin", model="r1"))
8. Resource Cleanup¶
Always clean up when done:
# Option 1: async context manager (recommended)
async with Agent(config) as agent:
result = await agent.query("Hello")
# cleanup called automatically
# Option 2: explicit cleanup
agent = Agent(config)
try:
result = await agent.query("Hello")
finally:
await agent.cleanup()
Advanced: CognitiaStack¶
For production applications that need memory, sandbox, web tools, planning, and MCP skills — use CognitiaStack:
Project Structure¶
your_app/
├── prompts/
│ ├── identity.md # Agent personality
│ ├── guardrails.md # Security constraints
│ ├── role_router.yaml # Auto role-switching rules
│ ├── role_skills.yaml # Role → tools/skills mapping
│ └── roles/
│ └── assistant.md # Per-role prompts
├── skills/ # MCP skills (optional)
│ └── my_skill/
│ ├── skill.yaml
│ └── INSTRUCTION.md
└── main.py
Minimal Stack¶
from pathlib import Path
from cognitia.bootstrap.stack import CognitiaStack
from cognitia.runtime.types import RuntimeConfig
from cognitia.todo.inmemory_provider import InMemoryTodoProvider
stack = CognitiaStack.create(
prompts_dir=Path("prompts"),
skills_dir=Path("skills"),
project_root=Path("."),
runtime_config=RuntimeConfig(runtime_name="thin", model="sonnet"),
todo_provider=InMemoryTodoProvider(user_id="user-1", topic_id="general"),
thinking_enabled=True,
)
Full-Featured Stack¶
from cognitia.bootstrap.stack import CognitiaStack
from cognitia.runtime.types import RuntimeConfig
from cognitia.tools.sandbox_local import LocalSandboxProvider
from cognitia.tools.types import SandboxConfig
from cognitia.tools.web_httpx import HttpxWebProvider
from cognitia.todo.inmemory_provider import InMemoryTodoProvider
from cognitia.memory_bank.fs_provider import FilesystemMemoryBankProvider
from cognitia.memory_bank.types import MemoryBankConfig
sandbox = LocalSandboxProvider(SandboxConfig(
root_path="/data/sandbox",
user_id="user-1",
topic_id="project-1",
timeout_seconds=30,
denied_commands=frozenset({"rm", "sudo"}),
))
memory = FilesystemMemoryBankProvider(
MemoryBankConfig(enabled=True, root_path=Path("/data/memory")),
user_id="user-1",
topic_id="project-1",
)
stack = CognitiaStack.create(
prompts_dir=Path("prompts"),
skills_dir=Path("skills"),
project_root=Path("."),
runtime_config=RuntimeConfig(runtime_name="thin", model="sonnet"),
sandbox_provider=sandbox,
web_provider=HttpxWebProvider(timeout=30),
todo_provider=InMemoryTodoProvider(user_id="user-1", topic_id="project-1"),
memory_bank_provider=memory,
thinking_enabled=True,
allowed_system_tools={"bash", "read", "write", "edit"},
)
Running the Stack¶
from cognitia.runtime.types import Message
# Create runtime
runtime = stack.runtime_factory.create(
runtime_name="thin",
config=stack.runtime_config,
)
# Run a query
messages = [Message(role="user", content="Help me analyze this project")]
async for event in runtime.run(
messages=messages,
system_prompt="You are a helpful assistant.",
active_tools=list(stack.capability_specs.values()),
):
if event.type == "assistant_delta":
print(event.data["text"], end="")
elif event.type == "tool_call_started":
print(f"\n[Tool: {event.data['name']}]")
elif event.type == "final":
new_messages = event.data["new_messages"]
Note: When using
runtime.run()directly, events have rawRuntimeEventtypes (assistant_delta,tool_call_started,tool_call_finished,final). When usingAgent.stream(), these are adapted totext_delta,tool_use_start,tool_use_result,done— see the Streaming section above.
9. Cost Budget¶
Track LLM spending and enforce limits using the middleware API:
from cognitia.agent import Agent, AgentConfig, CostTracker
tracker = CostTracker(budget_usd=5.0)
agent = Agent(AgentConfig(
system_prompt="You are a helpful assistant.",
runtime="thin",
middleware=(tracker,),
))
result = await agent.query("Hello!")
print(f"Total cost: ${tracker.total_cost_usd:.4f}")
For lower-level control, see CostBudget and CostTracker in Production Safety.
10. Guardrails¶
Pre- and post-LLM content checks via RuntimeConfig:
from cognitia.guardrails import ContentLengthGuardrail, RegexGuardrail
# Guardrails are applied at the RuntimeConfig level
length_guard = ContentLengthGuardrail(max_length=8000)
regex_guard = RegexGuardrail(patterns=[r"ignore previous instructions"])
# Check content before sending to LLM
result = await length_guard.check("Some user input here")
print(result.passed) # True if within limits
11. Sessions¶
Persist session state across restarts:
from cognitia.session.backends import SqliteSessionBackend, MemoryScope, scoped_key
backend = SqliteSessionBackend(db_path="sessions.db")
key = scoped_key(MemoryScope.AGENT, "user:42:session:abc")
await backend.save(key, {"turn": 7, "role": "coach"})
See Sessions for InMemorySessionBackend, custom backends, and SessionManager integration.
12. Observability¶
Event bus and tracing for runtime instrumentation:
from cognitia.observability.event_bus import InMemoryEventBus
from cognitia.observability.tracer import ConsoleTracer, TracingSubscriber
bus = InMemoryEventBus()
tracer = ConsoleTracer()
subscriber = TracingSubscriber(bus, tracer)
subscriber.attach()
# Subscribe to specific events
await bus.subscribe("llm_call_end", lambda data: print(f"LLM call: {data}"))
# Fire events (ThinRuntime does this automatically)
await bus.publish("llm_call_end", {"model": "sonnet", "tokens": 150})
See Observability for custom tracers and event subscriptions.
13. UI Projection¶
Convert RuntimeEvent streams into UI-friendly state for frontends:
from cognitia.ui.projection import ChatProjection, project_stream
from cognitia.runtime.types import RuntimeEvent
projection = ChatProjection()
# project_stream wraps an async event iterator into UIState updates
async def demo_events():
yield RuntimeEvent.assistant_delta(text="Hello, ")
yield RuntimeEvent.assistant_delta(text="world!")
yield RuntimeEvent.final(text="Hello, world!", new_messages=[])
async for ui_state in project_stream(demo_events(), projection):
for msg in ui_state.messages:
print(msg.blocks) # [TextBlock(text="Hello, world!")]
See UI Projection for custom projections and UIState.to_dict() serialization.
14. RAG¶
Inject relevant documents into LLM context using RagInputFilter:
from cognitia.rag import Document, SimpleRetriever, RagInputFilter
from cognitia.runtime.types import Message
docs = [
Document(content="Paris is the capital of France."),
Document(content="Python was created by Guido van Rossum."),
]
retriever = SimpleRetriever(documents=docs)
rag_filter = RagInputFilter(retriever=retriever, top_k=2)
messages = [Message(role="user", content="What is the capital of France?")]
filtered_msgs, enriched_prompt = await rag_filter.filter(messages, "You are helpful.")
print(enriched_prompt) # System prompt with relevant docs injected
See RAG for custom retrievers (Pinecone, pgvector) and filter chain integration.
What's New in v1.0.0¶
- CLI Runtime — subprocess-based runtime with NDJSON protocol (
CliAgentRuntime, see example19_cli_runtime.py) - Multi-Agent — agent-as-tool composition, priority task queues, agent registry with lifecycle management (examples
21-23) - Workflow Graphs — declarative graphs with conditions, loops, parallel branches, and human-in-the-loop interrupts (
WorkflowGraph, example20_workflow_graph.py) - RAG — retrieval-augmented generation with pluggable retrievers and
RagInputFilter(example08_rag.py) - 27 runnable examples — from basics to complex multi-agent scenarios, see Examples
Next Steps¶
- Agent Facade API — full reference for Agent, AgentConfig, @tool, Result, Conversation, Middleware
- Runtimes — Claude SDK vs ThinRuntime vs DeepAgents vs CLI: comparison, switching, capabilities
- Capabilities — sandbox, web, todo, memory bank, planning, thinking
- Memory Providers — InMemory, PostgreSQL, SQLite: 8 protocols, summarization
- Tools & Skills — @tool decorator, MCP skills (YAML), tool policy
- Web Tools — search providers (DuckDuckGo, Brave, Tavily, SearXNG), fetch providers
- Configuration — CognitiaStack, RuntimeConfig, ToolPolicy, environment variables
- Orchestration — planning mode, subagents, team mode, agent-as-tool, task queues, workflow graphs
- Structured Output — Pydantic validation, retry on failure, nested models
- Production Safety — cost budgets, guardrails, input filters, retry/fallback
- Sessions — session backends, memory scopes, persistence
- Observability — event bus, tracing, custom tracers
- UI Projection — RuntimeEvent to UIState for frontends
- RAG — retrieval-augmented generation, custom retrievers, filter chains
- Runtime Registry — custom runtimes, entry point plugins
- Architecture — Clean Architecture layers, protocols, design principles
- Examples — 27 runnable examples from basics to complex scenarios