Letta Adapter
Stateful agents with persistent memory, inner monologue, and self-editing capabilities.
Installation
pip install openharness-letta
This will install both openharness (the Python SDK) and letta-client (the official Letta SDK).
Quick Start
from openharness_letta import LettaAdapter
from openharness_letta.types import LettaAgentConfig, MemoryBlock
from openharness.types import ExecuteRequest
# Connect to Letta cloud
adapter = LettaAdapter(api_key="letta-...")
# Or connect to local Letta server
adapter = LettaAdapter(base_url="http://localhost:8283")
# Create an agent with memory
agent_id = await adapter.create_agent(LettaAgentConfig(
name="my-assistant",
model="openai/gpt-4o-mini",
memory_blocks=[
MemoryBlock(label="human", value="Name: Alice"),
MemoryBlock(label="persona", value="I am a helpful assistant."),
],
))
# Execute a prompt
result = await adapter.execute(
ExecuteRequest(message="Hello!", agent_id=agent_id)
)
print(result.output)
Capabilities
Memory-First Architecture
Letta's unique architecture centers on persistent memory blocks that survive across conversations. Agents can read and modify their own memory, enabling true long-term learning and personalization.
Supported
| Domain | Capability | Notes |
|---|---|---|
| Agents | create_agent() | Full lifecycle with memory blocks |
| Agents | get_agent() | Retrieve agent details |
| Agents | list_agents() | List all agents |
| Agents | delete_agent() | Delete agent and memory |
| Execution | execute() | Sync execution with memory |
| Execution | execute_stream() | Streaming with events |
| Memory | memory.get_blocks() | List all memory blocks |
| Memory | memory.update_block() | Update block content |
| Memory | memory.add_block() | Add new memory block |
| Memory | memory.delete_block() | Remove memory block |
| Tools | list_tools() | List registered tools |
| Tools | register_tool() | Register custom tools |
Not Supported
| Domain | Reason | Workaround |
|---|---|---|
| Sessions | Agents maintain state | Use agents for persistent conversations |
| MCP | Not in Letta | Register tools directly |
| Skills | Not in Letta | Register tools |
| Subagents | Not supported | Create separate agents |
| Files | Limited support | Use archival memory |
| Hooks | Not supported | Wrap adapter methods |
Memory Management
Letta's memory system is organized into "blocks" - persistent storage that the agent can read from and write to during conversations.
Standard Memory Blocks
- human - Information about the user (name, preferences, history)
- persona - The agent's personality, instructions, and self-description
- system - System-level instructions (optional)
Working with Memory
# Get the memory manager
memory = adapter.memory
# View current memory blocks
blocks = await memory.get_blocks(agent_id)
for block in blocks:
print(f"{block.label}: {block.value[:50]}...")
# Update a memory block
await memory.update_block(
agent_id,
"human",
"Name: Alice\nPreferences: Concise responses\nExpertise: Python"
)
# Add a custom memory block
await memory.add_block(
agent_id,
MemoryBlock(
label="project",
value="Working on OpenHarness - a universal API for AI harnesses"
)
)
Self-Editing Memory
Letta agents can modify their own memory during conversations. When the agent learns something about the user or updates its own understanding, it can persist this to memory blocks automatically.
Streaming
from openharness.types import ExecuteRequest
async for event in adapter.execute_stream(
ExecuteRequest(message="Explain recursion step by step", agent_id=agent_id)
):
if event.type == "text":
print(event.content, end="")
elif event.type == "thinking":
print(f"[Thinking: {event.thinking}]")
elif event.type == "tool_call_start":
print(f"\n[Tool: {event.name}]")
elif event.type == "done":
print(f"\n\nTokens: {event.usage.total_tokens}")
Event Types
| Event | Description |
|---|---|
text | Text content chunk |
thinking | Agent's inner monologue |
tool_call_start | Tool invocation beginning |
tool_result | Tool execution result |
tool_call_end | Tool invocation complete |
error | Error occurred |
done | Execution complete with usage stats |
Agent Lifecycle
from openharness_letta.types import LettaAgentConfig, MemoryBlock
# Create an agent
config = LettaAgentConfig(
name="coding-assistant",
model="openai/gpt-4o",
memory_blocks=[
MemoryBlock(label="human", value="Developer working on Python projects"),
MemoryBlock(label="persona", value="Expert Python developer assistant"),
],
tools=["web_search"],
include_base_tools=True,
)
agent_id = await adapter.create_agent(config)
# Get agent details
agent = await adapter.get_agent(agent_id)
print(agent["name"], agent["model"])
# List all agents
agents = await adapter.list_agents()
for a in agents:
print(a["id"], a["name"])
# Delete agent when done
await adapter.delete_agent(agent_id)
Configuration
Adapter Options
adapter = LettaAdapter(
api_key="letta-...", # Letta cloud API key
base_url="http://...", # Local server URL (alternative)
timeout=60.0, # Request timeout in seconds
)
Agent Configuration
class LettaAgentConfig:
name: str | None = None
model: str = "openai/gpt-4o-mini"
embedding_model: str = "openai/text-embedding-ada-002"
memory_blocks: list[MemoryBlock] = []
tools: list[str] = []
system_prompt: str | None = None
include_base_tools: bool = True
metadata: dict[str, Any] = {}
Environment Variables
LETTA_API_KEY- Letta cloud API key
Letta-Specific Features
Inner Monologue
Letta agents have an "inner monologue" - their thinking process is visible. This is included in streaming events by default:
async for event in adapter.execute_stream(request, include_thinking=True):
if event.type == "thinking":
print(f"Agent thinking: {event.thinking}")
Archival Memory
For long-term storage beyond the context window, Letta provides archival memory with semantic search. This allows agents to store and retrieve information that doesn't fit in the active memory blocks.
Multi-Model Support
Letta supports multiple model providers:
openai/gpt-4o,openai/gpt-4o-minianthropic/claude-3-opus,anthropic/claude-3-sonnet- Many more via Letta's provider integrations
Running Letta Locally
# Install Letta server
pip install letta
# Start the server
letta server
# Server starts at http://localhost:8283