Installation

pip install openharness-letta

This will install both openharness (the Python SDK) and letta-client (the official Letta SDK).

Quick Start

from openharness_letta import LettaAdapter
from openharness_letta.types import LettaAgentConfig, MemoryBlock
from openharness.types import ExecuteRequest

# Connect to Letta cloud
adapter = LettaAdapter(api_key="letta-...")

# Or connect to local Letta server
adapter = LettaAdapter(base_url="http://localhost:8283")

# Create an agent with memory
agent_id = await adapter.create_agent(LettaAgentConfig(
    name="my-assistant",
    model="openai/gpt-4o-mini",
    memory_blocks=[
        MemoryBlock(label="human", value="Name: Alice"),
        MemoryBlock(label="persona", value="I am a helpful assistant."),
    ],
))

# Execute a prompt
result = await adapter.execute(
    ExecuteRequest(message="Hello!", agent_id=agent_id)
)
print(result.output)

Capabilities

Memory-First Architecture

Letta's unique architecture centers on persistent memory blocks that survive across conversations. Agents can read and modify their own memory, enabling true long-term learning and personalization.

Supported

DomainCapabilityNotes
Agentscreate_agent()Full lifecycle with memory blocks
Agentsget_agent()Retrieve agent details
Agentslist_agents()List all agents
Agentsdelete_agent()Delete agent and memory
Executionexecute()Sync execution with memory
Executionexecute_stream()Streaming with events
Memorymemory.get_blocks()List all memory blocks
Memorymemory.update_block()Update block content
Memorymemory.add_block()Add new memory block
Memorymemory.delete_block()Remove memory block
Toolslist_tools()List registered tools
Toolsregister_tool()Register custom tools

Not Supported

DomainReasonWorkaround
SessionsAgents maintain stateUse agents for persistent conversations
MCPNot in LettaRegister tools directly
SkillsNot in LettaRegister tools
SubagentsNot supportedCreate separate agents
FilesLimited supportUse archival memory
HooksNot supportedWrap adapter methods

Memory Management

Letta's memory system is organized into "blocks" - persistent storage that the agent can read from and write to during conversations.

Standard Memory Blocks

  • human - Information about the user (name, preferences, history)
  • persona - The agent's personality, instructions, and self-description
  • system - System-level instructions (optional)

Working with Memory

# Get the memory manager
memory = adapter.memory

# View current memory blocks
blocks = await memory.get_blocks(agent_id)
for block in blocks:
    print(f"{block.label}: {block.value[:50]}...")

# Update a memory block
await memory.update_block(
    agent_id,
    "human",
    "Name: Alice\nPreferences: Concise responses\nExpertise: Python"
)

# Add a custom memory block
await memory.add_block(
    agent_id,
    MemoryBlock(
        label="project",
        value="Working on OpenHarness - a universal API for AI harnesses"
    )
)

Self-Editing Memory

Letta agents can modify their own memory during conversations. When the agent learns something about the user or updates its own understanding, it can persist this to memory blocks automatically.

Streaming

from openharness.types import ExecuteRequest

async for event in adapter.execute_stream(
    ExecuteRequest(message="Explain recursion step by step", agent_id=agent_id)
):
    if event.type == "text":
        print(event.content, end="")
    elif event.type == "thinking":
        print(f"[Thinking: {event.thinking}]")
    elif event.type == "tool_call_start":
        print(f"\n[Tool: {event.name}]")
    elif event.type == "done":
        print(f"\n\nTokens: {event.usage.total_tokens}")

Event Types

EventDescription
textText content chunk
thinkingAgent's inner monologue
tool_call_startTool invocation beginning
tool_resultTool execution result
tool_call_endTool invocation complete
errorError occurred
doneExecution complete with usage stats

Agent Lifecycle

from openharness_letta.types import LettaAgentConfig, MemoryBlock

# Create an agent
config = LettaAgentConfig(
    name="coding-assistant",
    model="openai/gpt-4o",
    memory_blocks=[
        MemoryBlock(label="human", value="Developer working on Python projects"),
        MemoryBlock(label="persona", value="Expert Python developer assistant"),
    ],
    tools=["web_search"],
    include_base_tools=True,
)

agent_id = await adapter.create_agent(config)

# Get agent details
agent = await adapter.get_agent(agent_id)
print(agent["name"], agent["model"])

# List all agents
agents = await adapter.list_agents()
for a in agents:
    print(a["id"], a["name"])

# Delete agent when done
await adapter.delete_agent(agent_id)

Configuration

Adapter Options

adapter = LettaAdapter(
    api_key="letta-...",        # Letta cloud API key
    base_url="http://...",      # Local server URL (alternative)
    timeout=60.0,                # Request timeout in seconds
)

Agent Configuration

class LettaAgentConfig:
    name: str | None = None
    model: str = "openai/gpt-4o-mini"
    embedding_model: str = "openai/text-embedding-ada-002"
    memory_blocks: list[MemoryBlock] = []
    tools: list[str] = []
    system_prompt: str | None = None
    include_base_tools: bool = True
    metadata: dict[str, Any] = {}

Environment Variables

  • LETTA_API_KEY - Letta cloud API key

Letta-Specific Features

Inner Monologue

Letta agents have an "inner monologue" - their thinking process is visible. This is included in streaming events by default:

async for event in adapter.execute_stream(request, include_thinking=True):
    if event.type == "thinking":
        print(f"Agent thinking: {event.thinking}")

Archival Memory

For long-term storage beyond the context window, Letta provides archival memory with semantic search. This allows agents to store and retrieve information that doesn't fit in the active memory blocks.

Multi-Model Support

Letta supports multiple model providers:

  • openai/gpt-4o, openai/gpt-4o-mini
  • anthropic/claude-3-opus, anthropic/claude-3-sonnet
  • Many more via Letta's provider integrations

Running Letta Locally

# Install Letta server
pip install letta

# Start the server
letta server

# Server starts at http://localhost:8283