Automated verification of adapter behavior across all supported harnesses
How each adapter performs against the complete test suite (85 tests). Measures overall capability and completeness.
| Adapter | Coverage | Passed | Not Implemented | Progress |
|---|---|---|---|---|
| Claude Code | 73% | 62 / 85 | 21 | |
| Deep Agents | 48% | 41 / 85 | 38 | |
| Goose | 46% | 39 / 85 | 42 | |
| Letta | 41% | 35 / 85 | 43 |
How well each adapter implements the features it claims to support. Only tests for declared capabilities.
| Adapter | Reliability | Passed | Failed | Progress |
|---|---|---|---|---|
| Claude Code | 97% | 62 / 64 | 2 | |
| Letta | 92% | 35 / 38 | 3 | |
| Goose | 91% | 39 / 43 | 4 | |
| Deep Agents | 87% | 41 / 47 | 6 |
Test results grouped by capability area.
| Category | Tests | Claude Code | Letta | Goose | Deep Agents |
|---|---|---|---|---|---|
| Execution | 6 | 6/6 | 5/6 | 5/6 | 4/6 |
| Streaming | 6 | 6/6 | 6/6 | 5/6 | 6/6 |
| Tool Events | 5 | 5/5 | 5/5 | 5/5 | 5/5 |
| Sessions | 7 | -- | -- | -- | -- |
| Agents | 7 | -- | 7/7 | -- | -- |
| Memory | 7 | -- | 7/7 | -- | -- |
| Subagents | 6 | 6/6 | -- | -- | 6/6 |
| MCP | 6 | 6/6 | -- | 6/6 | -- |
| Files | 8 | 8/8 | -- | 8/8 | 8/8 |
| Planning | 7 | 7/7 | -- | -- | 7/7 |
| Hooks | 6 | 6/6 | -- | -- | -- |
| Skills | 7 | 7/7 | -- | 7/7 | -- |
| Tools API | 7 | 5/7 | 5/7 | 3/7 | 5/7 |