Getting Started¶
This guide walks through installing OpenPaw, creating your first agent workspace, and running it. Examples use Telegram, but Discord is also supported — see Channels for Discord setup and multi-channel configuration.
Prerequisites¶
- Python 3.11+
- Poetry 2.0+ for dependency management (installation guide)
- At least one channel bot token:
- Telegram (create via BotFather)
- Discord (create via Developer Portal)
- At least one model provider credential:
- Anthropic API key (get one here)
- OpenAI API key (get one here)
- AWS credentials for Bedrock (configure AWS CLI)
Installation¶
1. Clone the Repository¶
2. Install Dependencies¶
# Core installation (includes Docling + Playwright)
poetry install
# Install Playwright browser
poetry run playwright install chromium
3. Optional Extras¶
Install additional builtins based on your needs:
# Voice capabilities (Whisper transcription + ElevenLabs TTS)
poetry install -E voice
# Web search (Brave Search API)
poetry install -E web
# Memory search (semantic search over past conversations)
poetry install -E memory
# Install everything
poetry install -E all-builtins
Extra descriptions:
| Extra | Provides | Requires |
|---|---|---|
voice |
Whisper audio transcription, ElevenLabs text-to-speech | OPENAI_API_KEY, ELEVENLABS_API_KEY |
web |
Brave Search web search | BRAVE_API_KEY |
memory |
Semantic search over conversation archives | sqlite-vec package |
all-builtins |
All of the above | All API keys above |
Note: Docling (document conversion), Playwright (browser automation), and all LLM providers (Anthropic, OpenAI, AWS Bedrock, xAI) are core dependencies installed automatically with poetry install.
4. Set Up Environment Variables¶
Create a .env file or export variables:
# Required: Channel (at least one)
export TELEGRAM_BOT_TOKEN="your-telegram-bot-token"
export DISCORD_BOT_TOKEN="your-discord-bot-token"
# Required: Model provider (choose at least one)
export ANTHROPIC_API_KEY="your-anthropic-key" # Anthropic Claude
export OPENAI_API_KEY="your-openai-key" # OpenAI GPT (also used by Whisper)
# AWS Bedrock (Kimi K2, Claude, Mistral, Nova, etc.)
export AWS_ACCESS_KEY_ID="your-access-key"
export AWS_SECRET_ACCESS_KEY="your-secret-key"
export AWS_REGION="us-east-1"
# Optional: Builtin API keys
export BRAVE_API_KEY="your-brave-key" # Web search
export ELEVENLABS_API_KEY="your-elevenlabs-key" # Text-to-speech
Per-workspace secrets: You can also create a .env file in any workspace directory (agent_workspaces/<name>/config/.env) for workspace-specific environment variables. These are automatically loaded at workspace startup.
Initial Configuration¶
1. Copy the Example Configuration¶
2. Edit config.yaml¶
The default configuration works for most use cases. Key sections:
# Path to agent workspaces
workspaces_path: agent_workspaces
# Queue behavior
queue:
mode: collect # collect, steer, followup, interrupt
debounce_ms: 1000 # wait 1 second before processing collected messages
# Agent defaults
agent:
model: anthropic:claude-sonnet-4-20250514 # or openai:gpt-4o, bedrock_converse:moonshot.kimi-k2-thinking
max_turns: 50
temperature: 0.7
Note: Channel configuration is no longer in global config. Each workspace configures its own channel in agent.yaml.
See configuration.md for detailed reference.
Creating Your First Workspace¶
Each workspace represents an isolated agent with its own personality, tools, and conversation state. The fastest way to create one is with openpaw init:
1. Scaffold a Workspace¶
# Basic scaffold
poetry run openpaw init my_agent
# With model and channel pre-configured
poetry run openpaw init my_agent --model anthropic:claude-sonnet-4-20250514 --channel telegram
This creates agent_workspaces/my_agent/ with all required files:
| File | Purpose |
|---|---|
agent/AGENT.md |
Capabilities, behavior guidelines |
agent/USER.md |
User context and preferences |
agent/SOUL.md |
Core personality and values |
agent/HEARTBEAT.md |
Session state scratchpad |
config/agent.yaml |
Model, channel, and queue config |
config/.env |
API key placeholders |
Each file includes TODO markers to guide customization.
2. Configure Your Workspace¶
Edit config/agent.yaml with your model and channel settings. If you used --model and --channel flags, the relevant sections are already populated:
name: my_agent
description: ""
model:
provider: anthropic
model: claude-sonnet-4-20250514
api_key: ${ANTHROPIC_API_KEY}
temperature: 0.7
channel:
type: telegram
token: ${TELEGRAM_BOT_TOKEN}
allowed_users: []
queue:
mode: collect
debounce_ms: 1000
Add your API keys to config/.env:
Timezone: Add a timezone field (IANA identifier, e.g., America/New_York) to control cron timing, heartbeat hours, and display timestamps. Defaults to UTC.
3. Customize Personality¶
Edit the markdown files to define your agent's identity:
- AGENT.md — What the agent can do and how it should behave
- USER.md — Context about the user(s) who will interact with it
- SOUL.md — Core personality, values, and communication style
- HEARTBEAT.md — Can start empty; the agent updates it to track session state
See workspaces.md for detailed examples of each file.
List existing workspaces
Use poetry run openpaw list to see all valid workspaces in agent_workspaces/.
4. Optional: Custom Tools¶
Drop LangChain @tool functions into agent/tools/ directory:
Example agent/tools/weather.py:
from langchain_core.tools import tool
@tool
def get_weather(city: str) -> str:
"""Get current weather for a city.
Args:
city: The city name
Returns:
Current weather description
"""
# Your implementation here
return f"Weather for {city}: Sunny, 72°F"
Add tool-specific dependencies to agent/tools/requirements.txt:
Dependencies are auto-installed at workspace startup.
5. Optional: Scheduled Tasks¶
Create cron jobs in config/crons/ directory:
Example config/crons/daily-summary.yaml:
name: daily-summary
schedule: "0 9 * * *" # Every day at 9:00 AM (workspace timezone)
enabled: true
prompt: |
Review active tasks and workspace state.
Provide a brief daily summary of pending work.
output:
channel: telegram
chat_id: 123456789 # Your Telegram user ID
See scheduling.md for detailed configuration.
Running Your Agent¶
1. Single Workspace¶
2. Multiple Workspaces¶
3. All Workspaces¶
# Either syntax works
poetry run openpaw -c config.yaml --all
poetry run openpaw -c config.yaml -w "*"
4. Verbose Logging¶
Testing Your Agent¶
- Find your bot in Telegram - Search for the bot username you configured with BotFather
- Send a message - "Hello! What can you do?"
- The agent should respond based on its personality files (AGENT.md, USER.md, SOUL.md)
Try Built-in Commands¶
Commands are intercepted by the framework before reaching the agent:
/help- List available commands/status- Show model, conversation stats, active tasks, token usage/new- Archive current conversation and start fresh/compact- Summarize conversation, archive it, start new with summary/queue collect- Change queue mode (collect, steer, interrupt, followup)
Upload a File¶
Send a PDF, DOCX, or image to test document conversion:
- PDFs/DOCX/PPTX are converted to markdown via Docling with OCR
- Voice messages are transcribed via Whisper (if
OPENAI_API_KEYis set) - Files are saved to
uploads/{YYYY-MM-DD}/with sibling output files (e.g.,report.pdf→report.md)
Next Steps¶
Now that you have a working agent, explore advanced features:
- Configuration - Deep-dive into global and workspace config options
- Queue System - Understand queue modes (collect, steer, interrupt, followup)
- Scheduling - Set up scheduled tasks and heartbeats
- Builtins - Enable web search, voice, browser automation, sub-agents
- Workspaces - Advanced workspace organization and custom tools
- Channels - Channel system details and access control
- Architecture - System design and component interactions
Troubleshooting¶
Bot Doesn't Respond¶
Check environment variables:
# Verify variables are set
echo $TELEGRAM_BOT_TOKEN # or DISCORD_BOT_TOKEN
echo $ANTHROPIC_API_KEY # or OPENAI_API_KEY
Check allowed_users list:
If you configured allowed_users in agent.yaml, ensure your user ID is in the list. To find your user ID, temporarily set allowed_users: [] and check the logs when you send a message.
Check logs:
Look for errors like Unauthorized message from user 123456 or Invalid API key.
"Module not found" Errors¶
Ensure you're using poetry run prefix:
Verify installation:
"No API key" Errors¶
For global environment variables:
For workspace-specific variables:
Ensure .env file exists in agent_workspaces/<name>/config/.env and contains the required keys.
Test API key directly:
Playwright Browser Errors¶
If browser automation fails:
# Ensure Playwright is installed
poetry run playwright install chromium
# Check installed browsers
poetry run playwright install --help
Docling OCR Issues¶
If scanned PDFs produce <!-- image --> instead of text:
- macOS: Docling uses native OCR (no additional setup needed)
- Linux: Docling falls back to EasyOCR (auto-installed)
Check logs for OCR-related errors when processing PDFs.
Agent Responds Incorrectly¶
Review workspace markdown files:
- Check for typos or inconsistent instructions in agent/AGENT.md, agent/USER.md, agent/SOUL.md
- Ensure AGENT.md clearly defines capabilities and communication style
Adjust temperature:
Check conversation context:
Use /new to start a fresh conversation if the agent seems confused by prior context.
Performance Issues¶
Reduce concurrency:
# In config.yaml
lanes:
main_concurrency: 2 # Reduce from default 4
subagent_concurrency: 4 # Reduce from default 8
Enable heartbeat pre-flight skip: Heartbeats skip LLM calls when HEARTBEAT.md is empty and no active tasks exist (enabled by default).
Monitor token usage:
Database Locked Errors¶
If you see database is locked errors, this indicates concurrent access to the SQLite conversation database. This should be rare but can happen with aggressive concurrency settings.
Temporary fix:
Still Having Issues?¶
- Check the GitHub Issues for similar problems
- Review the full logs in
logs/<workspace>_YYYY-MM-DD.log - Ensure your Poetry environment is up to date:
poetry update