Skip to content

Builtins

OpenPaw Built-in Tools

Builtins are optional capabilities conditionally loaded based on API key availability and installed packages. They come in two types: tools (agent-invokable functions) and processors (message transformers).

Overview

OpenPaw ships with 15 built-in tools and 4 message processors. Builtins are discovered at runtime — if prerequisites (API keys, packages) are missing, the builtin is unavailable. The allow/deny system provides fine-grained control over which capabilities are active in each workspace.

Architecture:

BuiltinRegistry
├─ Tools (15)
│  ├─ browser          Web automation via Playwright
│  ├─ brave_search     Web search
│  ├─ spawn            Sub-agent spawning
│  ├─ cron             Agent self-scheduling
│  ├─ cron_manager     Persistent YAML cron management
│  ├─ acknowledge      Silent system event acknowledgment
│  ├─ task_tracker     Persistent task management
│  ├─ send_message     Mid-execution messaging
│  ├─ send_file        Send workspace files to users
│  ├─ followup         Self-continuation
│  ├─ plan             Session-scoped planning
│  ├─ channel_history  Channel history browsing
│  ├─ memory_search    Semantic conversation search
│  ├─ shell            Local command execution
│  ├─ md2pdf            Markdown-to-PDF conversion
│  └─ elevenlabs       Text-to-speech
└─ Processors (4)
   ├─ file_persistence Universal file upload handling
   ├─ whisper          Audio transcription
   ├─ timestamp        Message timestamp injection
   └─ docling          Document-to-markdown conversion

Processor Pipeline Order: file_persistencewhispertimestampdocling

The order matters — file_persistence runs first to save uploaded files, then downstream processors (whisper, docling) can read from disk.

Tools

browser

Group: browser Type: Tool (11 functions) Prerequisites: playwright (core dependency), chromium browser installed

Web automation via Playwright with accessibility tree navigation. Agents interact with pages via numeric element references instead of writing CSS selectors.

Available Functions: - browser_navigate — Navigate to a URL (respects domain allowlist/blocklist) - browser_snapshot — Get current page state as numbered accessibility tree - browser_click — Click an element by numeric reference - browser_type — Type text into an input field by numeric reference - browser_select — Select dropdown option by numeric reference - browser_scroll — Scroll the page (up/down/top/bottom) - browser_back — Navigate back in browser history - browser_screenshot — Capture page screenshot (saved to workspace/screenshots/) - browser_close — Close current page/tab - browser_tabs — List all open tabs - browser_switch_tab — Switch to a different tab by index

Security Model:

Domain allowlisting and blocklisting prevent unauthorized navigation. If allowed_domains is non-empty, only those domains (and subdomains with *. prefix) are permitted. The blocked_domains list takes precedence and denies specific domains even if allowed.

Configuration:

builtins:
  browser:
    enabled: true
    config:
      headless: true                # Run browser without GUI
      allowed_domains:              # Allowlist (empty = allow all)
        - "calendly.com"
        - "*.google.com"            # Subdomain wildcard
      blocked_domains: []           # Blocklist (takes precedence)
      timeout_seconds: 30           # Default timeout for operations
      persist_cookies: false        # Persist cookies across agent runs
      downloads_dir: "downloads"    # Where to save downloaded files
      screenshots_dir: "screenshots"  # Where to save screenshots

Installation:

poetry install  # playwright is a core dependency
poetry run playwright install chromium

Usage Example:

User: "Book a meeting on my Calendly for tomorrow at 2pm"
Agent: [Calls browser_navigate("https://calendly.com/myaccount")]
Agent: [Calls browser_snapshot() to see page elements]
Agent: [Calls browser_click(42) to click the "Schedule" button (element #42)]
Agent: [Fills in meeting details and confirms booking]

Lifecycle:

Browser instances are lazily initialized (no browser created until first use). Each session gets its own browser context. Browsers are automatically cleaned up on /new, /compact, and workspace shutdown.

Cookie Persistence:

When persist_cookies: true, authentication state and cookies survive across agent runs within the same session. Cookies are cleared on conversation reset.

Downloads and Screenshots:

Files downloaded by the browser are saved to {workspace}/workspace/downloads/ with sanitized filenames. Page screenshots are saved to {workspace}/workspace/screenshots/ and returned as relative paths for agent reference.


Group: web Type: Tool Prerequisites: BRAVE_API_KEY, poetry install -E web

Web search capability using the Brave Search API.

Configuration:

builtins:
  brave_search:
    enabled: true
    config:
      count: 5  # Number of search results

Usage Example:

User: "What's the latest news about Python 3.13?"
Agent: [Uses brave_search tool to find recent articles]
Agent: "According to recent sources, Python 3.13 introduces..."

spawn

Group: agent Type: Tool (4 functions) Prerequisites: None (always available)

Sub-agent spawning for concurrent background tasks. Sub-agents run in isolated contexts with filtered tools to prevent recursion and unsolicited messaging.

Available Functions: - spawn_agent — Spawn a background sub-agent with a task prompt and label - list_subagents — List all sub-agents (active and recently completed) - get_subagent_result — Retrieve result of a completed sub-agent by ID - cancel_subagent — Cancel a running sub-agent

Configuration:

builtins:
  spawn:
    enabled: true
    config:
      max_concurrent: 8  # Maximum simultaneous sub-agents (default: 8)

Tool Exclusions:

Sub-agents cannot spawn sub-agents (no spawn_agent), send unsolicited messages (no send_message/send_file), self-continue (no request_followup), or schedule tasks (no cron tools). This prevents recursion and ensures sub-agents are single-purpose workers.

Lifecycle:

pendingrunningcompleted/failed/cancelled/timed_out. Running sub-agents exceeding their timeout are marked as timed_out during cleanup.

Notifications:

When notify: true (default), sub-agent completion results are injected into the message queue, triggering a new agent turn to process the [SYSTEM] notification.

Usage Example:

User: "Research topic X in the background while I work on Y"
Agent: [Calls spawn_agent(task="Research topic X...", label="research-x")]
Sub-agent: [Runs concurrently, main agent continues working on Y]
System: [When complete, user receives notification with result summary]

Limits:

Maximum 8 concurrent sub-agents (configurable), timeout defaults to 30 minutes (1-120 range). Results are truncated at 50K characters to match read_file safety valve pattern.

Storage:

Sub-agent state persists to {workspace}/data/subagents.yaml and survives restarts. Completed/failed/cancelled requests older than 24 hours are automatically cleaned up on initialization.


cron

Group: agent Type: Tool (4 functions) Prerequisites: None (always available)

Agent self-scheduling for one-time and recurring tasks. Enables autonomous workflows like "remind me in 20 minutes" or "check on this PR every hour".

Available Functions: - schedule_at — Schedule a one-time action at a specific timestamp - schedule_every — Schedule a recurring action at fixed intervals - list_scheduled — List all pending scheduled tasks - cancel_scheduled — Cancel a scheduled task by ID

Configuration:

builtins:
  cron:
    enabled: true
    config:
      min_interval_seconds: 300  # Minimum interval for recurring tasks (default: 5 min)
      max_tasks: 50              # Maximum pending tasks per workspace

Storage:

Tasks persist to {workspace}/data/dynamic_crons.json and survive restarts. One-time tasks are automatically cleaned up after execution or if expired on startup.

Routing:

Responses are sent back to the first allowed user in the workspace's channel config.

Usage Example:

User: "Ping me in 10 minutes to check on the deploy"
Agent: [Calls schedule_at with timestamp 10 minutes from now]
System: [Task fires, agent sends reminder to user's chat]

cron_manager

Group: automation Type: Tool (4 functions) Prerequisites: None (always available)

Persistent cron management — create, list, update, and delete YAML cron jobs that survive restarts. Unlike dynamic scheduling (schedule_at/schedule_every), cron_manager writes YAML files to config/crons/ that are loaded by the cron scheduler at startup alongside any hand-authored cron files. Changes are also applied to the live scheduler immediately — no workspace restart required.

Available Functions: - create_cron — Create a new persistent cron job (validates expression, writes YAML, hot-adds to scheduler) - list_crons — List all YAML crons with name, schedule, enabled status, and next run time - update_cron — Update fields on an existing cron job (hot-reloads in scheduler) - delete_cron — Remove a cron job file and unregister from scheduler

Configuration:

builtins:
  cron_manager:
    enabled: true

Comparison with Dynamic Scheduling:

Feature Dynamic (cron) Persistent (cron_manager)
Storage data/dynamic_crons.json config/crons/{name}.yaml
Scheduling One-time or interval-based Standard cron expressions
Lifecycle Auto-cleaned after execution (one-time) Permanent until deleted
Restarts Loaded from JSON on restart Loaded from YAML on restart
Use case "Remind me in 10 minutes" "Daily summary at 9am"

Name Validation:

Cron names must be lowercase alphanumeric with hyphens only (^[a-z0-9][a-z0-9-]*$). Names become filenames ({name}.yaml).

Usage Example:

User: "Set up a daily summary cron at 9am"
Agent: [Calls create_cron(name="daily-summary", schedule="0 9 * * *", prompt="Generate a daily summary...", delivery="channel")]
Agent: "Done — 'daily-summary' will run every day at 9:00 AM and send results to this chat."

task_tracker

Group: agent Type: Tool (4 functions) Prerequisites: None (always available)

Task management via TASKS.yaml for tracking long-running operations across heartbeats and sessions.

Available Functions: - create_task — Create a new tracked task - update_task — Update task status or notes - list_tasks — List all tasks (optionally filtered by status) - get_task — Retrieve a specific task by ID

Configuration:

builtins:
  task_tracker:
    enabled: true

Storage:

Tasks persist to {workspace}/data/TASKS.yaml. Thread-safe with atomic writes.

Integration with Heartbeat:

When active tasks exist, a compact summary is injected into the heartbeat prompt as <active_tasks> XML tags. This avoids an extra LLM tool call to list_tasks().

Usage Example:

Agent: [Calls create_task(title="Monitor deploy", status="in_progress")]
Agent: [Works on the task]
Agent: [Calls update_task(task_id="task-001", status="completed")]

send_message

Group: agent Type: Tool Prerequisites: None (always available)

Mid-execution messaging to keep users informed during long operations. Agents can send progress updates while continuing to work.

Configuration:

builtins:
  send_message:
    enabled: true

Implementation:

Uses shared _channel_context for session-safe state access to the active channel.

Usage Example:

User: "Process this large dataset"
Agent: [Calls send_message("Starting analysis of 10,000 rows...")]
Agent: [Continues processing]
Agent: [Calls send_message("Halfway done, found 3 anomalies...")]
Agent: [Finishes and responds with full results]

send_file

Group: agent Type: Tool Prerequisites: None (always available)

Send workspace files to users via channel. Validates files within sandbox, infers MIME type, enforces 50MB limit.

Configuration:

builtins:
  send_file:
    enabled: true
    config:
      max_file_size: 52428800  # 50 MB default

Implementation:

Uses shared _channel_context for session-safe state. Validates paths with resolve_sandboxed_path() for security.

Usage Example:

Agent: [Generates a report.pdf in workspace]
Agent: [Calls send_file("report.pdf", caption="Monthly report")]
User: [Receives file via Telegram]

followup

Group: agent Type: Tool Prerequisites: None (always available)

Self-continuation for multi-step autonomous workflows with depth limiting. Agents request re-invocation after responding.

Configuration:

builtins:
  followup:
    enabled: true

Usage Example:

Agent: "I've completed step 1 of 3. [Calls request_followup()]"
System: [Re-invokes agent]
Agent: "Now completing step 2..."

Depth Limiting:

Prevents infinite loops via configurable depth limits in the message processing loop.


Group: memory Type: Tool Prerequisites: sqlite-vec, poetry install -E memory

Semantic search over past conversations using vector embeddings.

Configuration:

builtins:
  memory_search:
    enabled: true

Usage Example:

User: "What did we discuss about the deployment last week?"
Agent: [Calls memory_search("deployment last week")]
Agent: "Last Tuesday we discussed rolling back the deployment due to..."

shell

Group: system Type: Tool Prerequisites: None (core dependency)

Execute shell commands on the host system with configurable security controls. Disabled by default — must explicitly enable.

Security:

  • Disabled by default
  • Default blocked commands list prevents dangerous operations (rm -rf, sudo, etc.)
  • Optional command allowlist for strict control
  • Optional working directory constraint

Configuration:

builtins:
  shell:
    enabled: true  # Must explicitly enable
    config:
      allowed_commands:  # Optional allowlist
        - ls
        - cat
        - grep
      blocked_commands:  # Optional override of defaults
        - rm -rf
        - sudo
      working_directory: /home/user/sandbox  # Optional constraint

Default Blocked Commands:

rm -rf, sudo, chmod 777, chown, wget, curl, dd if=, mkfs, fork bombs

Usage Example:

User: "What files are in the current directory?"
Agent: [Calls shell with command "ls -la"]
Agent: "Here are the files in the directory..."

md2pdf

Group: document Type: Tool Prerequisites: weasyprint, markdown, pygments (core dependencies)

Convert workspace markdown files to polished PDF documents with CSS theming, Mermaid diagram rendering, and AI self-healing for broken diagrams.

Themes:

Theme Style
minimal Clean serif font, light styling, academic feel
professional Indigo accents, sans-serif, business report look
technical Dark code blocks, monospace-heavy, engineering docs

Features:

  • Mermaid diagrams rendered via mermaid.ink API (no local dependencies)
  • SVG auto-scaling to fit page width
  • AI self-healing for broken Mermaid syntax (configurable LLM, default: gpt-4o-mini)
  • Syntax-highlighted code blocks via Pygments
  • Tables, table of contents, and standard markdown extensions

Configuration:

builtins:
  md2pdf:
    theme: professional           # minimal, professional, or technical
    max_diagram_width: 6.5        # Max diagram width in inches
    self_heal: true               # AI repair for broken Mermaid diagrams
    self_heal_model: "openai:gpt-4o-mini"  # Any LangChain model spec
    max_heal_iterations: 3        # Max repair attempts per diagram

Self-Healing:

When a Mermaid diagram fails to render, the tool can optionally invoke a LangGraph subgraph that:

  1. Sends the broken source + error to a configurable LLM
  2. Validates the repair by re-rendering via mermaid.ink
  3. Loops up to max_heal_iterations times
  4. Marks repaired diagrams with a visual indicator in the PDF

Self-healing requires an API key for the configured model (e.g., OPENAI_API_KEY for gpt-4o-mini). If unavailable, the tool degrades gracefully — broken diagrams get an error placeholder instead.

Usage Example:

User: "Convert my research notes to a PDF"
Agent: [Calls markdown_to_pdf(source_path="reports/notes.md", theme="professional")]
Agent: "PDF created: reports/notes.pdf (3 Mermaid diagrams rendered, 1 repaired by AI)"

plan

Group: system Type: Tool Prerequisites: None (always available)

Session-scoped planning tool for multi-step work. Agents use write_plan to externalize their thinking into a structured plan and read_plan to retrieve it later. Plans persist for the current session and reset on /new or /compact.

Available tools:

  • write_plan(plan) — Write or overwrite the session plan
  • read_plan() — Retrieve the current plan

When agents use this:

Agents create plans when tackling complex, multi-step tasks — especially when the work involves multiple tool calls, file operations, or research phases. The plan serves as working memory that survives across tool calls within a single session.

Usage Example:

User: "Research the latest AI safety papers and write a summary report"
Agent: [Calls write_plan("1. Search for recent AI safety papers\n2. Read top 5 results\n3. Synthesize findings\n4. Write summary to reports/ai-safety.md")]
Agent: [Proceeds to execute each step, updating the plan as steps complete]

acknowledge

Group: automation Type: Tool Prerequisites: None (always available)

Silent acknowledgment for system events. When the agent receives a [SYSTEM] event (cron result, heartbeat injection, sub-agent completion) and determines there is nothing the user needs to know, it calls acknowledge_event to suppress channel delivery. Everything is still logged — conversation history, token usage, and the acknowledgment reason. Silence means "don't message the user," not "don't record."

When to use:

The agent receives a [SYSTEM] notification with routine information — a cron ran successfully with no notable output, a heartbeat check found nothing actionable, or a background sub-agent completed a task the user doesn't need to hear about. Instead of sending a noisy "nothing to report" message, the agent calls acknowledge_event with a brief reason.

Key Behaviors:

  • Only suppresses channel delivery for system-originated events. Has no effect on user messages.
  • One acknowledge_event call per agent invocation — duplicate calls return an error.
  • The agent's text response is still written to conversation history; it just isn't delivered to the channel.

Configuration:

builtins:
  acknowledge:
    enabled: true

Usage Example:

System: [SYSTEM] Cron 'daily-check' completed. Session log: memory/sessions/cron/daily-check_2026-03-25T09-00-00.jsonl
Agent: [Reads session log — routine status, no anomalies found]
Agent: [Calls acknowledge_event(reason="daily-check ran clean, no anomalies to report")]
Agent: "Checked the daily-check cron result — all systems nominal." (not delivered to user)

elevenlabs

Group: voice Type: Tool Prerequisites: ELEVENLABS_API_KEY, poetry install -E voice

Text-to-speech for voice responses using ElevenLabs API.

Configuration:

builtins:
  elevenlabs:
    enabled: true
    config:
      voice_id: 21m00Tcm4TlvDq8ikWAM  # ElevenLabs voice ID
      model_id: eleven_turbo_v2_5

Usage Example:

User: "Read me the summary"
Agent: [Calls elevenlabs to generate audio]
Agent: [Sends voice message via Telegram]

To find voice IDs, visit the ElevenLabs Voice Library.


Processors

Processor Pipeline

file_persistence

Group: None Type: Processor Prerequisites: None (always available)

Universal file upload handling with date partitioning. First processor in the pipeline — saves all uploaded files to {workspace}/data/uploads/{YYYY-MM-DD}/.

Configuration:

builtins:
  file_persistence:
    enabled: true
    config:
      max_file_size: 52428800  # 50 MB default
      clear_data_after_save: false  # Free memory after saving

Behavior:

Sets attachment.saved_path (relative to workspace root) so downstream processors can read from disk. Enriches message content with file receipt notifications:

[File received: report.pdf (2.3 MB, application/pdf)]
[Saved to: data/uploads/2026-02-07/report.pdf]

Filename Handling:

sanitize_filename() normalizes filenames (lowercases, removes special chars, replaces spaces with underscores). deduplicate_path() appends counters (1), (2), etc. to prevent overwrites.


whisper

Group: voice Type: Processor Prerequisites: OPENAI_API_KEY, poetry install -E voice

Audio transcription for voice and audio messages using OpenAI's Whisper API.

Configuration:

builtins:
  whisper:
    enabled: true
    config:
      model: whisper-1
      language: en  # Optional: auto-detect if omitted

Behavior:

Transcribes audio/voice messages and saves transcript as .txt sibling to the audio file (e.g., voice_123.oggvoice_123.txt). Appends transcript inline to message content.

Usage Example:

User: [Sends voice message]
Channel: [Downloads audio file]
Whisper: [Transcribes to text, saves voice_123.txt]
Agent: [Processes transcribed text as normal message]

timestamp

Group: context Type: Processor Prerequisites: None (always available)

Prepends current date/time context to inbound messages, helping agents understand the current time in the user's timezone.

Configuration:

builtins:
  timestamp:
    enabled: true
    config:
      format: "%Y-%m-%d %H:%M %Z"  # Optional datetime format (strftime)
      template: "[Current time: {datetime}]"  # Optional prefix template

Behavior:

Automatically adds timestamp context to every message:

User: "What's the weather today?"
[Timestamp processor adds: "[Current time: 2026-02-17 14:30 PST]"]
Agent sees: "[Current time: 2026-02-17 14:30 PST]\n\nWhat's the weather today?"

Format Examples:

# ISO 8601
format: "%Y-%m-%d %H:%M:%S %Z"
# Output: [Current time: 2026-02-17 14:30:00 PST]

# Human-readable
format: "%A, %B %d, %Y at %I:%M %p %Z"
# Output: [Current time: Monday, February 17, 2026 at 02:30 PM PST]

Note: Timestamp formatting uses the workspace timezone (configurable in agent.yaml).


docling

Group: None Type: Processor Prerequisites: docling (core dependency)

Document conversion (PDF, DOCX, PPTX, etc.) to markdown with OCR support.

Configuration:

builtins:
  docling:
    enabled: true

Behavior:

Converts documents to markdown and saves as .md sibling file (e.g., report.pdfreport.md). Appends converted markdown inline to message content.

OCR Support:

  • macOS: Uses OcrMacOptions(force_full_page_ocr=True) for scanned PDFs
  • Linux: Uses EasyOcrOptions for OCR

Usage Example:

User: [Uploads report.pdf]
FilePersistence: [Saves to data/uploads/2026-02-17/report.pdf]
Docling: [Converts to markdown, saves report.md]
Agent: [Processes markdown content]

Configuration

Global Configuration

Configure builtins in config.yaml:

builtins:
  # Allow/deny lists
  allow: []  # Empty = allow all available
  deny:
    - group:voice  # Deny all voice-related builtins

  # Individual builtin configs
  browser:
    enabled: true
    config:
      headless: true
      allowed_domains: ["calendly.com"]

  brave_search:
    enabled: true
    config:
      count: 5

  whisper:
    enabled: true
    config:
      model: whisper-1

  spawn:
    enabled: true
    config:
      max_concurrent: 8

  cron:
    enabled: true
    config:
      max_tasks: 50
      min_interval_seconds: 300

  file_persistence:
    enabled: true
    config:
      max_file_size: 52428800

Per-Workspace Configuration

Override builtin settings per workspace in agent.yaml:

# workspace1/agent.yaml - Enable web search only
builtins:
  allow:
    - brave_search
  deny:
    - elevenlabs

# workspace2/agent.yaml - Enable voice features
builtins:
  allow:
    - group:voice  # Allow all voice builtins
  deny:
    - brave_search

Allow/Deny Behavior

Empty allow list — Allow all available builtins (default)

builtins:
  allow: []  # Allow everything

Specific allow list — Only enable listed builtins

builtins:
  allow:
    - brave_search
    - whisper
  # elevenlabs is denied (not in allow list)

Group allow — Enable all builtins in a group

builtins:
  allow:
    - group:voice  # Allows whisper, elevenlabs

Deny list — Block specific builtins or groups

builtins:
  allow: []  # Allow all
  deny:
    - elevenlabs  # Except this one

Group deny — Block all builtins in a group

builtins:
  deny:
    - group:voice  # Blocks whisper, elevenlabs

Priority: Deny takes precedence over allow.


Builtin Groups

Group Members
voice whisper, elevenlabs
web brave_search
system shell
context timestamp
agent spawn, cron, task_tracker, send_message, followup, send_file
automation cron_manager, acknowledge
browser browser
memory memory_search
document md2pdf

Usage:

builtins:
  allow:
    - group:web  # Allow all web builtins
  deny:
    - group:voice  # Deny all voice builtins

Installation

Most builtins are included in the core install. A few require optional extras:

Voice capabilities:

poetry install -E voice
Installs: openai, elevenlabs

Web capabilities:

poetry install -E web
Installs: langchain-community

Memory search:

poetry install -E memory
Installs: sqlite-vec

All optional builtins:

poetry install -E all-builtins

Core dependencies (included in base poetry install): - docling, easyocr, opencv-python-headless — Document conversion and OCR - playwright — Browser automation - langchain-anthropic, langchain-openai, langchain-aws, langchain-xai — LLM providers - weasyprint, markdown, pygments — Markdown-to-PDF conversion - Shell tool — No extra dependencies required


Adding Custom Builtins

You can extend OpenPaw with custom tools and processors.

Creating a Custom Tool

  1. Create tool file: openpaw/builtins/tools/my_tool.py
from langchain_core.tools import StructuredTool
from openpaw.builtins.base import (
    BaseBuiltinTool,
    BuiltinMetadata,
    BuiltinType,
    BuiltinPrerequisite,
)


class MyCustomTool(BaseBuiltinTool):
    """Custom tool implementation."""

    metadata = BuiltinMetadata(
        name="my_custom_tool",
        display_name="My Custom Tool",
        description="Custom functionality for X",
        builtin_type=BuiltinType.TOOL,
        group="custom",
        prerequisites=BuiltinPrerequisite(
            env_vars=["MY_API_KEY"],
            packages=["my-package"],
        ),
    )

    def get_langchain_tool(self) -> list:
        """Return LangChain tool instances."""

        def my_tool_func(query: str) -> str:
            """Execute the tool."""
            api_key = self.config.get("api_key") or os.getenv("MY_API_KEY")
            # Implementation here
            return result

        return [
            StructuredTool.from_function(
                func=my_tool_func,
                name="my_custom_tool",
                description="What this tool does",
            )
        ]

Key Points:

  • Extend BaseBuiltinTool, not BaseTool from LangChain
  • Use StructuredTool.from_function() factory pattern
  • get_langchain_tool() returns a list (can contain multiple tools)
  • Access config via self.config

  • Register in registry: openpaw/builtins/registry.py

try:
    from openpaw.builtins.tools.my_tool import MyCustomTool
    self.register_tool(MyCustomTool)
except ImportError as e:
    logger.debug(f"My custom tool not available: {e}")
  1. Configure in config.yaml:
builtins:
  my_custom_tool:
    enabled: true
    config:
      option1: value1
  1. Set environment variable:
export MY_API_KEY="your-key"

Creating a Custom Processor

  1. Create processor file: openpaw/builtins/processors/my_processor.py
from openpaw.builtins.base import (
    BaseBuiltinProcessor,
    BuiltinMetadata,
    BuiltinType,
    BuiltinPrerequisite,
    ProcessorResult,
)
from openpaw.domain.message import Message


class MyCustomProcessor(BaseBuiltinProcessor):
    """Custom message processor."""

    metadata = BuiltinMetadata(
        name="my_processor",
        display_name="My Processor",
        description="Processes messages before agent sees them",
        builtin_type=BuiltinType.PROCESSOR,
        group="custom",
        prerequisites=BuiltinPrerequisite(
            env_vars=["MY_API_KEY"],
        ),
    )

    async def process_inbound(self, message: Message) -> ProcessorResult:
        """Transform the message."""
        # Access config
        option = self.config.get("option1", "default")

        # Transform message content
        message.content = f"[Processed] {message.content}"

        return ProcessorResult(message=message)
  1. Register in registry: openpaw/builtins/registry.py
try:
    from openpaw.builtins.processors.my_processor import MyCustomProcessor
    self.register_processor(MyCustomProcessor)
except ImportError as e:
    logger.debug(f"My processor not available: {e}")
  1. Configure and use:
builtins:
  my_processor:
    enabled: true
    config:
      option1: value1

Processors run automatically on all messages in the channel layer.


Best Practices

1. Use Environment Variables for Secrets

Never hardcode API keys:

# Bad
builtins:
  brave_search:
    config:
      api_key: "actual-key-here"  # Don't do this

# Good — relies on BRAVE_API_KEY environment variable
builtins:
  brave_search:
    enabled: true

2. Install Only Needed Extras

Minimize dependencies:

# Only need voice features
poetry install -E voice

# Don't install all if you only need some

3. Deny Unused Builtins

Reduce attack surface:

builtins:
  deny:
    - elevenlabs  # Don't need TTS in this workspace
    - group:system  # Disable shell for security

Security Note: The shell tool should be denied unless explicitly needed. It is disabled by default and requires explicit enablement.

4. Use Groups for Bulk Operations

Simplify configuration:

# Instead of denying individual tools:
builtins:
  deny:
    - whisper
    - elevenlabs

# Use group deny:
builtins:
  deny:
    - group:voice

5. Test Prerequisites

Verify API keys before deploying:

# Test OpenAI key
curl https://api.openai.com/v1/models \
  -H "Authorization: Bearer $OPENAI_API_KEY"

# Test ElevenLabs key
curl https://api.elevenlabs.io/v1/voices \
  -H "xi-api-key: $ELEVENLABS_API_KEY"

Troubleshooting

Builtin not available: - Check environment variable is set: echo $OPENAI_API_KEY - Verify extras are installed: poetry install -E voice - Check allow/deny lists in config - Enable verbose logging: poetry run openpaw -w agent -v

API key errors: - Verify key is valid and active - Check API quota/billing status - Test key with curl (see examples above)

Import errors: - Missing optional dependency - Run poetry install -E <extra-name> - Check pyproject.toml for correct package versions

Processor not running: - Verify enabled: true in config - Check processor isn't denied - Ensure processor is registered in registry - Check logs for initialization errors

Tool not available to agent: - Verify tool prerequisites are met - Check allow/deny lists - Tool must be properly registered - Agent must have permission to use tools (model capability)


Security Considerations

Shell Tool

The shell tool provides powerful system access and requires careful configuration:

  • Disabled by default — must explicitly enable in config
  • Use allowed_commands for strict allowlisting when possible
  • Default blocked_commands list prevents common dangerous operations
  • Consider constraining working_directory to a sandbox
  • Never enable in untrusted environments

Best Practices: 1. Enable the shell tool only in workspaces that need it 2. Use group:system deny rule in untrusted workspaces 3. Configure minimal allowed_commands 4. Monitor logs for blocked command attempts

Browser Tool

Domain Security: - Use allowed_domains allowlist for production workspaces - blocked_domains takes precedence over allowlist - Wildcard subdomain support with *.example.com - Consider persist_cookies: false to avoid session leakage

Best Practices: 1. Configure domain allowlist for untrusted agents 2. Monitor the workspace/downloads/ directory for unexpected files 3. Set reasonable timeout values 4. Review screenshot captures for sensitive data