Claude Agent SDK App
Long-running agent server with Claude Agent SDK
Claude Agent SDK App - Long-Running Agent Server
Status: PLANNED Priority: MEDIUM Target Start: Iteration 151+
Overview
A container-based long-running agent server using Claude Agent SDK for:
- Full filesystem access (code operations)
- Shell tools (bash, git, gh CLI)
- Long-running tasks (minutes to hours)
- Heavy compute operations
- Triggered by Tier 1 agents via Workflows
Architecture Design
System Architecture
Communication Flow
Implementation Plan
Phase 1: Foundation (Iteration 151-160)
Target: Basic agent server with Claude Agent SDK
Tasks
- Set up Node.js/Bun project structure
- Install Claude Agent SDK (
@anthropic-ai/sdk-agent) - Create agent server with Hono/Fastify
- Implement basic tool system
- Add filesystem tools (read, write, list, delete)
- Add bash tool for shell execution
- Add git tool (clone, status, commit, push)
- Implement session management
- Add health check endpoints
- Write basic tests
File Structure
Phase 2: Tool Implementation (Iteration 161-170)
Target: Full tool suite for code operations
Bash Tool
Git Tool
Filesystem Tool
Phase 3: Workflow Integration (Iteration 171-180)
Target: Cloudflare Workers integration
Workflow API
Phase 4: Deployment (Iteration 181-190)
Target: Production-ready deployment
Deployment Options
-
Fly.io (Recommended)
- Simple deployment
- Built-in secrets management
- Auto-scaling
- Volume storage for workspaces
-
Railway
- GitHub integration
- Built-in Postgres
- Easy scaling
-
Self-hosted VPS
- Full control
- Custom domain
- Manual setup
Deployment Configuration
Security Considerations
Isolation
- Sandboxed execution: Use containers/chroot for file operations
- Resource limits: CPU, memory, disk quotas
- Network restrictions: Limit outbound connections
- Timeout enforcement: Maximum execution time per task
Authentication
- API keys: Secure secret management
- Request signing: Verify requests from Workers
- Rate limiting: Prevent abuse
- Audit logging: Track all operations
Filesystem Safety
- Workspace isolation: Each task in separate directory
- Path validation: Prevent directory traversal
- File size limits: Prevent disk exhaustion
- Cleanup jobs: Remove old workspaces
Monitoring & Observability
Metrics
- Task queue depth
- Task execution time (P50, P95, P99)
- Success/error rates
- Resource usage (CPU, memory, disk)
- Tool usage statistics
Logging
- Structured JSON logs
- Log levels: error, warn, info, debug
- Request/response logging
- Error stack traces
- Workflow state changes
Alerts
- Task failures
- High queue depth
- Resource exhaustion
- Service unavailability
Usage Examples
Example 1: Repository Refactoring
Example 2: Long-Running Test Suite
Next Steps
Immediate (Iteration 126-150)
- ✅ Complete skeleton screens
- ✅ Complete MCP server tests
- ✅ Add local MCP server implementations
- Continue with queued TODO.md items
- Implement API security enhancements
- Add performance optimizations
For Agent Server (Iteration 151+)
- Finalize architecture design
- Set up project structure
- Implement Claude Agent SDK integration
- Add tool implementations
- Deploy to staging
- Test with Telegram bot
- Production deployment
Last Updated: 2025-12-30 Iteration: 126 Status: Planning complete, ready for implementation