AI generated first iteration
This commit is contained in:
402
README.md
402
README.md
@@ -1,2 +1,402 @@
|
||||
# ws-sanctum-chronicler
|
||||
# The Sanctum Chronicler
|
||||
|
||||
A Dockerized Python MVP for an AI stream assistant that monitors Twitch chat, gently guides conversation, stores stream events, and exports a post-stream markdown ledger.
|
||||
|
||||
## Overview
|
||||
|
||||
**The Sanctum Chronicler** is an intelligent assistant designed to enhance live streaming by:
|
||||
- Monitoring real-time chat and stream activity
|
||||
- Maintaining a warm, non-intrusive presence during streams
|
||||
- Flagging suspicious content and spam patterns
|
||||
- Archiving discussion highlights and clip candidates
|
||||
- Generating post-stream ledgers and blog ideas
|
||||
|
||||
The system uses a multi-mode agent architecture, where different "personas" handle different aspects of stream management:
|
||||
- **Hearthkeeper** - Gently prompts chat when it's quiet
|
||||
- **Steward** - Responds thoughtfully to engagement
|
||||
- **Warden** - Detects suspicious content and spam
|
||||
- **Librarian** - Archives important discussion
|
||||
- **Scribe** - Compiles post-stream ledgers
|
||||
|
||||
## Architecture
|
||||
|
||||
### Project Structure
|
||||
|
||||
```
|
||||
sanctum-agent/
|
||||
├── app/
|
||||
│ ├── main.py # FastAPI application
|
||||
│ ├── config.py # Configuration (pydantic-settings)
|
||||
│ ├── twitch/
|
||||
│ │ ├── eventsub.py # Twitch EventSub client (stub)
|
||||
│ │ └── chat.py # Chat message handling (stub)
|
||||
│ ├── agent/
|
||||
│ │ ├── orchestrator.py # Main agent orchestrator
|
||||
│ │ ├── policies.py # Behavior policies
|
||||
│ │ └── modes/
|
||||
│ │ ├── hearthkeeper.py
|
||||
│ │ ├── steward.py
|
||||
│ │ ├── warden.py
|
||||
│ │ ├── librarian.py
|
||||
│ │ └── scribe.py
|
||||
│ ├── memory/
|
||||
│ │ ├── database.py # Async SQLAlchemy setup
|
||||
│ │ ├── models.py # Database models
|
||||
│ │ └── repository.py # Data access layer
|
||||
│ ├── llm/
|
||||
│ │ ├── client.py # Pluggable LLM client
|
||||
│ │ └── prompts.py # Prompt templates
|
||||
│ └── exports/
|
||||
│ └── markdown.py # Markdown ledger generation
|
||||
├── exports/ # Generated ledgers
|
||||
├── data/ # Local data storage
|
||||
├── Dockerfile
|
||||
├── docker-compose.yml
|
||||
├── requirements.txt
|
||||
├── .env.example
|
||||
└── README.md
|
||||
```
|
||||
|
||||
### Tech Stack
|
||||
|
||||
- **Python 3.12** - Core language
|
||||
- **FastAPI** - REST API framework
|
||||
- **SQLAlchemy + asyncpg** - Async database ORM
|
||||
- **PostgreSQL** - Primary data store
|
||||
- **Docker & Docker Compose** - Containerization
|
||||
- **Pydantic** - Configuration and validation
|
||||
|
||||
### Key Design Patterns
|
||||
|
||||
**Agent Modes:** Each mode operates independently but shares access to:
|
||||
- The LLM client for text generation
|
||||
- The database repository for persistence
|
||||
- Shared policies for behavior control
|
||||
|
||||
**Policies:** Encapsulate decision logic:
|
||||
- `ChatActivityPolicy` - Tracks inactivity periods
|
||||
- `ResponseSuppression` - Avoids speaking during active chat
|
||||
- `SuspiciousContentPolicy` - Pattern matching for spam/scams
|
||||
|
||||
**Async Architecture:** All I/O operations are non-blocking:
|
||||
- Database queries use `asyncpg`
|
||||
- FastAPI endpoints handle concurrent requests
|
||||
- LLM calls prepare for real API integration
|
||||
|
||||
## Setup & Quick Start
|
||||
|
||||
### Prerequisites
|
||||
|
||||
- Docker & Docker Compose
|
||||
- Python 3.12 (for local development)
|
||||
- PostgreSQL 16 (or use Docker)
|
||||
|
||||
### 1. Clone Repository
|
||||
|
||||
```bash
|
||||
cd ws-sanctum-chronicler
|
||||
```
|
||||
|
||||
### 2. Configure Environment
|
||||
|
||||
```bash
|
||||
cp .env.example .env
|
||||
# Edit .env with your settings (Twitch tokens, LLM provider, etc.)
|
||||
```
|
||||
|
||||
### 3. Start Services
|
||||
|
||||
```bash
|
||||
docker-compose up --build
|
||||
```
|
||||
|
||||
The API will be available at `http://localhost:8000`
|
||||
|
||||
### 4. Test the API
|
||||
|
||||
**Health Check:**
|
||||
```bash
|
||||
curl http://localhost:8000/health
|
||||
```
|
||||
|
||||
**Start Session:**
|
||||
```bash
|
||||
curl -X POST http://localhost:8000/admin/session/start \
|
||||
-H "Content-Type: application/x-www-form-urlencoded" \
|
||||
-d "channel_name=example_channel"
|
||||
```
|
||||
|
||||
**Send Test Message:**
|
||||
```bash
|
||||
curl -X POST http://localhost:8000/admin/test-message \
|
||||
-H "Content-Type: application/x-www-form-urlencoded" \
|
||||
-d "session_id=<SESSION_ID>&username=test_user&message=Hello stream!"
|
||||
```
|
||||
|
||||
**Get Ledger:**
|
||||
```bash
|
||||
curl http://localhost:8000/admin/ledger?session_id=<SESSION_ID>
|
||||
```
|
||||
|
||||
**End Session:**
|
||||
```bash
|
||||
curl -X POST http://localhost:8000/admin/session/end \
|
||||
-H "Content-Type: application/x-www-form-urlencoded" \
|
||||
-d "session_id=<SESSION_ID>"
|
||||
```
|
||||
|
||||
## API Endpoints
|
||||
|
||||
### Health & Status
|
||||
|
||||
- `GET /health` - Application health check
|
||||
|
||||
### Session Management
|
||||
|
||||
- `POST /admin/session/start?channel_name=<name>` - Start stream session
|
||||
- `POST /admin/session/end?session_id=<id>` - End stream session
|
||||
|
||||
### Testing & Admin
|
||||
|
||||
- `POST /admin/test-message?session_id=<id>&username=<user>&message=<msg>` - Send test message
|
||||
- `GET /admin/ledger?session_id=<id>` - Retrieve markdown ledger
|
||||
|
||||
## Configuration
|
||||
|
||||
All settings are loaded from environment variables (see `.env.example`):
|
||||
|
||||
### Application
|
||||
|
||||
- `APP_NAME` - Application display name
|
||||
- `APP_ENV` - Environment (development/production)
|
||||
- `DEBUG` - Enable debug logging
|
||||
|
||||
### Database
|
||||
|
||||
- `DATABASE_URL` - PostgreSQL connection string
|
||||
- `DB_PASSWORD` - Database password (for docker-compose)
|
||||
|
||||
### Twitch (Optional - Stubs Present)
|
||||
|
||||
- `TWITCH_CLIENT_ID` - Twitch OAuth client ID
|
||||
- `TWITCH_CLIENT_SECRET` - Twitch OAuth secret
|
||||
- `TWITCH_BOT_USERNAME` - Bot username
|
||||
- `TWITCH_CHANNEL_NAME` - Channel to monitor
|
||||
|
||||
### LLM
|
||||
|
||||
- `LLM_PROVIDER` - Provider: `openai`, `ollama`, `lm_studio`, or empty for mock
|
||||
- `LLM_BASE_URL` - API endpoint (for local providers)
|
||||
- `LLM_API_KEY` - API key (if needed)
|
||||
- `LLM_MODEL` - Model identifier (default: gpt-3.5-turbo)
|
||||
|
||||
### Export
|
||||
|
||||
- `EXPORT_PATH` - Directory for ledger exports
|
||||
|
||||
## Agent Policies
|
||||
|
||||
### Chat Activity Policy
|
||||
|
||||
- **Inactivity Threshold:** 15 minutes
|
||||
- **Hearthkeeper Activation:** Sends gentle prompt when no messages for 15+ minutes
|
||||
- **Human Override:** Hearthkeeper stays silent if chat is active (5+ messages/minute)
|
||||
|
||||
### Response Suppression
|
||||
|
||||
- **Active Chat Threshold:** 5 messages per minute
|
||||
- **Behavior:** Agent suppresses responses when humans are actively talking
|
||||
- **Rationale:** Respects human conversation and avoids noise
|
||||
|
||||
### Suspicious Content Detection
|
||||
|
||||
**Patterns Detected:**
|
||||
- "join our discord", "discord.gg" (growth spam)
|
||||
- "grow your channel", "easy money" (scams)
|
||||
- Multiple URLs (spam)
|
||||
- Common scam keywords
|
||||
|
||||
**Actions:** Warden flags suspicious messages (not auto-delete)
|
||||
|
||||
## Database Schema
|
||||
|
||||
### StreamSession
|
||||
- `id` (UUID) - Primary key
|
||||
- `channel_name` - Twitch channel
|
||||
- `started_at` - Session start time
|
||||
- `ended_at` - Session end time (null if active)
|
||||
- `theme` - Stream theme
|
||||
- `is_active` - Boolean flag
|
||||
|
||||
### ChatMessage
|
||||
- `id` (UUID)
|
||||
- `session_id` - Reference to session
|
||||
- `username` - Message author
|
||||
- `content` - Message text
|
||||
- `timestamp` - Message time
|
||||
- `is_bot`, `is_moderator` - Flags
|
||||
|
||||
### AgentAction
|
||||
- `id` (UUID)
|
||||
- `session_id` - Reference to session
|
||||
- `action_type` - RESPONSE, FLAG_SUSPICIOUS, ARCHIVE_CLIP, etc.
|
||||
- `mode` - Which agent mode took action
|
||||
- `triggered_by_message_id` - Message that triggered action
|
||||
- `description` - Action details
|
||||
|
||||
### ClipCandidate
|
||||
- `id` (UUID)
|
||||
- `session_id`, `message_id`
|
||||
- `reason` - Why it's clip-worthy
|
||||
|
||||
### BlogSeed
|
||||
- `id` (UUID)
|
||||
- `session_id`
|
||||
- `topic`, `description`
|
||||
- `related_messages` - JSON array
|
||||
|
||||
## LLM Integration
|
||||
|
||||
The system includes a pluggable LLM client that currently:
|
||||
- Generates mock responses when no provider is configured
|
||||
- Prepares for OpenAI, Ollama, and LM Studio integration
|
||||
|
||||
**Current Mock Behavior:**
|
||||
- Returns deterministic responses based on keywords
|
||||
- Useful for testing without API costs
|
||||
|
||||
**Implementing Real Providers:**
|
||||
|
||||
See `app/llm/client.py` for TODO comments marking where to integrate:
|
||||
- `_generate_openai()` - OpenAI API calls
|
||||
- `_generate_ollama()` - Ollama local API
|
||||
- `_generate_lm_studio()` - LM Studio API
|
||||
|
||||
## Current Limitations
|
||||
|
||||
This is **scaffolding, not production code**:
|
||||
|
||||
- Twitch EventSub connection is a stub (see TODO comments)
|
||||
- Chat sending is not implemented
|
||||
- LLM providers are not integrated yet (mock mode works)
|
||||
- No real OAuth flow for Twitch
|
||||
- Database migrations are automatic (no versioning)
|
||||
- No rate limiting on endpoints
|
||||
- No authentication/authorization
|
||||
- Ledger export is basic markdown (no formatting options)
|
||||
|
||||
## Next Implementation Steps
|
||||
|
||||
### Phase 1: Core Features (Recommended)
|
||||
|
||||
1. **Implement real Twitch integration:**
|
||||
- Implement EventSub WebSocket connection in `app/twitch/eventsub.py`
|
||||
- Implement send chat message API in `app/twitch/chat.py`
|
||||
- Add OAuth token exchange flow
|
||||
|
||||
2. **Integrate real LLM provider:**
|
||||
- Choose provider (e.g., Ollama for self-hosted)
|
||||
- Implement `_generate_ollama()` or `_generate_openai()`
|
||||
- Test with actual model
|
||||
|
||||
3. **Enhance agent modes:**
|
||||
- Refine Hearthkeeper timing logic
|
||||
- Implement Steward mention detection
|
||||
- Expand Warden pattern library
|
||||
- Complete Librarian topic extraction
|
||||
|
||||
### Phase 2: User Experience
|
||||
|
||||
1. **Add UI/Dashboard:**
|
||||
- Stream monitoring view
|
||||
- Ledger generation UI
|
||||
- Settings panel
|
||||
|
||||
2. **Improve exports:**
|
||||
- Configurable markdown templates
|
||||
- JSON export option
|
||||
- Email distribution
|
||||
|
||||
3. **Add persistence:**
|
||||
- Session history
|
||||
- Settings storage per channel
|
||||
- Analytics dashboard
|
||||
|
||||
### Phase 3: Production Readiness
|
||||
|
||||
1. **Testing:**
|
||||
- Unit tests for policies
|
||||
- Integration tests for agent modes
|
||||
- E2E tests for full flows
|
||||
|
||||
2. **DevOps:**
|
||||
- Database migrations (Alembic)
|
||||
- Logging aggregation
|
||||
- Monitoring/alerting
|
||||
|
||||
3. **Performance:**
|
||||
- Rate limiting
|
||||
- Caching for repeated LLM calls
|
||||
- Message deduplication
|
||||
|
||||
## Development
|
||||
|
||||
### Local Setup (Without Docker)
|
||||
|
||||
```bash
|
||||
# Create virtual environment
|
||||
python3.12 -m venv venv
|
||||
source venv/bin/activate
|
||||
|
||||
# Install dependencies
|
||||
pip install -r requirements.txt
|
||||
|
||||
# Create .env file
|
||||
cp .env.example .env
|
||||
|
||||
# Run migrations (auto-created on app startup)
|
||||
# Start app
|
||||
python -m uvicorn app.main:app --reload
|
||||
```
|
||||
|
||||
### Database Access
|
||||
|
||||
```bash
|
||||
# Connect to running PostgreSQL
|
||||
docker-compose exec sanctum-db psql -U sanctum -d sanctum
|
||||
|
||||
# View tables
|
||||
\dt
|
||||
|
||||
# Query sessions
|
||||
SELECT id, channel_name, started_at FROM stream_sessions;
|
||||
```
|
||||
|
||||
### Logging
|
||||
|
||||
The application uses Python's standard logging. Configure in `app/main.py`:
|
||||
|
||||
```python
|
||||
logging.basicConfig(level=logging.DEBUG)
|
||||
```
|
||||
|
||||
## Contributing
|
||||
|
||||
When adding features:
|
||||
- Maintain async/await patterns throughout
|
||||
- Add type hints to all functions
|
||||
- Include docstrings with purpose and TODO comments for future work
|
||||
- Keep modes independent but shareable
|
||||
|
||||
## License
|
||||
|
||||
(Add your license here)
|
||||
|
||||
## Support
|
||||
|
||||
For issues or questions:
|
||||
1. Check TODO comments in relevant files
|
||||
2. Review the architecture overview
|
||||
3. File an issue with reproduction steps
|
||||
|
||||
|
||||
|
||||
Reference in New Issue
Block a user