Goondex/docs/ARCHITECTURE.md
Team Goon f7d82cd207 v0.1.0-dev2: Full TPDB integration with auto-fetch and comprehensive docs
Major Features:
-  Complete TPDB scraper implementation with real API calls
-  Auto-fetch on cache miss: search commands now automatically import from TPDB when not found locally
-  Comprehensive documentation (5 markdown files for Bookstack)
-  Import commands for performers, studios, and scenes
-  Fixed JSON type mismatches (aliases array, studio numeric IDs)

Changes:
1. TPDB Scraper (internal/scraper/tpdb/):
   - types.go: Full API response structures with correct types
     - PerformerResponse.Aliases: string → []string (TPDB returns array)
     - StudioResponse.ID: string → int (TPDB returns numeric IDs)
     - SiteInfo.ID: string → int (scenes reference studios by number)
   - mapper.go: Maps TPDB responses to internal models
     - Converts aliases array to comma-separated string
     - Converts numeric studio IDs to strings using strconv.Itoa()
   - scraper.go: Real HTTP client with Bearer token auth
     - SearchPerformers, SearchStudios, SearchScenes implemented
     - GetPerformerByID, GetStudioByID, GetSceneByID implemented

2. CLI Auto-Fetch (cmd/goondex/main.go):
   - performer-search: Auto-fetches from TPDB if local DB empty
   - studio-search: Auto-fetches from TPDB if local DB empty
   - scene-search: Auto-fetches basic metadata (no relationships)
   - Graceful handling of missing TPDB_API_KEY
   - Import → search again to get local IDs

3. Documentation (docs/):
   - INDEX.md: Documentation overview and navigation
   - ARCHITECTURE.md: System design, data flow, component diagrams
   - DATABASE_SCHEMA.md: Complete schema with relationships and indexes
   - CLI_REFERENCE.md: All commands with examples
   - TPDB_INTEGRATION.md: API guide, data mapping, best practices

4. Fixes:
   - .gitignore: Fixed pattern to allow cmd/goondex/* and cmd/goondexd/*
   - README: Updated to reflect TPDB integration and auto-fetch

Testing:
-  performer-search "Riley Reid" - auto-fetched 2 performers, cached
-  studio-search "Brazzers" - auto-fetched 12 studios, cached
-  Aliases now display correctly as comma-separated list
-  Studio IDs properly converted from numeric to string

API Integration:
- Base URL: https://api.theporndb.net
- Authentication: Bearer token via TPDB_API_KEY env var
- Endpoints: /performers, /sites, /scenes
- Rate limiting handled with warnings

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-14 22:04:23 -05:00

291 lines
12 KiB
Markdown

# Architecture Overview
## High-Level Architecture
```
┌─────────────────────────────────────────────────────────────┐
│ User Interface │
│ (CLI / TUI / Web) │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Application Layer │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Commands │ │ Metadata │ │ Images │ │
│ │ (Import/ │ │ Resolver │ │ (Cache/ │ │
│ │ Search) │ │ │ │ Fetch) │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Scraper Layer │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ TPDB │ │ AE │ │ Future │ │
│ │ Scraper │ │ Scraper │ │ Sources │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
│ Scraper Registry │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Data Layer (DB) │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Performers │ │ Studios │ │ Scenes │ │
│ │ Store │ │ Store │ │ Store │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
│ ┌──────────────┐ ┌──────────────────────────────────┐ │
│ │ Tags │ │ Junction Tables │ │
│ │ Store │ │ (scene_performers, scene_tags) │ │
│ └──────────────┘ └──────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ SQLite Database │
│ (WAL mode, Foreign Keys enabled) │
└─────────────────────────────────────────────────────────────┘
```
## Component Breakdown
### 1. User Interface Layer
#### CLI (`cmd/goondex`)
- Command-line interface built with Cobra
- Commands:
- `search` - Search local database
- `import` - Fetch from external sources
- `version` - Show version
- Environment-based configuration (TPDB_API_KEY)
#### TUI (Future: `internal/ui/tui`)
- Terminal UI built with Bubble Tea
- Features:
- Browse performers/scenes/studios
- View images (kitty/sixel protocol)
- Interactive search
#### Daemon (Future: `cmd/goondexd`)
- Background service
- Scheduled scraping
- Cache warming
- Image pre-fetching
### 2. Application Layer
#### Commands (`cmd/goondex/*_cmd.go`)
- Implements business logic for CLI commands
- Orchestrates scraper + database operations
- Handles error reporting and user feedback
#### Metadata Resolver (Future: `internal/metadata`)
- Stash-inspired field merge strategies
- Conflict resolution (IGNORE/MERGE/OVERWRITE)
- De-duplication logic
- Create-missing entity handling
#### Image Management (Future: `internal/images`)
- Fetch images from URLs
- Cache locally
- Generate thumbnails/sprites
- Byte-compare to avoid re-downloads
### 3. Scraper Layer
#### Scraper Interface (`internal/scraper/interface.go`)
```go
type Scraper interface {
Name() string
SearchPerformers(ctx, query) ([]Performer, error)
SearchStudios(ctx, query) ([]Studio, error)
SearchScenes(ctx, query) ([]Scene, error)
GetPerformerByID(ctx, id) (*Performer, error)
GetStudioByID(ctx, id) (*Studio, error)
GetSceneByID(ctx, id) (*Scene, error)
}
```
#### TPDB Scraper (`internal/scraper/tpdb`)
- HTTP client with Bearer auth
- JSON response parsing
- Mapping TPDB structs → internal models
- Rate limiting (future)
#### Scraper Registry (`internal/scraper/registry.go`)
- Manages available scrapers
- Priority ordering
- Fallback logic
### 4. Data Layer
#### Stores (`internal/db/*_store.go`)
- Thin abstraction over SQL
- CRUD operations
- Search/filter methods
- Transaction support (future)
#### Models (`internal/model/*.go`)
- Pure Go structs
- JSON serialization
- No database coupling
- Relationship fields
### 5. Storage Layer
#### SQLite Database
- **Mode**: WAL (Write-Ahead Logging)
- **Foreign Keys**: Enabled
- **Indexes**: On name/title/code fields
- **Location**: Configurable (default: `./goondex.db`)
## Data Flow Examples
### Import Flow
```
User runs: goondex import performer "Riley Reid"
┌─────────────────────────────────┐
│ 1. Parse command arguments │
└─────────────────────────────────┘
┌─────────────────────────────────┐
│ 2. Load TPDB_API_KEY from env │
└─────────────────────────────────┘
┌─────────────────────────────────┐
│ 3. Create TPDB scraper │
└─────────────────────────────────┘
┌─────────────────────────────────┐
│ 4. Call SearchPerformers() │
│ - Build HTTP request │
│ - Add Bearer token │
│ - Parse JSON response │
│ - Map to model.Performer │
└─────────────────────────────────┘
┌─────────────────────────────────┐
│ 5. Open local database │
└─────────────────────────────────┘
┌─────────────────────────────────┐
│ 6. For each performer: │
│ - PerformerStore.Create() │
│ - Insert into DB │
│ - Report success/failure │
└─────────────────────────────────┘
```
### Search Flow
```
User runs: goondex performer-search "Riley"
┌─────────────────────────────────┐
│ 1. Parse command arguments │
└─────────────────────────────────┘
┌─────────────────────────────────┐
│ 2. Open local database │
└─────────────────────────────────┘
┌─────────────────────────────────┐
│ 3. PerformerStore.Search() │
│ - Execute SQL LIKE query │
│ - Parse rows into structs │
└─────────────────────────────────┘
┌─────────────────────────────────┐
│ 4. Format and display results │
└─────────────────────────────────┘
```
## Design Principles
### 1. **Local-First**
- All data stored locally in SQLite
- No required internet connection for search
- External APIs only for imports/updates
### 2. **Pluggable Scrapers**
- Standard interface for all sources
- Easy to add new scrapers
- Registry-based discovery
### 3. **Thin Database Layer**
- Stores are lightweight wrappers
- SQL is visible and auditable
- No heavy ORM
### 4. **Clean Models**
- Domain models are pure Go structs
- No tags/annotations beyond JSON
- Clear separation from DB layer
### 5. **Idempotent Operations**
- Re-running imports is safe
- Merge strategies prevent data loss
- Duplicate detection
## Technology Stack
- **Language**: Go 1.25+
- **Database**: SQLite (modernc.org/sqlite)
- **CLI Framework**: Cobra
- **HTTP Client**: net/http (stdlib)
- **JSON**: encoding/json (stdlib)
## Directory Structure
```
Goondex/
├── cmd/
│ ├── goondex/ # CLI application
│ └── goondexd/ # Daemon (future)
├── internal/
│ ├── db/ # Database layer
│ ├── images/ # Image management
│ ├── metadata/ # Resolver/strategies
│ ├── model/ # Data models
│ ├── scraper/ # Scraper system
│ ├── ui/ # TUI/GUI
│ └── util/ # Utilities
├── config/ # YAML configs
├── docs/ # Documentation
├── scripts/ # Dev scripts
└── assets/ # Static assets
```
## Future Enhancements
### v0.2.x
- Metadata resolver with field strategies
- Image caching system
- Full-text search (FTS5)
- TUI browser
### v0.3.x
- Background daemon
- Scheduled imports
- Duplicate detection
- Preview sprites
### v0.4.x
- Web UI
- REST API
- Multi-source priority
- Plugin system