Goondex/docs/ARCHITECTURE.md
Team Goon f7d82cd207 v0.1.0-dev2: Full TPDB integration with auto-fetch and comprehensive docs
Major Features:
-  Complete TPDB scraper implementation with real API calls
-  Auto-fetch on cache miss: search commands now automatically import from TPDB when not found locally
-  Comprehensive documentation (5 markdown files for Bookstack)
-  Import commands for performers, studios, and scenes
-  Fixed JSON type mismatches (aliases array, studio numeric IDs)

Changes:
1. TPDB Scraper (internal/scraper/tpdb/):
   - types.go: Full API response structures with correct types
     - PerformerResponse.Aliases: string → []string (TPDB returns array)
     - StudioResponse.ID: string → int (TPDB returns numeric IDs)
     - SiteInfo.ID: string → int (scenes reference studios by number)
   - mapper.go: Maps TPDB responses to internal models
     - Converts aliases array to comma-separated string
     - Converts numeric studio IDs to strings using strconv.Itoa()
   - scraper.go: Real HTTP client with Bearer token auth
     - SearchPerformers, SearchStudios, SearchScenes implemented
     - GetPerformerByID, GetStudioByID, GetSceneByID implemented

2. CLI Auto-Fetch (cmd/goondex/main.go):
   - performer-search: Auto-fetches from TPDB if local DB empty
   - studio-search: Auto-fetches from TPDB if local DB empty
   - scene-search: Auto-fetches basic metadata (no relationships)
   - Graceful handling of missing TPDB_API_KEY
   - Import → search again to get local IDs

3. Documentation (docs/):
   - INDEX.md: Documentation overview and navigation
   - ARCHITECTURE.md: System design, data flow, component diagrams
   - DATABASE_SCHEMA.md: Complete schema with relationships and indexes
   - CLI_REFERENCE.md: All commands with examples
   - TPDB_INTEGRATION.md: API guide, data mapping, best practices

4. Fixes:
   - .gitignore: Fixed pattern to allow cmd/goondex/* and cmd/goondexd/*
   - README: Updated to reflect TPDB integration and auto-fetch

Testing:
-  performer-search "Riley Reid" - auto-fetched 2 performers, cached
-  studio-search "Brazzers" - auto-fetched 12 studios, cached
-  Aliases now display correctly as comma-separated list
-  Studio IDs properly converted from numeric to string

API Integration:
- Base URL: https://api.theporndb.net
- Authentication: Bearer token via TPDB_API_KEY env var
- Endpoints: /performers, /sites, /scenes
- Rate limiting handled with warnings

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-14 22:04:23 -05:00

12 KiB

Architecture Overview

High-Level Architecture

┌─────────────────────────────────────────────────────────────┐
│                         User Interface                       │
│                     (CLI / TUI / Web)                        │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│                      Application Layer                       │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐      │
│  │   Commands   │  │   Metadata   │  │    Images    │      │
│  │   (Import/   │  │   Resolver   │  │   (Cache/    │      │
│  │    Search)   │  │              │  │   Fetch)     │      │
│  └──────────────┘  └──────────────┘  └──────────────┘      │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│                        Scraper Layer                         │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐      │
│  │     TPDB     │  │      AE      │  │   Future     │      │
│  │   Scraper    │  │   Scraper    │  │   Sources    │      │
│  └──────────────┘  └──────────────┘  └──────────────┘      │
│                    Scraper Registry                          │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│                      Data Layer (DB)                         │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐      │
│  │  Performers  │  │    Studios   │  │    Scenes    │      │
│  │    Store     │  │     Store    │  │     Store    │      │
│  └──────────────┘  └──────────────┘  └──────────────┘      │
│  ┌──────────────┐  ┌──────────────────────────────────┐    │
│  │     Tags     │  │      Junction Tables              │    │
│  │    Store     │  │  (scene_performers, scene_tags)   │    │
│  └──────────────┘  └──────────────────────────────────┘    │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│                    SQLite Database                           │
│              (WAL mode, Foreign Keys enabled)                │
└─────────────────────────────────────────────────────────────┘

Component Breakdown

1. User Interface Layer

CLI (cmd/goondex)

  • Command-line interface built with Cobra
  • Commands:
    • search - Search local database
    • import - Fetch from external sources
    • version - Show version
  • Environment-based configuration (TPDB_API_KEY)

TUI (Future: internal/ui/tui)

  • Terminal UI built with Bubble Tea
  • Features:
    • Browse performers/scenes/studios
    • View images (kitty/sixel protocol)
    • Interactive search

Daemon (Future: cmd/goondexd)

  • Background service
  • Scheduled scraping
  • Cache warming
  • Image pre-fetching

2. Application Layer

Commands (cmd/goondex/*_cmd.go)

  • Implements business logic for CLI commands
  • Orchestrates scraper + database operations
  • Handles error reporting and user feedback

Metadata Resolver (Future: internal/metadata)

  • Stash-inspired field merge strategies
  • Conflict resolution (IGNORE/MERGE/OVERWRITE)
  • De-duplication logic
  • Create-missing entity handling

Image Management (Future: internal/images)

  • Fetch images from URLs
  • Cache locally
  • Generate thumbnails/sprites
  • Byte-compare to avoid re-downloads

3. Scraper Layer

Scraper Interface (internal/scraper/interface.go)

type Scraper interface {
    Name() string
    SearchPerformers(ctx, query) ([]Performer, error)
    SearchStudios(ctx, query) ([]Studio, error)
    SearchScenes(ctx, query) ([]Scene, error)
    GetPerformerByID(ctx, id) (*Performer, error)
    GetStudioByID(ctx, id) (*Studio, error)
    GetSceneByID(ctx, id) (*Scene, error)
}

TPDB Scraper (internal/scraper/tpdb)

  • HTTP client with Bearer auth
  • JSON response parsing
  • Mapping TPDB structs → internal models
  • Rate limiting (future)

Scraper Registry (internal/scraper/registry.go)

  • Manages available scrapers
  • Priority ordering
  • Fallback logic

4. Data Layer

Stores (internal/db/*_store.go)

  • Thin abstraction over SQL
  • CRUD operations
  • Search/filter methods
  • Transaction support (future)

Models (internal/model/*.go)

  • Pure Go structs
  • JSON serialization
  • No database coupling
  • Relationship fields

5. Storage Layer

SQLite Database

  • Mode: WAL (Write-Ahead Logging)
  • Foreign Keys: Enabled
  • Indexes: On name/title/code fields
  • Location: Configurable (default: ./goondex.db)

Data Flow Examples

Import Flow

User runs: goondex import performer "Riley Reid"
              │
              ▼
┌─────────────────────────────────┐
│  1. Parse command arguments     │
└─────────────────────────────────┘
              │
              ▼
┌─────────────────────────────────┐
│  2. Load TPDB_API_KEY from env  │
└─────────────────────────────────┘
              │
              ▼
┌─────────────────────────────────┐
│  3. Create TPDB scraper         │
└─────────────────────────────────┘
              │
              ▼
┌─────────────────────────────────┐
│  4. Call SearchPerformers()     │
│     - Build HTTP request        │
│     - Add Bearer token          │
│     - Parse JSON response       │
│     - Map to model.Performer    │
└─────────────────────────────────┘
              │
              ▼
┌─────────────────────────────────┐
│  5. Open local database         │
└─────────────────────────────────┘
              │
              ▼
┌─────────────────────────────────┐
│  6. For each performer:         │
│     - PerformerStore.Create()   │
│     - Insert into DB            │
│     - Report success/failure    │
└─────────────────────────────────┘

Search Flow

User runs: goondex performer-search "Riley"
              │
              ▼
┌─────────────────────────────────┐
│  1. Parse command arguments     │
└─────────────────────────────────┘
              │
              ▼
┌─────────────────────────────────┐
│  2. Open local database         │
└─────────────────────────────────┘
              │
              ▼
┌─────────────────────────────────┐
│  3. PerformerStore.Search()     │
│     - Execute SQL LIKE query    │
│     - Parse rows into structs   │
└─────────────────────────────────┘
              │
              ▼
┌─────────────────────────────────┐
│  4. Format and display results  │
└─────────────────────────────────┘

Design Principles

1. Local-First

  • All data stored locally in SQLite
  • No required internet connection for search
  • External APIs only for imports/updates

2. Pluggable Scrapers

  • Standard interface for all sources
  • Easy to add new scrapers
  • Registry-based discovery

3. Thin Database Layer

  • Stores are lightweight wrappers
  • SQL is visible and auditable
  • No heavy ORM

4. Clean Models

  • Domain models are pure Go structs
  • No tags/annotations beyond JSON
  • Clear separation from DB layer

5. Idempotent Operations

  • Re-running imports is safe
  • Merge strategies prevent data loss
  • Duplicate detection

Technology Stack

  • Language: Go 1.25+
  • Database: SQLite (modernc.org/sqlite)
  • CLI Framework: Cobra
  • HTTP Client: net/http (stdlib)
  • JSON: encoding/json (stdlib)

Directory Structure

Goondex/
├── cmd/
│   ├── goondex/        # CLI application
│   └── goondexd/       # Daemon (future)
├── internal/
│   ├── db/             # Database layer
│   ├── images/         # Image management
│   ├── metadata/       # Resolver/strategies
│   ├── model/          # Data models
│   ├── scraper/        # Scraper system
│   ├── ui/             # TUI/GUI
│   └── util/           # Utilities
├── config/             # YAML configs
├── docs/               # Documentation
├── scripts/            # Dev scripts
└── assets/             # Static assets

Future Enhancements

v0.2.x

  • Metadata resolver with field strategies
  • Image caching system
  • Full-text search (FTS5)
  • TUI browser

v0.3.x

  • Background daemon
  • Scheduled imports
  • Duplicate detection
  • Preview sprites

v0.4.x

  • Web UI
  • REST API
  • Multi-source priority
  • Plugin system