Goondex/docs/DATABASE_SCHEMA.md
Team Goon f7d82cd207 v0.1.0-dev2: Full TPDB integration with auto-fetch and comprehensive docs
Major Features:
-  Complete TPDB scraper implementation with real API calls
-  Auto-fetch on cache miss: search commands now automatically import from TPDB when not found locally
-  Comprehensive documentation (5 markdown files for Bookstack)
-  Import commands for performers, studios, and scenes
-  Fixed JSON type mismatches (aliases array, studio numeric IDs)

Changes:
1. TPDB Scraper (internal/scraper/tpdb/):
   - types.go: Full API response structures with correct types
     - PerformerResponse.Aliases: string → []string (TPDB returns array)
     - StudioResponse.ID: string → int (TPDB returns numeric IDs)
     - SiteInfo.ID: string → int (scenes reference studios by number)
   - mapper.go: Maps TPDB responses to internal models
     - Converts aliases array to comma-separated string
     - Converts numeric studio IDs to strings using strconv.Itoa()
   - scraper.go: Real HTTP client with Bearer token auth
     - SearchPerformers, SearchStudios, SearchScenes implemented
     - GetPerformerByID, GetStudioByID, GetSceneByID implemented

2. CLI Auto-Fetch (cmd/goondex/main.go):
   - performer-search: Auto-fetches from TPDB if local DB empty
   - studio-search: Auto-fetches from TPDB if local DB empty
   - scene-search: Auto-fetches basic metadata (no relationships)
   - Graceful handling of missing TPDB_API_KEY
   - Import → search again to get local IDs

3. Documentation (docs/):
   - INDEX.md: Documentation overview and navigation
   - ARCHITECTURE.md: System design, data flow, component diagrams
   - DATABASE_SCHEMA.md: Complete schema with relationships and indexes
   - CLI_REFERENCE.md: All commands with examples
   - TPDB_INTEGRATION.md: API guide, data mapping, best practices

4. Fixes:
   - .gitignore: Fixed pattern to allow cmd/goondex/* and cmd/goondexd/*
   - README: Updated to reflect TPDB integration and auto-fetch

Testing:
-  performer-search "Riley Reid" - auto-fetched 2 performers, cached
-  studio-search "Brazzers" - auto-fetched 12 studios, cached
-  Aliases now display correctly as comma-separated list
-  Studio IDs properly converted from numeric to string

API Integration:
- Base URL: https://api.theporndb.net
- Authentication: Bearer token via TPDB_API_KEY env var
- Endpoints: /performers, /sites, /scenes
- Rate limiting handled with warnings

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-14 22:04:23 -05:00

12 KiB

Database Schema

Overview

Goondex uses SQLite as its database engine with the following characteristics:

  • Journal Mode: WAL (Write-Ahead Logging)
  • Foreign Keys: Enabled
  • Date Format: RFC3339 (ISO 8601)
  • Location: Configurable (default: ./goondex.db)

Schema Diagram

┌──────────────┐       ┌──────────────────┐       ┌──────────────┐
│  performers  │       │ scene_performers │       │    scenes    │
├──────────────┤       ├──────────────────┤       ├──────────────┤
│ id (PK)      │       │ scene_id (FK)    │       │ id (PK)      │
│ name         │◄──────┤ performer_id(FK) ├──────►│ title        │
│ aliases      │       └──────────────────┘       │ code         │
│ nationality  │                                  │ date         │
│ country      │                                  │ studio_id(FK)│
│ gender       │                                  │ description  │
│ image_path   │                                  │ image_path   │
│ image_url    │                                  │ image_url    │
│ bio          │                                  │ director     │
│ source       │                                  │ url          │
│ source_id    │                                  │ source       │
│ created_at   │                                  │ source_id    │
│ updated_at   │                                  │ created_at   │
└──────────────┘                                  │ updated_at   │
                                                  └──────────────┘
                                                         │
                                                         │
                  ┌──────────────────┐                  │
                  │   scene_tags     │                  │
                  ├──────────────────┤                  │
                  │ scene_id (FK)    │◄─────────────────┘
                  │ tag_id (FK)      ├──────┐
                  └──────────────────┘      │
                                            │
┌──────────────┐                            │
│   studios    │                            │
├──────────────┤                            │
│ id (PK)      │                            │
│ name         │                            │
│ parent_id(FK)│────┐                       │
│ image_path   │    │(self-ref)             │
│ image_url    │    │                       │
│ description  │    │                       │
│ source       │    │                       ▼
│ source_id    │    │              ┌──────────────┐
│ created_at   │    │              │     tags     │
│ updated_at   │    │              ├──────────────┤
└──────────────┘    │              │ id (PK)      │
        ▲           │              │ name (UNIQUE)│
        │           │              │ source       │
        │           │              │ source_id    │
        └───────────┘              │ created_at   │
                                   │ updated_at   │
                                   └──────────────┘

Table Definitions

performers

Stores performer/actor information.

Column Type Constraints Description
id INTEGER PRIMARY KEY AUTOINCR Local database ID
name TEXT NOT NULL Performer's name
aliases TEXT Comma-separated aliases
nationality TEXT ISO country code (e.g., "US")
country TEXT Full country name
gender TEXT male/female/trans/other
image_path TEXT Local file path to image
image_url TEXT Remote URL to image
bio TEXT Biography/description
source TEXT Source name (e.g., "tpdb")
source_id TEXT ID at the source
created_at TEXT NOT NULL, DEFAULT NOW RFC3339 timestamp
updated_at TEXT NOT NULL, DEFAULT NOW RFC3339 timestamp

Indexes:

  • idx_performers_name on name

studios

Stores production companies and networks.

Column Type Constraints Description
id INTEGER PRIMARY KEY AUTOINCR Local database ID
name TEXT NOT NULL Studio/site name
parent_id INTEGER FK → studios(id) Parent studio/network ID
image_path TEXT Local file path to logo
image_url TEXT Remote URL to logo
description TEXT Studio description
source TEXT Source name
source_id TEXT ID at the source
created_at TEXT NOT NULL, DEFAULT NOW RFC3339 timestamp
updated_at TEXT NOT NULL, DEFAULT NOW RFC3339 timestamp

Indexes:

  • idx_studios_name on name

Foreign Keys:

  • parent_id references studios(id) ON DELETE SET NULL

tags

Stores content tags and categories.

Column Type Constraints Description
id INTEGER PRIMARY KEY AUTOINCR Local database ID
name TEXT NOT NULL, UNIQUE Tag name
source TEXT Source name
source_id TEXT ID at the source
created_at TEXT NOT NULL, DEFAULT NOW RFC3339 timestamp
updated_at TEXT NOT NULL, DEFAULT NOW RFC3339 timestamp

Indexes:

  • idx_tags_name on name

scenes

Stores video scenes/titles.

Column Type Constraints Description
id INTEGER PRIMARY KEY AUTOINCR Local database ID
title TEXT NOT NULL Scene title
code TEXT DVD code, scene ID, etc.
date TEXT Release date (ISO 8601)
studio_id INTEGER FK → studios(id) Studio ID
description TEXT Scene description/synopsis
image_path TEXT Local file path to cover
image_url TEXT Remote URL to cover
director TEXT Director name
url TEXT Scene URL
source TEXT Source name
source_id TEXT ID at the source
created_at TEXT NOT NULL, DEFAULT NOW RFC3339 timestamp
updated_at TEXT NOT NULL, DEFAULT NOW RFC3339 timestamp

Indexes:

  • idx_scenes_title on title
  • idx_scenes_code on code

Foreign Keys:

  • studio_id references studios(id) ON DELETE SET NULL

scene_performers

Junction table for many-to-many relationship between scenes and performers.

Column Type Constraints Description
scene_id INTEGER FK → scenes(id) Scene ID
performer_id INTEGER FK → performers(id) Performer ID

Primary Key: (scene_id, performer_id)

Foreign Keys:

  • scene_id references scenes(id) ON DELETE CASCADE
  • performer_id references performers(id) ON DELETE CASCADE

scene_tags

Junction table for many-to-many relationship between scenes and tags.

Column Type Constraints Description
scene_id INTEGER FK → scenes(id) Scene ID
tag_id INTEGER FK → tags(id) Tag ID

Primary Key: (scene_id, tag_id)

Foreign Keys:

  • scene_id references scenes(id) ON DELETE CASCADE
  • tag_id references tags(id) ON DELETE CASCADE

Common Queries

Search Performers

SELECT * FROM performers
WHERE name LIKE '%Riley%' OR aliases LIKE '%Riley%'
ORDER BY name;
-- Get scene
SELECT * FROM scenes WHERE id = ?;

-- Get performers for scene
SELECT p.* FROM performers p
JOIN scene_performers sp ON p.id = sp.performer_id
WHERE sp.scene_id = ?;

-- Get tags for scene
SELECT t.* FROM tags t
JOIN scene_tags st ON t.id = st.tag_id
WHERE st.scene_id = ?;

-- Get studio for scene
SELECT s.* FROM studios s
JOIN scenes sc ON s.id = sc.studio_id
WHERE sc.id = ?;

Find Scenes by Performer

SELECT s.* FROM scenes s
JOIN scene_performers sp ON s.id = sp.scene_id
JOIN performers p ON sp.performer_id = p.id
WHERE p.name LIKE '%Riley Reid%'
ORDER BY s.date DESC;

Get Studio Hierarchy

-- Get parent studios
WITH RECURSIVE studio_tree AS (
  SELECT id, name, parent_id, 0 AS level
  FROM studios
  WHERE id = ?

  UNION ALL

  SELECT s.id, s.name, s.parent_id, st.level + 1
  FROM studios s
  JOIN studio_tree st ON s.parent_id = st.id
)
SELECT * FROM studio_tree;

Migration Strategy

For future schema changes:

  1. Version tracking in a schema_version table
  2. Migration scripts in internal/db/migrations/
  3. Up/down migration support
  4. Automatic migration on startup

Performance Considerations

Current Indexes

  • Name fields (performers, studios, tags, scenes)
  • Code field (scenes)

Future Optimizations (v0.2+)

  • Full-text search indexes (FTS5)
  • Composite indexes for common queries
  • Materialized views for complex joins
  • Separate alias table for normalized storage

Backup and Restore

Backup

# SQLite backup
sqlite3 goondex.db ".backup goondex-backup.db"

# Or simple copy (safe with WAL mode)
cp goondex.db goondex-backup.db

Restore

# Copy backup over current
cp goondex-backup.db goondex.db

Database Maintenance

Analyze Statistics

ANALYZE;

Vacuum (Rebuild)

VACUUM;

Check Integrity

PRAGMA integrity_check;

View Database Info

PRAGMA database_list;
PRAGMA table_list;
PRAGMA foreign_key_check;