Major Features: - ✅ Complete TPDB scraper implementation with real API calls - ✅ Auto-fetch on cache miss: search commands now automatically import from TPDB when not found locally - ✅ Comprehensive documentation (5 markdown files for Bookstack) - ✅ Import commands for performers, studios, and scenes - ✅ Fixed JSON type mismatches (aliases array, studio numeric IDs) Changes: 1. TPDB Scraper (internal/scraper/tpdb/): - types.go: Full API response structures with correct types - PerformerResponse.Aliases: string → []string (TPDB returns array) - StudioResponse.ID: string → int (TPDB returns numeric IDs) - SiteInfo.ID: string → int (scenes reference studios by number) - mapper.go: Maps TPDB responses to internal models - Converts aliases array to comma-separated string - Converts numeric studio IDs to strings using strconv.Itoa() - scraper.go: Real HTTP client with Bearer token auth - SearchPerformers, SearchStudios, SearchScenes implemented - GetPerformerByID, GetStudioByID, GetSceneByID implemented 2. CLI Auto-Fetch (cmd/goondex/main.go): - performer-search: Auto-fetches from TPDB if local DB empty - studio-search: Auto-fetches from TPDB if local DB empty - scene-search: Auto-fetches basic metadata (no relationships) - Graceful handling of missing TPDB_API_KEY - Import → search again to get local IDs 3. Documentation (docs/): - INDEX.md: Documentation overview and navigation - ARCHITECTURE.md: System design, data flow, component diagrams - DATABASE_SCHEMA.md: Complete schema with relationships and indexes - CLI_REFERENCE.md: All commands with examples - TPDB_INTEGRATION.md: API guide, data mapping, best practices 4. Fixes: - .gitignore: Fixed pattern to allow cmd/goondex/* and cmd/goondexd/* - README: Updated to reflect TPDB integration and auto-fetch Testing: - ✅ performer-search "Riley Reid" - auto-fetched 2 performers, cached - ✅ studio-search "Brazzers" - auto-fetched 12 studios, cached - ✅ Aliases now display correctly as comma-separated list - ✅ Studio IDs properly converted from numeric to string API Integration: - Base URL: https://api.theporndb.net - Authentication: Bearer token via TPDB_API_KEY env var - Endpoints: /performers, /sites, /scenes - Rate limiting handled with warnings 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
12 KiB
12 KiB
Database Schema
Overview
Goondex uses SQLite as its database engine with the following characteristics:
- Journal Mode: WAL (Write-Ahead Logging)
- Foreign Keys: Enabled
- Date Format: RFC3339 (ISO 8601)
- Location: Configurable (default:
./goondex.db)
Schema Diagram
┌──────────────┐ ┌──────────────────┐ ┌──────────────┐
│ performers │ │ scene_performers │ │ scenes │
├──────────────┤ ├──────────────────┤ ├──────────────┤
│ id (PK) │ │ scene_id (FK) │ │ id (PK) │
│ name │◄──────┤ performer_id(FK) ├──────►│ title │
│ aliases │ └──────────────────┘ │ code │
│ nationality │ │ date │
│ country │ │ studio_id(FK)│
│ gender │ │ description │
│ image_path │ │ image_path │
│ image_url │ │ image_url │
│ bio │ │ director │
│ source │ │ url │
│ source_id │ │ source │
│ created_at │ │ source_id │
│ updated_at │ │ created_at │
└──────────────┘ │ updated_at │
└──────────────┘
│
│
┌──────────────────┐ │
│ scene_tags │ │
├──────────────────┤ │
│ scene_id (FK) │◄─────────────────┘
│ tag_id (FK) ├──────┐
└──────────────────┘ │
│
┌──────────────┐ │
│ studios │ │
├──────────────┤ │
│ id (PK) │ │
│ name │ │
│ parent_id(FK)│────┐ │
│ image_path │ │(self-ref) │
│ image_url │ │ │
│ description │ │ │
│ source │ │ ▼
│ source_id │ │ ┌──────────────┐
│ created_at │ │ │ tags │
│ updated_at │ │ ├──────────────┤
└──────────────┘ │ │ id (PK) │
▲ │ │ name (UNIQUE)│
│ │ │ source │
│ │ │ source_id │
└───────────┘ │ created_at │
│ updated_at │
└──────────────┘
Table Definitions
performers
Stores performer/actor information.
| Column | Type | Constraints | Description |
|---|---|---|---|
| id | INTEGER | PRIMARY KEY AUTOINCR | Local database ID |
| name | TEXT | NOT NULL | Performer's name |
| aliases | TEXT | Comma-separated aliases | |
| nationality | TEXT | ISO country code (e.g., "US") | |
| country | TEXT | Full country name | |
| gender | TEXT | male/female/trans/other | |
| image_path | TEXT | Local file path to image | |
| image_url | TEXT | Remote URL to image | |
| bio | TEXT | Biography/description | |
| source | TEXT | Source name (e.g., "tpdb") | |
| source_id | TEXT | ID at the source | |
| created_at | TEXT | NOT NULL, DEFAULT NOW | RFC3339 timestamp |
| updated_at | TEXT | NOT NULL, DEFAULT NOW | RFC3339 timestamp |
Indexes:
idx_performers_nameonname
studios
Stores production companies and networks.
| Column | Type | Constraints | Description |
|---|---|---|---|
| id | INTEGER | PRIMARY KEY AUTOINCR | Local database ID |
| name | TEXT | NOT NULL | Studio/site name |
| parent_id | INTEGER | FK → studios(id) | Parent studio/network ID |
| image_path | TEXT | Local file path to logo | |
| image_url | TEXT | Remote URL to logo | |
| description | TEXT | Studio description | |
| source | TEXT | Source name | |
| source_id | TEXT | ID at the source | |
| created_at | TEXT | NOT NULL, DEFAULT NOW | RFC3339 timestamp |
| updated_at | TEXT | NOT NULL, DEFAULT NOW | RFC3339 timestamp |
Indexes:
idx_studios_nameonname
Foreign Keys:
parent_idreferencesstudios(id)ON DELETE SET NULL
tags
Stores content tags and categories.
| Column | Type | Constraints | Description |
|---|---|---|---|
| id | INTEGER | PRIMARY KEY AUTOINCR | Local database ID |
| name | TEXT | NOT NULL, UNIQUE | Tag name |
| source | TEXT | Source name | |
| source_id | TEXT | ID at the source | |
| created_at | TEXT | NOT NULL, DEFAULT NOW | RFC3339 timestamp |
| updated_at | TEXT | NOT NULL, DEFAULT NOW | RFC3339 timestamp |
Indexes:
idx_tags_nameonname
scenes
Stores video scenes/titles.
| Column | Type | Constraints | Description |
|---|---|---|---|
| id | INTEGER | PRIMARY KEY AUTOINCR | Local database ID |
| title | TEXT | NOT NULL | Scene title |
| code | TEXT | DVD code, scene ID, etc. | |
| date | TEXT | Release date (ISO 8601) | |
| studio_id | INTEGER | FK → studios(id) | Studio ID |
| description | TEXT | Scene description/synopsis | |
| image_path | TEXT | Local file path to cover | |
| image_url | TEXT | Remote URL to cover | |
| director | TEXT | Director name | |
| url | TEXT | Scene URL | |
| source | TEXT | Source name | |
| source_id | TEXT | ID at the source | |
| created_at | TEXT | NOT NULL, DEFAULT NOW | RFC3339 timestamp |
| updated_at | TEXT | NOT NULL, DEFAULT NOW | RFC3339 timestamp |
Indexes:
idx_scenes_titleontitleidx_scenes_codeoncode
Foreign Keys:
studio_idreferencesstudios(id)ON DELETE SET NULL
scene_performers
Junction table for many-to-many relationship between scenes and performers.
| Column | Type | Constraints | Description |
|---|---|---|---|
| scene_id | INTEGER | FK → scenes(id) | Scene ID |
| performer_id | INTEGER | FK → performers(id) | Performer ID |
Primary Key: (scene_id, performer_id)
Foreign Keys:
scene_idreferencesscenes(id)ON DELETE CASCADEperformer_idreferencesperformers(id)ON DELETE CASCADE
scene_tags
Junction table for many-to-many relationship between scenes and tags.
| Column | Type | Constraints | Description |
|---|---|---|---|
| scene_id | INTEGER | FK → scenes(id) | Scene ID |
| tag_id | INTEGER | FK → tags(id) | Tag ID |
Primary Key: (scene_id, tag_id)
Foreign Keys:
scene_idreferencesscenes(id)ON DELETE CASCADEtag_idreferencestags(id)ON DELETE CASCADE
Common Queries
Search Performers
SELECT * FROM performers
WHERE name LIKE '%Riley%' OR aliases LIKE '%Riley%'
ORDER BY name;
Get Scene with All Related Data
-- Get scene
SELECT * FROM scenes WHERE id = ?;
-- Get performers for scene
SELECT p.* FROM performers p
JOIN scene_performers sp ON p.id = sp.performer_id
WHERE sp.scene_id = ?;
-- Get tags for scene
SELECT t.* FROM tags t
JOIN scene_tags st ON t.id = st.tag_id
WHERE st.scene_id = ?;
-- Get studio for scene
SELECT s.* FROM studios s
JOIN scenes sc ON s.id = sc.studio_id
WHERE sc.id = ?;
Find Scenes by Performer
SELECT s.* FROM scenes s
JOIN scene_performers sp ON s.id = sp.scene_id
JOIN performers p ON sp.performer_id = p.id
WHERE p.name LIKE '%Riley Reid%'
ORDER BY s.date DESC;
Get Studio Hierarchy
-- Get parent studios
WITH RECURSIVE studio_tree AS (
SELECT id, name, parent_id, 0 AS level
FROM studios
WHERE id = ?
UNION ALL
SELECT s.id, s.name, s.parent_id, st.level + 1
FROM studios s
JOIN studio_tree st ON s.parent_id = st.id
)
SELECT * FROM studio_tree;
Migration Strategy
For future schema changes:
- Version tracking in a
schema_versiontable - Migration scripts in
internal/db/migrations/ - Up/down migration support
- Automatic migration on startup
Performance Considerations
Current Indexes
- Name fields (performers, studios, tags, scenes)
- Code field (scenes)
Future Optimizations (v0.2+)
- Full-text search indexes (FTS5)
- Composite indexes for common queries
- Materialized views for complex joins
- Separate alias table for normalized storage
Backup and Restore
Backup
# SQLite backup
sqlite3 goondex.db ".backup goondex-backup.db"
# Or simple copy (safe with WAL mode)
cp goondex.db goondex-backup.db
Restore
# Copy backup over current
cp goondex-backup.db goondex.db
Database Maintenance
Analyze Statistics
ANALYZE;
Vacuum (Rebuild)
VACUUM;
Check Integrity
PRAGMA integrity_check;
View Database Info
PRAGMA database_list;
PRAGMA table_list;
PRAGMA foreign_key_check;