Goondex/docs/DATABASE_SCHEMA.md
Team Goon f7d82cd207 v0.1.0-dev2: Full TPDB integration with auto-fetch and comprehensive docs
Major Features:
-  Complete TPDB scraper implementation with real API calls
-  Auto-fetch on cache miss: search commands now automatically import from TPDB when not found locally
-  Comprehensive documentation (5 markdown files for Bookstack)
-  Import commands for performers, studios, and scenes
-  Fixed JSON type mismatches (aliases array, studio numeric IDs)

Changes:
1. TPDB Scraper (internal/scraper/tpdb/):
   - types.go: Full API response structures with correct types
     - PerformerResponse.Aliases: string → []string (TPDB returns array)
     - StudioResponse.ID: string → int (TPDB returns numeric IDs)
     - SiteInfo.ID: string → int (scenes reference studios by number)
   - mapper.go: Maps TPDB responses to internal models
     - Converts aliases array to comma-separated string
     - Converts numeric studio IDs to strings using strconv.Itoa()
   - scraper.go: Real HTTP client with Bearer token auth
     - SearchPerformers, SearchStudios, SearchScenes implemented
     - GetPerformerByID, GetStudioByID, GetSceneByID implemented

2. CLI Auto-Fetch (cmd/goondex/main.go):
   - performer-search: Auto-fetches from TPDB if local DB empty
   - studio-search: Auto-fetches from TPDB if local DB empty
   - scene-search: Auto-fetches basic metadata (no relationships)
   - Graceful handling of missing TPDB_API_KEY
   - Import → search again to get local IDs

3. Documentation (docs/):
   - INDEX.md: Documentation overview and navigation
   - ARCHITECTURE.md: System design, data flow, component diagrams
   - DATABASE_SCHEMA.md: Complete schema with relationships and indexes
   - CLI_REFERENCE.md: All commands with examples
   - TPDB_INTEGRATION.md: API guide, data mapping, best practices

4. Fixes:
   - .gitignore: Fixed pattern to allow cmd/goondex/* and cmd/goondexd/*
   - README: Updated to reflect TPDB integration and auto-fetch

Testing:
-  performer-search "Riley Reid" - auto-fetched 2 performers, cached
-  studio-search "Brazzers" - auto-fetched 12 studios, cached
-  Aliases now display correctly as comma-separated list
-  Studio IDs properly converted from numeric to string

API Integration:
- Base URL: https://api.theporndb.net
- Authentication: Bearer token via TPDB_API_KEY env var
- Endpoints: /performers, /sites, /scenes
- Rate limiting handled with warnings

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-14 22:04:23 -05:00

305 lines
12 KiB
Markdown

# Database Schema
## Overview
Goondex uses SQLite as its database engine with the following characteristics:
- **Journal Mode**: WAL (Write-Ahead Logging)
- **Foreign Keys**: Enabled
- **Date Format**: RFC3339 (ISO 8601)
- **Location**: Configurable (default: `./goondex.db`)
## Schema Diagram
```
┌──────────────┐ ┌──────────────────┐ ┌──────────────┐
│ performers │ │ scene_performers │ │ scenes │
├──────────────┤ ├──────────────────┤ ├──────────────┤
│ id (PK) │ │ scene_id (FK) │ │ id (PK) │
│ name │◄──────┤ performer_id(FK) ├──────►│ title │
│ aliases │ └──────────────────┘ │ code │
│ nationality │ │ date │
│ country │ │ studio_id(FK)│
│ gender │ │ description │
│ image_path │ │ image_path │
│ image_url │ │ image_url │
│ bio │ │ director │
│ source │ │ url │
│ source_id │ │ source │
│ created_at │ │ source_id │
│ updated_at │ │ created_at │
└──────────────┘ │ updated_at │
└──────────────┘
┌──────────────────┐ │
│ scene_tags │ │
├──────────────────┤ │
│ scene_id (FK) │◄─────────────────┘
│ tag_id (FK) ├──────┐
└──────────────────┘ │
┌──────────────┐ │
│ studios │ │
├──────────────┤ │
│ id (PK) │ │
│ name │ │
│ parent_id(FK)│────┐ │
│ image_path │ │(self-ref) │
│ image_url │ │ │
│ description │ │ │
│ source │ │ ▼
│ source_id │ │ ┌──────────────┐
│ created_at │ │ │ tags │
│ updated_at │ │ ├──────────────┤
└──────────────┘ │ │ id (PK) │
▲ │ │ name (UNIQUE)│
│ │ │ source │
│ │ │ source_id │
└───────────┘ │ created_at │
│ updated_at │
└──────────────┘
```
## Table Definitions
### `performers`
Stores performer/actor information.
| Column | Type | Constraints | Description |
|--------------|---------|-----------------------|--------------------------------|
| id | INTEGER | PRIMARY KEY AUTOINCR | Local database ID |
| name | TEXT | NOT NULL | Performer's name |
| aliases | TEXT | | Comma-separated aliases |
| nationality | TEXT | | ISO country code (e.g., "US") |
| country | TEXT | | Full country name |
| gender | TEXT | | male/female/trans/other |
| image_path | TEXT | | Local file path to image |
| image_url | TEXT | | Remote URL to image |
| bio | TEXT | | Biography/description |
| source | TEXT | | Source name (e.g., "tpdb") |
| source_id | TEXT | | ID at the source |
| created_at | TEXT | NOT NULL, DEFAULT NOW | RFC3339 timestamp |
| updated_at | TEXT | NOT NULL, DEFAULT NOW | RFC3339 timestamp |
**Indexes**:
- `idx_performers_name` on `name`
### `studios`
Stores production companies and networks.
| Column | Type | Constraints | Description |
|--------------|---------|-----------------------|--------------------------------|
| id | INTEGER | PRIMARY KEY AUTOINCR | Local database ID |
| name | TEXT | NOT NULL | Studio/site name |
| parent_id | INTEGER | FK → studios(id) | Parent studio/network ID |
| image_path | TEXT | | Local file path to logo |
| image_url | TEXT | | Remote URL to logo |
| description | TEXT | | Studio description |
| source | TEXT | | Source name |
| source_id | TEXT | | ID at the source |
| created_at | TEXT | NOT NULL, DEFAULT NOW | RFC3339 timestamp |
| updated_at | TEXT | NOT NULL, DEFAULT NOW | RFC3339 timestamp |
**Indexes**:
- `idx_studios_name` on `name`
**Foreign Keys**:
- `parent_id` references `studios(id)` ON DELETE SET NULL
### `tags`
Stores content tags and categories.
| Column | Type | Constraints | Description |
|--------------|---------|-----------------------|--------------------------------|
| id | INTEGER | PRIMARY KEY AUTOINCR | Local database ID |
| name | TEXT | NOT NULL, UNIQUE | Tag name |
| source | TEXT | | Source name |
| source_id | TEXT | | ID at the source |
| created_at | TEXT | NOT NULL, DEFAULT NOW | RFC3339 timestamp |
| updated_at | TEXT | NOT NULL, DEFAULT NOW | RFC3339 timestamp |
**Indexes**:
- `idx_tags_name` on `name`
### `scenes`
Stores video scenes/titles.
| Column | Type | Constraints | Description |
|--------------|---------|-----------------------|--------------------------------|
| id | INTEGER | PRIMARY KEY AUTOINCR | Local database ID |
| title | TEXT | NOT NULL | Scene title |
| code | TEXT | | DVD code, scene ID, etc. |
| date | TEXT | | Release date (ISO 8601) |
| studio_id | INTEGER | FK → studios(id) | Studio ID |
| description | TEXT | | Scene description/synopsis |
| image_path | TEXT | | Local file path to cover |
| image_url | TEXT | | Remote URL to cover |
| director | TEXT | | Director name |
| url | TEXT | | Scene URL |
| source | TEXT | | Source name |
| source_id | TEXT | | ID at the source |
| created_at | TEXT | NOT NULL, DEFAULT NOW | RFC3339 timestamp |
| updated_at | TEXT | NOT NULL, DEFAULT NOW | RFC3339 timestamp |
**Indexes**:
- `idx_scenes_title` on `title`
- `idx_scenes_code` on `code`
**Foreign Keys**:
- `studio_id` references `studios(id)` ON DELETE SET NULL
### `scene_performers`
Junction table for many-to-many relationship between scenes and performers.
| Column | Type | Constraints | Description |
|---------------|---------|----------------------------|----------------------|
| scene_id | INTEGER | FK → scenes(id) | Scene ID |
| performer_id | INTEGER | FK → performers(id) | Performer ID |
**Primary Key**: `(scene_id, performer_id)`
**Foreign Keys**:
- `scene_id` references `scenes(id)` ON DELETE CASCADE
- `performer_id` references `performers(id)` ON DELETE CASCADE
### `scene_tags`
Junction table for many-to-many relationship between scenes and tags.
| Column | Type | Constraints | Description |
|------------|---------|----------------------------|----------------------|
| scene_id | INTEGER | FK → scenes(id) | Scene ID |
| tag_id | INTEGER | FK → tags(id) | Tag ID |
**Primary Key**: `(scene_id, tag_id)`
**Foreign Keys**:
- `scene_id` references `scenes(id)` ON DELETE CASCADE
- `tag_id` references `tags(id)` ON DELETE CASCADE
## Common Queries
### Search Performers
```sql
SELECT * FROM performers
WHERE name LIKE '%Riley%' OR aliases LIKE '%Riley%'
ORDER BY name;
```
### Get Scene with All Related Data
```sql
-- Get scene
SELECT * FROM scenes WHERE id = ?;
-- Get performers for scene
SELECT p.* FROM performers p
JOIN scene_performers sp ON p.id = sp.performer_id
WHERE sp.scene_id = ?;
-- Get tags for scene
SELECT t.* FROM tags t
JOIN scene_tags st ON t.id = st.tag_id
WHERE st.scene_id = ?;
-- Get studio for scene
SELECT s.* FROM studios s
JOIN scenes sc ON s.id = sc.studio_id
WHERE sc.id = ?;
```
### Find Scenes by Performer
```sql
SELECT s.* FROM scenes s
JOIN scene_performers sp ON s.id = sp.scene_id
JOIN performers p ON sp.performer_id = p.id
WHERE p.name LIKE '%Riley Reid%'
ORDER BY s.date DESC;
```
### Get Studio Hierarchy
```sql
-- Get parent studios
WITH RECURSIVE studio_tree AS (
SELECT id, name, parent_id, 0 AS level
FROM studios
WHERE id = ?
UNION ALL
SELECT s.id, s.name, s.parent_id, st.level + 1
FROM studios s
JOIN studio_tree st ON s.parent_id = st.id
)
SELECT * FROM studio_tree;
```
## Migration Strategy
For future schema changes:
1. Version tracking in a `schema_version` table
2. Migration scripts in `internal/db/migrations/`
3. Up/down migration support
4. Automatic migration on startup
## Performance Considerations
### Current Indexes
- Name fields (performers, studios, tags, scenes)
- Code field (scenes)
### Future Optimizations (v0.2+)
- Full-text search indexes (FTS5)
- Composite indexes for common queries
- Materialized views for complex joins
- Separate alias table for normalized storage
## Backup and Restore
### Backup
```bash
# SQLite backup
sqlite3 goondex.db ".backup goondex-backup.db"
# Or simple copy (safe with WAL mode)
cp goondex.db goondex-backup.db
```
### Restore
```bash
# Copy backup over current
cp goondex-backup.db goondex.db
```
## Database Maintenance
### Analyze Statistics
```sql
ANALYZE;
```
### Vacuum (Rebuild)
```sql
VACUUM;
```
### Check Integrity
```sql
PRAGMA integrity_check;
```
### View Database Info
```sql
PRAGMA database_list;
PRAGMA table_list;
PRAGMA foreign_key_check;
```