Major Features: - ✅ Complete TPDB scraper implementation with real API calls - ✅ Auto-fetch on cache miss: search commands now automatically import from TPDB when not found locally - ✅ Comprehensive documentation (5 markdown files for Bookstack) - ✅ Import commands for performers, studios, and scenes - ✅ Fixed JSON type mismatches (aliases array, studio numeric IDs) Changes: 1. TPDB Scraper (internal/scraper/tpdb/): - types.go: Full API response structures with correct types - PerformerResponse.Aliases: string → []string (TPDB returns array) - StudioResponse.ID: string → int (TPDB returns numeric IDs) - SiteInfo.ID: string → int (scenes reference studios by number) - mapper.go: Maps TPDB responses to internal models - Converts aliases array to comma-separated string - Converts numeric studio IDs to strings using strconv.Itoa() - scraper.go: Real HTTP client with Bearer token auth - SearchPerformers, SearchStudios, SearchScenes implemented - GetPerformerByID, GetStudioByID, GetSceneByID implemented 2. CLI Auto-Fetch (cmd/goondex/main.go): - performer-search: Auto-fetches from TPDB if local DB empty - studio-search: Auto-fetches from TPDB if local DB empty - scene-search: Auto-fetches basic metadata (no relationships) - Graceful handling of missing TPDB_API_KEY - Import → search again to get local IDs 3. Documentation (docs/): - INDEX.md: Documentation overview and navigation - ARCHITECTURE.md: System design, data flow, component diagrams - DATABASE_SCHEMA.md: Complete schema with relationships and indexes - CLI_REFERENCE.md: All commands with examples - TPDB_INTEGRATION.md: API guide, data mapping, best practices 4. Fixes: - .gitignore: Fixed pattern to allow cmd/goondex/* and cmd/goondexd/* - README: Updated to reflect TPDB integration and auto-fetch Testing: - ✅ performer-search "Riley Reid" - auto-fetched 2 performers, cached - ✅ studio-search "Brazzers" - auto-fetched 12 studios, cached - ✅ Aliases now display correctly as comma-separated list - ✅ Studio IDs properly converted from numeric to string API Integration: - Base URL: https://api.theporndb.net - Authentication: Bearer token via TPDB_API_KEY env var - Endpoints: /performers, /sites, /scenes - Rate limiting handled with warnings 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
305 lines
12 KiB
Markdown
305 lines
12 KiB
Markdown
# Database Schema
|
|
|
|
## Overview
|
|
|
|
Goondex uses SQLite as its database engine with the following characteristics:
|
|
- **Journal Mode**: WAL (Write-Ahead Logging)
|
|
- **Foreign Keys**: Enabled
|
|
- **Date Format**: RFC3339 (ISO 8601)
|
|
- **Location**: Configurable (default: `./goondex.db`)
|
|
|
|
## Schema Diagram
|
|
|
|
```
|
|
┌──────────────┐ ┌──────────────────┐ ┌──────────────┐
|
|
│ performers │ │ scene_performers │ │ scenes │
|
|
├──────────────┤ ├──────────────────┤ ├──────────────┤
|
|
│ id (PK) │ │ scene_id (FK) │ │ id (PK) │
|
|
│ name │◄──────┤ performer_id(FK) ├──────►│ title │
|
|
│ aliases │ └──────────────────┘ │ code │
|
|
│ nationality │ │ date │
|
|
│ country │ │ studio_id(FK)│
|
|
│ gender │ │ description │
|
|
│ image_path │ │ image_path │
|
|
│ image_url │ │ image_url │
|
|
│ bio │ │ director │
|
|
│ source │ │ url │
|
|
│ source_id │ │ source │
|
|
│ created_at │ │ source_id │
|
|
│ updated_at │ │ created_at │
|
|
└──────────────┘ │ updated_at │
|
|
└──────────────┘
|
|
│
|
|
│
|
|
┌──────────────────┐ │
|
|
│ scene_tags │ │
|
|
├──────────────────┤ │
|
|
│ scene_id (FK) │◄─────────────────┘
|
|
│ tag_id (FK) ├──────┐
|
|
└──────────────────┘ │
|
|
│
|
|
┌──────────────┐ │
|
|
│ studios │ │
|
|
├──────────────┤ │
|
|
│ id (PK) │ │
|
|
│ name │ │
|
|
│ parent_id(FK)│────┐ │
|
|
│ image_path │ │(self-ref) │
|
|
│ image_url │ │ │
|
|
│ description │ │ │
|
|
│ source │ │ ▼
|
|
│ source_id │ │ ┌──────────────┐
|
|
│ created_at │ │ │ tags │
|
|
│ updated_at │ │ ├──────────────┤
|
|
└──────────────┘ │ │ id (PK) │
|
|
▲ │ │ name (UNIQUE)│
|
|
│ │ │ source │
|
|
│ │ │ source_id │
|
|
└───────────┘ │ created_at │
|
|
│ updated_at │
|
|
└──────────────┘
|
|
```
|
|
|
|
## Table Definitions
|
|
|
|
### `performers`
|
|
|
|
Stores performer/actor information.
|
|
|
|
| Column | Type | Constraints | Description |
|
|
|--------------|---------|-----------------------|--------------------------------|
|
|
| id | INTEGER | PRIMARY KEY AUTOINCR | Local database ID |
|
|
| name | TEXT | NOT NULL | Performer's name |
|
|
| aliases | TEXT | | Comma-separated aliases |
|
|
| nationality | TEXT | | ISO country code (e.g., "US") |
|
|
| country | TEXT | | Full country name |
|
|
| gender | TEXT | | male/female/trans/other |
|
|
| image_path | TEXT | | Local file path to image |
|
|
| image_url | TEXT | | Remote URL to image |
|
|
| bio | TEXT | | Biography/description |
|
|
| source | TEXT | | Source name (e.g., "tpdb") |
|
|
| source_id | TEXT | | ID at the source |
|
|
| created_at | TEXT | NOT NULL, DEFAULT NOW | RFC3339 timestamp |
|
|
| updated_at | TEXT | NOT NULL, DEFAULT NOW | RFC3339 timestamp |
|
|
|
|
**Indexes**:
|
|
- `idx_performers_name` on `name`
|
|
|
|
### `studios`
|
|
|
|
Stores production companies and networks.
|
|
|
|
| Column | Type | Constraints | Description |
|
|
|--------------|---------|-----------------------|--------------------------------|
|
|
| id | INTEGER | PRIMARY KEY AUTOINCR | Local database ID |
|
|
| name | TEXT | NOT NULL | Studio/site name |
|
|
| parent_id | INTEGER | FK → studios(id) | Parent studio/network ID |
|
|
| image_path | TEXT | | Local file path to logo |
|
|
| image_url | TEXT | | Remote URL to logo |
|
|
| description | TEXT | | Studio description |
|
|
| source | TEXT | | Source name |
|
|
| source_id | TEXT | | ID at the source |
|
|
| created_at | TEXT | NOT NULL, DEFAULT NOW | RFC3339 timestamp |
|
|
| updated_at | TEXT | NOT NULL, DEFAULT NOW | RFC3339 timestamp |
|
|
|
|
**Indexes**:
|
|
- `idx_studios_name` on `name`
|
|
|
|
**Foreign Keys**:
|
|
- `parent_id` references `studios(id)` ON DELETE SET NULL
|
|
|
|
### `tags`
|
|
|
|
Stores content tags and categories.
|
|
|
|
| Column | Type | Constraints | Description |
|
|
|--------------|---------|-----------------------|--------------------------------|
|
|
| id | INTEGER | PRIMARY KEY AUTOINCR | Local database ID |
|
|
| name | TEXT | NOT NULL, UNIQUE | Tag name |
|
|
| source | TEXT | | Source name |
|
|
| source_id | TEXT | | ID at the source |
|
|
| created_at | TEXT | NOT NULL, DEFAULT NOW | RFC3339 timestamp |
|
|
| updated_at | TEXT | NOT NULL, DEFAULT NOW | RFC3339 timestamp |
|
|
|
|
**Indexes**:
|
|
- `idx_tags_name` on `name`
|
|
|
|
### `scenes`
|
|
|
|
Stores video scenes/titles.
|
|
|
|
| Column | Type | Constraints | Description |
|
|
|--------------|---------|-----------------------|--------------------------------|
|
|
| id | INTEGER | PRIMARY KEY AUTOINCR | Local database ID |
|
|
| title | TEXT | NOT NULL | Scene title |
|
|
| code | TEXT | | DVD code, scene ID, etc. |
|
|
| date | TEXT | | Release date (ISO 8601) |
|
|
| studio_id | INTEGER | FK → studios(id) | Studio ID |
|
|
| description | TEXT | | Scene description/synopsis |
|
|
| image_path | TEXT | | Local file path to cover |
|
|
| image_url | TEXT | | Remote URL to cover |
|
|
| director | TEXT | | Director name |
|
|
| url | TEXT | | Scene URL |
|
|
| source | TEXT | | Source name |
|
|
| source_id | TEXT | | ID at the source |
|
|
| created_at | TEXT | NOT NULL, DEFAULT NOW | RFC3339 timestamp |
|
|
| updated_at | TEXT | NOT NULL, DEFAULT NOW | RFC3339 timestamp |
|
|
|
|
**Indexes**:
|
|
- `idx_scenes_title` on `title`
|
|
- `idx_scenes_code` on `code`
|
|
|
|
**Foreign Keys**:
|
|
- `studio_id` references `studios(id)` ON DELETE SET NULL
|
|
|
|
### `scene_performers`
|
|
|
|
Junction table for many-to-many relationship between scenes and performers.
|
|
|
|
| Column | Type | Constraints | Description |
|
|
|---------------|---------|----------------------------|----------------------|
|
|
| scene_id | INTEGER | FK → scenes(id) | Scene ID |
|
|
| performer_id | INTEGER | FK → performers(id) | Performer ID |
|
|
|
|
**Primary Key**: `(scene_id, performer_id)`
|
|
|
|
**Foreign Keys**:
|
|
- `scene_id` references `scenes(id)` ON DELETE CASCADE
|
|
- `performer_id` references `performers(id)` ON DELETE CASCADE
|
|
|
|
### `scene_tags`
|
|
|
|
Junction table for many-to-many relationship between scenes and tags.
|
|
|
|
| Column | Type | Constraints | Description |
|
|
|------------|---------|----------------------------|----------------------|
|
|
| scene_id | INTEGER | FK → scenes(id) | Scene ID |
|
|
| tag_id | INTEGER | FK → tags(id) | Tag ID |
|
|
|
|
**Primary Key**: `(scene_id, tag_id)`
|
|
|
|
**Foreign Keys**:
|
|
- `scene_id` references `scenes(id)` ON DELETE CASCADE
|
|
- `tag_id` references `tags(id)` ON DELETE CASCADE
|
|
|
|
## Common Queries
|
|
|
|
### Search Performers
|
|
|
|
```sql
|
|
SELECT * FROM performers
|
|
WHERE name LIKE '%Riley%' OR aliases LIKE '%Riley%'
|
|
ORDER BY name;
|
|
```
|
|
|
|
### Get Scene with All Related Data
|
|
|
|
```sql
|
|
-- Get scene
|
|
SELECT * FROM scenes WHERE id = ?;
|
|
|
|
-- Get performers for scene
|
|
SELECT p.* FROM performers p
|
|
JOIN scene_performers sp ON p.id = sp.performer_id
|
|
WHERE sp.scene_id = ?;
|
|
|
|
-- Get tags for scene
|
|
SELECT t.* FROM tags t
|
|
JOIN scene_tags st ON t.id = st.tag_id
|
|
WHERE st.scene_id = ?;
|
|
|
|
-- Get studio for scene
|
|
SELECT s.* FROM studios s
|
|
JOIN scenes sc ON s.id = sc.studio_id
|
|
WHERE sc.id = ?;
|
|
```
|
|
|
|
### Find Scenes by Performer
|
|
|
|
```sql
|
|
SELECT s.* FROM scenes s
|
|
JOIN scene_performers sp ON s.id = sp.scene_id
|
|
JOIN performers p ON sp.performer_id = p.id
|
|
WHERE p.name LIKE '%Riley Reid%'
|
|
ORDER BY s.date DESC;
|
|
```
|
|
|
|
### Get Studio Hierarchy
|
|
|
|
```sql
|
|
-- Get parent studios
|
|
WITH RECURSIVE studio_tree AS (
|
|
SELECT id, name, parent_id, 0 AS level
|
|
FROM studios
|
|
WHERE id = ?
|
|
|
|
UNION ALL
|
|
|
|
SELECT s.id, s.name, s.parent_id, st.level + 1
|
|
FROM studios s
|
|
JOIN studio_tree st ON s.parent_id = st.id
|
|
)
|
|
SELECT * FROM studio_tree;
|
|
```
|
|
|
|
## Migration Strategy
|
|
|
|
For future schema changes:
|
|
1. Version tracking in a `schema_version` table
|
|
2. Migration scripts in `internal/db/migrations/`
|
|
3. Up/down migration support
|
|
4. Automatic migration on startup
|
|
|
|
## Performance Considerations
|
|
|
|
### Current Indexes
|
|
- Name fields (performers, studios, tags, scenes)
|
|
- Code field (scenes)
|
|
|
|
### Future Optimizations (v0.2+)
|
|
- Full-text search indexes (FTS5)
|
|
- Composite indexes for common queries
|
|
- Materialized views for complex joins
|
|
- Separate alias table for normalized storage
|
|
|
|
## Backup and Restore
|
|
|
|
### Backup
|
|
```bash
|
|
# SQLite backup
|
|
sqlite3 goondex.db ".backup goondex-backup.db"
|
|
|
|
# Or simple copy (safe with WAL mode)
|
|
cp goondex.db goondex-backup.db
|
|
```
|
|
|
|
### Restore
|
|
```bash
|
|
# Copy backup over current
|
|
cp goondex-backup.db goondex.db
|
|
```
|
|
|
|
## Database Maintenance
|
|
|
|
### Analyze Statistics
|
|
```sql
|
|
ANALYZE;
|
|
```
|
|
|
|
### Vacuum (Rebuild)
|
|
```sql
|
|
VACUUM;
|
|
```
|
|
|
|
### Check Integrity
|
|
```sql
|
|
PRAGMA integrity_check;
|
|
```
|
|
|
|
### View Database Info
|
|
```sql
|
|
PRAGMA database_list;
|
|
PRAGMA table_list;
|
|
PRAGMA foreign_key_check;
|
|
```
|