- Implement full web interface with Go html/template server - Add GX component library (buttons, dialogs, tables, forms, etc.) - Create scene/performer/studio/movie detail and listing pages - Add Adult Empire scraper for additional metadata sources - Implement movie support with database schema - Add import and sync services for data management - Include comprehensive API and frontend documentation - Add custom color scheme and responsive layout 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
9.0 KiB
Adult Empire Scraper Integration
Version: v0.1.0-dev4 Last Updated: 2025-11-16
Overview
Goondex now includes a full-featured Adult Empire scraper based on the Stash app's scraping architecture. This allows you to fetch metadata, cover art, and performer information directly from Adult Empire (adultdvdempire.com).
Features
✅ Scene Scraping
- Extract scene title, description, release date
- Download cover art/thumbnails
- Retrieve studio information
- Get performer lists
- Extract tags/categories
- Scene code/SKU
- Director information
✅ Performer Scraping
- Extract performer name, aliases
- Download profile images
- Retrieve birthdate, ethnicity, nationality
- Physical attributes (height, measurements, hair/eye color)
- Biography text
✅ Search Functionality
- Search scenes by title
- Search performers by name
- Get search results with thumbnails
Architecture
The Adult Empire scraper is implemented in /internal/scraper/adultemp/ with the following components:
Files
types.go- Data structures for scraped contentclient.go- HTTP client with cookie/session managementxpath.go- XPath parsing utilities for HTML extractionscraper.go- Main scraper implementation
Components
┌─────────────────┐
│ Scraper API │ - ScrapeSceneByURL()
│ │ - ScrapePerformerByURL()
│ │ - SearchScenesByName()
│ │ - SearchPerformersByName()
└────────┬────────┘
│
▼
┌─────────────────┐
│ HTTP Client │ - Cookie jar for sessions
│ │ - Age verification
│ │ - Auth token support
└────────┬────────┘
│
▼
┌─────────────────┐
│ XPath Parser │ - Extract data from HTML
│ │ - Parse dates, heights
│ │ - Clean text content
└─────────────────┘
Usage
Authentication (Optional)
For full access to Adult Empire content, you can set an authentication token:
scraper, err := adultemp.NewScraper()
if err != nil {
log.Fatal(err)
}
// Optional: Set your Adult Empire session token
scraper.SetAuthToken("your-etoken-here")
Getting your etoken:
- Log into adultdvdempire.com
- Open browser DevTools (F12)
- Go to Application → Cookies → adultdvdempire.com
- Copy the value of the
etokencookie
Scrape a Scene by URL
ctx := context.Background()
sceneData, err := scraper.ScrapeSceneByURL(ctx, "https://www.adultdvdempire.com/12345/scene-name")
if err != nil {
log.Fatal(err)
}
// Convert to Goondex model
scene := scraper.ConvertSceneToModel(sceneData)
// Save to database
// db.Scenes.Create(scene)
Search for Scenes
results, err := scraper.SearchScenesByName(ctx, "scene title")
if err != nil {
log.Fatal(err)
}
for _, result := range results {
fmt.Printf("Title: %s\n", result.Title)
fmt.Printf("URL: %s\n", result.URL)
fmt.Printf("Image: %s\n", result.Image)
}
Scrape a Performer
performerData, err := scraper.ScrapePerformerByURL(ctx, "https://www.adultdvdempire.com/performer/12345/name")
if err != nil {
log.Fatal(err)
}
// Convert to Goondex model
performer := scraper.ConvertPerformerToModel(performerData)
Search for Performers
results, err := scraper.SearchPerformersByName(ctx, "performer name")
if err != nil {
log.Fatal(err)
}
for _, result := range results {
fmt.Printf("Name: %s\n", result.Title)
fmt.Printf("URL: %s\n", result.URL)
}
Data Structures
SceneData
type SceneData struct {
Title string // Scene title
URL string // Adult Empire URL
Date string // Release date
Studio string // Studio name
Image string // Cover image URL
Description string // Synopsis/description
Performers []string // List of performer names
Tags []string // Categories/tags
Code string // Scene code/SKU
Director string // Director name
}
PerformerData
type PerformerData struct {
Name string // Performer name
URL string // Adult Empire URL
Image string // Profile image URL
Birthdate string // Date of birth
Ethnicity string // Ethnicity
Country string // Country of origin
Height string // Height (converted to cm)
Measurements string // Body measurements
HairColor string // Hair color
EyeColor string // Eye color
Biography string // Bio text
Aliases []string // Alternative names
}
XPath Selectors
The scraper uses XPath to extract data from Adult Empire pages. Key selectors include:
Scene Selectors
- Title:
//h1[@class='title'] - Date:
//div[@class='release-date']/text() - Studio:
//a[contains(@href, '/studio/')]/text() - Image:
//div[@class='item-image']//img/@src - Description:
//div[@class='synopsis'] - Performers:
//a[contains(@href, '/performer/')]/text() - Tags:
//a[contains(@href, '/category/')]/text()
Performer Selectors
- Name:
//h1[@class='performer-name'] - Image:
//div[@class='performer-image']//img/@src - Birthdate:
//span[@class='birthdate']/text() - Height:
//span[@class='height']/text() - Bio:
//div[@class='bio']
Note: Adult Empire may change their HTML structure. If scraping fails, XPath selectors in scraper.go may need updates.
Utilities
Date Parsing
dateStr := ParseDate("Jan 15, 2024") // Handles various formats
Height Conversion
heightCm := ParseHeight("5'6\"") // Converts feet/inches to cm (168)
Text Cleaning
cleanedText := CleanText(rawHTML) // Removes "Show More/Less" and extra whitespace
URL Normalization
fullURL := ExtractURL("/path/to/scene", "https://www.adultdvdempire.com")
// Returns: "https://www.adultdvdempire.com/path/to/scene"
Integration with Goondex
The Adult Empire scraper integrates seamlessly with the existing Goondex architecture:
- Scrape data from Adult Empire using the scraper
- Convert to Goondex models using converter functions
- Save to the database using existing stores
- Display in the web UI with cover art and metadata
Example Workflow
// 1. Search for a scene
results, _ := scraper.SearchScenesByName(ctx, "scene name")
// 2. Pick the first result and scrape full details
sceneData, _ := scraper.ScrapeSceneByURL(ctx, results[0].URL)
// 3. Convert to Goondex model
scene := scraper.ConvertSceneToModel(sceneData)
// 4. Save to database
sceneStore := db.NewSceneStore(database)
sceneStore.Create(scene)
// 5. Now it appears in the web UI!
Future Enhancements
Planned improvements for the Adult Empire scraper:
- ⏳ Bulk Import - Import entire studios or series
- ⏳ Auto-Update - Periodically refresh metadata
- ⏳ Image Caching - Download and cache cover art locally
- ⏳ Duplicate Detection - Avoid importing the same scene twice
- ⏳ Advanced Search - Filter by studio, date range, tags
- ⏳ Web UI Integration - Search and import from the dashboard
Troubleshooting
"Failed to parse HTML"
- The Adult Empire page structure may have changed
- Update XPath selectors in
scraper.go
"Request failed: 403 Forbidden"
- You may need to set an auth token
- Adult Empire may be blocking automated requests
- Try setting a valid
etokencookie
"No results found"
- Check that the search query is correct
- Adult Empire search may have different spelling
- Try broader search terms
Scene/Performer data incomplete
- Some fields may not be present on all pages
- XPath selectors may need adjustment
- Check the raw HTML to verify field availability
Comparison with TPDB Scraper
| Feature | TPDB | Adult Empire |
|---|---|---|
| API | ✅ Official JSON API | ❌ HTML scraping |
| Auth | ✅ API key | ⚠️ Session cookie |
| Rate Limits | ✅ Documented | ⚠️ Unknown |
| Stability | ✅ Stable schema | ⚠️ May change |
| Coverage | ✅ Comprehensive | ✅ Comprehensive |
| Images | ✅ High quality | ✅ High quality |
Recommendation: Use TPDB as the primary source and Adult Empire as a fallback or supplemental source.
Contributing
To improve Adult Empire scraping:
- Update XPath selectors if Adult Empire changes their HTML
- Add support for additional fields
- Improve date/height parsing
- Add more robust error handling
Version History
- v0.1.0-dev4 (2025-11-16): Initial Adult Empire scraper implementation
- HTTP client with cookie support
- XPath parsing utilities
- Scene and performer scraping
- Search functionality
- Model conversion utilities
Last Updated: 2025-11-16 Maintainer: Goondex Team