185 lines
6.1 KiB
Markdown
185 lines
6.1 KiB
Markdown
File: docs/CLI_USAGE.md
|
|
Version: v0.4.2
|
|
Last updated: November 2025
|
|
Maintainer: Leak Technologies
|
|
Project: Goondex
|
|
|
|
------------------------------------------------------------
|
|
Goondex CLI Usage Guide
|
|
------------------------------------------------------------
|
|
|
|
Purpose:
|
|
Provide a full command reference for importing, tagging, validating, and searching PornPics galleries using the Goondex command-line interface.
|
|
|
|
------------------------------------------------------------
|
|
Overview
|
|
------------------------------------------------------------
|
|
|
|
The Goondex CLI provides a unified workflow to:
|
|
|
|
1. Import and refresh PornPics galleries
|
|
2. Automatically tag galleries using YAML dictionaries
|
|
3. Manage sources and metadata through a single command entrypoint
|
|
4. Generate statistics and validation reports
|
|
5. Build and search machine learning datasets (hybrid text + image)
|
|
|
|
Project root:
|
|
~/Projects/PD/PornPics_Importer/Porndex_PornpicsImporter/
|
|
|
|
------------------------------------------------------------
|
|
1. Importing Galleries
|
|
------------------------------------------------------------
|
|
|
|
Quick Import (preferred):
|
|
goondex import "https://www.pornpics.com/galleries/<gallery-id>/"
|
|
|
|
Process:
|
|
- Creates a new folder in Galleries/<timestamp>_<models>_<title>/
|
|
- Downloads all images (threaded)
|
|
- Saves metadata.json
|
|
- Auto-tags the gallery using refresh-one
|
|
- Rebuilds the global index
|
|
|
|
Legacy method:
|
|
python src/importer/gallery_importer.py "https://www.pornpics.com/galleries/<gallery-id>/"
|
|
|
|
------------------------------------------------------------
|
|
2. Refreshing Metadata
|
|
------------------------------------------------------------
|
|
|
|
Refresh all galleries:
|
|
python src/importer/gallery_importer.py --refresh-all
|
|
|
|
Function:
|
|
- Re-fetches metadata for every gallery that has a source_url
|
|
- Merges new fields without overwriting local tags
|
|
- Automatically re-applies tag inference
|
|
- Rebuilds Galleries/index.json
|
|
|
|
------------------------------------------------------------
|
|
3. Tag Management
|
|
------------------------------------------------------------
|
|
|
|
Unified syntax:
|
|
goondex <command> [args...]
|
|
|
|
Common operations:
|
|
refresh-all → refresh tags for all galleries
|
|
refresh-one "<folder>" → refresh tags for a single gallery
|
|
validate-tags → validate YAML tag dictionaries
|
|
tag-stats → generate frequency report (saved to src/importer/reports)
|
|
list → list all galleries
|
|
list-tags "<folder>" → show tags for one gallery
|
|
add "<folder>" "Tag" → add a tag manually
|
|
remove "<folder>" "Tag" → remove a tag manually
|
|
add-multi "<folder>" "Tag1,Tag2" → add multiple tags at once
|
|
show-metadata "<folder>" → view metadata.json content
|
|
source "<folder>" set "Source" → set a single source
|
|
source bulk set "Source" → set the same source for all galleries
|
|
|
|
Tag inference uses YAML dictionaries stored under:
|
|
src/importer/tag_dictionaries/
|
|
|
|
------------------------------------------------------------
|
|
4. TPDB Performer Bridge (optional)
|
|
------------------------------------------------------------
|
|
|
|
Command:
|
|
python -m performers.tpdb_bridge <cmd> [flags]
|
|
|
|
Common flags:
|
|
check-key, fetch, fill-index, enrich, sync-all
|
|
list-sources, add-source, delete-source
|
|
verify-enrichment --export-json
|
|
|
|
Database:
|
|
src/importer/db/performers.db
|
|
|
|
Reports:
|
|
src/importer/reports/
|
|
|
|
------------------------------------------------------------
|
|
5. Example Workflow
|
|
------------------------------------------------------------
|
|
|
|
Import a gallery:
|
|
goondex import "https://www.pornpics.com/galleries/<id>/"
|
|
|
|
Refresh tags for one folder:
|
|
goondex refresh-one "<folder-name>"
|
|
|
|
Validate YAML dictionaries:
|
|
goondex validate-tags
|
|
|
|
Generate tag statistics:
|
|
goondex tag-stats
|
|
|
|
------------------------------------------------------------
|
|
6. Machine Learning (ML) Pipeline
|
|
------------------------------------------------------------
|
|
|
|
Dataset builder:
|
|
python -m ml.ml_dataset_builder
|
|
|
|
Creates file:
|
|
ML/porndex_dataset.jsonl
|
|
|
|
Example entry:
|
|
{
|
|
"gallery_id": "...",
|
|
"title": "...",
|
|
"models": ["..."],
|
|
"tags": ["..."],
|
|
"categories": ["..."],
|
|
"image_paths": [".../Galleries/.../001.jpg"]
|
|
}
|
|
|
|
Build hybrid embeddings:
|
|
python -m ml.ml_embeddings build --img-samples 8 --device auto
|
|
|
|
Outputs:
|
|
ML/embeddings/<gallery_id>.npz
|
|
ML/embeddings_index.jsonl
|
|
|
|
Search modes:
|
|
python -m ml.ml_embeddings search "japanese redhead creampie"
|
|
python -m ml.ml_embeddings search "japanese redhead creampie" --index text
|
|
python -m ml.ml_embeddings search "interracial bbc" --mode strict
|
|
|
|
Verify embedding integrity:
|
|
python -m ml.ml_embeddings verify
|
|
|
|
------------------------------------------------------------
|
|
7. Data Locations
|
|
------------------------------------------------------------
|
|
|
|
Galleries/ → imported galleries and images
|
|
Galleries/index.json → master index of all galleries
|
|
src/importer/reports/ → YAML validation and statistics reports
|
|
ML/porndex_dataset.jsonl → ML dataset definition
|
|
ML/embeddings/ → embedding vector files
|
|
ML/embeddings_index.jsonl → search index for semantic lookups
|
|
|
|
------------------------------------------------------------
|
|
8. Roadmap (post-v0.4.2)
|
|
------------------------------------------------------------
|
|
|
|
- Integrate GroundingDINO + Grounded-SAM for localized object detection
|
|
- Add attribute heads for gender, ethnicity, and clothing
|
|
- Develop an active-learning loop to refine weakly-labeled data
|
|
- Introduce interactive tag editor for review and correction
|
|
|
|
------------------------------------------------------------
|
|
Notes
|
|
------------------------------------------------------------
|
|
|
|
All commands operate locally and offline.
|
|
Rebuilding datasets and embeddings is safe and idempotent.
|
|
Importer auto-tags new galleries using YAML dictionaries by default.
|
|
All modules adhere to the clean modular design outlined in ARCHITECTURE.md.
|
|
Versioned documentation ensures clarity between CLI and code versions.
|
|
|
|
------------------------------------------------------------
|
|
End of File
|
|
------------------------------------------------------------
|