Goondex/docs/CLI_USAGE.md

191 lines
6.3 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

File: docs/CLI_USAGE.md
Version: v0.3.4
Last updated: November 2025
Maintainer: Leak Technologies
Project: Goondex
------------------------------------------------------------
Goondex CLI Usage Guide
------------------------------------------------------------
Purpose:
Provide a full command reference for importing, tagging, validating, and searching PornPics galleries using the Goondex command-line interface.
------------------------------------------------------------
Overview
------------------------------------------------------------
The Goondex CLI provides a unified workflow to:
1. Import and refresh PornPics galleries
2. Automatically tag galleries using YAML dictionaries
3. Manage sources and metadata through a single command entrypoint
4. Generate statistics and validation reports
5. Build and search machine learning datasets (hybrid text + image)
Project root:
~/Projects/PD/Goondex/
------------------------------------------------------------
1. Importing Galleries
------------------------------------------------------------
Quick Import (preferred):
goondex import "https://www.pornpics.com/galleries/<gallery-id>/"
Process:
- Creates a new folder under Galleries/<timestamp>_<models>_<title>/
- Downloads all images (threaded)
- Saves metadata.json
- Auto-tags the gallery using refresh-one
- Rebuilds the global index
Legacy Python call (deprecated):
python src/importer/gallery_importer.py "https://www.pornpics.com/galleries/<gallery-id>/"
------------------------------------------------------------
2. Refreshing Metadata and Tags
------------------------------------------------------------
Refresh all galleries:
goondex refresh-all
Refresh a single gallery:
goondex refresh-one "<folder>"
Function:
- Re-infers tags for one or all galleries
- Updates metadata.json in place
- Logs summary and statistics to src/importer/reports/
------------------------------------------------------------
3. Tag Management
------------------------------------------------------------
Unified syntax:
goondex <command> [args...]
Common operations:
refresh-all → Refresh tags for all galleries
refresh-one "<folder>" → Refresh tags for a single gallery
validate-tags → Validate YAML tag dictionaries
tag-stats → Generate frequency report (saved to src/importer/reports)
list → List all galleries
list-tags "<folder>" → Show tags for one gallery
add "<folder>" "Tag" → Add a tag manually
remove "<folder>" "Tag" → Remove a tag manually
add-multi "<folder>" "Tag1,Tag2" → Add multiple tags at once
show-metadata "<folder>" → View full metadata.json
source "<folder>" set "Source" → Set a single source
source bulk set "Source" → Set the same source for all galleries
debug "<folder>" → Run detailed tag inference diagnostics
Tag inference uses YAML dictionaries stored under:
src/importer/tag_dictionaries/
------------------------------------------------------------
4. Debug Mode
------------------------------------------------------------
The debug command runs detailed inference diagnostics for a single gallery.
Example:
goondex debug "20251106_2041_Madison Young_ATK_Archives_starring_Madison_Young_Porn_XXX"
Output:
- Shows every keywordtag match in [DEBUG] matched 'keyword' → Tag format
- Displays the total inferred tags
- Uses full ANSI color support
- Useful for verifying YAML dictionary behavior and tuning inference
Typical output:
[i] Running debug inference for 20251106_2041_Madison Young_...
[DEBUG] matched 'hairy' → Hairy
[DEBUG] matched 'office' → Office
...
[✓] Debug complete.
------------------------------------------------------------
5. Example Workflow
------------------------------------------------------------
Import a gallery:
goondex import "https://www.pornpics.com/galleries/<id>/"
Refresh tags for one folder:
goondex refresh-one "<folder-name>"
Validate YAML dictionaries:
goondex validate-tags
Generate tag statistics:
goondex tag-stats
Run a tag inference debug session:
goondex debug "<folder>"
------------------------------------------------------------
6. Machine Learning (ML) Pipeline
------------------------------------------------------------
Dataset builder:
goondex ml build-dataset
Creates file:
ML/porndex_dataset.jsonl
Example entry:
{
"gallery_id": "...",
"title": "...",
"models": ["..."],
"tags": ["..."],
"categories": ["..."],
"image_paths": [".../Galleries/.../001.jpg"]
}
Build hybrid embeddings:
goondex ml build-embeddings --img-samples 8 --device auto
Outputs:
ML/embeddings/<gallery_id>.npz
ML/embeddings_index.jsonl
Search examples:
goondex ml search "redhead teen"
goondex ml search "interracial bbc" --mode strict
------------------------------------------------------------
7. Data Locations
------------------------------------------------------------
Galleries/ → Imported galleries and images
Galleries/index.json → Master index of all galleries
src/importer/reports/ → YAML validation and statistics reports
ML/porndex_dataset.jsonl → ML dataset definition
ML/embeddings/ → Embedding vector files
ML/embeddings_index.jsonl → Search index for semantic lookups
------------------------------------------------------------
8. Roadmap (post-v0.3.4)
------------------------------------------------------------
- Integrate GroundingDINO + Grounded-SAM for localized object detection
- Add attribute heads for gender, ethnicity, and clothing
- Develop an active-learning loop to refine weakly-labeled data
- Introduce interactive tag editor for review and correction
------------------------------------------------------------
Notes
------------------------------------------------------------
All commands operate locally and offline.
Rebuilding datasets and embeddings is safe and idempotent.
Importer auto-tags new galleries using YAML dictionaries by default.
All modules adhere to the clean modular design outlined in ARCHITECTURE.md.
Versioned documentation ensures clarity between CLI and code versions.
------------------------------------------------------------
End of File
------------------------------------------------------------