Goondex/docs/ROADMAP.md

File: docs/ROADMAP.md
Version: v0.3.4
Last updated: November 2025
Maintainer: Leak Technologies
Project: Goondex

------------------------------------------------------------
Goondex Development Roadmap
------------------------------------------------------------

Purpose:
Outline upcoming milestones, version objectives, and long-term development goals for the Goondex ecosystem.
This roadmap provides an overview of feature direction, architectural priorities, and research-driven enhancements.

------------------------------------------------------------
1. Project Vision
------------------------------------------------------------

Goondex is designed as an automated, privacy-respecting adult content cataloguer, focused on:
- Intelligent tagging and metadata curation
- Machine-learning assisted gallery organization
- Local-first, offline-friendly operation
- Open, modular, and human-readable data formats

The system evolves through iterative versioning with strong emphasis on stability, transparency, and reproducibility.

------------------------------------------------------------
2. Version Milestones
------------------------------------------------------------

v0.3.x — Consolidation Phase
------------------------------------
Status: Active
Goals:
- Finalize CLI alias and stable import structure
- Standardize metadata.json schema and YAML dictionaries
- Document all core systems (CLI, Galleries, Tagging, Branding)
- Implement validation tools for dictionaries and index integrity
- Ensure consistency across all module imports (src.importer.*)
- Establish internal branding and developer documentation standards

v0.4.x — Machine Learning Integration
------------------------------------
Planned Start: December 2025
Goals:
- Introduce ML dataset builder and embedding engine
- Add hybrid (text + image) search support
- Implement GroundingDINO + Grounded-SAM detection pipeline
- Build attribute heads for ethnicity, gender, and clothing
- Introduce semantic tag inference based on contextual cues
- Develop auto-thumbnail generator for galleries
- Establish foundation for future “Goondex ML Core”

v0.5.x — Visual Intelligence and Automation
------------------------------------
Planned Start: Q1 2026
Goals:
- Expand ML integration to support local fine-tuning
- Train local model for visual tagging (SAM, CLIP, BLIP2)
- Enable partial face and body region detection
- Add scene clustering (e.g., “bathroom scenes”, “studio sets”)
- Improve NLP-based title parsing for better model recognition
- Integrate hybrid similarity search (image-to-gallery)

v0.6.x — Web Interface & UX
------------------------------------
Planned Start: Q2 2026
Goals:
- Create lightweight local Web UI for browsing and search
- Add thumbnail preview grid for galleries
- Support filtering by tag, performer, or source
- Allow tag editing via UI (writes to metadata.json)
- Visualize ML embeddings as clusters or heatmaps
- Introduce color-coded category icons based on tag domains

v0.7.x — Multi-Source Expansion
------------------------------------
Planned Start: Q3 2026
Goals:
- Add support for multiple import sources (e.g., TheHun, Fapello)
- Normalize cross-site metadata into unified schema
- Introduce per-site tag mappings for source-specific categories
- Develop rate-limiting, retries, and error resilience for scraping
- Expand YAML dictionaries to include new tag categories

v0.8.x — Semantic Intelligence & AI Curation
------------------------------------
Planned Start: Q4 2026
Goals:
- Train in-house multimodal model for semantic gallery tagging
- Support “smart tagging” with probabilistic tag confidence
- Implement user feedback learning loop for refinement
- Add multilingual tag inference (English, French, German)
- Develop automatic duplicate detection and merge logic
- Add story-based inference (scene context across images)

v0.9.x — Optimization & Deployment
------------------------------------
Planned Start: 2027
Goals:
- Package as standalone application with installer
- Implement database indexing for instant search
- Optimize YAML and JSON parsing for large collections
- Introduce CLI subcommands for advanced maintenance tasks
- Add backup, restore, and migration tools
- Begin Linux packaging (PKGBUILD, Flatpak manifest)

v1.0.0 — Stable Release
------------------------------------
Planned Start: 2027
Goals:
- Fully modular architecture with plugin-based importers
- Complete Web UI parity with CLI functionality
- Documented API endpoints for local integrations
- Export system for JSONL / CSV / ML dataset sync
- Full automated test coverage and build pipeline
- Public release of “Goondex ML Core” dataset format

------------------------------------------------------------
3. Research & Experimental Branches
------------------------------------------------------------

ML-Research Branch:
- Embedding fusion experiments (text–image hybrid)
- Visual attribute detection fine-tuning using CLIP variants
- Performance benchmark on local consumer GPUs

Tag-Lab Branch:
- Dynamic tag clustering using sentence-transformers
- Contextual tagging prototype (scene recognition)
- Human-assisted tag correction feedback loop

Web-UI Branch:
- Minimalist grid-based gallery explorer
- Tag filters with real-time search
- RESTful interface backed by FastAPI

------------------------------------------------------------
4. Long-Term Goals
------------------------------------------------------------

- Local inference pipeline fully independent from cloud APIs
- Optional privacy layer for encrypted gallery indexing
- On-device fine-tuning for user-specific preferences
- Extend beyond adult content into broader visual media indexing
- Formalize Goondex Metadata Specification (GMS 1.0) for interoperability

------------------------------------------------------------
5. Development Philosophy
------------------------------------------------------------

- Local-first: all functions must work offline
- Transparent: all data stored in readable YAML/JSON
- Modular: each subsystem must be independently testable
- Ethical: prioritizes privacy and non-exploitative content handling
- Accessible: written with clear documentation and open interfaces

------------------------------------------------------------
End of File
------------------------------------------------------------