Goondex/docs/ROADMAP.md

161 lines
6.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

File: docs/ROADMAP.md
Version: v0.3.4
Last updated: November 2025
Maintainer: Leak Technologies
Project: Goondex
------------------------------------------------------------
Goondex Development Roadmap
------------------------------------------------------------
Purpose:
Outline upcoming milestones, version objectives, and long-term development goals for the Goondex ecosystem.
This roadmap provides an overview of feature direction, architectural priorities, and research-driven enhancements.
------------------------------------------------------------
1. Project Vision
------------------------------------------------------------
Goondex is designed as an automated, privacy-respecting adult content cataloguer, focused on:
- Intelligent tagging and metadata curation
- Machine-learning assisted gallery organization
- Local-first, offline-friendly operation
- Open, modular, and human-readable data formats
The system evolves through iterative versioning with strong emphasis on stability, transparency, and reproducibility.
------------------------------------------------------------
2. Version Milestones
------------------------------------------------------------
v0.3.x — Consolidation Phase
------------------------------------
Status: Active
Goals:
- Finalize CLI alias and stable import structure
- Standardize metadata.json schema and YAML dictionaries
- Document all core systems (CLI, Galleries, Tagging, Branding)
- Implement validation tools for dictionaries and index integrity
- Ensure consistency across all module imports (src.importer.*)
- Establish internal branding and developer documentation standards
v0.4.x — Machine Learning Integration
------------------------------------
Planned Start: December 2025
Goals:
- Introduce ML dataset builder and embedding engine
- Add hybrid (text + image) search support
- Implement GroundingDINO + Grounded-SAM detection pipeline
- Build attribute heads for ethnicity, gender, and clothing
- Introduce semantic tag inference based on contextual cues
- Develop auto-thumbnail generator for galleries
- Establish foundation for future “Goondex ML Core”
v0.5.x — Visual Intelligence and Automation
------------------------------------
Planned Start: Q1 2026
Goals:
- Expand ML integration to support local fine-tuning
- Train local model for visual tagging (SAM, CLIP, BLIP2)
- Enable partial face and body region detection
- Add scene clustering (e.g., “bathroom scenes”, “studio sets”)
- Improve NLP-based title parsing for better model recognition
- Integrate hybrid similarity search (image-to-gallery)
v0.6.x — Web Interface & UX
------------------------------------
Planned Start: Q2 2026
Goals:
- Create lightweight local Web UI for browsing and search
- Add thumbnail preview grid for galleries
- Support filtering by tag, performer, or source
- Allow tag editing via UI (writes to metadata.json)
- Visualize ML embeddings as clusters or heatmaps
- Introduce color-coded category icons based on tag domains
v0.7.x — Multi-Source Expansion
------------------------------------
Planned Start: Q3 2026
Goals:
- Add support for multiple import sources (e.g., TheHun, Fapello)
- Normalize cross-site metadata into unified schema
- Introduce per-site tag mappings for source-specific categories
- Develop rate-limiting, retries, and error resilience for scraping
- Expand YAML dictionaries to include new tag categories
v0.8.x — Semantic Intelligence & AI Curation
------------------------------------
Planned Start: Q4 2026
Goals:
- Train in-house multimodal model for semantic gallery tagging
- Support “smart tagging” with probabilistic tag confidence
- Implement user feedback learning loop for refinement
- Add multilingual tag inference (English, French, German)
- Develop automatic duplicate detection and merge logic
- Add story-based inference (scene context across images)
v0.9.x — Optimization & Deployment
------------------------------------
Planned Start: 2027
Goals:
- Package as standalone application with installer
- Implement database indexing for instant search
- Optimize YAML and JSON parsing for large collections
- Introduce CLI subcommands for advanced maintenance tasks
- Add backup, restore, and migration tools
- Begin Linux packaging (PKGBUILD, Flatpak manifest)
v1.0.0 — Stable Release
------------------------------------
Planned Start: 2027
Goals:
- Fully modular architecture with plugin-based importers
- Complete Web UI parity with CLI functionality
- Documented API endpoints for local integrations
- Export system for JSONL / CSV / ML dataset sync
- Full automated test coverage and build pipeline
- Public release of “Goondex ML Core” dataset format
------------------------------------------------------------
3. Research & Experimental Branches
------------------------------------------------------------
ML-Research Branch:
- Embedding fusion experiments (textimage hybrid)
- Visual attribute detection fine-tuning using CLIP variants
- Performance benchmark on local consumer GPUs
Tag-Lab Branch:
- Dynamic tag clustering using sentence-transformers
- Contextual tagging prototype (scene recognition)
- Human-assisted tag correction feedback loop
Web-UI Branch:
- Minimalist grid-based gallery explorer
- Tag filters with real-time search
- RESTful interface backed by FastAPI
------------------------------------------------------------
4. Long-Term Goals
------------------------------------------------------------
- Local inference pipeline fully independent from cloud APIs
- Optional privacy layer for encrypted gallery indexing
- On-device fine-tuning for user-specific preferences
- Extend beyond adult content into broader visual media indexing
- Formalize Goondex Metadata Specification (GMS 1.0) for interoperability
------------------------------------------------------------
5. Development Philosophy
------------------------------------------------------------
- Local-first: all functions must work offline
- Transparent: all data stored in readable YAML/JSON
- Modular: each subsystem must be independently testable
- Ethical: prioritizes privacy and non-exploitative content handling
- Accessible: written with clear documentation and open interfaces
------------------------------------------------------------
End of File
------------------------------------------------------------