Goondex/docs/ROADMAP.md

6.4 KiB
Raw Blame History

File: docs/ROADMAP.md
Version: v0.3.4
Last updated: November 2025
Maintainer: Leak Technologies
Project: Goondex


Goondex Development Roadmap

Purpose:
Outline upcoming milestones, version objectives, and long-term development goals for the Goondex ecosystem.
This roadmap provides an overview of feature direction, architectural priorities, and research-driven enhancements.


  1. Project Vision

Goondex is designed as an automated, privacy-respecting adult content cataloguer, focused on:

  • Intelligent tagging and metadata curation
  • Machine-learning assisted gallery organization
  • Local-first, offline-friendly operation
  • Open, modular, and human-readable data formats

The system evolves through iterative versioning with strong emphasis on stability, transparency, and reproducibility.


  1. Version Milestones

v0.3.x — Consolidation Phase

Status: Active
Goals:

  • Finalize CLI alias and stable import structure
  • Standardize metadata.json schema and YAML dictionaries
  • Document all core systems (CLI, Galleries, Tagging, Branding)
  • Implement validation tools for dictionaries and index integrity
  • Ensure consistency across all module imports (src.importer.*)
  • Establish internal branding and developer documentation standards

v0.4.x — Machine Learning Integration

Planned Start: December 2025
Goals:

  • Introduce ML dataset builder and embedding engine
  • Add hybrid (text + image) search support
  • Implement GroundingDINO + Grounded-SAM detection pipeline
  • Build attribute heads for ethnicity, gender, and clothing
  • Introduce semantic tag inference based on contextual cues
  • Develop auto-thumbnail generator for galleries
  • Establish foundation for future “Goondex ML Core”

v0.5.x — Visual Intelligence and Automation

Planned Start: Q1 2026
Goals:

  • Expand ML integration to support local fine-tuning
  • Train local model for visual tagging (SAM, CLIP, BLIP2)
  • Enable partial face and body region detection
  • Add scene clustering (e.g., “bathroom scenes”, “studio sets”)
  • Improve NLP-based title parsing for better model recognition
  • Integrate hybrid similarity search (image-to-gallery)

v0.6.x — Web Interface & UX

Planned Start: Q2 2026
Goals:

  • Create lightweight local Web UI for browsing and search
  • Add thumbnail preview grid for galleries
  • Support filtering by tag, performer, or source
  • Allow tag editing via UI (writes to metadata.json)
  • Visualize ML embeddings as clusters or heatmaps
  • Introduce color-coded category icons based on tag domains

v0.7.x — Multi-Source Expansion

Planned Start: Q3 2026
Goals:

  • Add support for multiple import sources (e.g., TheHun, Fapello)
  • Normalize cross-site metadata into unified schema
  • Introduce per-site tag mappings for source-specific categories
  • Develop rate-limiting, retries, and error resilience for scraping
  • Expand YAML dictionaries to include new tag categories

v0.8.x — Semantic Intelligence & AI Curation

Planned Start: Q4 2026
Goals:

  • Train in-house multimodal model for semantic gallery tagging
  • Support “smart tagging” with probabilistic tag confidence
  • Implement user feedback learning loop for refinement
  • Add multilingual tag inference (English, French, German)
  • Develop automatic duplicate detection and merge logic
  • Add story-based inference (scene context across images)

v0.9.x — Optimization & Deployment

Planned Start: 2027
Goals:

  • Package as standalone application with installer
  • Implement database indexing for instant search
  • Optimize YAML and JSON parsing for large collections
  • Introduce CLI subcommands for advanced maintenance tasks
  • Add backup, restore, and migration tools
  • Begin Linux packaging (PKGBUILD, Flatpak manifest)

v1.0.0 — Stable Release

Planned Start: 2027
Goals:

  • Fully modular architecture with plugin-based importers
  • Complete Web UI parity with CLI functionality
  • Documented API endpoints for local integrations
  • Export system for JSONL / CSV / ML dataset sync
  • Full automated test coverage and build pipeline
  • Public release of “Goondex ML Core” dataset format

  1. Research & Experimental Branches

ML-Research Branch:

  • Embedding fusion experiments (textimage hybrid)
  • Visual attribute detection fine-tuning using CLIP variants
  • Performance benchmark on local consumer GPUs

Tag-Lab Branch:

  • Dynamic tag clustering using sentence-transformers
  • Contextual tagging prototype (scene recognition)
  • Human-assisted tag correction feedback loop

Web-UI Branch:

  • Minimalist grid-based gallery explorer
  • Tag filters with real-time search
  • RESTful interface backed by FastAPI

  1. Long-Term Goals

  • Local inference pipeline fully independent from cloud APIs
  • Optional privacy layer for encrypted gallery indexing
  • On-device fine-tuning for user-specific preferences
  • Extend beyond adult content into broader visual media indexing
  • Formalize Goondex Metadata Specification (GMS 1.0) for interoperability

  1. Development Philosophy

  • Local-first: all functions must work offline
  • Transparent: all data stored in readable YAML/JSON
  • Modular: each subsystem must be independently testable
  • Ethical: prioritizes privacy and non-exploitative content handling
  • Accessible: written with clear documentation and open interfaces

End of File