VideoTools/TODO.md
Stu Leak 5d07d5bb61 feat(docs): create dedicated Windows installation guide
This commit introduces a new, comprehensive installation guide for
Windows users (INSTALL_WINDOWS.md) and refactors the main
INSTALLATION.md into a platform-agnostic hub.

This addresses the documentation gap for the Windows platform, providing
clear and distinct instructions for all supported operating systems.
2026-01-01 22:55:00 -05:00

8.4 KiB

VideoTools TODO (v0.1.0-dev22+ plan)

This file tracks upcoming features, improvements, and known issues.

Documentation: Fix Structural Errors

Priority: High

  • Audit All Docs for Broken Links:
    • Systematically check all 46 .md files for internal links that point to non-existent files or sections.
    • Create placeholder stubs for missing documents that are essential (e.g., CONTRIBUTING.md) or remove the links if they are not.
    • This ensures a professional and navigable documentation experience.

Critical Priority: dev22

VIDEO PLAYER IMPLEMENTATION

CRITICAL BLOCKER: All advanced features (enhancement, trim, advanced filters) depend on stable player foundation.

Current Player Issues (from PLAYER_PERFORMANCE_ISSUES.md):

  1. Separate A/V Processes (lines 10184-10185 in main.go)

    • Video and audio run in completely separate FFmpeg processes
    • No synchronization mechanism between them
    • They will inevitably drift apart, causing A/V desync and stuttering
    • FIX: Implement unified FFmpeg process with multiplexed output
  2. Audio Buffer Too Small (lines 8960, 9274 in main.go)

    • Currently 8192 samples = 170ms buffer
    • Modern systems need 100-200ms buffers for smooth playback
    • FIX: Increase to 16384-32768 samples (340-680ms)
  3. Volume Processing in Hot Path (lines 9294-9318 in main.go)

    • Processes volume on EVERY audio sample in real-time
    • CPU-intensive and blocks audio read loop
    • FIX: Move volume processing to FFmpeg filters
  4. Video Frame Pacing Issues (lines 9200-9203 in main.go)

    • time.Sleep() is not precise, cumulative timing errors
    • No correction mechanism if we fall behind
    • FIX: Implement adaptive timing with drift correction
  5. UI Thread Blocking (lines 9207-9215 in main.go)

    • Frame updates queue up if UI thread is busy
    • No frame dropping mechanism
    • FIX: Implement proper frame buffer management
  6. No Frame-Accurate Seeking (lines 10018-10028 in main.go)

    • Seeking kills and restarts both FFmpeg processes
    • 100-500ms gap during seek operations
    • No keyframe awareness
    • FIX: Implement frame-level seeking without process restart

Player Implementation Plan:

Phase 1: Foundation (Week 1-2)

  • Unified FFmpeg Architecture

    • Single process with multiplexed A/V output using pipes
    • Master clock reference for synchronization
    • PTS-based drift correction mechanisms
    • Ring buffers for audio and video
  • Hardware Acceleration Integration

    • Auto-detect available backends (CUDA, VA-API, VideoToolbox)
    • FFmpeg hardware acceleration through native flags
    • Fallback to software acceleration when hardware unavailable
  • Frame Extraction System

    • Frame extraction without restarting playback
    • Keyframe detection and indexing
    • Frame buffer pooling to reduce GC pressure

Phase 2: Core Features (Week 3-4)

  • Frame-Accurate Seeking

    • Seek to specific frames without restarts
    • Keyframe-aware seeking for performance
    • Frame extraction at seek points for preview
  • Chapter System Integration

    • Port scene detection from Author module
    • Manual chapter support with keyframing
    • Chapter navigation (next/previous)
    • Chapter display in UI
  • Performance Optimization

    • Adaptive frame timing with drift correction
    • Frame dropping when UI thread can't keep up
    • Memory pool management for frame buffers
    • CPU usage optimization

Phase 3: Advanced Features (Week 5-6)

  • Preview System

    • Real-time frame extraction
    • Thumbnail generation from keyframes
    • Frame buffer caching for previews
  • Error Recovery

    • Graceful failure handling
    • Resume capability after crashes
    • Smart fallback mechanisms

ENHANCEMENT MODULE FOUNDATION

DEPENDS ON PLAYER COMPLETION

Current State:

  • Basic filters module with color correction, sharpening, transforms
  • Stylistic effects (8mm, 16mm, B&W Film, Silent Film, VHS, Webcam)
  • AI upscaling with Real-ESRGAN integration
  • Basic AI model management
  • No content-aware processing
  • No multi-pass enhancement pipeline
  • No before/after preview system

Enhancement Module Plan:

Phase 1: Architecture (Week 1-2 - POST PLAYER)

  • Model Registry System

    • Abstract AI model interface for easy extension
    • Dynamic model discovery and registration
    • Model requirements validation
    • Configuration management for different model types
  • Content Detection Pipeline

    • Automatic content type detection (general/anime/film)
    • Quality assessment algorithms
    • Progressive vs interlaced detection
    • Artifact analysis (compression noise, film grain)
  • Unified Enhancement Workflow

    • Combine Filters + Upscale into single module
    • Content-aware model selection logic
    • Multi-pass processing framework
    • Quality preservation controls

Phase 2: Model Integration (Week 3-4)

  • Open-Source AI Model Expansion

    • BasicVSR integration (video-specific super-resolution)
    • RIFE models for frame interpolation
    • Real-CUGan for anime/cartoon enhancement
    • Model selection based on content type
  • Advanced Processing Features

    • Sequential model application capabilities
    • Custom enhancement pipeline creation
    • Parameter fine-tuning for different models
    • Quality vs Speed presets

TRIM MODULE ENHANCEMENT

DEPENDS ON PLAYER COMPLETION

Current State:

  • Basic planning completed
  • No timeline interface
  • No frame-accurate cutting
  • No chapter integration from Author module

Trim Module Plan:

Phase 1: Foundation (Week 1-2 - POST PLAYER)

  • Timeline Interface

    • Frame-accurate timeline visualization
    • Zoom capabilities for precise editing
    • Scrubbing with real-time preview
    • Time/frame dual display modes
  • Chapter Integration

    • Import scene detection from Author module
    • Manual chapter marker creation
    • Chapter navigation controls
    • Visual chapter markers on timeline
  • Frame-Accurate Cutting

    • Exact frame selection for in/out points
    • Preview before/after trim points
    • Multiple segment trimming support

Phase 2: Advanced Features (Week 3-4)

  • Smart Export System
    • Lossless vs re-encode decision logic
    • Format preservation when possible
    • Quality-aware encoding settings
    • Batch trimming operations

DOCUMENTATION UPDATES

  • Create PLAYER_MODULE.md - Comprehensive player architecture documentation
  • Update MODULES.md - Player and enhancement integration details
  • Update ROADMAP.md - Player-first development strategy
  • Create enhancement integration guide - How modules work together
  • API documentation - Player interface for module developers

Future Enhancements (dev23+)

AI Model Expansion

  • Diffusion-based models - SeedVR2, SVFR integration
  • Advanced restoration - Scratch repair, dust removal, color fading
  • Face enhancement - GFPGAN integration for portrait content
  • Specialized models - Content-specific models (sports, archival, etc.)

Professional Features

  • Batch enhancement queue - Process multiple videos with enhancement pipeline
  • Hardware optimization - Multi-GPU support, memory management
  • Export system - Professional format support (ProRes, DNxHD, etc.)
  • Plugin architecture - Extensible system for community contributions

Integration Improvements

  • Module communication - Seamless data flow between modules
  • Unified settings - Shared configuration across modules
  • Performance monitoring - Resource usage tracking and optimization
  • Cross-platform testing - Linux, Windows, macOS parity

Technical Debt Addressed

Player Architecture

  • Identified root causes of instability
  • Planned Go-based unified solution
  • Hardware acceleration strategy defined
  • Frame-accurate seeking approach designed

Enhancement Strategy

  • Open-source model ecosystem researched
  • Scalable architecture designed
  • Content-aware processing planned
  • Future-proof model integration system

Notes

  • Player stability is BLOCKER: Cannot proceed with enhancement features until player is stable
  • Go implementation preferred: Maintains single codebase, excellent testing ecosystem
  • Open-source focus: No commercial dependencies, community-driven model ecosystem
  • Modular design: Each enhancement system can be developed and tested independently