This commit introduces a new, comprehensive installation guide for Windows users (INSTALL_WINDOWS.md) and refactors the main INSTALLATION.md into a platform-agnostic hub. This addresses the documentation gap for the Windows platform, providing clear and distinct instructions for all supported operating systems.
8.4 KiB
VideoTools TODO (v0.1.0-dev22+ plan)
This file tracks upcoming features, improvements, and known issues.
Documentation: Fix Structural Errors
Priority: High
- Audit All Docs for Broken Links:
- Systematically check all 46
.mdfiles for internal links that point to non-existent files or sections. - Create placeholder stubs for missing documents that are essential (e.g.,
CONTRIBUTING.md) or remove the links if they are not. - This ensures a professional and navigable documentation experience.
- Systematically check all 46
Critical Priority: dev22
VIDEO PLAYER IMPLEMENTATION
CRITICAL BLOCKER: All advanced features (enhancement, trim, advanced filters) depend on stable player foundation.
Current Player Issues (from PLAYER_PERFORMANCE_ISSUES.md):
-
Separate A/V Processes (lines 10184-10185 in main.go)
- Video and audio run in completely separate FFmpeg processes
- No synchronization mechanism between them
- They will inevitably drift apart, causing A/V desync and stuttering
- FIX: Implement unified FFmpeg process with multiplexed output
-
Audio Buffer Too Small (lines 8960, 9274 in main.go)
- Currently 8192 samples = 170ms buffer
- Modern systems need 100-200ms buffers for smooth playback
- FIX: Increase to 16384-32768 samples (340-680ms)
-
Volume Processing in Hot Path (lines 9294-9318 in main.go)
- Processes volume on EVERY audio sample in real-time
- CPU-intensive and blocks audio read loop
- FIX: Move volume processing to FFmpeg filters
-
Video Frame Pacing Issues (lines 9200-9203 in main.go)
- time.Sleep() is not precise, cumulative timing errors
- No correction mechanism if we fall behind
- FIX: Implement adaptive timing with drift correction
-
UI Thread Blocking (lines 9207-9215 in main.go)
- Frame updates queue up if UI thread is busy
- No frame dropping mechanism
- FIX: Implement proper frame buffer management
-
No Frame-Accurate Seeking (lines 10018-10028 in main.go)
- Seeking kills and restarts both FFmpeg processes
- 100-500ms gap during seek operations
- No keyframe awareness
- FIX: Implement frame-level seeking without process restart
Player Implementation Plan:
Phase 1: Foundation (Week 1-2)
-
Unified FFmpeg Architecture
- Single process with multiplexed A/V output using pipes
- Master clock reference for synchronization
- PTS-based drift correction mechanisms
- Ring buffers for audio and video
-
Hardware Acceleration Integration
- Auto-detect available backends (CUDA, VA-API, VideoToolbox)
- FFmpeg hardware acceleration through native flags
- Fallback to software acceleration when hardware unavailable
-
Frame Extraction System
- Frame extraction without restarting playback
- Keyframe detection and indexing
- Frame buffer pooling to reduce GC pressure
Phase 2: Core Features (Week 3-4)
-
Frame-Accurate Seeking
- Seek to specific frames without restarts
- Keyframe-aware seeking for performance
- Frame extraction at seek points for preview
-
Chapter System Integration
- Port scene detection from Author module
- Manual chapter support with keyframing
- Chapter navigation (next/previous)
- Chapter display in UI
-
Performance Optimization
- Adaptive frame timing with drift correction
- Frame dropping when UI thread can't keep up
- Memory pool management for frame buffers
- CPU usage optimization
Phase 3: Advanced Features (Week 5-6)
-
Preview System
- Real-time frame extraction
- Thumbnail generation from keyframes
- Frame buffer caching for previews
-
Error Recovery
- Graceful failure handling
- Resume capability after crashes
- Smart fallback mechanisms
ENHANCEMENT MODULE FOUNDATION
DEPENDS ON PLAYER COMPLETION
Current State:
- Basic filters module with color correction, sharpening, transforms
- Stylistic effects (8mm, 16mm, B&W Film, Silent Film, VHS, Webcam)
- AI upscaling with Real-ESRGAN integration
- Basic AI model management
- No content-aware processing
- No multi-pass enhancement pipeline
- No before/after preview system
Enhancement Module Plan:
Phase 1: Architecture (Week 1-2 - POST PLAYER)
-
Model Registry System
- Abstract AI model interface for easy extension
- Dynamic model discovery and registration
- Model requirements validation
- Configuration management for different model types
-
Content Detection Pipeline
- Automatic content type detection (general/anime/film)
- Quality assessment algorithms
- Progressive vs interlaced detection
- Artifact analysis (compression noise, film grain)
-
Unified Enhancement Workflow
- Combine Filters + Upscale into single module
- Content-aware model selection logic
- Multi-pass processing framework
- Quality preservation controls
Phase 2: Model Integration (Week 3-4)
-
Open-Source AI Model Expansion
- BasicVSR integration (video-specific super-resolution)
- RIFE models for frame interpolation
- Real-CUGan for anime/cartoon enhancement
- Model selection based on content type
-
Advanced Processing Features
- Sequential model application capabilities
- Custom enhancement pipeline creation
- Parameter fine-tuning for different models
- Quality vs Speed presets
TRIM MODULE ENHANCEMENT
DEPENDS ON PLAYER COMPLETION
Current State:
- Basic planning completed
- No timeline interface
- No frame-accurate cutting
- No chapter integration from Author module
Trim Module Plan:
Phase 1: Foundation (Week 1-2 - POST PLAYER)
-
Timeline Interface
- Frame-accurate timeline visualization
- Zoom capabilities for precise editing
- Scrubbing with real-time preview
- Time/frame dual display modes
-
Chapter Integration
- Import scene detection from Author module
- Manual chapter marker creation
- Chapter navigation controls
- Visual chapter markers on timeline
-
Frame-Accurate Cutting
- Exact frame selection for in/out points
- Preview before/after trim points
- Multiple segment trimming support
Phase 2: Advanced Features (Week 3-4)
- Smart Export System
- Lossless vs re-encode decision logic
- Format preservation when possible
- Quality-aware encoding settings
- Batch trimming operations
DOCUMENTATION UPDATES
- Create PLAYER_MODULE.md - Comprehensive player architecture documentation
- Update MODULES.md - Player and enhancement integration details
- Update ROADMAP.md - Player-first development strategy
- Create enhancement integration guide - How modules work together
- API documentation - Player interface for module developers
Future Enhancements (dev23+)
AI Model Expansion
- Diffusion-based models - SeedVR2, SVFR integration
- Advanced restoration - Scratch repair, dust removal, color fading
- Face enhancement - GFPGAN integration for portrait content
- Specialized models - Content-specific models (sports, archival, etc.)
Professional Features
- Batch enhancement queue - Process multiple videos with enhancement pipeline
- Hardware optimization - Multi-GPU support, memory management
- Export system - Professional format support (ProRes, DNxHD, etc.)
- Plugin architecture - Extensible system for community contributions
Integration Improvements
- Module communication - Seamless data flow between modules
- Unified settings - Shared configuration across modules
- Performance monitoring - Resource usage tracking and optimization
- Cross-platform testing - Linux, Windows, macOS parity
Technical Debt Addressed
Player Architecture
- Identified root causes of instability
- Planned Go-based unified solution
- Hardware acceleration strategy defined
- Frame-accurate seeking approach designed
Enhancement Strategy
- Open-source model ecosystem researched
- Scalable architecture designed
- Content-aware processing planned
- Future-proof model integration system
Notes
- Player stability is BLOCKER: Cannot proceed with enhancement features until player is stable
- Go implementation preferred: Maintains single codebase, excellent testing ecosystem
- Open-source focus: No commercial dependencies, community-driven model ecosystem
- Modular design: Each enhancement system can be developed and tested independently