Creates a new PROJECT_STATUS.md file to provide a clear and honest overview of the project's current state. This file summarizes which features are implemented, in progress, or planned, and highlights critical known issues. The main README.md has been updated to link to this new status page, ensuring it is the first thing new users and contributors see. This addresses the first and highest-priority item from the recent documentation audit, which identified a disconnect between the documentation and the actual implementation.
8.4 KiB
VideoTools TODO (v0.1.0-dev22+ plan)
This file tracks upcoming features, improvements, and known issues.
Documentation Alignment
Priority: High
- Audit and Tag Planned Features:
- Go through all
.mdfiles in thedocs/directory and the root. - For any feature that is described but not yet implemented, add a clear and consistent marker (e.g.,
[PLANNED]). - This will help manage user expectations and provide a more honest representation of the project's capabilities.
- Go through all
Critical Priority: dev22
VIDEO PLAYER IMPLEMENTATION
CRITICAL BLOCKER: All advanced features (enhancement, trim, advanced filters) depend on stable player foundation.
Current Player Issues (from PLAYER_PERFORMANCE_ISSUES.md):
-
Separate A/V Processes (lines 10184-10185 in main.go)
- Video and audio run in completely separate FFmpeg processes
- No synchronization mechanism between them
- They will inevitably drift apart, causing A/V desync and stuttering
- FIX: Implement unified FFmpeg process with multiplexed output
-
Audio Buffer Too Small (lines 8960, 9274 in main.go)
- Currently 8192 samples = 170ms buffer
- Modern systems need 100-200ms buffers for smooth playback
- FIX: Increase to 16384-32768 samples (340-680ms)
-
Volume Processing in Hot Path (lines 9294-9318 in main.go)
- Processes volume on EVERY audio sample in real-time
- CPU-intensive and blocks audio read loop
- FIX: Move volume processing to FFmpeg filters
-
Video Frame Pacing Issues (lines 9200-9203 in main.go)
- time.Sleep() is not precise, cumulative timing errors
- No correction mechanism if we fall behind
- FIX: Implement adaptive timing with drift correction
-
UI Thread Blocking (lines 9207-9215 in main.go)
- Frame updates queue up if UI thread is busy
- No frame dropping mechanism
- FIX: Implement proper frame buffer management
-
No Frame-Accurate Seeking (lines 10018-10028 in main.go)
- Seeking kills and restarts both FFmpeg processes
- 100-500ms gap during seek operations
- No keyframe awareness
- FIX: Implement frame-level seeking without process restart
Player Implementation Plan:
Phase 1: Foundation (Week 1-2)
-
Unified FFmpeg Architecture
- Single process with multiplexed A/V output using pipes
- Master clock reference for synchronization
- PTS-based drift correction mechanisms
- Ring buffers for audio and video
-
Hardware Acceleration Integration
- Auto-detect available backends (CUDA, VA-API, VideoToolbox)
- FFmpeg hardware acceleration through native flags
- Fallback to software acceleration when hardware unavailable
-
Frame Extraction System
- Frame extraction without restarting playback
- Keyframe detection and indexing
- Frame buffer pooling to reduce GC pressure
Phase 2: Core Features (Week 3-4)
-
Frame-Accurate Seeking
- Seek to specific frames without restarts
- Keyframe-aware seeking for performance
- Frame extraction at seek points for preview
-
Chapter System Integration
- Port scene detection from Author module
- Manual chapter support with keyframing
- Chapter navigation (next/previous)
- Chapter display in UI
-
Performance Optimization
- Adaptive frame timing with drift correction
- Frame dropping when UI thread can't keep up
- Memory pool management for frame buffers
- CPU usage optimization
Phase 3: Advanced Features (Week 5-6)
-
Preview System
- Real-time frame extraction
- Thumbnail generation from keyframes
- Frame buffer caching for previews
-
Error Recovery
- Graceful failure handling
- Resume capability after crashes
- Smart fallback mechanisms
ENHANCEMENT MODULE FOUNDATION
DEPENDS ON PLAYER COMPLETION
Current State:
- Basic filters module with color correction, sharpening, transforms
- Stylistic effects (8mm, 16mm, B&W Film, Silent Film, VHS, Webcam)
- AI upscaling with Real-ESRGAN integration
- Basic AI model management
- No content-aware processing
- No multi-pass enhancement pipeline
- No before/after preview system
Enhancement Module Plan:
Phase 1: Architecture (Week 1-2 - POST PLAYER)
-
Model Registry System
- Abstract AI model interface for easy extension
- Dynamic model discovery and registration
- Model requirements validation
- Configuration management for different model types
-
Content Detection Pipeline
- Automatic content type detection (general/anime/film)
- Quality assessment algorithms
- Progressive vs interlaced detection
- Artifact analysis (compression noise, film grain)
-
Unified Enhancement Workflow
- Combine Filters + Upscale into single module
- Content-aware model selection logic
- Multi-pass processing framework
- Quality preservation controls
Phase 2: Model Integration (Week 3-4)
-
Open-Source AI Model Expansion
- BasicVSR integration (video-specific super-resolution)
- RIFE models for frame interpolation
- Real-CUGan for anime/cartoon enhancement
- Model selection based on content type
-
Advanced Processing Features
- Sequential model application capabilities
- Custom enhancement pipeline creation
- Parameter fine-tuning for different models
- Quality vs Speed presets
TRIM MODULE ENHANCEMENT
DEPENDS ON PLAYER COMPLETION
Current State:
- Basic planning completed
- No timeline interface
- No frame-accurate cutting
- No chapter integration from Author module
Trim Module Plan:
Phase 1: Foundation (Week 1-2 - POST PLAYER)
-
Timeline Interface
- Frame-accurate timeline visualization
- Zoom capabilities for precise editing
- Scrubbing with real-time preview
- Time/frame dual display modes
-
Chapter Integration
- Import scene detection from Author module
- Manual chapter marker creation
- Chapter navigation controls
- Visual chapter markers on timeline
-
Frame-Accurate Cutting
- Exact frame selection for in/out points
- Preview before/after trim points
- Multiple segment trimming support
Phase 2: Advanced Features (Week 3-4)
- Smart Export System
- Lossless vs re-encode decision logic
- Format preservation when possible
- Quality-aware encoding settings
- Batch trimming operations
DOCUMENTATION UPDATES
- Create PLAYER_MODULE.md - Comprehensive player architecture documentation
- Update MODULES.md - Player and enhancement integration details
- Update ROADMAP.md - Player-first development strategy
- Create enhancement integration guide - How modules work together
- API documentation - Player interface for module developers
Future Enhancements (dev23+)
AI Model Expansion
- Diffusion-based models - SeedVR2, SVFR integration
- Advanced restoration - Scratch repair, dust removal, color fading
- Face enhancement - GFPGAN integration for portrait content
- Specialized models - Content-specific models (sports, archival, etc.)
Professional Features
- Batch enhancement queue - Process multiple videos with enhancement pipeline
- Hardware optimization - Multi-GPU support, memory management
- Export system - Professional format support (ProRes, DNxHD, etc.)
- Plugin architecture - Extensible system for community contributions
Integration Improvements
- Module communication - Seamless data flow between modules
- Unified settings - Shared configuration across modules
- Performance monitoring - Resource usage tracking and optimization
- Cross-platform testing - Linux, Windows, macOS parity
Technical Debt Addressed
Player Architecture
- Identified root causes of instability
- Planned Go-based unified solution
- Hardware acceleration strategy defined
- Frame-accurate seeking approach designed
Enhancement Strategy
- Open-source model ecosystem researched
- Scalable architecture designed
- Content-aware processing planned
- Future-proof model integration system
Notes
- Player stability is BLOCKER: Cannot proceed with enhancement features until player is stable
- Go implementation preferred: Maintains single codebase, excellent testing ecosystem
- Open-source focus: No commercial dependencies, community-driven model ecosystem
- Modular design: Each enhancement system can be developed and tested independently