This commit introduces a new, comprehensive installation guide for Windows users (INSTALL_WINDOWS.md) and refactors the main INSTALLATION.md into a platform-agnostic hub. This addresses the documentation gap for the Windows platform, providing clear and distinct instructions for all supported operating systems.
233 lines
8.4 KiB
Markdown
233 lines
8.4 KiB
Markdown
# VideoTools TODO (v0.1.0-dev22+ plan)
|
|
|
|
This file tracks upcoming features, improvements, and known issues.
|
|
|
|
## Documentation: Fix Structural Errors
|
|
|
|
**Priority:** High
|
|
|
|
- [ ] **Audit All Docs for Broken Links:**
|
|
- Systematically check all 46 `.md` files for internal links that point to non-existent files or sections.
|
|
- Create placeholder stubs for missing documents that are essential (e.g., `CONTRIBUTING.md`) or remove the links if they are not.
|
|
- This ensures a professional and navigable documentation experience.
|
|
|
|
## Critical Priority: dev22
|
|
|
|
### VIDEO PLAYER IMPLEMENTATION
|
|
|
|
**CRITICAL BLOCKER:** All advanced features (enhancement, trim, advanced filters) depend on stable player foundation.
|
|
|
|
#### Current Player Issues (from PLAYER_PERFORMANCE_ISSUES.md):
|
|
|
|
1. **Separate A/V Processes** (lines 10184-10185 in main.go)
|
|
- Video and audio run in completely separate FFmpeg processes
|
|
- No synchronization mechanism between them
|
|
- They will inevitably drift apart, causing A/V desync and stuttering
|
|
- **FIX:** Implement unified FFmpeg process with multiplexed output
|
|
|
|
2. **Audio Buffer Too Small** (lines 8960, 9274 in main.go)
|
|
- Currently 8192 samples = 170ms buffer
|
|
- Modern systems need 100-200ms buffers for smooth playback
|
|
- **FIX:** Increase to 16384-32768 samples (340-680ms)
|
|
|
|
3. **Volume Processing in Hot Path** (lines 9294-9318 in main.go)
|
|
- Processes volume on EVERY audio sample in real-time
|
|
- CPU-intensive and blocks audio read loop
|
|
- **FIX:** Move volume processing to FFmpeg filters
|
|
|
|
4. **Video Frame Pacing Issues** (lines 9200-9203 in main.go)
|
|
- time.Sleep() is not precise, cumulative timing errors
|
|
- No correction mechanism if we fall behind
|
|
- **FIX:** Implement adaptive timing with drift correction
|
|
|
|
5. **UI Thread Blocking** (lines 9207-9215 in main.go)
|
|
- Frame updates queue up if UI thread is busy
|
|
- No frame dropping mechanism
|
|
- **FIX:** Implement proper frame buffer management
|
|
|
|
6. **No Frame-Accurate Seeking** (lines 10018-10028 in main.go)
|
|
- Seeking kills and restarts both FFmpeg processes
|
|
- 100-500ms gap during seek operations
|
|
- No keyframe awareness
|
|
- **FIX:** Implement frame-level seeking without process restart
|
|
|
|
#### Player Implementation Plan:
|
|
|
|
**Phase 1: Foundation (Week 1-2)**
|
|
- [ ] **Unified FFmpeg Architecture**
|
|
- Single process with multiplexed A/V output using pipes
|
|
- Master clock reference for synchronization
|
|
- PTS-based drift correction mechanisms
|
|
- Ring buffers for audio and video
|
|
|
|
- [ ] **Hardware Acceleration Integration**
|
|
- Auto-detect available backends (CUDA, VA-API, VideoToolbox)
|
|
- FFmpeg hardware acceleration through native flags
|
|
- Fallback to software acceleration when hardware unavailable
|
|
|
|
- [ ] **Frame Extraction System**
|
|
- Frame extraction without restarting playback
|
|
- Keyframe detection and indexing
|
|
- Frame buffer pooling to reduce GC pressure
|
|
|
|
**Phase 2: Core Features (Week 3-4)**
|
|
- [ ] **Frame-Accurate Seeking**
|
|
- Seek to specific frames without restarts
|
|
- Keyframe-aware seeking for performance
|
|
- Frame extraction at seek points for preview
|
|
|
|
- [ ] **Chapter System Integration**
|
|
- Port scene detection from Author module
|
|
- Manual chapter support with keyframing
|
|
- Chapter navigation (next/previous)
|
|
- Chapter display in UI
|
|
|
|
- [ ] **Performance Optimization**
|
|
- Adaptive frame timing with drift correction
|
|
- Frame dropping when UI thread can't keep up
|
|
- Memory pool management for frame buffers
|
|
- CPU usage optimization
|
|
|
|
**Phase 3: Advanced Features (Week 5-6)**
|
|
- [ ] **Preview System**
|
|
- Real-time frame extraction
|
|
- Thumbnail generation from keyframes
|
|
- Frame buffer caching for previews
|
|
|
|
- [ ] **Error Recovery**
|
|
- Graceful failure handling
|
|
- Resume capability after crashes
|
|
- Smart fallback mechanisms
|
|
|
|
### ENHANCEMENT MODULE FOUNDATION
|
|
|
|
**DEPENDS ON PLAYER COMPLETION**
|
|
|
|
#### Current State:
|
|
- [X] Basic filters module with color correction, sharpening, transforms
|
|
- [X] Stylistic effects (8mm, 16mm, B&W Film, Silent Film, VHS, Webcam)
|
|
- [X] AI upscaling with Real-ESRGAN integration
|
|
- [X] Basic AI model management
|
|
- [ ] No content-aware processing
|
|
- [ ] No multi-pass enhancement pipeline
|
|
- [ ] No before/after preview system
|
|
|
|
#### Enhancement Module Plan:
|
|
|
|
**Phase 1: Architecture (Week 1-2 - POST PLAYER)**
|
|
- [ ] **Model Registry System**
|
|
- Abstract AI model interface for easy extension
|
|
- Dynamic model discovery and registration
|
|
- Model requirements validation
|
|
- Configuration management for different model types
|
|
|
|
- [ ] **Content Detection Pipeline**
|
|
- Automatic content type detection (general/anime/film)
|
|
- Quality assessment algorithms
|
|
- Progressive vs interlaced detection
|
|
- Artifact analysis (compression noise, film grain)
|
|
|
|
- [ ] **Unified Enhancement Workflow**
|
|
- Combine Filters + Upscale into single module
|
|
- Content-aware model selection logic
|
|
- Multi-pass processing framework
|
|
- Quality preservation controls
|
|
|
|
**Phase 2: Model Integration (Week 3-4)**
|
|
- [ ] **Open-Source AI Model Expansion**
|
|
- BasicVSR integration (video-specific super-resolution)
|
|
- RIFE models for frame interpolation
|
|
- Real-CUGan for anime/cartoon enhancement
|
|
- Model selection based on content type
|
|
|
|
- [ ] **Advanced Processing Features**
|
|
- Sequential model application capabilities
|
|
- Custom enhancement pipeline creation
|
|
- Parameter fine-tuning for different models
|
|
- Quality vs Speed presets
|
|
|
|
### TRIM MODULE ENHANCEMENT
|
|
|
|
**DEPENDS ON PLAYER COMPLETION**
|
|
|
|
#### Current State:
|
|
- [X] Basic planning completed
|
|
- [ ] No timeline interface
|
|
- [ ] No frame-accurate cutting
|
|
- [ ] No chapter integration from Author module
|
|
|
|
#### Trim Module Plan:
|
|
|
|
**Phase 1: Foundation (Week 1-2 - POST PLAYER)**
|
|
- [ ] **Timeline Interface**
|
|
- Frame-accurate timeline visualization
|
|
- Zoom capabilities for precise editing
|
|
- Scrubbing with real-time preview
|
|
- Time/frame dual display modes
|
|
|
|
- [ ] **Chapter Integration**
|
|
- Import scene detection from Author module
|
|
- Manual chapter marker creation
|
|
- Chapter navigation controls
|
|
- Visual chapter markers on timeline
|
|
|
|
- [ ] **Frame-Accurate Cutting**
|
|
- Exact frame selection for in/out points
|
|
- Preview before/after trim points
|
|
- Multiple segment trimming support
|
|
|
|
**Phase 2: Advanced Features (Week 3-4)**
|
|
- [ ] **Smart Export System**
|
|
- Lossless vs re-encode decision logic
|
|
- Format preservation when possible
|
|
- Quality-aware encoding settings
|
|
- Batch trimming operations
|
|
|
|
### DOCUMENTATION UPDATES
|
|
|
|
- [X] **Create PLAYER_MODULE.md** - Comprehensive player architecture documentation
|
|
- [X] **Update MODULES.md** - Player and enhancement integration details
|
|
- [X] **Update ROADMAP.md** - Player-first development strategy
|
|
- [ ] **Create enhancement integration guide** - How modules work together
|
|
- [ ] **API documentation** - Player interface for module developers
|
|
|
|
## Future Enhancements (dev23+)
|
|
|
|
### AI Model Expansion
|
|
- [ ] **Diffusion-based models** - SeedVR2, SVFR integration
|
|
- [ ] **Advanced restoration** - Scratch repair, dust removal, color fading
|
|
- [ ] **Face enhancement** - GFPGAN integration for portrait content
|
|
- [ ] **Specialized models** - Content-specific models (sports, archival, etc.)
|
|
|
|
### Professional Features
|
|
- [ ] **Batch enhancement queue** - Process multiple videos with enhancement pipeline
|
|
- [ ] **Hardware optimization** - Multi-GPU support, memory management
|
|
- [ ] **Export system** - Professional format support (ProRes, DNxHD, etc.)
|
|
- [ ] **Plugin architecture** - Extensible system for community contributions
|
|
|
|
### Integration Improvements
|
|
- [ ] **Module communication** - Seamless data flow between modules
|
|
- [ ] **Unified settings** - Shared configuration across modules
|
|
- [ ] **Performance monitoring** - Resource usage tracking and optimization
|
|
- [ ] **Cross-platform testing** - Linux, Windows, macOS parity
|
|
|
|
## Technical Debt Addressed
|
|
|
|
### Player Architecture
|
|
- [X] Identified root causes of instability
|
|
- [X] Planned Go-based unified solution
|
|
- [X] Hardware acceleration strategy defined
|
|
- [X] Frame-accurate seeking approach designed
|
|
|
|
### Enhancement Strategy
|
|
- [X] Open-source model ecosystem researched
|
|
- [X] Scalable architecture designed
|
|
- [X] Content-aware processing planned
|
|
- [X] Future-proof model integration system
|
|
|
|
## Notes
|
|
|
|
- **Player stability is BLOCKER**: Cannot proceed with enhancement features until player is stable
|
|
- **Go implementation preferred**: Maintains single codebase, excellent testing ecosystem
|
|
- **Open-source focus**: No commercial dependencies, community-driven model ecosystem
|
|
- **Modular design**: Each enhancement system can be developed and tested independently |