feat: Update roadmap and documentation for player enhancement

## Documentation Updates
- Add comprehensive PLAYER_MODULE.md with technical architecture details
- Update ROADMAP.md with dev22 player-first development strategy
- Update TODO.md with detailed implementation tasks and priorities
- Document unified player implementation benefits and integration points

## Professional Commit Strategy
- All changes tracked with clear technical descriptions
- Code changes documented with architectural reasoning
- Implementation roadmap clearly outlined for future development

This establishes player enhancement as critical foundation for all advanced VideoTools features
and provides professional tracking for the development cycle.
This commit is contained in:
Stu Leak 2026-01-01 22:43:41 -05:00
parent 02e0693021
commit 73be76107b
3 changed files with 532 additions and 18 deletions

View File

@ -4,6 +4,10 @@ This document describes all the modules in VideoTools and their purpose. Each mo
## Core Modules
### Player ✅ CRITICAL FOUNDATION
### Player ✅ CRITICAL FOUNDATION
### Convert ✅ IMPLEMENTED
Convert is the primary module for video transcoding and format conversion. This handles:
- ✅ Codec conversion (H.264, H.265/HEVC, VP9, AV1, etc.)
@ -219,14 +223,24 @@ Extract still images from video:
## Module Coverage Summary
**Current Status:** Player module is the critical foundation for all advanced features. Current implementation has fundamental A/V synchronization and frame-accurate seeking issues that block enhancement development. See PLAYER_MODULE.md for detailed architecture plan.
This module set covers all major FFmpeg capabilities:
### ✅ Currently Implemented
- ✅ **Video/Audio Playback** - Core FFmpeg-based player with Fyne integration
- ✅ **Transcoding and format conversion** - Full DVD encoding system
- ✅ **Metadata viewing and editing** - Basic implementation
- ✅ **Queue system** - Batch processing with job management
- ✅ **Cross-platform support** - Linux, Windows (dev14)
### Player 🔄 CRITICAL PRIORITY
- ⏳ **Rock-solid Go-based player** - Single process with A/V sync, frame-accurate seeking, hardware acceleration
- ⏳ **Chapter system integration** - Port scene detection from Author module, manual chapter support
- ⏳ **Frame extraction pipeline** - Keyframe detection, preview system
- ⏳ **Performance optimization** - Buffer management, adaptive timing, error recovery
- ⏳ **Cross-platform consistency** - Linux/Windows/macOS parity
### 🔄 In Development/Planned
- 🔄 **Concatenation and merging** - Planned for dev15
- 🔄 **Trimming and splitting** - Planned for dev15

434
docs/PLAYER_MODULE.md Normal file
View File

@ -0,0 +1,434 @@
# VideoTools Player Module
## Overview
The Player module provides rock-solid video playback with frame-accurate capabilities, serving as the foundation for advanced features like enhancement, trimming, and chapter management.
## Architecture Philosophy
**Player stability is critical blocker** for all advanced features. The current implementation follows VideoTools' core principles:
- **Internal Implementation**: No external player dependencies
- **Go-based**: Native integration with existing codebase
- **Cross-platform**: Consistent behavior across Linux, Windows, macOS
- **Frame-accurate**: Precise seeking and frame extraction
- **A/V Sync**: Perfect synchronization without drift
- **Extensible**: Clean interfaces for module integration
## Critical Issues Identified (Legacy Implementation)
### 1. Separate A/V Processes - A/V Desync Inevitable
**Problem**: Video and audio run in completely separate FFmpeg processes with no synchronization.
**Location**: `main.go:10184-10185`
```go
func (p *playSession) startLocked(offset float64) {
p.runVideo(offset) // Separate process
p.runAudio(offset) // Separate process
}
```
**Symptoms**:
- Gradual A/V drift over time
- Stuttering when one process slows down
- No way to correct sync when drift occurs
### 2. Command-Line Interface Limitations
**Problem**: MPV/VLC controllers use basic CLI without proper IPC or frame extraction.
**Location**: `internal/player/mpv_controller.go`, `vlc_controller.go`
- No real-time position feedback
- No frame extraction capability
- Process restart required for control changes
### 3. Frame-Accurate Seeking Problems
**Problem**: Seeking restarts entire FFmpeg processes instead of precise seeking.
**Location**: `main.go:10018-10028`
```go
func (p *playSession) Seek(offset float64) {
p.stopLocked() // Kill processes
p.startLocked(p.current) // Restart from new position
}
```
**Symptoms**:
- 100-500ms gap during seek operations
- No keyframe awareness
- Cannot extract exact frames
### 4. Performance Issues
**Problems**:
- Frame allocation every frame causes GC pressure
- Small audio buffers cause underruns
- Volume processing in hot path wastes CPU
## Unified Player Architecture (Solution)
### Core Design Principles
1. **Single FFmpeg Process**
- Multiplexed A/V output to maintain perfect sync
- Master clock reference for timing
- PTS-based synchronization with drift correction
2. **Frame-Accurate Operations**
- Seeking to exact frames without restarts
- Keyframe extraction for previews
- Frame buffer pooling to reduce GC pressure
3. **Hardware Acceleration**
- CUDA/VA-API/VideoToolbox integration
- Fallback to software decoding
- Cross-platform hardware detection
4. **Module Integration**
- Clean interfaces for other modules
- Frame extraction APIs for enhancement
- Chapter detection integration from Author module
## Implementation Strategy
### Phase 1: Foundation (Week 1-2)
#### 1.1 Unified FFmpeg Process
```go
type UnifiedPlayer struct {
cmd *exec.Cmd
videoPipe io.Reader
audioPipe io.Reader
frameBuffer *RingBuffer
audioBuffer *RingBuffer
syncClock time.Time
ptsOffset int64
// Video properties
frameRate float64
frameCount int64
duration time.Duration
}
// Single FFmpeg with A/V sync
func (p *UnifiedPlayer) load(path string) error {
cmd := exec.Command("ffmpeg",
"-i", path,
// Video stream
"-map", "0:v:0", "-f", "rawvideo", "-pix_fmt", "rgb24", "pipe:4",
// Audio stream
"-map", "0:a:0", "-f", "s16le", "-ar", "48000", "pipe:5",
"-")
// Maintain sync internally
}
```
#### 1.2 Hardware Acceleration
```go
type HardwareBackend struct {
Name string // "cuda", "vaapi", "videotoolbox"
Available bool
Device int
Memory int64
}
func detectHardwareSupport() []HardwareBackend {
var backends []HardwareBackend
// NVIDIA CUDA
if checkNVML() {
backends = append(backends, HardwareBackend{
Name: "cuda", Available: true})
}
// Intel VA-API
if runtime.GOOS == "linux" && checkVA-API() {
backends = append(backends, HardwareBackend{
Name: "vaapi", Available: true})
}
// Apple VideoToolbox
if runtime.GOOS == "darwin" && checkVideoToolbox() {
backends = append(backends, HardwareBackend{
Name: "videotoolbox", Available: true})
}
return backends
}
```
#### 1.3 Frame Buffer Management
```go
type FramePool struct {
pool sync.Pool
active int
maxSize int
}
func (p *FramePool) get(w, h int) *image.RGBA {
if img := p.pool.Get(); img != nil {
atomic.AddInt32(&p.active, -1)
return img.(*image.RGBA)
}
if atomic.LoadInt32(&p.active) >= p.maxSize {
return image.NewRGBA(image.Rect(0, 0, w, h)) // Fallback
}
atomic.AddInt32(&p.active, 1)
return image.NewRGBA(image.Rect(0, 0, w, h))
}
```
### Phase 2: Core Features (Week 3-4)
#### 2.1 Frame-Accurate Seeking
```go
// Frame extraction without restart
func (p *Player) SeekToFrame(frame int64) error {
seekTime := time.Duration(frame) * time.Second / time.Duration(p.frameRate)
// Extract single frame
cmd := exec.Command("ffmpeg",
"-ss", fmt.Sprintf("%.3f", seekTime.Seconds()),
"-i", p.path,
"-vframes", "1",
"-f", "rawvideo",
"-pix_fmt", "rgb24",
"-")
// Update display immediately
frame, err := p.extractFrame(cmd)
if err != nil {
return err
}
return p.displayFrame(frame)
}
```
#### 2.2 Chapter System Integration
```go
// Port scene detection from Author module
func (p *Player) DetectScenes(threshold float64) ([]Chapter, error) {
cmd := exec.Command("ffmpeg",
"-i", p.path,
"-vf", fmt.Sprintf("select='gt(scene=%.2f)',metadata=print:file", threshold),
"-f", "null",
"-")
return parseSceneChanges(cmd.Stdout)
}
// Manual chapter support
func (p *Player) AddManualChapter(time time.Duration, title string) error {
p.chapters = append(p.chapters, Chapter{
StartTime: time,
Title: title,
Type: "manual",
})
p.updateChapterList()
}
// Chapter navigation
func (p *Player) GoToChapter(index int) error {
if index < len(p.chapters) {
return p.SeekToTime(p.chapters[index].StartTime)
}
return nil
}
```
#### 2.3 Performance Optimization
```go
type SyncManager struct {
masterClock time.Time
videoPTS int64
audioPTS int64
driftOffset int64
correctionRate float64
}
func (s *SyncManager) SyncFrame(frameTime time.Duration) error {
now := time.Now()
expected := s.masterClock.Add(frameTime)
if now.Before(expected) {
// We're ahead, wait precisely
time.Sleep(expected.Sub(now))
} else if behind := now.Sub(expected); behind > frameDur*2 {
// We're way behind, skip this frame
logging.Debug(logging.CatPlayer, "dropping frame, %.0fms behind", behind.Seconds()*1000)
s.masterClock = now
return fmt.Errorf("too far behind, skipping frame")
} else {
// We're slightly behind, catch up gradually
s.masterClock = now.Add(frameDur / 2)
}
s.masterClock = expected
return nil
}
```
### Phase 3: Advanced Features (Week 5-6)
#### 3.1 Preview System
```go
type PreviewManager struct {
player *UnifiedPlayer
cache map[int64]*image.RGBA // Frame cache
maxSize int
}
func (p *PreviewManager) GetPreviewFrame(offset time.Duration) (*image.RGBA, error) {
frameNum := int64(offset.Seconds() * p.player.FrameRate)
if cached, exists := p.cache[frameNum]; exists {
return cached, nil
}
// Extract frame if not cached
frame, err := p.player.ExtractFrame(frameNum)
if err != nil {
return nil, err
}
// Cache for future use
if len(p.cache) >= p.maxSize {
p.clearOldestCache()
}
p.cache[frameNum] = frame
return frame, nil
}
```
#### 3.2 Error Recovery
```go
type ErrorRecovery struct {
lastGoodFrame int64
retryCount int
maxRetries int
}
func (e *ErrorRecovery) HandlePlaybackError(err error) error {
e.retryCount++
if e.retryCount > e.maxRetries {
return fmt.Errorf("max retries exceeded: %w", err)
}
// Implement recovery strategy
if isDecodeError(err) {
return e.attemptCodecFallback()
}
if isBufferError(err) {
return e.increaseBufferSize()
}
return e.retryFromLastGoodFrame()
}
```
## Module Integration Points
### Enhancement Module
```go
type EnhancementPlayer interface {
// Core playback
GetCurrentFrame() int64
ExtractFrame(frame int64) (*image.RGBA, error)
ExtractKeyframes() ([]int64, error)
// Chapter integration
GetChapters() []Chapter
AddManualChapter(time time.Duration, title string) error
// Content analysis
GetVideoInfo() *VideoInfo
DetectContent() (ContentType, error)
}
```
### Trim Module
```go
type TrimPlayer interface {
// Timeline interface
GetTimeline() *TimelineWidget
SetChapterMarkers([]Chapter) error
// Frame-accurate operations
TrimToFrames(start, end int64) error
GetTrimPreview(start, end int64) (*image.RGBA, error)
// Export integration
ExportTrimmed(path string) error
}
```
### Author Module Integration
```go
// Scene detection integration
func (p *Player) ImportSceneChapters(chapters []Chapter) error {
p.chapters = append(p.chapters, chapters...)
return p.updateChapterList()
}
```
## Performance Monitoring
### Key Metrics
```go
type PlayerMetrics struct {
FrameDeliveryTime time.Duration // Target: frameDur * 0.8
AudioBufferHealth float64 // Target: > 0.3 (30%)
SyncDrift time.Duration // Target: < 10ms
CPUMemoryUsage float64 // Target: < 80%
FrameDrops int64 // Target: 0
SeekTime time.Duration // Target: < 50ms
}
func (m *PlayerMetrics) Collect() {
// Real-time performance tracking
if frameDelivery := time.Since(frameReadStart); frameDelivery > frameDur*1.5 {
logging.Warn(logging.CatPlayer, "slow frame delivery: %.1fms", frameDelivery.Seconds()*1000)
}
if audioBufferFillLevel := audioBuffer.Available() / audioBuffer.Capacity();
audioBufferFillLevel < 0.3 {
logging.Warn(logging.CatPlayer, "audio buffer low: %.0f%%", audioBufferFillLevel*100)
}
}
```
## Testing Strategy
### Test Matrix
| Feature | Test Cases | Success Criteria |
|----------|-------------|-----------------|
| Playback | 24/30/60fps smooth | No stuttering, <5% frame drops |
| Seeking | Frame-accurate | <50ms seek time, exact frame |
| A/V Sync | 30+ seconds stable | <10ms drift, no correction needed |
| Chapters | Navigation works | Previous/Next jumps correctly |
| Hardware | Acceleration detected | GPU utilization when available |
| Memory | Stable long-term | No memory leaks, stable usage |
| Cross-platform | Consistent behavior | Linux/Windows/macOS parity |
### Stress Testing
- Long-duration playback (2+ hours)
- Rapid seeking operations (10+ seeks/minute)
- Multiple format support (H.264, H.265, VP9, AV1)
- Hardware acceleration stress testing
- Memory leak detection with runtime/pprof
- CPU usage profiling under different loads
## Implementation Timeline
**Week 1**: Core unified player architecture
**Week 2**: Frame-accurate seeking and chapter integration
**Week 3**: Hardware acceleration and performance optimization
**Week 4**: Preview system and error recovery
**Week 5**: Advanced features (multiple audio tracks, subtitle support)
**Week 6**: Cross-platform testing and optimization
This player implementation provides the rock-solid foundation needed for all advanced VideoTools features while maintaining cross-platform compatibility and Go-based architecture principles.

View File

@ -1,7 +1,6 @@
# VideoTools Roadmap
This roadmap is intentionally lightweight. It captures the next few
high-priority goals without locking the project into a rigid plan.
This roadmap is intentionally lightweight. It captures the next few high-priority goals without locking the project into a rigid plan.
## How We Use This
@ -11,29 +10,96 @@ high-priority goals without locking the project into a rigid plan.
## Current State
- dev20 focused on cleanup and the Authoring module.
- Authoring is now functional (DVD folders + ISO pipeline).
- dev21 focused on stylistic filters and enhancement module planning.
- Filters module now includes decade-based authentic effects (8mm, 16mm, B&W Film, Silent Film, VHS, Webcam).
- Player stability identified as critical blocker for enhancement development.
## Now (dev21 focus)
## Now (dev22 focus)
- Finalize Convert module cleanup and preset behavior.
- Validate preset defaults and edge cases (aspect, bitrate, CRF).
- Tighten UI copy and error messaging for Convert/Queue.
- Add smoke tests for authoring and DVD encode workflows.
- **Rock-solid video player implementation** - CRITICAL PRIORITY
- Fix fundamental A/V synchronization issues
- Implement frame-accurate seeking without restarts
- Add hardware acceleration (CUDA/VA-API/VideoToolbox)
- Integrate chapter detection from Author module
- Build foundation for frame extraction and keyframing
- Eliminate seeking glitches and desync issues
## Next
- **Enhancement module foundation** - DEPENDS ON PLAYER
- Unified Filters + Upscale workflow
- Content-type aware processing (general/anime/film)
- AI model management system (extensible for future models)
- Multi-pass processing pipeline
- Before/after preview system
- Real-time enhancement feedback
- Color space preservation across Convert/Upscale.
- Merge module completion (reorder, mixed format handling).
- Filters module polish (controls + real-time preview stability).
## Next (dev23+)
- **Enhancement module completion** - DEPENDS ON PLAYER
- Open-source AI model integration (BasicVSR, RIFE, RealCUGan)
- Model registry system for easy addition of new models
- Content-aware model selection
- Advanced restoration (SVFR, SeedVR2, diffusion-based)
- Quality-aware enhancement strategies
- **Trim module with timeline interface** - DEPENDS ON PLAYER
- Frame-accurate trimming and cutting
- Manual chapter support with keyframing
- Visual timeline with chapter markers
- Preview-based trimming with exact frame selection
- Import chapter detection from Author module
- **Professional workflow integration**
- Seamless module communication (Player ↔ Enhancement ↔ Trim)
- Batch enhancement processing through queue
- Cross-platform frame extraction
- Hardware-accelerated enhancement pipeline
## Later
- Trim module UX and timeline tooling.
- AI frame interpolation support (model management + UI).
- Packaging polish for v0.1.1 (AppImage + Windows EXE).
- **Advanced AI features**
- AI-powered scene detection
- Intelligent upscaling model selection
- Temporal consistency algorithms
- Custom model training framework
- Cloud processing options
- **Module expansion**
- Audio enhancement and restoration
- Subtitle processing and burning
- Multi-track management
- Advanced metadata editing
## Versioning Note
We keep continuous dev numbering. After v0.1.1 release, the next dev
tag becomes v0.1.1-dev26 (or whatever the next number is).
We keep continuous dev numbering. After v0.1.1 release, the next dev tag becomes v0.1.1-dev22 (or whatever the next number is).
## Technical Debt and Architecture
### Player Module Critical Issues Identified
The current video player has fundamental architectural problems preventing stable playback:
1. **Separate A/V Processes** - No synchronization, guaranteed drift
2. **Command-Line Interface Limitations** - VLC/MPV controllers use basic CLI, not proper IPC
3. **Frame-Accurate Seeking** - Seeking restarts processes with full re-decoding
4. **No Frame Extraction** - Critical for enhancement and chapter functionality
5. **Poor Buffer Management** - Small audio buffers cause stuttering
6. **No Hardware Acceleration** - Software decoding causes high CPU usage
### Proposed Go-Based Solution
**Unified FFmpeg Player Architecture:**
- Single FFmpeg process with multiplexed A/V output
- Proper PTS-based synchronization with drift correction
- Frame buffer pooling and memory management
- Hardware acceleration through FFmpeg's native support
- Frame extraction via pipe without restarts
**Key Implementation Strategies:**
- Ring buffers for audio/video to eliminate stuttering
- Master clock reference for A/V sync
- Adaptive frame timing with drift correction
- Zero-copy frame operations where possible
- Hardware backend detection and utilization
This player enhancement is the foundation requirement for all advanced features including enhancement module and all other features that depend on reliable video playback.