Major fixes: - Fix drag-and-drop video loading bug (s.source was never set) - Call switchToVideo() instead of showPlayerView() to properly initialize player state - Show initial preview frame/thumbnail when video loads - Improve ffprobe error messages (capture stderr for better diagnostics) UI improvements: - Move playlist from left to right side - Add playlist toggle button (☰) with visibility control - Load and display preview frame immediately when video loads - Improve control layout with volume container - Auto-hide playlist when only one video loaded Documentation: - Add FEATURE_ROADMAP.md tracking 30 planned features - Add ICONS_NEEDED.md listing 53 required SVG icons - Update .gitignore to exclude binaries References: DEV_SPEC_FRAME_ACCURATE_PLAYBACK.md Co-Authored-By: Claude <noreply@anthropic.com>
19 KiB
VT_Player Development Specification: Frame-Accurate Playback & Lossless Cutting
Project: VT_Player Target: Lightweight frame-accurate video player with keyframe navigation for lossless cutting Goal: Provide LosslessCut-style functionality using FFmpeg suite exclusively Performance: Competitive with existing tools, optimized for lightweight operation Status: Foundation exists, keyframe features need implementation
Design Philosophy
- Lightweight First: Minimize memory footprint, avoid bloat
- FFmpeg Suite Only: Use ffplay, ffmpeg, ffprobe - no external players
- Hardware Accelerated: Leverage GPU when available, graceful fallback to CPU
- Responsive UI: All operations feel instant (<100ms response time)
- Smart Caching: Cache intelligently, clean up aggressively
Current State Analysis
What's Working ✓
-
Basic Player Controller (
internal/player/controller_linux.go)- ffplay integration with stdin control
- Play/pause/seek/volume controls
- Window embedding via xdotool/SDL
- ~50MB memory footprint for playback
-
Player UI (
main.go:555-800)- Video loading and playlist
- Basic controls and time display
- Slider-based seeking
-
Video Metadata (
videoSourcestruct)- FFprobe metadata extraction
- Duration/resolution/framerate parsing
What's Missing ✗
- No keyframe detection - Cannot identify I-frames
- No frame-by-frame navigation - Only time-based seeking
- No timeline visualization - No keyframe markers
- No in/out point marking - Cannot mark cut points
- No lossless cut functionality - No stream copy cutting
- No frame counter - Only shows time
Implementation Plan
Phase 1: Lightweight Keyframe Detection
Goal: <5s detection time for 1-hour video, <10MB memory overhead
Create internal/keyframe/detector.go
Strategy: Sparse Keyframe Index
package keyframe
// Keyframe represents an I-frame position
type Keyframe struct {
FrameNum int // Frame number
Timestamp float64 // Time in seconds
}
// Index holds keyframe positions (I-frames only, not all frames)
type Index struct {
Keyframes []Keyframe // Only I-frames (~1KB per minute of video)
TotalFrames int
Duration float64
FrameRate float64
}
// DetectKeyframes uses FFprobe to find I-frames only
func DetectKeyframes(videoPath string) (*Index, error) {
// ffprobe -v error -skip_frame nokey -select_streams v:0 \
// -show_entries frame=pkt_pts_time -of csv video.mp4
//
// -skip_frame nokey = Only I-frames (5-10x faster than scanning all frames)
// Returns: 0.000000, 2.002000, 4.004000, ...
cmd := exec.Command("ffprobe",
"-v", "error",
"-skip_frame", "nokey", // KEY OPTIMIZATION: Only I-frames
"-select_streams", "v:0",
"-show_entries", "frame=pkt_pts_time",
"-of", "csv=p=0",
videoPath,
)
// Parse output, build Keyframe array
// Memory: ~100 bytes per keyframe
// 1-hour video @ 2s GOP = ~1800 keyframes = ~180KB
}
// FindNearestKeyframe returns closest I-frame to timestamp
func (idx *Index) FindNearestKeyframe(timestamp float64, direction string) *Keyframe {
// Binary search (O(log n))
// direction: "before", "after", "nearest"
}
// EstimateFrameNumber calculates frame # from timestamp
func (idx *Index) EstimateFrameNumber(timestamp float64) int {
return int(timestamp * idx.FrameRate + 0.5)
}
Performance Targets:
- 1-hour video detection: <5 seconds
- Memory usage: <1MB for index
- Cache size: <100KB per video
Cache Strategy:
// Cache in memory during playback, persist to disk
// Location: ~/.cache/vt_player/<video-hash>.kf
// Format: Binary (timestamp as float64, 8 bytes per keyframe)
// Invalidate if: video modified time changes
Phase 2: Frame-Accurate Seeking with ffplay
Goal: Precise navigation using existing ffplay controller
Extend internal/player/controller_linux.go
Position Tracking:
type ffplayController struct {
// ... existing fields ...
// NEW: Position tracking
lastKnownPos float64 // Last seek position
lastKnownTime time.Time // When position was updated
playState bool // true = playing, false = paused
}
// GetCurrentPosition estimates current position
func (c *ffplayController) GetCurrentPosition() float64 {
c.mu.Lock()
defer c.mu.Unlock()
if !c.playState {
// Paused: return last position
return c.lastKnownPos
}
// Playing: estimate based on elapsed time
elapsed := time.Since(c.lastKnownTime).Seconds()
return c.lastKnownPos + elapsed
}
// SeekToFrame seeks to specific frame number
func (c *ffplayController) SeekToFrame(frameNum int, frameRate float64) error {
timestamp := float64(frameNum) / frameRate
return c.Seek(timestamp)
}
Frame Stepping Strategy:
// For single-frame steps: Use ffplay's built-in frame step
// ffplay keyboard command: 's' = step to next frame
var (
keyStepForward = []byte{'s'} // Frame step
)
func (c *ffplayController) StepFrame(direction int) error {
// Ensure paused
if !c.paused {
c.Pause()
}
if direction > 0 {
// Step forward: Use 's' key
return c.send(keyStepForward)
} else {
// Step backward: Seek back 1 frame
currentPos := c.GetCurrentPosition()
frameRate := c.frameRate // Store from metadata
backOneFrame := currentPos - (1.0 / frameRate)
return c.Seek(math.Max(0, backOneFrame))
}
}
Memory Impact: +40 bytes per controller instance
Phase 3: Custom Timeline Widget with Keyframe Markers
Goal: Visual timeline, smooth interaction, minimal redraw overhead
Create internal/ui/timeline.go
Custom Fyne Widget:
package ui
import (
"image/color"
"fyne.io/fyne/v2"
"fyne.io/fyne/v2/canvas"
"fyne.io/fyne/v2/widget"
)
// TimelineWidget shows video timeline with keyframe markers
type TimelineWidget struct {
widget.BaseWidget
duration float64 // Total duration
position float64 // Current position
keyframes []float64 // Keyframe timestamps
inPoint *float64 // In-point marker
outPoint *float64 // Out-point marker
onChange func(float64) // Callback on seek
// Rendering cache (updated only on resize/data change)
cachedBackground *canvas.Rectangle
cachedKeyframes []*canvas.Line
cachedScrubber *canvas.Line
}
// CreateRenderer implements fyne.Widget
func (t *TimelineWidget) CreateRenderer() fyne.WidgetRenderer {
// Draw once, update only scrubber position on drag
// Keyframe markers: 1px vertical yellow lines
// In-point: 2px blue line
// Out-point: 2px red line
// Scrubber: 3px white line
}
// Lightweight: Only redraw scrubber on position change
// Full redraw only on resize or keyframe data change
Memory Impact: ~2KB per timeline widget Rendering: <5ms for 1000 keyframes
Phase 4: Enhanced Player UI
Goal: LosslessCut-style controls, keyboard-driven workflow
Update main.go showPlayerView (lines 555-800)
Layout:
┌──────────────────────────────────────────────┐
│ Video Display (ffplay window) 960x540 │
└──────────────────────────────────────────────┘
┌──────────────────────────────────────────────┐
│ [Timeline with keyframe markers] │
│ [====|==|====I====|==O====|===] │
└──────────────────────────────────────────────┘
┌──────────────────────────────────────────────┐
│ Frame: 1234 / 15000 Time: 0:41.400 │
│ [<<KF] [<Frame] [Play] [Frame>] [KF>>] │
└──────────────────────────────────────────────┘
┌──────────────────────────────────────────────┐
│ [Set In] [Set Out] [Clear] [Export Cut] │
└──────────────────────────────────────────────┘
Components:
// Frame counter (updates every 100ms when playing)
frameLabel := widget.NewLabel("Frame: 0 / 0")
timeLabel := widget.NewLabel("Time: 0:00.000")
// Frame navigation buttons
btnPrevKF := widget.NewButton("<<", func() {
// Jump to previous keyframe
kf := s.keyframeIndex.FindNearestKeyframe(currentPos, "before")
s.player.Seek(kf.Timestamp)
})
btnPrevFrame := widget.NewButton("<", func() {
// Step back 1 frame
s.player.StepFrame(-1)
})
btnNextFrame := widget.NewButton(">", func() {
// Step forward 1 frame
s.player.StepFrame(1)
})
btnNextKF := widget.NewButton(">>", func() {
// Jump to next keyframe
kf := s.keyframeIndex.FindNearestKeyframe(currentPos, "after")
s.player.Seek(kf.Timestamp)
})
// In/Out controls
btnSetIn := widget.NewButton("Set In [I]", func() {
s.cutInPoint = currentPosition
timelineWidget.SetInPoint(s.cutInPoint)
})
btnSetOut := widget.NewButton("Set Out [O]", func() {
s.cutOutPoint = currentPosition
timelineWidget.SetOutPoint(s.cutOutPoint)
})
btnClear := widget.NewButton("Clear [X]", func() {
s.cutInPoint = nil
s.cutOutPoint = nil
timelineWidget.ClearPoints()
})
Keyboard Shortcuts:
canvas.SetOnTypedKey(func(ke *fyne.KeyEvent) {
switch ke.Name {
case fyne.KeySpace:
togglePlayPause()
case fyne.KeyLeft:
if ke.Modifier&fyne.KeyModifierShift != 0 {
jumpToPreviousKeyframe()
} else {
stepBackOneFrame()
}
case fyne.KeyRight:
if ke.Modifier&fyne.KeyModifierShift != 0 {
jumpToNextKeyframe()
} else {
stepForwardOneFrame()
}
case fyne.KeyI:
setInPoint()
case fyne.KeyO:
setOutPoint()
case fyne.KeyX:
clearInOutPoints()
case fyne.KeyE:
exportCut()
}
})
Memory Impact: +2KB for UI components
Phase 5: Lossless Cut Export
Goal: Fast, zero-quality-loss cutting using FFmpeg stream copy
Create internal/cut/export.go
Core Functionality:
package cut
import (
"fmt"
"os/exec"
"git.leaktechnologies.dev/stu/VT_Player/internal/keyframe"
)
// ExportOptions configures export
type ExportOptions struct {
InputPath string
OutputPath string
InTime float64
OutTime float64
AutoSnap bool // Snap in-point to nearest keyframe
}
// Export performs lossless cut
func Export(opts ExportOptions, idx *keyframe.Index,
progress func(float64)) error {
inTime := opts.InTime
outTime := opts.OutTime
// Validate/snap in-point to keyframe
if opts.AutoSnap {
kf := idx.FindNearestKeyframe(inTime, "before")
if kf != nil && math.Abs(kf.Timestamp-inTime) > 0.1 {
inTime = kf.Timestamp
}
}
// FFmpeg stream copy (no re-encoding)
args := []string{
"-hide_banner",
"-loglevel", "error",
"-progress", "pipe:1", // Progress reporting
"-i", opts.InputPath,
"-ss", fmt.Sprintf("%.6f", inTime),
"-to", fmt.Sprintf("%.6f", outTime),
"-c", "copy", // Stream copy = lossless
"-avoid_negative_ts", "make_zero",
"-y", // Overwrite
opts.OutputPath,
}
cmd := exec.Command("ffmpeg", args...)
// Parse progress output
// Call progress(percentage) callback
return cmd.Run()
}
// Validate checks if cut points are valid
func Validate(inTime, outTime float64, idx *keyframe.Index) error {
// Check if in-point is close to a keyframe
kf := idx.FindNearestKeyframe(inTime, "nearest")
if kf == nil {
return fmt.Errorf("no keyframes found")
}
diff := math.Abs(kf.Timestamp - inTime)
if diff > 0.5 {
return fmt.Errorf("in-point not near keyframe (%.2fs away)", diff)
}
if outTime <= inTime {
return fmt.Errorf("out-point must be after in-point")
}
return nil
}
Export UI Integration:
exportBtn := widget.NewButton("Export Cut [E]", func() {
if s.cutInPoint == nil || s.cutOutPoint == nil {
dialog.ShowError(errors.New("Set in/out points first"), s.window)
return
}
// Validate
err := cut.Validate(*s.cutInPoint, *s.cutOutPoint, s.keyframeIndex)
if err != nil {
// Show error with option to auto-snap
dialog.ShowConfirm(
"Invalid Cut Point",
fmt.Sprintf("%v\n\nSnap to nearest keyframe?", err),
func(snap bool) {
if snap {
// Auto-snap and retry
performExport(true)
}
},
s.window,
)
return
}
// Show save dialog
dialog.ShowFileSave(func(uc fyne.URIWriteCloser, err error) {
if err != nil || uc == nil {
return
}
outputPath := uc.URI().Path()
uc.Close()
// Export with progress
performExport(false)
}, s.window)
})
func performExport(autoSnap bool) {
// Show progress dialog
progress := widget.NewProgressBar()
dlg := dialog.NewCustom("Exporting...", "Cancel", progress, s.window)
dlg.Show()
go func() {
err := cut.Export(cut.ExportOptions{
InputPath: s.source.Path,
OutputPath: outputPath,
InTime: *s.cutInPoint,
OutTime: *s.cutOutPoint,
AutoSnap: autoSnap,
}, s.keyframeIndex, func(pct float64) {
progress.SetValue(pct)
})
dlg.Hide()
if err != nil {
dialog.ShowError(err, s.window)
} else {
dialog.ShowInformation("Success", "Cut exported", s.window)
}
}()
}
Performance:
- Export speed: Real-time (1-hour video exports in ~30 seconds)
- No quality loss (bit-perfect copy)
- Memory usage: <50MB during export
Performance Optimizations
Keyframe Detection
// 1. Parallel processing for multiple videos
var wg sync.WaitGroup
for _, video := range videos {
wg.Add(1)
go func(v string) {
defer wg.Done()
DetectKeyframes(v)
}(video)
}
// 2. Incremental loading: Show UI before detection completes
go func() {
idx, err := DetectKeyframes(videoPath)
// Update UI when ready
timeline.SetKeyframes(idx.Keyframes)
}()
// 3. Cache aggressively
cacheKey := fmt.Sprintf("%s-%d", videoPath, fileInfo.ModTime().Unix())
if cached := loadFromCache(cacheKey); cached != nil {
return cached
}
Memory Management
// 1. Sparse keyframe storage (I-frames only)
// 1-hour video: ~180KB vs 10MB for all frames
// 2. Limit cached indices
const maxCachedIndices = 10
if len(indexCache) > maxCachedIndices {
// Remove oldest
delete(indexCache, oldestKey)
}
// 3. Timeline rendering: Canvas reuse
// Don't recreate canvas objects, update positions only
UI Responsiveness
// 1. Debounce position updates
var updateTimer *time.Timer
func updatePosition(pos float64) {
if updateTimer != nil {
updateTimer.Stop()
}
updateTimer = time.AfterFunc(50*time.Millisecond, func() {
frameLabel.SetText(formatFrame(pos))
})
}
// 2. Background goroutines for heavy operations
go detectKeyframes()
go exportCut()
// Never block UI thread
// 3. Efficient timeline redraw
// Only redraw scrubber, not entire timeline
Testing Strategy
Performance Benchmarks
# Target: Competitive with LosslessCut
# Keyframe detection: <5s for 1-hour video
# Frame stepping: <50ms response
# Export: Real-time speed (1x)
# Memory: <100MB total (including ffplay)
# Test suite:
go test ./internal/keyframe -bench=. -benchtime=10s
go test ./internal/cut -bench=. -benchtime=10s
Test Videos
# 1. Generate test video with known keyframe intervals
ffmpeg -f lavfi -i testsrc=duration=60:size=1280x720:rate=30 \
-c:v libx264 -g 60 -keyint_min 60 \
test_2s_keyframes.mp4
# 2. Various formats
# - H.264, H.265, VP9, AV1
# - Different GOP sizes
# - Variable framerate
Validation
# Verify cut accuracy
ffprobe -v error -show_entries format=duration \
-of default=noprint_wrappers=1:nokey=1 cut_output.mp4
# Verify no re-encoding (check codec)
ffprobe -v error -select_streams v:0 \
-show_entries stream=codec_name cut_output.mp4
# Should match original codec exactly
Resource Usage Targets
Memory:
- Base application: ~30MB
- Video playback (ffplay): ~50MB
- Keyframe index (1-hour): ~1MB
- UI components: ~5MB
- Total: <100MB for typical use
CPU:
- Idle: <1%
- Playback: 5-15% (ffplay + UI updates)
- Keyframe detection: 100% single core for <5s
- Export: 20-40% (FFmpeg stream copy)
Disk:
- Cache per video: <100KB
- Total cache limit: 50MB (500 videos)
- Auto-cleanup on startup
Success Criteria
Performance Parity with LosslessCut:
- Keyframe detection: <5s for 1-hour video
- Frame stepping: <50ms response time
- Timeline rendering: <5ms for 1000 keyframes
- Export speed: Real-time (1x)
- Memory usage: <100MB total
Feature Completeness:
- Frame-by-frame navigation
- Keyframe visualization
- In/out point marking
- Lossless export
- Keyboard-driven workflow
- Progress reporting
Quality:
- Bit-perfect lossless cuts
- Frame-accurate positioning
- No UI lag or stuttering
- Stable under heavy use
Integration with VideoTools
Reusable Components:
internal/keyframe/- Copy directlyinternal/cut/- Copy directlyinternal/ui/timeline.go- Adapt for Trim module
VideoTools Trim Module:
- Use VT_Player's proven code
- Add batch trimming
- Integrate with queue system
Maintenance:
- VT_Player = Standalone lightweight player
- VideoTools = Full suite including trim capability
- Share core keyframe/cut code between projects
Implementation Priority
Week 1:
- Phase 1: Keyframe detection with caching
- Test performance (<5s target)
Week 2:
- Phase 2: Frame-accurate seeking
- Phase 3: Timeline widget
- Test responsiveness
Week 3:
- Phase 4: Enhanced UI + keyboard shortcuts
- Phase 5: Lossless cut export
- Integration testing
Week 4:
- Performance optimization
- Documentation
- Prepare for VideoTools integration
Next Steps
- Start with Phase 1 (keyframe detection)
- Benchmark against 1-hour test video
- Verify <5s detection time and <1MB memory
- Move to Phase 2 once performance validated
- Iterate rapidly, test continuously
Goal: Lightweight, powerful, competitive with industry tools.