Document Real-ESRGAN upscale pipeline

This commit is contained in:
Stu Leak 2025-12-20 20:58:34 -05:00
parent 505db279d8
commit 55c291406f
3 changed files with 73 additions and 12 deletions

View File

@ -76,18 +76,20 @@ Filters module provides video and audio processing effects:
**Current Status:** Planned for dev15, basic filter system design.
### Upscale 🔄 PLANNED
### Upscale 🔄 PARTIAL
Upscale increases video resolution using advanced scaling algorithms:
- ⏳ **AI-based:** Waifu2x, Real-ESRGAN (via external integration)
- ⏳ **Traditional:** Lanczos, Bicubic, Spline, Super-resolution
- ⏳ **Target resolutions:** 720p, 1080p, 1440p, 4K, custom
- ⏳ Noise reduction and artifact mitigation during upscaling
- ⏳ Batch processing for multiple files
- ⏳ Quality presets balancing speed vs. output quality
- ✅ **AI-based:** Real-ESRGAN (ncnn backend) with presets and model selection
- ✅ **Traditional:** Lanczos, Bicubic, Spline, Bilinear
- ✅ **Target resolutions:** Match Source, 2x/4x relative, 720p, 1080p, 1440p, 4K, 8K
- ✅ Frame extraction → AI upscale → reassemble pipeline
- ✅ Filters and frame-rate conversion can be applied before AI upscaling
- ⏳ Noise reduction and artifact mitigation beyond Real-ESRGAN
- ⏳ Batch processing for multiple files (via queue)
- ✅ Quality presets balancing speed vs. output quality (AI presets)
**FFmpeg Features:** Scale filter, super-resolution filters
**FFmpeg Features:** Scale filter, minterpolate, fps
**Current Status:** Planned for dev16, AI integration research phase.
**Current Status:** AI integration wired (ncnn). Python backend options are documented but not yet executed.
### Audio 🔄 PLANNED
Audio module handles all audio track operations:
@ -248,4 +250,4 @@ This module set covers all major FFmpeg capabilities:
- **Additional Modules:** 0/7 implemented
- **Overall Progress:** ~12% complete
- **Next Major Release:** dev15 (Merge, Trim, Filters modules)
- **Future Focus:** Blu-ray professional authoring system
- **Future Focus:** Blu-ray professional authoring system

View File

@ -1,6 +1,6 @@
# VideoTools Documentation
VideoTools is a professional-grade video processing suite with a modern GUI, currently on v0.1.0-dev14. It specializes in creating DVD-compliant videos for authoring and distribution.
VideoTools is a professional-grade video processing suite with a modern GUI, currently on v0.1.0-dev18. It specializes in creating DVD-compliant videos for authoring and distribution.
## Documentation Structure
@ -15,7 +15,7 @@ VideoTools is a professional-grade video processing suite with a modern GUI, cur
- [Merge](merge/) - Join multiple video clips *(planned)*
- [Trim](trim/) - Cut and split videos *(planned)*
- [Filters](filters/) - Video and audio effects *(planned)*
- [Upscale](upscale/) - Resolution enhancement *(planned)*
- [Upscale](upscale/) - Resolution enhancement *(AI + traditional now wired)*
- [Audio](audio/) - Audio track operations *(planned)*
- [Thumb](thumb/) - Thumbnail generation *(planned)*
- [Rip](rip/) - DVD/Blu-ray extraction *(planned)*

59
docs/upscale/README.md Normal file
View File

@ -0,0 +1,59 @@
# Upscale Module
The Upscale module raises video resolution using traditional FFmpeg scaling or AI-based Real-ESRGAN (ncnn).
## Status
- AI upscaling is wired through the Real-ESRGAN ncnn backend.
- Traditional scaling is always available.
- Filters and frame rate conversion can be applied before AI upscaling.
## AI Upscaling (Real-ESRGAN ncnn)
### Requirements
- `realesrgan-ncnn-vulkan` in `PATH`.
- Vulkan-capable GPU recommended.
### Pipeline
1. Extract frames from the source video (filters and fps conversion applied here if enabled).
2. Run `realesrgan-ncnn-vulkan` on extracted frames.
3. Reassemble frames into a lossless MKV with the original audio.
### AI Controls
- **Model Preset**
- General (RealESRGAN_x4plus)
- Anime/Illustration (RealESRGAN_x4plus_anime_6B)
- Anime Video (realesr-animevideov3)
- General Tiny (realesr-general-x4v3)
- 2x General (RealESRGAN_x2plus)
- Clean Restore (realesrnet-x4plus)
- **Processing Preset**
- Ultra Fast, Fast, Balanced (default), High Quality, Maximum Quality
- Presets tune tile size and TTA.
- **Upscale Factor**
- Match Target or fixed 1x/2x/3x/4x/8x.
- **Output Adjustment**
- Post-scale multiplier (0.5x2.0x).
- **Denoise**
- Available for `realesr-general-x4v3` (General Tiny).
- **Tile Size**
- Auto/256/512/800.
- **Output Frames**
- PNG/JPG/WEBP for frame extraction.
- **Advanced**
- GPU selection, threads (load/proc/save), and TTA toggle.
### Notes
- Face enhancement requires the Python/GFPGAN backend and is currently not executed.
- AI upscaling is heavier than traditional scaling; use smaller tiles for low VRAM.
## Traditional Scaling
- **Algorithms:** Lanczos, Bicubic, Spline, Bilinear.
- **Target:** Match Source, 2x/4x, or fixed resolutions (720p → 8K).
- **Output:** Lossless MKV by default (copy audio).
## Filters and Frame Rate
- Filters configured in the Filters module can be applied before upscaling.
- Frame rate conversion can be applied with or without motion interpolation.
## Logging
- Each upscale job writes a conversion log in the `logs/` folder next to the executable.