VC-2 is an intra-only wavelet-based ultra low latency codec developed by the BBC years ago for exactly this purpose. It is royalty free and currently the only implementations are in ffmpeg and in the official BBC repository, and are CPU based. I am planning to make a CUDA accelerated version for my master thesis, since the Vulkan implementations made at GSoC last year still suck quite a bit. I would suggest people to look into this codec
Definitely a neat codec! You can get COTS hardware en/decoders that use it via https://atlona.com/omnistream-av-over-ip/.
Do you mind going in some detail as to why they suck? Not a dig, just genuinely curious.
95% GPU usage but only x2 faster than the reference SIMD encoder/decoder
What I wonder is, how do you get the video frames to be compressed from the video card into the encoder?
The only frame capture APIs I know, take the image from the GPU, to CPU RAM, then you can put it back into the GPU for encoding.
Are there APIs which can sidestep the "load to CPU RAM" part?
Or is it implied, that a game streaming codec has to be implemented with custom GPU drivers?
Some capture cards (Blackmagic comes to mind) have worked together with NVIDIA to expose DMA access. This way video frames are automatically transferred from the card to the GPU memory bypassing the RAM and CPU. I think all GPU manufacturers expose APIs to do this, but it's not that common in consumer products.
> Are there APIs which can sidestep the "load to CPU RAM" part?
On windows that API is Desktop Duplication. The API delivers D3D11 textures, usually in BGRA8_UNORM format. When HDR is enabled you would need slightly different API method which can deliver HDR frames in RGBA16_FLOAT pixel format.
There's also Windows.Graphics.Capture. It allows to get texture not only for whole desktop, but just individual windows.
On Linux you should look into GStreamer and dmabuf.
In your experience, how does VC-2 compare to JPEG XS from a quality perspective? The JPEG XS resources I’ve seen say JPEG XS has higher visual quality, but curious what it’s like in practice.
JPEG-XS is an almost direct successor to VC-2. They use the same techniques and if you read JPEG-XS's whitepaper they explicitly cite VC-2 as an inspiration and a target to surpass. JPEG-XS is an improvement, there is not doubt about that, but unfortunately they decided to patent it for all uses. In both cases, the publicly available software implementations are very few, CPU-based, and the ones that aren't are implemented in hardware inside business AV solutions.