Transcoding vs Encoding
The terms are often used interchangeably. Strictly:
- Encoding is the act of compressing raw video data into a codec format (e.g. H.264).
- Transcoding is the act of decoding an already-encoded video and re-encoding it to a different codec, container, bitrate, or resolution.
Most “encoding” in cloud video platforms is technically transcoding — your uploaded MP4 (already encoded by your camera or NLE) is decoded and re-encoded into multiple HLS variants.
Why platforms transcode uploads
Source files vary: 4K H.264, 1080p HEVC, 8K ProRes, AV1, VP9, weird old codecs from old cameras. The platform needs:
- Multiple resolution variants for adaptive bitrate streaming
- A predictable codec for player compatibility (usually H.264)
- HLS-segmented output rather than single-file MP4
- Optionally, encryption applied during the encode step
To produce all that, the platform decodes the source (using FFmpeg, GStreamer, or a hardware decoder) and re-encodes — i.e., transcodes.
Hardware vs software transcoding
- Software transcoding (FFmpeg + libx264) — flexible, all platforms, slower than hardware. Roughly 2–4× realtime for libx264 medium preset at 1080p on a recent x86 CPU; slower on the slower presets, faster on faster.
- Hardware transcoding (NVIDIA NVENC, Intel QuickSync, AMD VCN) — fast, fixed feature set, requires GPU. AVCaption uses NVENC for ~10-20× realtime.
Hardware is dramatically faster but slightly less compression-efficient at the same quality. For a video hosting platform shipping multiple variants quickly, hardware is the right trade-off.
When you’d “just encode”
When you have raw uncompressed video (e.g., from a frame-grabber or a render pipeline), encoding is the term. AVCaption’s pipeline almost always operates on already-compressed source files, so it’s transcoding in the strict sense.