Upload
lytram
View
219
Download
3
Embed Size (px)
Citation preview
Copyright © 2017 Brightcove, Inc. All Rights Reserved.1
Large-scale cloud encodingYuriy Reznik, Brightcove, Inc.
DASH-IF workshop, Comcast media center, Centennial, CO
August 10, 2017
Copyright © 2017 Brightcove, Inc. All Rights Reserved.2
OTT content publishing
– chain of processing steps
– single rate vs ABR / multi-rate encoding
– live vs on-demand encoding
Moving things to cloud
– API & notifications
– parallelization of work
– managing cloud
– monitoring quality and costs
Optimizing transcoding system for ABR delivery
– pre-processing
– encoding ladder design
– setting right codec constraints
Outline
Copyright © 2017 Brightcove, Inc. All Rights Reserved.3
Typical processing chain:
Steps:
– origin – a feed from a contribution encoder, mezzanine from post-production, etc.
– ingest – process of acquisition of the content
– demux – decomposition of original container into a set of elementary streams and metadata
– transcode – process of decoding, conversion, and re-encoding of each media component
– mux – process of wrapping of each re-encoded elementary stream into delivery format (ISOBMFF for DASH)
• this step may also be combined with:
– splicing / dynamic ad-insertions
– applying encryption (as needed for DRM)
– segmentation – splitting output in a sequence of segment files
– manifest – generation of manifest (DASH .mpd) file describing all adaptation sets and representations
– CDN – uploading of the encoded media files and manifests on CDN for distribution
Notes:
– some of these steps (mux, DRM, manifest generation, CDN fill) can be done dynamically (as client accesses stream)
OTT content publishing
Origin Ingest Demux Transcode Mux Manifest CDN
Copyright © 2017 Brightcove, Inc. All Rights Reserved.4
Single-rate transcoding:
Transcoding for ABR:
Key operations:
– ABR ladder design
– video processing
– encoding
Transcoding
DecodeVideo
processingEncode
Mezzanine
video ES
Delivery
video ES
Targets
DecodeVideo
processingEncode
Mezzanine
video ES
Delivery
video ES 1
Video
processingEncode
Delivery
video ES 2
Video
processingEncode
Delivery
video ES 3
Video
processingEncode
Delivery
video ES n
Ladder of
ABR targets
Encode
request
… … …
Copyright © 2017 Brightcove, Inc. All Rights Reserved.5
Single transcoder process:
– executed on a local system
Plurality of transcoder processes executed in cloud:
– executed on plurality of systems
Key aspects:
– API
– Distribution of work
– Cloud orchestration
– Monitoring
– Ensuring reliability
– Minimizing costs
– etc.
Moving things to cloud
Transcoder process
Decode A/V processing EncodeMezzanine Delivery stream
Job
parameters
Demux Mux
Origin
Mezzanine
Job parameters
Cloud
CDN
MezzanineMezzanines
Job parametersJob parameters
Delivery streamDelivery stream
Delivery streams
API
Orchestration
Dashboard
Cloud Instance
Transcoding processCocessCloud Instance
Transcoding processCloud Instance
Transcoding processCloud Instance /
Transcoding process
Copyright © 2017 Brightcove, Inc. All Rights Reserved.6
Types of parallelization:
– stream-level
• each stream is encoded at dedicated instance
– segment-level
• stream is split in segments
• each segment is encoded at dedicated instance
– instance-level
• each instance has multiple CPU/GPU cores
• encoding job is split and allocated to those cores
• codec-level parallelization
On segment-level parallelization:
– key challenges are to ensure that
• encoding quality is consistent across different segments
• catenation of encoding segments produces HRD-compliant stream
– simplification:
• for segment-based delivery (DASH, HLS) it is sufficient to align boundaries of encoding segments with boundaries of final segments
used for delivery. Since each delivery segment resets SPS/PPS, this avoids possible HRD mismatch problems.
• Example: delivery is done using either 4-, 6-, or 10-sec segments. Then by using 4*6*10 / gcd(4,6,10)^2 = 60-sec segments we can
ensure that their boundaries will coincide with boundaries of final delivery segments in all cases.
Distribution of work
Origin
Mezzanine
Job parameters
Cloud
CDN
MezzanineMezzanines
Job parametersJob parameters
Delivery streamDelivery stream
Delivery streams
API
Orchestration
Dashboard
Cloud Instance
Transcoding processCocessCloud Instance
Transcoding processCloud Instance
Transcoding processCloud Instance /
Transcoding process
Copyright © 2017 Brightcove, Inc. All Rights Reserved.7
As an example, below we show Zencoder API:
– Job creation:
• An HTTP POST to: https://app.zencoder.com/api/v2/jobs
• with JSON body containing job request:
– Notifications:
• Sent upon completion of the job. Job request may include, e.g.:
API, notifications, monitoring
{
"input": "s3://zencodertesting/test.mov",
"output": [
{
"label": “stream-1-240k",
"audio_bitrate": 56,
"audio_sample_rate": 22050,
"base_url": "s3://my-bucket/",
"decoder_bitrate_cap": 300,
"decoder_buffer_size": 800,
"max_frame_rate": 15,
"public": 1,
"type": "segmented",
"video_bitrate": 200,
"width": 400,
"format": "ts"
},
…
]
}
{
"notifications": [
"http://user:[email protected]/zencoder",
],
…
Copyright © 2017 Brightcove, Inc. All Rights Reserved.8
Key topics:
– optimizing job allocations
– ensuring reliability
• redundancy
• reliable ingest & upload techniques
• support for legacy & broken formats
– managing costs
• selecting cloud, region, instances, etc.
In addition, the transcoder by itself must be good!
Attention must be given to:
– pre-processing
– encoding ladder design
– setting right codec constraints
Optimizing cloud-based transcoding system
Copyright © 2017 Brightcove, Inc. All Rights Reserved.9
Key functions:
– understand what type of content it is (which may be different from the way it was previously encoded)
• progressive / interlace / telecine, cadence type, field order, etc.
– minimize artifacts introduced by prior generation encoder or sampling process
• blocking, ringing, broken lines, temporal noise, etc.
– perform conversion from source format to format needed for delivery. This includes conversions of:
• spatial resolution
• chroma sampling type (4:4:4, 4:2:2, 4:2:0, different kinds of 4:2:0)
• frame rate
• temporal sampling type (progressive, telecine, interlace, field order)
• color (gamma, matrix, primaries, EOTF, mastering display color volume, other display-related metadata, etc.).
Typical chain:
Pre-processing
Decoder
Content analysis
Artifact
removal
filters
Temporal
sampling type
conversion
Spatial
resolution
conversion
Color space
conversion
Frame-rate
conversion
Sequence and frame-level metadata, bitstream elements, SEIs, etc.
Detected temporal sampling type/pattern, cuts, broken lines, black bars, etc.
Raw video
Copyright © 2017 Brightcove, Inc. All Rights Reserved.10
Each ABR stream (“representation”) is basically a tradeoff between:
– bitrate 𝑅 used to send video over the network, and
– quality 𝑄 achieved when video is rendered by receiver
A combination of (𝑅, 𝑄) pairs achievable for a particular video sequence and codec is commonly
called quality-rate function 𝑄 𝑅 . It is normally monotonic and saturating with 𝑅 → ∞.
Special points:
– 𝑅min, 𝑄min – lowest quality point at which service becomes feasible
– (𝑅max, 𝑄max) – saturation point (spending more bits do not produce visible improvements)
Notes on quality metrics:
– broadly available metrics (MSE, PSNR, SSIM, FSIM, etc.) measure only codec noise
• full-reference, specific to each resolution (spatial, framerate, sampling type, etc.)
• have no means for discrimination between resolutions, reproduction volume (HDR/SDR), etc.
– tools/models accounting for different resolutions, display & environmental parameters:
• VDP-derivatives, Tektronix PQA (PQR metric), SSIMWave SQM, etc.
– but… none of these metrics/models is perfect
• human-measured scores typically scatter around
• the most one can infer from an objective quality score is the probability that user assessment of
quality is above or below a given threshold.
Encoding: quality-rate tradeoffs
Hypothetical shape of quality-rate function:
Some existing metrics (Zhang, et al 2011):
Copyright © 2017 Brightcove, Inc. All Rights Reserved.11
Encoding ladder parameters:
– bitrates: 𝑅 = (𝑅1, … , 𝑅𝑛)
– resolutions: 𝑆 = (𝑆1, … , 𝑆𝑛), 𝑆𝑖 = 𝑤𝑖 , ℎ𝑖 , 𝑓𝑖 (width, height, framerate)
– codec-related constraints 𝐶 = (𝐶1, … , 𝐶𝑛)
Typical ladder design constraints:
– bitrate monotonicity: 𝑅min ≤ 𝑅1 < ⋯ < 𝑅𝑛 ≤ 𝑅max
– bound on first rate/start-up latency: 𝑅1 ≤ 𝑅1,max
– bitrate granularity constraints: 𝛾min ≤𝑅𝑖+1
𝑅𝑖− 1 ≤ 𝛾max, 𝑖 = 1,… , 𝑛 − 1
– resolution monotonicity: 𝑆1 ≤ ⋯ ≤ 𝑆𝑛 , 𝑆𝑖 = 𝑤𝑖 ⋅ ℎ𝑖 ⋅ 𝑓𝑖– codec noise thresholds: 𝑞min ≤ 𝑞𝑆𝑖(𝑅𝑖), i = 1, … , n
– quality monotonicity: 𝑄min ≤ 𝑄(𝑅1) < ⋯ < 𝑄(𝑅𝑛)
Optimal ladder design:
– find: number of points 𝑛∗, bitrates 𝑅∗ = 𝑅1∗, …𝑅𝑛∗
∗ , resolutions 𝑆∗ = 𝑆1∗, … 𝑆𝑛∗
∗ , and constraints 𝐶∗ = 𝐶1∗, … 𝐶𝑛∗
∗ , such that:
Φ 𝑛∗, 𝑅∗, 𝑆∗, 𝐶∗ = max𝑛∈ℕ, 𝑅∈ℝ𝑛, 𝑆∈ℝ3×𝑛
+all constraints
Φ 𝑛, 𝑅, 𝑆, 𝐶
– where: Φ 𝑛, 𝑅, 𝑆, 𝐶 is a figure of merit function
• e.g. best overall quality, lowest storage cost, lowest transcoding cost, highest efficiency (max number of pixels / encoded bit), etc.
Optimizing design of ABR encoding ladder
Copyright © 2017 Brightcove, Inc. All Rights Reserved.12
VBR control
– for ABR streaming streams must be capped VBR !!!
• the use of highly-variable VBR encoding may confuse DASH clients and cause
buffering or inefficient use of bandwidth
– typically, maximum bitrate cap is set to about 10-35% above average bitrate
• must be lower than next target bitrate in the ABR encoding ladder
– decoder buffer size must also be limited
– hrd_parameters() should be included in the bit-stream
GOP length and type
– GOP length must be shorter or equal to shortest feasible segment length
– GOP length may also be affected by the need to support ad-insertions/splicing, etc.
– closed GOP is most commonly used
Profiles & levels
– H.264 Baseline profile is needed for legacy devices (e.g. mobile devices prior to 2012)
– Main profile is adequate for streams of up to 720p
– High is better for 1080p and beyond
– Level must be sufficient to carry stream at given resolution, framerate, bitrate, CPB size
Reference frames, B frames:
– for legacy devices (e.g. mobiles of 2012 and earlier) – no B frames, 1 reference
– most STBs can support up to 4 reference frames, 3 B-frames
Setting right codec constraints
Example level/profile combinations:
Example of uncapped VBR behavior:
Copyright © 2017 Brightcove, Inc. All Rights Reserved.13
Design of cloud-based encoding system poses a number of interesting engineering problems
These include:
– methods of parallelization of encoding work
• stream-level, segment-level, instance/codec-level
– distribution of work in the cloud
• allocation if instances
• managing their workloads
– minimization of costs
• considering different cloud service providers, their cost models
• considering different regions, etc.
– ensuring reliability
• redundancy
• reliable ingest, upload techniques
– optimization of encoding for ABR/OTT delivery
• pre-processing, ladder design, etc.
One can have much fun solving them, or just pick one of the existing solutions.
To give Zencoder a try, please go to https://zencoder.com/en/ and click “Sign Up”.
To check our ABR ladder generator, please email: [email protected]
Summary