Structuring Renditions for Simplicity

In a multi-device world, navigating the murky waters of video support can be tricky, especially when you’re trying to keep costs down and quality high. Renditions and The Modern World In its most basic form, online video consists of transcoding a single source file into a single output file that will play over the Web. Each of these video files is called a rendition, and an array of renditions defines how video will be delivered to end-users. When YouTube launched in 2005, it delivered a single output rendition through a basic player. Fast forward to 2013 and the world of online video is defined by HTML5/Flash players, ad-insertion, recommendation engines, paywalls, and anywhere from a handful to a boatload of renditions at different bitrates and in various formats. It may sound like a confusing mess, and it can be, but there are strategies that can simplify your approach to delivering video, shrink costs, and improve the viewer’s experience. It all starts with renditions.  You can download the series “Architecting a Video Encoding Strategy Designed for Growth” as a whitepaper here. Climb the Ladder Imagine the world of devices as a wall. At the bottom of the wall are the least capable, most painful-to-use feature-phones with 3G connections and a tiny screen. At the top of the wall, we have a brand new, HDTV with a fast Internet connection. Between the bottom and the top of the wall is a range of devices, each having different processors, GPUs, network connections and screen sizes. The height of the wall is determined by average content duration; the longer the duration, the higher the wall. Renditions are like ladders that help us start anywhere along the wall and climb up or down smoothly. If the wall is high, there needs to be more rungs on the ladder to ensure users can smoothly climb up and down. If the wall is short, we can get away with only a couple rungs and still provide a good experience

Step 1:  The First Ladder The first step is to decide on a base format. The base format should be playable on a wide range of devices. It might not always be the best choice on every device, but it should always be playable. The goal of online video is to get in front of everybody. Zencoder supports a wide swath of the most important output formats for Web, mobile and connected TVs. Valid use cases exist for each of these formats; but, for the vast majority, MP4 is the best option due to its ubiquity across the widest range of devices.  The first ladder we build will be based on the MP4 format. Step 2:  Bitrates - Creating the Ladder’s Rungs Now that we have decided which ladder to create first, we can begin constructing the rungs.
First, decide where on the wall the service should start and end. For example, consider a user-generated content site where the average video duration is one minute. The maximum size of each video is small, so there is no need to worry about buffering or stream disruptions; the player should be able to download the whole stream in a few seconds, which means only a couple of renditions are needed, for example, one HD and one SD.

On the other hand, consider a movie service with an average video length of 120 minutes. The files are large, which means the user’s device won’t be able to download the entire stream. In addition, users generally have higher expectations for the quality of feature films. We need to create a number of renditions so users will be able to watch high-quality videos when they have a strong network connection.

If the connection is poor, we still want them to be able to watch a video, and then improve the experience as soon as more bandwidth is available by providing intermediate renditions -- stepping up the ladder.The longer the content and the higher the quality, the more renditions are needed to provide a consistent viewing experience. Step 3:  Defining the Rungs We’ve created a nice, smooth ladder, but there is room for improvement. Aside from bitrate and resolution, H.264 has two other features that are used to target renditions at subsets of devices: profile and level.
Profile defines the complexity of the encoding algorithm required to decode a given rendition ranging from low to high complexity. The three most important profiles are baseline, main and high. Level defines a maximum amount of pixels and bitrate that a certain rendition is guaranteed not to exceed. The three most important levels are 3.0 (SD/legacy mobile), 3.1 (720p/mobile), and 4.1 (1080p/modern devices). At the bottom rung, we want to provide the widest array of support so that we can always deliver a playable video regardless of the device. That means we should choose either baseline 3.0 or main 3.1, and we should choose a resolution that is fairly modest, most likely between mobile or 640x360. As we move up the ladder, we can gradually increment these values until we reach the top, where we can maximize our video quality with 1080p high 4.1 videos.

Step 4:  Formats - Duplicating Ladders Now that our MP4s have been created, we have a stable base format and customers can watch video on a variety of devices; we created a ladder to scale the wall. While MP4 is a strong baseline format, other formats can improve the user’s experience.  For example, HLS allows a user’s device to automatically and seamlessly jump up and down the ladder. Since we’ve already created MP4s, and because MP4 is a standard format, we can easily repackage it into other formats. In fact, this is such an easy task that Zencoder charges only 25% of a normal job to perform this duplication, or transmuxing, and it can be done nearly instantly alongside a group of MP4 encodings by using “source”, “copy_video”, and “copy_audio.” The “source” command tells Zencoder to reuse the file created under a given output “label.” So, if we create a file with “label:”:= “MP4_250,” all we need to do is use “source:” “MP4_250” to tell Zencoder to reuse this rendition. “Copy_video” and “copy audio” will then extract the elemental audio and video tracks, and repackage them into an HLS formatted file.

We can do the same thing for smooth streaming as well. And almost instantly, at a fraction of the cost, we’ve created two new ladders that let virtually anybody watch great quality video.

Step 5:  Refine The most important thing a video service can do is commit itself to constantly improving, revisiting, and refining its renditions. With the pace of online video accelerating by the day, what seems terrific today might only be sufficient next year. In a couple of years, it will be downright obsolete.  Zencoder helps solve these issues by being a driving force behind the bleeding edge of video encoding technology. We are constantly updating and building our tools to make the encoding platform faster and more stable with higher quality. The next step is up to you -- constantly testing new variations to find the best set of renditions for your users will result in a more stable and optimized delivery infrastructure and a more engaged user base.