What is Adaptive Streaming?

Live streaming is the future. For many people, it is already the present.

You can confidently expect everything to be internet-based in the not-too-distant future. This will include broadcast television, radio, events, news, presidential addresses, war declarations and marriage proposals, among others.

Getting data (bits) to travel in real-time across the earth and back is a big deal.

HTTP is everywhere

What is HTTP?

If you don’t know what a computer network is, or what a protocol is, head over to A Computer Network, from AFRAID: A RAID Primer.

HTTP (Hypertext Transfer Protocol) is basically a language and set of ‘rituals’ that two people (computers) do to share data.



“What’s up?”

“Hate to trouble you, but I’ve got this REQUEST from this dude.”

“Sure, no problem. What’s the password?”

“Umm…he said *******.”

“Let’s get started.”

One of the foundations of the internet is the client-server model. Your computer or device is a client. If you click your mouse or push a button you expect something to happen.

A server has a Port (a window – like a fast food counter window), through which it waits to listen to any requests. When a client wants something, it sends a message through the port. The server responds.

Your ‘command’ or ‘request’ is executed if you meet the predefined ritualistic behavior guidelines.

Obviously, each website and server has its own rules for what constitutes a correct ritual, but they all speak the same protocol (think of it as language – they all speak English).

HTTP is one of those protocols – the most widely used. The request that the client sends (it could be anything, request to view a web page, request to log-in, request to view videos, etc.) is called an HTTP Request (like saying ‘please’ in English).

When the client and server have shaken hands (agreed to communicate) a Session begins. When the information transfer is complete, the session ends. HTTP is a stateless protocol, which means once the deal is done neither side has to retain info about the transaction. One way people circumvent this is by using cookies.

What does it mean to stream something over HTTP?

HTTP is such a cool dude. Once the session starts, many things can pass between a client and server. HTML, images, video, Javascript, viruses and so on. See how it is like English? You can use it or abuse it.

As long as a session is open, video is just another bit stream. In computer-terms it’s all ones and zeroes anyway. It only looks like video to us.

A video can be passed live between a server (the computer serving the video) and the clients (millions of people watching that game live) as long as each client-server session is uninterrupted.

There are other media streaming protocols, like RTSP, etc. These are like other languages. Instead of speaking English, your client-server decides it’ll speak in Chinese. Different set of rituals.

Here we are concerned with HTTP only. The great advantage of HTTP is that it already is the backbone of the internet, so why not use it (like many non-English speaking countries use it for official work nevertheless)?

What is adaptive streaming?

The problem of streaming is a problem of transferring bits in real-time. Web pages and images and radio is one thing, but video is a whole different problem.

Video files are huge. At 8 Mbps, a 5 minute 1080p video needs 300 MB of storage. If a million people watch the video at the same time, the server will have to dish out 300 TB, at about 1 TB/s.

1 TB/s????!!!!

As you can imagine, covering large-scale live events is a huge challenge. We haven’t even considered a typical user’s internet connection, which can be as low as 1 Mbps. Even when one has a 4 Mbps broadband connection, the actual sustainable speed might only be half of it. How do you find a common ground?

The traditional method to find the balance between video quality (the more you compress the worse it gets) and download speeds is to offer many different options to the end user, like Youtube does. This is what it looks like:

Adaptive streaming

Don’t be scared. All will be explained.

Your video is sent to an Encoder that must transcode your video into different formats on the fly. Each version of your video is a separate bit stream.

The practice of offering different bit rates of the same video in real-time is called adaptive streaming. Doing it with stored video (like Youtube) is one thing, and doing it live (an encoder must crunch numbers like a monster) is another.

Each bit stream is divided into Segments (a few seconds each). A list of available bit streams and their segments is available in the Manifest File. It’s like a roster or a party guest-list.

The client (your computer, mobile, tablet, whatever) references the Manifest often to see what the best quality available is based on your internet speed and device. When speeds are high, you get a higher bit rate video, and when speeds go low, you get a lower bit rate video. As a consumer, you always have access to the stream, and you can ‘adapt’ the stream to suit your device and internet speeds.

Adaptive streaming is used in different ways. Some important ones are Apple HLS (HTTP Live Streaming), Adobe Dynamic Streaming for Flash, Microsoft Smooth Streaming, etc. The problems with these big dudes is that they tend to prefer ‘closed’ systems and their own proprietary technology and hardware.

And, they don’t like each other too much.

What is DASHing (Dynamic Adaptive Streaming over HTTP)?

Can we have adaptive streaming technology that can work in an automated fashion, dynamically changing, monitoring, connecting, disconnecting, analyzing, etc., in real-time? You bet.

DASH is the technology that:

  • Encodes your video into different bit rates.
  • Breaks down the video into different segments.
  • Uses HTTP to stream data.
  • Gives the client the ability to automatically select the best segments (best bit rate possible) to view in real-time (What? Do you want to pause every time your internet connection changes speed to select the best ‘feed’ from a menu?).
  • Avoids stalling or buffering or rebuffering during playback.
  • Provides a seamless viewing experience under changing network conditions.

As a certain Mr. A. Powers would say: It’s DASHing, baby. 

To know more about how DASH is implemented, read What is MPEG-DASH?

What is Live Streaming?

Here’s my definition:

Live streaming is the practice of

  • transferring multimedia content of an event in real-time, as it happens,
  • using dynamic adaptive streaming technology over HTTP (DASH),
  • to many clients via a server, without burdening the end user with technology,
  • while providing a seamless viewing experience similar to live television.

Like I said, it’s the future.