Master's Thesis Live Streaming Support on Mobile Networks for Android and iOS Jon Andersen

Master’s Thesis

Live Streaming Support on Mobile Networks for Android and iOS

Jon Andersen & Simon Ekvy

Department of Computer Science Faculty of Engineering LTH Lund University, 2013

ISSN 1650-2884 LU-CS-EX: 2013-09

Live Streaming Support on MobileNetworks for Android and iOS

Jon [email protected]

Simon [email protected]

May 30, 2013

Master’s thesis work carried out at Axis Communications.

Supervisor: Anders Persson, [email protected] and Gunnar Erlandsson,[email protected]

Examiner Mathias Haage, [email protected]

mailto:[email protected]





Abstract

Mobile devices are becoming increasingly important, and users are gettingused to having access to all their data and systems in their smartphones.Of course, this also includes video surveillance systems. However, iPhoneand Android smartphones have limited support for Axis video streamingformats in their native browsers, and there is a need for apps in order tooffer a good user experience for remote mobile access.

The purpose of this thesis is to investigate video streaming, networkprotocols and decoding of video in smartphones. Furthermore, this thesiswill look into video quality and how to ensure this on mobile networks.

Several different approaches and architectures for streaming media appsare presented, and the most promising approaches are selected to be pro-totyped and then evaluated. Since mobile networks present a new area forAxis Cameras, quality and performance of these networks with the Axiscameras are evaluated. We created predefined profiles for different networkssuch as Edge, 3G and WiFi and an algorithm for automatic quality switch-ing between these profiles. However, switching to an optimal profile forbest quality requires context awareness. This is possible through severalmethods, some of which we investigate in this thesis. Based on the result ofthe evaluations, we make recommendations for the best approach on howto develop a streaming media app for an Axis camera and a proposed qualityswitching algorithm to handle the varying quality of mobile networks.

Keywords: iOS, Android, Live Streaming, RTP/RTSP, MediaCodec, FFmpeg,Video Quality

2

Acknowledgements

We would like to send our gratitude to our supervisors Anders Persson and GunnarErlandsson at Axis Communications for their support and invaluable feedback. Alsowe want to thank Martin Gunnarsson for his input and help. Furthermore we wouldlike to thank Mathias Haage, our supervisor at the Department of Computer Scienceat Lund University, for his guidance and input during the work of the thesis.

3

4

Contents

1 Introduction 71.1 Purpose and goal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.2 Previous Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.3 Tool Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81.4 Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.5 Limiting factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.6 Streaming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.7 Playback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2 Background 132.1 Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.1.1 Streaming media over TCP and UDP . . . . . . . . . . . . . . 132.1.2 RTSP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.1.3 RTP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.1.4 HTTP Live Streaming (HLS) . . . . . . . . . . . . . . . . . . 16

2.2 Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162.2.1 JPEG and MPEG . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.3 Authentication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192.4 Wireless Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192.5 Network Cameras . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202.6 Smartphones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212.7 Video Quality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3 Support Analysis 253.1 Android . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.1.1 FFmpeg and Gstreamer . . . . . . . . . . . . . . . . . . . . . 273.1.2 Proxy RTSP . . . . . . . . . . . . . . . . . . . . . . . . . . . 293.1.3 MediaCodec API . . . . . . . . . . . . . . . . . . . . . . . . . 303.1.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.2 iOS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

5

CONTENTS

3.2.1 FFmpeg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323.2.2 Proxy RTSP . . . . . . . . . . . . . . . . . . . . . . . . . . . 333.2.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

4 Design and Development 354.1 Android Proxy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

4.1.1 Android UDP Proxy . . . . . . . . . . . . . . . . . . . . . . . 354.1.2 Android HTTP Tunneled Proxy . . . . . . . . . . . . . . . . . 36

4.2 Android and MediaCodec . . . . . . . . . . . . . . . . . . . . . . . . 374.3 iOS and FFmpeg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394.4 Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

5 Streaming and Video Quality 455.1 Network speed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455.2 Basic Algorithm Concept . . . . . . . . . . . . . . . . . . . . . . . . . 475.3 Video Quality parameters bit rate use . . . . . . . . . . . . . . . . . . 495.4 Mobile Profiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 515.5 App design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525.6 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

6 Discussion 576.1 App discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 576.2 Video Quality Discussion . . . . . . . . . . . . . . . . . . . . . . . . 586.3 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 596.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

Appendix A Full size figures 63

Appendix B Code Snippets 73B.1 iOS detect network . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

Appendix C How to compile FFmpeg 75C.1 Compiling for iOS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

Appendix D Decoding with the MediaCodec API 79D.1 The basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79D.2 Sample code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

6

Chapter 1Introduction

This chapter will give a general overview of the goals of this thesis, the setup, limita-tions and previous work done. The work of this thesis is carried out at Axis Commu-nications. Axis is a swedish based company that manufactures network cameras. Axiswas founded 1984 in Sweden, Lund, at present they are the global market leader in thenetwork video market and has a market share of 31% [1].

1.1 Purpose and goalThe overall goal of this thesis is to investigate the current support on smartphonesfor live streamed video and to propose how to implement this on iOS and Android.Based on the investigation the most promising solutions will be focused on in thisthesis. Further this thesis will look into video quality and how to ensure this on mobilenetworks. The main features the apps aims to provide is:

• Playback of live streamed video (RTSP/H.264 streaming, authentication and de-coding)

• Quality control and reliability of live stream video

1.2 Previous WorkWhile there are Apps developed on both Android and iOS platform there is not muchdocumented work on how to develop RTSP streaming for these platforms. Previouscustom media players have been developed for the android platform [12], however thiswas made when Android was in its infancy and much have changed since.

On the topic of Quality there is a large amount of research. This thesis focuseson the Axis camera and tries to develop suitable solutions. In particular, solutions as

7

1. INTRODUCTION

proposed in [4][3] which uses the server to handle the quality. This is currently notpossible on the Axis camera and as observed in [4][3] it will put more load on theserver which isn’t desirable for a network camera. In [3] buffers are used to detect whena quality switch is needed while [4] suggest an upswitch algorithm based on ”distinctcharacteristics in the mobile links”. In [16] a suggested method is to gather statisticsof bandwidth in a particular context, in this case in Oslo while commuting. By thisinformation it’s possible to decide when to change quality based on where the user isand what the expected bandwidth will be at that point. In [21] the author suggests waysto calculate an acceptable bit rate for a given video. Which could potentially be used asa starting point for automatically deciding on what bitrate is needed for a video. Otherways to handle changes in bandwidth are to use HTTP streaming, such as Apples HLS[37]. However this is more suitable for non live streaming material, such as YouTubeor Netflix.

1.3 Tool Setup

The setup for the live streaming environment is shown in Figure 1.1. The Axis cam-era captures video and transmits a live video stream to connected devices through anetwork using wired or wireless transmission. The smartphone receives the live videostream through WiFi or a mobile network and displays it.

Figure 1.1: Live streaming environment

8

1.4 SCENARIO

1.4 ScenarioThe result of our investigation will be advice on implementation of apps and prototypesof such. The use of these apps would be to view ip cameras anywhere where there isinternet access with a smartphone.

Preparation An Axis camera is installed on a network. Accessible IP addresses areobtained and authentication is configured.

Setup The app connects to the camera with the IP address and the correct credentials.

Use When the app has successfully connected to the camera a live view of the streamedvideo or prerecorded media is displayed.

1.5 Limiting factorsThe foremost limiting factor was time. The thesis had a time span of 20 weeks in whichdeveloping the apps, researching quality and also writing the thesis had to be done.

Since the thesis was aimed at Axis cameras the hardware was a limiting factor. Wecould not change any protocols or server side implementations and thus the clients haveto conform to the Axis camera.

1.6 StreamingThere are three main categories of streaming media [17]

Stored media streaming when streaming stored media files from a server. Such asmovies or music.

Live streaming when a client is listening to a broadcasted live stream. Such as live TVor in our case an Axis camera live feed.

Interactive live streaming when a client is interacting with a live stream. Such as withSkype or Axis cameras that supports remote control of the camera.

While there are several protocols for streaming we will only focus on those related tolive streaming.

Figure 1.2: Download and Play

Download and play is as the name suggest the simplest form of accessing the videocontent. It downloads the entire video and only after this starts playback, see figure 1.2.

9

1. INTRODUCTION

Figure 1.3: Traditional Streaming

As the entire file needs to be available on the client side before playing it, this excludesthe use of live video such as with the Axis camera [2].

In traditional streaming the video is transferred to the client in a stream of packetswhich will be interpreted and rendered in real time as they arrive, see figure 1.3. Thereis a need to keep track of the state of the client application such as PLAY, PAUSE andcurrent progress of the streaming. RTSP is the standardized protocol that support thisand this is the protocol that the Axis Camera uses [2].

Figure 1.4: Progressive Download

A hybrid approach between download and play and the traditional streaming is theprogressive download approach. It uses the HTTP protocol instead of RTSP and RTPas traditional streaming does. Instead of downloading the entire file, segments of themedia file is downloaded which creates a virtual stream and enables the client to startplaying the media before it is fully downloaded, see figure 1.4. [2]

Figure 1.5: HTTP Adaptive Streaming

10

1.7 PLAYBACK

Continuing with the basics of progressive download of chunks of media HTTPAdaptive Streaming makes use of the HTTP protocol and enables live streams. Con-trary to progressive download the chunks are much smaller and can be compared tostreaming of packets much like in the case of RTSP and RTP, see figure 1.5 [2].

1.7 PlaybackThere are three steps in playback of media files, gathering media data, decoding theaudio and the video stream then finally display the decoded data. The media playerdesign that this thesis will use is based on a hierarchy where each layer completes itsassigned task. The layers in the media player are data extract layer, pretreatment layer,decode layer and finally the user interface. The layers can be seen in Figure 1.6.

Figure 1.6: A conceptual Media Player

The first layer in the media player is the data extract layer which is responsiblefor reading media files. It either read a local file or read from a remote stream with aprotocol such as RTSP.

The pretreatment layer demuxes the incoming media and stores the demuxed infor-mation in the buffer. The layer ensures that the buffer receives entire media file framesas the underlying protocols might receive partials frames.

Decoding of the media is done in the decode layer. The decoding layer ensures thatthe decoded frames will remain synchronized in the correct frame rate. Decoding canbe done either hardware accelerated or using software, but the later can be a very CPUintensive task. Normally media players has a buffering step here where they will ac-cumulate a few seconds worth of data. For live video this step has to be minimal toreduce the delay from the camera source to rendering on screen. This will ensure thatthe video can continue to play even if the network has some minor issues such as vary-ing bandwidth. The implementation of the final layer will vary on different devices.Rendering methods can vary from using device specific classes to using OpenGL orSDL(Simple DirectMedia Layer) [12].

11

1. INTRODUCTION

12

Chapter 2

Background

This chapter will cover concepts and technologies used today in live streaming. Thecovered areas are mostly those that are related to our work, such as RTSP/RTP, H.264and wireless networks.

2.1 Protocols

2.1.1 Streaming media over TCP and UDPThe User Datagram Protocol (UDP) is a connectionless and unreliable transport pro-tocol. UDP provides process-to-process communication and it sends packets betweenprocesses without any guarantee of delivery [6].

TCP (Transmission Control Protocol), unlike UDP, is a connection-oriented, reli-able, stream transport protocol. Instead of sending packets between processes it sets upa virtual connection in which it sends a stream of bytes. For reliable transport, TCPuses an acknowledgment mechanism to check arrival of messages and request retrans-missions [6] .

Using TCP for streaming media has some important advantages, firstly the stableand scalable rate control. Secondly it can eliminate packet losses. However TCP israrely used for streaming media, and then mainly to circumvent firewalls. The mainissues with using TCP is the delivery guarantee which effectively will lead to an in-creased delivery time due to retransmissions. TCP uses an ”Additive Increase MultipleDecrease” rule. When no packet loss (congestion) is detected transmission is increasedat a constant rate, and when congestion is detected transmission is halved. This rulesgives a very varying throughput, which is not very suitable for streaming media. Ingeneral streaming will be done over UDP as it allows for more flexability of both rateand error control [7].

13

2. BACKGROUND

2.1.2 RTSPReal-Time Streaming Protocol (RTSP), standardized in RFC 2326, is a protocol de-signed to establish and control streaming of media acting as a ”network remote con-trol”. RTSP is a stateful protocol that resides in the application layer and runs overTCP. It’s designed to be very similar to the text-based stateless protocol HTTP. As op-posed to HTTP RTSP maintains a session for each client. Both the RTSP client andserver can issue requests to each other. All data (with one exception) is carried out-of-band by another protocol specified when establishing the session.

RTSP messages are text based and each line is terminated by a carry return directlyfollowed by a line feed (CRLF) and the message ends with an additional CRLF. Amessage can also contain a payload for which the length is then specified in the value of”Content-Length”. The following are the recommended and required commands thatcan be issued in an RTSP session [8].

Figure 2.1: RTSP session

OPTIONS returns a list of supported RTSP commands. This command does notinfluence the server state and can be used to keep the session alive by regularlyissuing the command.

DESCRIBE retrieves the Session Description Protocol (SDP) of the media sourcewhich contains all media initialization information such as resolution and fram-erate.

SETUP specifies the transport protocol and the ports to be used.

PLAY is used to request the server to start sending data.

14

2.1 PROTOCOLS

PAUSE is used to temporarily stop the stream and can be resumed by the PLAY com-mand.

TEARDOWN terminates the data delivery from the server.

A RTSP session typically looks like in figure 2.1 The client connects to the server,first it requests all available commands by issuing the OPTIONS command and by is-suing DESCRIBE the client will get all information about the media stream. The clientcan now initiate the setup process with SETUP, the client will send its transport infor-mation with the message and receive the servers transport information in the response.After a successful setup the client can request the server to start streaming the me-dia with command PLAY. The client can stop streaming by issuing either the PAUSEcommand from which the stream can be continued by issuing PLAY or terminate thestream permanently with the command TEARDOWN.

2.1.3 RTPReal-time Transport Protocol (RTP) defines a packet format to handle real-time trafficsuch as audio and video on the Internet. RTP was first published as an IETF standard(RFC 1889) in 1996 with the current RFC 3550. RTP does not have any deliverymechanism and must be used with UDP or TCP. Figure 2.2 shows how the RTP packetis encapsulated within UDP or TCP packets. RTP resides in the transport layer aboveUDP/TCP and enhances them with synchronization, loss of detection, payload andsource identification, reception quality reporting and marking of events within themedia stream. RTP consists of two parts one which transports the data and one thatcontrols the data, Real-time Transport Control Protocol (RTCP). RTCP runs alongsideRTP and provides periodic reporting of information which is used in for example forlip synchronization between audio and video [18].

Figure 2.2: Video data stack

The RTP packet consists of 12 mandatory bytes and up to 60 optional bytes. TheRTP data packet can be viewed in Figure 2.3. There are four main parts to the RTPpacket. The first 12 bytes are the mandatory RTP header which consists of packet infor-mation, sequence number, timestamp and the synchronization source identifier. Thenext part is the contributing source identifiers and consists of 8 optional bytes whichare used for identifying the sources which contributed to the RTP packet. Normally

15

2. BACKGROUND

there is only one source that contributes to the packet. The following 8 bytes are anoptional header extension which can be used for extending the RTP header. The thirdpart consists of an optional payload header with the primary functionality to provideerror resilience to formats not designed to be sent over lossy packet networks. The finalpart is the payload itself which contains one or more frames of the media data.

Figure 2.3: A RTP data transfer packet

2.1.4 HTTP Live Streaming (HLS)While several HTTP Adaptive Streaming protocols exist we limit ourself to only dis-cuss Apples proposed protocol called HTTP Live Streaming or short HLS. The proto-col consists of three parts, the server component, the distribution component and theclient component.

The server component takes a media stream as input and encapsulates it in a formatsuitable for delivery, preparing the media for distribution. This will create chunks ofmedia and the server is also responsible of creating the index file needed containingthe url of all the media chunks. For an ongoing live session Apple recommends listingthree chunks with a duration of ten seconds each.

The distribution component is a standard web server responsible for accepting theclient requests and deliver the prepared media to the client.

The client software requests media from the distribution component and receivesan index file specifying the chunks. Thereafter downloads the chunks and reassemblesthem so that the media can be played. The client can seamlessly change the qualitybased on the alternative streams specified in the index file [37].

2.2 CompressionCompression is used to reduce image data with as little visible quality loss as possible.The reduction can be as much as 50% to 90% (depending on the compression algorithm

16

2.2 COMPRESSION

used) and in some cases more without any visible difference from the original image[20]. With a camera that supports 1080p at 30 frames per second with a color depth of24 bits per pixel the total uncompressed bandwidth will be, 30 ∗ 1920 ∗ 1080 ∗ 24 =187 MBps. Most users do not have a connection of 200 MBps and will need efficientcompression techniques.

2.2.1 JPEG and MPEGThere are several standard compression formats, among these JPEG and MPEG are twobasic video compression standards. Since the first version of MPEG new versions of thestandard has been released offering improvements and new features such as supportinglower bandwidth consuming applications or applications that requires extremely highquality at high bandwidth. Other formats are H.261 and H.263 based on MPEG butare not standards and lacks some of MPEG:s advanced features [20].

JPEG compress images while MPEG compress video, which means that JPEG han-dles still images (see figure 2.5) and MPEG moving pictures (see figure 2.6) [20]. MPEGhas both spatial and temporal compression this means that MPEG takes into accountthat many adjacent images in a sequence of images are the same or almost the same [17].

Spatial compression is made with JPEG where each frame in a video is compressed.Temporal compression handles a sequence of images where redundant images can beremoved. The method used in temporal compression is to divide frames into threecategories: I-frames, P-frames and B-frames [17].

Figure 2.4: A sequence of I-, B- and P-frames [20].

I-frame (intra coded-frame) is self-contained, which means it is not dependent on anyother frames and therefore it can be decoded by itself.

P-frame (predicted-frame) contains the changes relative the preceding frame, and there-fore it requires the previous I-frame to be decoded.

B-frame (bidirectional-frame) is related to both preceding and following I or P - framesand will require both the I-frame and the P-frame to be decoded.

17

2. BACKGROUND

A video decoder decodes the stream frame by frame, only I-frames can be decodedindependently while B and P frames must be decoded with their reference frames [20].

To set how many P-frames that should be sent before a new I-frame there is a settingGOV (group of video) that can configured. Having a lower GOV length means I-frameswill be sent more frequently which will increase the bit rate [15].

In figure 2.5 the Motion JPEG images are coded and sent as separate unique images(I-frames). Figure 2.6 shows how motion and dependencies between frames are takeninto account. The first frame is the I-frame and the entire scene needs to be transmitted.The following frames are P-frames and only the motion in this case the human needsto be transmitted, the house and the scene can be derived from the I-frame.

Figure 2.5: JPEG video sequence [20].

Figure 2.6: MPEG video sequence [20].

A video that is represented by a sequence of JPEG pictures is called Motion JPEG.The advantages of MJPEG is that it uses the compression of JPEG in which qualityand compression ratio can easily be changed. Also its independence of following andpreceding frames which means that frames can be dropped without any other framesbeing affected. The disadvantages are that it does not use temporal compression, it onlyhandles still images. This results in a lower compression ratio than that of MPEG. [20]

H.264 (also called MPEG-4 Part 10/AVC for Advanced Video Coding) is an efficientvideo compression technique that uses the same compression principles as MPEG butwith much more advanced algorithms. It can reduce the size of a video file by morethan 80% compared with the Motion JPEG format and 50% more than MPEG-4 part2 [20].

18

2.3 AUTHENTICATION

2.3 AuthenticationIn order to protect a server from unauthorized access some kind of authenticationmethod is required. Depending on what protocol is used different methods can beapplied. The two authentication methods specified for HTTP (RFC 2617) are basicand digest authentication.

To authenticate with the server, using basic authentication the client sends theuserid and the password separated by a single colon the entire string is then base64encoded. The message sent would for the userid ’Aladdin’ with password ’open sesame’look as following:

Authorization : Basic QWxhZGRpbjpvcGVuIHNlc2FtZQ ==

The basic method does however not provide any encryption and the password isessentially sent in plain text. Digest authentication aims to improve basic authentica-tion by sending the password hashed to the server. Hashing the password does notprovide any encryption and this has to be done by using HTTPS or some other formof encryption mechanism. To authenticate with the server the client has to make a re-quest to the server. The server will reply with a 401 Unauthorized message containinga nonce which will be used to hash the userid and the password [9].

2.4 Wireless NetworksIn this section the most commonly used technologies are examined, in particular WiFi,GPRS/EDGE, WCDMA/HSDPA, LTE.

2G Superseding the analog 1G mobile network, the first call on GSM (Global systemfor mobile communications) was made in Finland in 1991. GSM is the mostprominent 2G technology with a market share of 80-85% and it’s a standard inEurope. Other notable technologies are IS-95 (CDMA one)[22].

GSM does not provide internet access, this is provided through additions to theGSM network. GPRS (General Packet Radio Service) was introduced in 1999 andimproves packet switching. While it requires some additional physical nodes it’sdesigned to incorporate with the existing GSM infrastructure. It has a theoreticalthroughput ranging from 56 to 115 kbps [22].

EDGE (Enhanced Data rates for GSM Evolution) was deployed on GSM net-works in 2003. It future improves GPRS by introducing several more timeslots,and as such provides an improved throughput up to a theoretical of 473.6 kbps[23].

3G was marketed and designed as a way to make live video calls on the mobile net-work. UMTS (Universal Mobile Telecommunications System) is a 3G standard.W-CDMA (Wideband Code Division Multiple Access) is the most used tech-nology of the UMTS family. It comes in two variants either as Time DivisionDuplex (TDD) or Frequency Division Duplex (FDD), with the later being themost common implementation [25].

19

2. BACKGROUND

HSDPA (High Speed Downlink Packet Access) is a concept that provides higherthroughput in UMTS networks, based on methods developed for EDGE. Cur-rent HSDPA deployments can support up to 42 Mbps as compared to the 384kbps in the first UMTS release. Future development will allow speeds of up to337 Mbps. In 2009 the most common configurations was with a data rate of 3.6Mbps or 7.2 Mbps [24].

4G is the fourth generation of mobile network. There are two candidates WiMAXand LTE (Long Term Evolution). LTE was first released in Stockholm and Osloin 2009. It has a theoretical downlink of 299.6 Mbps, which isn’t enough to fulfillthe needed 1 Gbps to comply with the 4G service. However upcoming versionof LTE name LTE-A (Long Term Evolution Advance) are expecting this speed in2013 [26].

WiFi As well as all above technologies mobile phone usually are capable of connectingto WiFi networks. But as opposed to mobile networks they have a limited range,considerably limiting the movement capabilities of the user. Current theoreti-cal output is around 600Mbps, however standards currently being developed isplanned to support up to 7 Gbps [27].

With TCP the network delay has additional concerns. When measuring capacityof a ”network pipe” there is a concept known as Bandwidth-Delay Product (BDP).The number of data packets that can be in transit (in-flight or unacknowledged) in thenetwork cannot exceed the BDP. The upper bound BDP is calculated like

BDP (bits) = total_available_bandwith(bits/s) ∗ round_trip_time(sec)

In effect throughput is bounded by the BDP, the network pipe can’t be filled above theBDP since acknowledgments can’t be sent fast enough. When its exceeded packets maybe dropped or queued [5].

2.5 Network CamerasA network camera is used to send video/audio over an IP network for live viewing orrecording of video. IP cameras are often used for security surveillance and it is possibleto ensure that only authorized access is allowed. Network cameras can be seen as acombination of a camera and a server. The main components are a lens, an imagesensor, one or several processors and memory. The camera has its own IP addressthrough either a wired or a wireless network connection. The Axis camera comes witha webserver, FTP capabilities and email. When connecting to the camera through a webbrowser the user can define access, configure the camera settings, set resolution, framerate and compression format. In addition to the capabilities already in the camera Axisallows the installation of applications such as Cross Line Detection [15].

As Axis has several cameras we focused on the cameras that support VAPIX 3.0(AxisAPI for the cameras). Specifically the support that was interesting was motion JPEGand H.264 for video encoding. In the Axis newer line of cameras there are support forthe following transport protocols:

20

2.6 SMARTPHONES

RTP: Packets sent without a control protocol and always multicast.

RTP + RTSP: RTP packets and RTSP messages are sent on separate sockets.

Interleaved RTP over RTSP: All RTP packets are sent on the RTSP socket. This isadvantageous because it’s relatively easy to configure a firewall to allow RTSPtraffic. To interleave the RTP packets the transport parameters is set to RT-P/AVP/TCP;unicast. Before an RTP packet a $ sign (0x24) followed by transportchannel and packet length information is supplied. This to distinguish the RTPpackets from RTSP messages.

Interleaved RTP over RTSP over HTTP: The RTSP session is maintained over HTTPand the RTP packets are sent over the RTSP stream. Sending RTP over a singleTCP connection as in Interleaved RTP over RTSP is not always enough to reacha significant population of internet users. A lot of users are on private networkswhere they have indirect access to public internet via HTTP proxies. HTTPtunneling exploits the capabilities of HTTP GET and POST to send and receiveRTP/RTSP data. RTSP messages must be Base64 encoded, otherwise a messagecan be interpreted as a malformed HTTP message [10].

2.6 SmartphonesA possible definition of a smartphone is that it is an extension of a feature phone [14]that allows installation of new software generally called apps. Compared to a featurephone a smartphone comes with more processing and connectivity capabilities. Smart-phones are used much like any computer in addition to normal phone functionalitysuch as calling and texting. Some of the mobile operating systems that modern smart-phones uses are Google’s Android, Apple’s iOS, Nokia’s Symbian, RIM’s BlackBerryOS and Microsoft’s Windows Phone. These operating systems can be installed on avariety of different phone hardware [28].

Currently the market share is dominated by Apple and Google with Android hav-ing a share of almost 70% and iOS 20% [32]. However there is indications that Appledominates the corporate market [33]. Since Google and Apple have such a huge marketdominance this thesis will only focus on the Android and iOS platforms. As of writingthis the current version for Android is 4.2 Jelly Bean (API level 17) and 6.1 for iOS.

Android is an open-source software environment built on the Linux operating sys-tem. Android was unveiled 2007 and the first android device sold in 2008. The systemarchitecture is built like a stack where in the base a linux kernel resides, above native li-braries written in C/C++ and on the top, applications and an application framework.Android applications are written in Java using the Android software development kit(SDK) [34].

Released in 2007 by Apple Inc iOS is a mobile operating system licensed only forinstallation on Apple hardware. iOS has four abstraction layers, the Core OS layer, theCore Services Layer, the Media Layer and the Cocoa Touch Layer. Apple provides anSDK to develop apps, which can be written in C, C++ or Objective-C [30].

21

2. BACKGROUND

2.7 Video QualityWhile discussing quality most people understand what it means based on the context,however it’s still ambiguous enough that it needs a complete definition for this thesis.In general quality is defined as something measured against other things of similar kind.This section aims to clarify what this thesis considers to be quality.

In this thesis when we discuss about quality and improving quality, it aims to pointout the specific aspects for smartphones and how to improve them. Generally havinga high quality video will mean that it will be playing as clear and as smooth as possi-ble, and give the viewer a feeling as of being there. While these definitions might besufficient for a normal video, we need to take extra consideration on surveillance videoaspect such as latency and security. Further since we are building apps for smartphones,we must consider some additional aspects that will affect quality, such as battery usage,low quality networks and limited network data plans. When streaming video from anAxis camera the following quality parameters are configurable:[19]

Resolution is a parameter that has significant impact on the quality. It will affect boththe size and the sharpness of a picture. The Axis cameras allows the client tochoose from a set of resolutions.

Frames per second can be configured and will affect the smoothness of the video. Theeye can still understand the motions that happens at 15 fps but it will be con-sidered jerky. Below 10 fps the eye will only perceive a serie of images with nomotion.

Compression algorithms reduces bandwidth by exploiting redundant information orremoving details.

GOV length will determine how many P-frames are sent before the next I-frame. Set-ting a high GOV length value will save considerable bandwidth but with a trade-off. There is a longer recovery time after the video significantly has changed (timebefore a new I-frame) for a higher GOV length. Also if there is congestion on thenetwork the video quality may decay.

The above parameters can be configured in many ways and by choosing a too highquality it can result in lowering the perceived quality. For example having a too highresolution on a low quality network might lower the frame rate and thus making thevideo feel sluggish. Setting the quality parameters too low can result in not utilizingthe available bandwidth in the network. Optimal settings varies depending on contextfor instance screen resolution and bandwidth varies. Playing a video with a higherresolution than the screen will not incur any higher perceived quality, it will onlyincrease bit rate [11].

In the Axis camera there is pre defined profiles where these parameters are set tofit different quality requirements. For instance there are profiles named Quality, Band-width and Balanced. In chapter 5 we create our own profiles based on measurementson bandwidths in mobile networks and the impact the parameters have on bit rate.

When deciding which tradeoffs to make depending on frame rate, resolution andcompression the context needs to be taken into account. For video surveillance higher

22

2.7 VIDEO QUALITY

resolution might be desired when parts of the image must be detectable such as facesand having a lower fps might be acceptable. On the other hand in casinos a high fps isdesired to be able to detect suspicious activities at the gaming tables [19].

23

2. BACKGROUND

24

Chapter 3Support Analysis

Before starting development of prototypes we needed to research the current supporton the platforms of interest. Since it is an area still in its infancy, much of the infor-mation on the subject is scattered on forums, developer pages, API:s, blogs and even insource code.

Based on the criterias below, set by Axis, the investigation of this thesis will focuson solutions that supports these. We present several different solutions. Only a selectedfew will be chosen for prototyping, development and further investigation.

• All streaming protocols that Axis cameras support

• Authentication of video streams

• Playback of video streams (H.264, MJPEG)

• If possible uses native classes and api:s

On the market there are several third party libraries that can be used on iOS and onAndroid. In many cases it can be advantageously to use a library instead of writing yourown. There is no comprehensive guide on the support and availability of such libraries.In table 3.1 we present the libraries we found and the support they provide. For theiOS platform libraries are not limited to only Objective-C as both C and C++ also canbe used on iOS. The Android platform uses Java and supports C/C++ through theNDK(Native Development Kit).

FFmpeg is one of the most used library for encoding and decoding. It’s used in severallarge project such as VLC and MPlayer [39]. It support streaming of both localand remote files/streams. FFmpeg comes with a command line tool that can beused to perform all functions. FFmpeg supports decoding of more than 90 mediaformats [12].

25

3. SUPPORT ANALYSIS

Name RTSP HTTP Basic Digest MJPEG H.264 LanguageFFmpeg 3 3 3 3 3 3 CJJmpeg 3 3 3 3 3 3 JavaLive555 3 3 3 3 7 7 C++Netty 3 3 3 7 7 7 JavaGstreamer 3 3 3 3 3 3 C (Java SDK)

Table 3.1: Third library party support for RTSP and decoding

JJmpeg is a Java binding to FFmpeg. The work of the library seems to be maintainedby only one committer and have recently not been actively developed. [41]

Live555 is a library used for streaming media. While Live555 supports H.264 this isonly through their media server application and not as a client [42].

Netty provides a library to simplify network task such as handling sockets and han-dling many transport protocols such as HTTP [43].

GStreamer SDK GStreamer is a library for constructing graphs of media-handlingcomponents and can be used to build multimedia applications. It provides anSDK for Android which includes tutorials, documentation and aims to be a stablelibrary. There is also an SDK on the horizon for iOS [44].

The following two sections describes some of the solutions and ideas we foundduring our research. Some of these ideas will be prototyped and explored further inthe next chapter. It would be interesting to compare all solutions but time limits theamount of time we can spend implementing. Each solution will have it pros and conswhich we will use to select which one to implement.

3.1 AndroidAs of Android 3.0 there has been support for RTSP streaming and H.264 playback [29].However the Android RTSP implementation only supports streaming over UDP andcan only be accessed through the MediaPlayer API. Authentication is not supportedand there is no indication that this or an extension of the streaming will be added.These limitations in Android requires either an extensions of available classes or a newsolution.

Android provides a multimedia framework that handles all multimedia related taskssuch as playback, recording, streaming and also provides interfaces to decoders andencoders. The foundation of the framework is called Stagefright which as of android2.2 replaced OpenCORE. Most interaction with the StageFright is abstracted throughthe MediaPlayer API by JNI (Java Native Interface) bindings, which enables callingmethods in other languages.

It is possible to use Stagefright directly in the NDK but it is not exposed by defaultand there is no official documentation or guarantees of behaviour. While it is possibleto use Stagefright without the MediaPlayer API we found it not to be very feasible

26

3.1 ANDROID

Figure 3.1: Android multimedia framework

and as of Android 4.1.1 there is a new API, the MediaCodec API which exposes thedecoder/encoder of the device. The MediaCodec API was made for the purpose ofextending support of protocols and formats [31].

3.1.1 FFmpeg and GstreamerOn the market there are several 3d party libraries that can be used on Android andthese are described above. On Android there is the option to use either Java libraries oruse the NDK with C/C++ libraries. In addition to the libraries that can be used foriOS there are Gstreamer, Netty and JJmpeg.

Figure 3.2: FFmpeg and Gstreamer in Android

FFmpeg provides a complete solution for streaming, authentication and decoding,however for playback of the the video stream we would need to implement a mediaplayer. It does come with a drawback which is that it relies on using the NDK and inaddition there is a lack of documentation. FFmpeg provides its own software decoders

27

3. SUPPORT ANALYSIS

so regardless of support for hardware decoding the software decoders are always avail-able and performance is only limited by processing power. From version 0.9 FFmpeghave added support for hardware decoding in Android [40] which greatly reduces theCPU usage of decoding.

Advantages of using FFmpeg

• Fully configurable, (transport protocols, authentication, delay)

• Device independent, only processing power which limits its performance.

Disadvantages

• Lack of documentation.

• Requires use of open source licensing.

• High code complexity. Coding in C, must be used with the NDK and JNIbindings.

Figure 3.2 shows the data flow from Axis camera to being rendered on the devicescreen. First FFmpeg will establish the RTSP connection and possibly authenticatewith basic or digest. The TCP or UDP stream will then be parsed and the video framesextracted and decoded. Using JNI bindings the decoded video frame can be fetched andrendered in the media player.

In contrast to FFmpeg GStreamer provides a SDK with tutorials and documen-tation which greatly helps developers. Programming with GStreamer SDK is in Cusing the Android NDK, the functions written can then be called on from Java withJNI bindings. Both supports the streaming methods provided by the Axis camera.Gstreamer uses the same software decoders as FFmpeg and can be configured to usehardware accelerated decoding with the MediaCodec API. The SDK has a lot of func-tionality, some which can be used to help with buffering of a stream, rendering andaudio and video synchronization.

Advantages of using Gstreamer

• Fully configurable, (transport protocols, authentication, delay)

• Available integration with MediaCodec API for hardware decoding.

• Handles tasks such as buffering, playback and sync of video and audio.


Disadvantages

• High code complexity. Coding in C, must be used with the NDK and JNIbinders.

Figure 3.2 shows how to use Gstreamer on Android. It works similar to FFmpeg,however Gstreamer provides playback.

28

3.1 ANDROID

3.1.2 Proxy RTSPWhile FFmpeg and GStreamer looks promising lets not forget that Android alreadycomes with a media player that at least has rudimentary support for playing RTSPstreams. The MediaPlayer API has support for RTP(UDP) + RTSP and the goal of thissolution is to extend it with with authentication for the following types: RTP(UDP)+ RTSP, Interleaved RTP over RTSP, Interleaved RTP over RTSP over HTTP. It’spossible to create a proxy client which restreams the unsupported types into supportedtypes and adds authentication.

Figure 3.3 shows the conceptual design of the proxy RTSP solution. The idea is tocreate a step in between the Android Media Player and the Axis Camera and add themissing functionality. The media player establish an RTSP connection to the proxyserver on the localhost. The server manipulates the RTSP stream to ensure that theAxis camera streams to the proxy client instead of the media player and possibly addsauthentication. The camera stream will be sent to the proxy client which in turn willconvert a TCP stream into a UDP stream (UDP streams will just be forwarded) andsend it to the media player. The Android Media Player will then decode and render theincoming video stream on the screen.

Figure 3.3: Proxy RTSP in Android

The Proxy RTSP solution is a conceptually easy approach which only adds func-tionality to the existing classes and can be achieved with low complexity and a low linesof code count. The drawback is a lack of configurability where delay can’t be tweaked.While this solution will use optimized code specifically written for android we haveobserved that the media player classes behaves different on different devices.

Advantages

• Playback is handled by the existing media player classes.

• Uses optimized code specifically written for android.

• Low complexity and line of code count.

29

3. SUPPORT ANALYSIS

• Hardware decoding• Only uses Java.

Disadvantages

• Lack of configurability.• Behaves different on different devices.

3.1.3 MediaCodec APIThere is the possibility to skip the media player altogether and write our own RTSPclient and decode video with the device decoder. There are two ways to access thedecoders without the MediaPlayer API, either bypass the API and directly use functionsin the underlying media framework or use the new MediaCodec API. We will onlyfocus on using the MediaCodec API.

Figure 3.4: MediaCodec API in Android

Figure 3.4 shows the conceptual design of the MediaCodec solution. An RTSPclient will establish a connection to the Axis camera and possibly authenticate usingeither digest or basic. The incoming stream would be parsed and the video data ex-tracted. Before the data can be sent to the MediaCodec API for decoding the video datamust be assembled into frames. MediaCodec API will use the hardware to decode theframes and the resulted image will be rendered by the media player.

With the restreaming solution much of how the stream is handled can’t be changed,if more control of the implementation is required then the MediaCodec API can beused, the decoding and rendering part is handled by the API but the other parts mustbe implemented.

Advantages

• Access to the hardware decoder without using the media player classes.• Gives full control of the whole flow from setting up a video stream to ren-

dering it on screen.

30

3.2 IOS

• Hardware decoding• Only uses Java.

Disadvantages

• MediaCodec only available on Android 4.1.1+• Cost of maintaining code.• Must implement entire stack.

3.1.4 ConclusionAll these solutions are feasible but each approach comes with its own advantages anddisadvantages. When deciding upon which solution(s) to take further and use in anapp the solutions need to be compared and balanced against what the needs are for thespecific app. The main differences is code complexity, latency and support on differentdevices.

Table 3.2 shows what support we found for our solutions on a variety of devices.This is a very small set of devices but clearly the proxy solution has varying results. Onsome devices the Proxy won’t work even if the MediaPlayer does. The MediaCodecAPI worked on all the devices that has support for the API (Android 4.1.1 and up).

Manufacturer Model Android Version MediaCodec Proxy MediaPlayerSamsung GT-I8190 4.1.2 3 3 7

Samsung GT-I9100 4.1.2 3 7 7

LG Nexus 4 4.2.2 3 3 3

Asus Nexus 7 4.2.2 3 3 3

Sony ST25i 4.0.4 7 7 7

Sony LT26i 4.0.4 7 7 3

Table 3.2: Support for MediaCodec, Proxy and native MediaPlayer in Android

There are lots of apps that uses FFmpeg on the market, in the iOS chapters ofthis thesis FFmpeg is further discussed and a prototype app is also developed. ForAndroid we decided to further investigate the Proxy RTSP this because of that muchof the functionality were already completed (MediaPlayer, RTSP Client). All that wasneeded was to create support for HTTP tunneling and Authentication. MediaCodecwas chosen because we wanted more flexibility than the Proxy solution offered, alsousing MediaCodec was officially supported way to access the hardware decoder.

3.2 iOSCurrently iOS does not support RTSP streaming and there is no indication that thiswill be implemented in the near future as Apple currently is advocating HLS [35].H.264 decoding is supported [36] but this is only accessible through the media classesprovided by apple. There is not any native solutions for H.264 live streaming that Axiscameras are compatible with.

31

3. SUPPORT ANALYSIS

3.2.1 FFmpegSince iOS lacks native RTSP support a client has to be implemented. Authenticationwould have to be implemented, but the RTSP client will be written anyway so thiswouldn’t pose much extra work. The lack of H.264 decoder only gives one possiblenative solution and that is to completely ignore H.264 and to use MJPEG instead.While constructing a H.264 decoder would be possible it’s certainly not feasible withinthe time constraint and it’s completely outside of the scope of this thesis.

Figure 3.5: FFmpeg in iOS

The solution to the problems is to use third party libraries. As shown in table 3.1both Live555 and FFmpeg is third party libraries that would provide streaming. How-ever FFmpeg provides decoding in addition to streaming and we deem it to be a betteroption as it reduced the number of libraries used. If Live555 were to be used, a libraryto decode H.264 would be needed. Using one library ensures that the context and datadoes not need any extra code to convert between libraries. With FFmpeg, in order todisplay the decoded video stream a media player needs to be implemented. Figure 3.5shows the conceptual design of FFmpeg in iOS.

Advantages of using FFmpeg

• Fully configurable, (transport protocols, authentication, delay).


Disadvantages

• Doesn’t use hardware decoding.

• Lack of documentation.

• Requires use of open source licensing.

32

3.2 IOS

3.2.2 Proxy RTSPAV Foundation is an iOS framework that provides an Objective-C interface to createand play media. It allows for examining, creating, editing and encoding of media files.Also it support real time video manipulation of streams. AV Foundation is the onlyway to achieve hardware decoding on iOS but this is only for a limited number offormats. Also the supported streams is limited to HLS or to a local file.

To be able to use the native classes and decoders the solution we propose is to cre-ate a proxy on the phone and then to restream the RTSP stream to Apple’s supportedformat HLS. This solution have similarities to the Android solution discussed in sec-tion 3.1.2, but only on an abstract level. Where the Android solution would be able tosimply restream the data with an appropriate transport protocol, the iOS solution needit to comply to the HLS standard discussed in section 2.1.4.

Figure 3.6: Proxy RTSP in iOS

The proxy would need an RTSP client to communicate with the camera to obtainthe video frames. Once obtained the proxy would need to save frames in appropriatechunks to disk. These chunks would then be served to the AVFoundation as an HLSstream. Figure 3.6 shows the conceptual design of this.

Advantages

• Playback is handled by the existing media player classes.• Uses optimized code specifically written for iOS.• Hardware decoding.

Disadvantages

• Lack of configurability.• Inherent delay in the protocol.• Intense usage of the disk.

33

3. SUPPORT ANALYSIS

3.2.3 ConclusionUsing AVFoundation would be preferred over using FFmpeg as it would enable hard-ware decoding and reduce external dependencies. To decode an RTSP H.264 streamwith AVFoundation a proxy is needed. While creating a proxy on the iOS device is notconceptually hard, the limitations on iOS makes it unpractical. As it’s not possible topoint to an in memory file, all files need to be saved locally. Saving all files locally putsa huge load on the devices storage and could turn out to be a bottleneck. The proxywould need to implement a RTSP client to receive the video data, but this could besimplified by using a library such as Live555 or FFmpeg. However the inherent latencyof the protocol used in the proxy solution is alone enough for us to discard this solu-tion. To test if it was feasible, we created a simple prototype using a server to act as aproxy and found that without any optimization we had a latency of 60 seconds. Withoptimizations the delay can be lowered, but it would still be too long. In accordanceto our conclusions Apple states in their documentation of HLS that the protocol is notintended to be used for live video streaming.

We found that for iOS, in the current state, using FFmpeg is the only applicablesolution. The lack of hardware decoding severely limits the performance and this alonecould be a reason to choose another solution if one would prove to be suitable.

34

Chapter 4Design and Development

In this chapter we present our solutions based on the investigation we conducted. Wefound two solution for Android that we decided to focus on the Proxy and the Media-Codec solution. Limitations in the iOS only allowed one solution, using a third partylibrary, in our case FFmpeg.

4.1 Android ProxyA typical way to start an RTSP stream in Android is shown in listing 4.1. To play avideo on Android a VideoView object must be created, the video source is set, afterthat it’s just matter of starting the playback of the stream.

Listing 4.1: Usage of Android VideoView to play a video (Java)VideoView view = new VideoView (this);view. setVideoURI (Uri.parse("rtsp :// cameraip /axis -media/

media.amp"));view.start ();

Figure 4.1 shows how the media player connects to the camera with a video onlystream. It will create three sockets, a TCP socket (ports: a0, a1) for RTSP messages andtwo UDP sockets (ports: b0, b1, c0, c1) for RTP and RTCP packets.

4.1.1 Android UDP ProxySince the camera requires authentication we have to add a layer of functionality onthe RTSP stream. The proxy implementation acts as a mediator of the RTSP messagesbetween the VideoView and the camera. Instead of directly connecting to the camerathe VideoView connects to a localhost port on the Android device, it can be seen in

35

4. DESIGN AND DEVELOPMENT

Figure 4.1: RTSP in Android

listing 4.2 how this is done. The proxy listens to this port and sends the incoming mes-sages to the camera after adding authentication. Changes can be made to the messagesto tailor the behavior of the resulting media stream. For instance the block size [8] canbe tweaked to better fit the needs of the application.

Listing 4.2: Usage of Android VideoView to play a video through the proxy(Java)

VideoView view = new VideoView (this);view. setVideoURI (Uri.parse("rtsp :// localhost /axis -media/

media.amp"))view.start ();

Our implementation of the proxy creates two additional sockets, one for the client(a1) and one for the server (8080). The proxy does not have an internal separationof server and client but from the view of the camera it acts as a client and from theAndroid device as a server. The server socket (8080) accepts incoming messages fromthe VideoView and modifies these in the Filter class. The messages are forwarded tothe camera socket (554). The RTSP messages from the camera are received on theProxy client socket (a1) and forwarded to the media player socket (a0). RTP and RTCPmessages works independently of the proxy and communicates over UDP sockets (b1to b0, c1 and c0).

Figure 4.2: Proxy RTSP in Android

4.1.2 Android HTTP Tunneled ProxyStreaming media over UDP has some problems (UDP ports blocked, NAT traversal)[35], therefore it’s beneficial to send the packages over TCP and set up the connectionwith HTTP tunneling [10]. Figure 4.3 shows how this can be set up. The difference

36

4.2 ANDROID AND MEDIACODEC

between this proxy and the previous one is that it acts as a mediator on all packets andmessages, and not only the RTSP messages. The setup process is the same as before,the VideoView connects to localhost(socket on port 8080, a0) and proxy to the camera.This proxy establishes two channels one for HTTP GET(ports: 80, e0) and one forHTTP POST(ports 80, d0), and it establishes two sockets to the VideoView for RTPand RTCP(ports: b1,b0,c1,c0).The proxy receives data on the GET channel and sendsdata on the POST channel. When the proxy starts receiving the RTP or RTCP packets(over TCP) from the camera it has to convert these into UDP packets and send themon the specified port to the VideoView. This is achieved by extracting the payload fromthe TCP packet and simply sending a UDP packet with the extracted payload.

Figure 4.3: Proxy RTSP in Android (HTTP tunneled)

In listing 4.3 we can see the original RTSP message sent from the camera to theproxy. In listing 4.4 the proxy have changed the transport type from TCP to UDP andalso the client/server ports have been added.

Listing 4.3: RTSP message from the camera to the proxyRTSP /1.0 200 OKCSeq: 2Session : 5 A66A568 ; timeout =60Transport : RTP/AVP/TCP; unicast ;

interleaved =0 -1;ssrc= ECAC6DE3 ;mode="PLAY"

Listing 4.4: RTSP message from the proxy to the VideoViewRTSP /1.0 200 OKCSeq: 2Session : 5 A66A568 ; timeout =60Transport : RTP/AVP; unicast ;

client_port =15550 -15551; server_port=15552 -15553;

ssrc= ECAC6DE3 ;mode="PLAY"

4.2 Android and MediaCodecThe MediaCodec solution requires us to implement our own media player solution.This solution needs to have functionality similar to VideoView such as buffers and

37


rendering functionality. The MediaCodec API provides a class to enable rendering, weonly have to provide an image to be rendered. The following code snippet shows howthis can be achieved.

Listing 4.5: Rendering an image using MediaCodec (Java)MediaCodec decoder = MediaCodec . createDecoderByType ("

video/avc");decoder . configure (format , surface , null , 0);

// Fetch an assembled frame

int inIndex = decoder . dequeueInputBuffer (-1);ByteBuffer buffer = decoder . getInputBuffers ()[ inIndex ];buffer.put(frame.data , 0, frame.length);int outIndex = decoder . dequeueOutputBuffer (info , timeout )

;decoder . releaseOutputBuffer (outIndex , true);

This code would display a single frame on the surface. The code is only conceptualand several important steps have been omitted. The general idea is to create a decoderand configure it. The buffers used by the decoder are internal and are used to queuedecoding work. When the frame has been decoded it can be rendered on the suppliedsurface. To get smooth playback a buffer between the decoding and rendering step isrequired.

Our design using the MediaCodec API (see figure 4.4) consist of five essential parts.We limited the implementation to only support HTTP tunnelled RTSP.

Figure 4.4: Internal MediaCodec RTSP in Android (HTTP tunneled)

RTSP Client is used to establish and control the RTSP session with a camera.

Stream Parser parses the incoming stream and depending on the message it will for-ward the stream content to either the H.264 assembler or the RTSP client.

38

4.3 IOS AND FFMPEG

Demuxer will receive an internal Packet object containing the frame data. It identi-fies the packet types and video fragments are assembled into single frames. Theframes are sent to the decoder.

MediaCodec API works in a similar way as described above.

Media Player receives the decoded frame and puts it in the internal buffer before ren-dering it on screen.

4.3 iOS and FFmpegWe decided to use FFmpeg on iOS to stream and play the video stream from the Axiscamera. The library provides an API to create an authenticated HTTP tunneled RTSPstream. Figure 4.5 shows our design.

Figure 4.5: Axis MediaPlayer on iOS using FFmpeg

How to use FFmpeg is illustrated with code and comments below. Listing 4.6 showsa very simplified (note several crucial steps are omitted) way on how to decode an imagein FFmpeg.

Listing 4.6: FFmpeg workflow (C)// set options and initialize the streamAVFormatContext * pFormatCtx ;AVDictionary *opts = 0;av_dict_set (&opts , "WWW - Authenticate ", "Digest", 0);avformat_open_input (& pFormatCtx , url , NULL , &opts);avformat_find_stream_info (pFormatCtx , NULL)// initialize the decoder...// stream ready , start reading framesav_read_frame (pFormatCtx , &packet);// decode frameAVPicture picture ;avpicture_alloc (& picture , PIX_FMT_RGB24 ,width ,height);sws_scale (..)

39


// Convert image to local context

The decoded image is encapsulated in a container class (ImageContainer). The con-tainer class has some additional information such as timestamps and statistics. Thetimestamps are needed to ensure a smooth playback of the video. Before rendering theimages are buffered to ensure a smooth playback. The rendering is performed on acustom UIImage class.

4.4 ResultThis chapter aimed to show the design of the apps. Some implementations details canbe found in Appendix C for FFmpeg and Appendix D for MediaCodec. This sectionwill show how the implemented apps will look on the phone and how the perform. Tomeasure the performance we set a camera to use the resolution 800x600 with 30 framesper second. Compression was set to 30 and we used a GOV length of 32.

In figure 4.6 a list of the current cameras are shown. To add new cameras theuser will have to press the new camera button, in which case the user will be taken tofigure 4.7. This view is the add and edit camera view. A name and a ip can be set andwhen the user press the add button this information is saved to a database. The user cannow select this camera from the view. When selecting a camera the user will be takento figure 4.8 where the playback of the camera stream will begin.

Figure 4.6: List ofcameras (Android)

Figure 4.7:Edit/Add cam-era (Android)

Figure 4.8:Playback of camera(Android)

Figure 4.9 4.10 (full size figures can be found in Appendix A) show the initial stateof CPU load and memory usage of the device before starting any app. Both CPU loadand memory usage is about 20%. Figure 4.11 4.12 show the state while playing a streamwith our MediaCodec implementation. CPU load has risen to about 30% and memoryusage by a couple percent. We compared our implementation with the App RTSPplayer[47]. Figure 4.13 4.14 show playback of the same stream with the RTSP player

40

4.4 RESULT

app, and it uses FFmpeg instead of MediaCodec. We can see that our solution that useshardware decoding is much more resource effective. The CPU load has risen to about80% and memory usage to 40%. Note that comparing memory usage this way doesn’ttell much about efficiency since usage will depend on things like how large buffers areused (buffers that delay frames to give a smooth playback).

Figure 4.9: The initial CPU load Figure 4.10: The initial memory us-age

Figure 4.11: CPU load with our im-plementation

Figure 4.12: Memory usage with ourimplementation

Figure 4.13: CPU load with RTSPplayer Figure 4.14: Memory usage with

RTSP player

Figure 4.15 shows a list view which will contain all cameras that the user have added

41


on the iOS device. Adding new cameras is done by pressing the + sign in the top right.This will take the user to the edit/add camera view, which can be seen in figure 4.16.There the user can set a name, set ip and port, specify if to use HTTP tunneled orjust RTSP normally, set login parameters and finally select quality profiles. Once allparameters are set and the user choses to save the entry it will be saved to a databaseand presented in the list view. When the user selects an entry figure 4.17 will be shown,and playback of the stream will start.

Figure 4.15: List ofcameras (iOS)

Figure 4.16:Edit/Add camera(iOS)

Figure 4.17:Playback of camera(iOS)

Figure 4.18 shows how much CPU and Memory the app uses on an iPhone 4S.As the device has two CPU cores the maximum utilization is 200%. It’s notable thatthe app uses approximately 130% of the CPU, compared to the Android device wherethe app uses approximately 10%. We also compared with an App from egeniq calledCamControl [48]. This app obtains much better result (see figure 4.19) but this is dueto the fact that is uses MJPEG and not H.264 (much more CPU intense). It has a 60%CPU usage but at a cost with a much higher bit rate. Memory wise the two apps arevery similar 56MB compared to 60MB on the CamControl app. When the resolutionpasses a certain threshold(using our app) the device won’t be able to decode, this isapproximately at 1240x720, at this point the app uses 200% of the CPU.

Figure 4.18: Performance on the iOS device (our app)

Figure 4.19: Performance on the iOS device(CamControl)

42

4.4 RESULT

We end this chapter with concluding that the two apps works well, but improve-ments can be made to both performance and user interface design. The hardware de-coding in Android ensures that the device is able to support a much higher resolutionthan the iOS device and it uses less system resources.

43


44

Chapter 5Streaming and Video Quality

In previous chapters we investigated how to implement apps to open, decode and rendera stream. Decoding and rendering can only be done after the data to process has beenobtained. The quality of the stream in terms of parameters such as resolution andcompression level that can be streamed over a network will depend on the networkbandwidth. So creating an app that can open a stream and process the data is notenough, what if the bandwidth is too low? How to set the quality parameters for anacceptable bit rate? In this chapter we will find answers for these questions.

The Axis camera doesn’t support quality change of an ongoing stream, the streamneeds to be closed and a new stream with different quality has to be opened. In additionof this being a rather time consuming task it will also stop playback for a brief periodof time. If possible, this is something we want to avoid. We propose a way to decidethe initial quality of the stream, based on the context as well as an automatic qualityswitching algorithm.

Capabilities of Android and iOS smartphones includes detection of network type,screen resolution and other parameters. From this information it’s possible to select anappropriate quality settings for the current context. We will attempt to find the bestsettings through a series of experiments. The experiments will begin by measuring theactual throughput on available networks (Edge, 3G, 4G and WiFi). We will comparethe quality parameters alone and in combinations, to find their impact on bit rate.From this we will construct profiles to be used in different contexts. Finally we willevaluate our profiles.

5.1 Network speedIn section 2.4 wireless technologies available to the smartphone are discussed. Thetheoretical speeds of the networks can vary, not only based on what underlying tech-nologies that are used, but also on local context such as number of clients using the

45

5. STREAMING AND VIDEO QUALITY

Figure 5.1: Bandwidth speed for 3Gnetwork

Figure 5.2: Bandwidth speed forEdge network

network. The profiles is based on measurements performed in Lund instead of theo-retical values. We present several result as images in this section, larger images can befound in Appendix.

We used an iPhone 4S to measure network speed, but due to hardware limitation wewere only able to test on 3G and Edge. We sampled the network speed at 32 differentlocations. To measure the bandwidth (downlink and uplink)as well as ping time weused the App SpeedTest [45]. Since 3G is available all over Lund we tested Edge bydisabling 3G on the phone. All samples were performed on Telias network. Teliaswebsite [46] provides theoretical values for Lund. In Lund there is support for 4G+(20-80 Mbps), 4G (2-6 Mbps) , Turbo-3G+ (2-20 Mbps), Turbo-3G (2-10 Mbps), 3G(0.1-0.3 Mpbs), Edge (0.1-0.2 Mbps). The iPhone 4S supports up to Turbo-3G (HSPA).

Figure 5.3: Time to connect to Axis camera on different networks

As we can see in figure 5.1 the variance is much higher in 3G than in edge (figure5.2). The difference in 3G is because some locations have support for the differentHSDPA with higher speed. In 3G we obtain download speeds from 1.47 up to 8.98Mbps. With a maximum uplink of 1.12 Mbps and lowest 0.6 Mbps. Edge have a lowestdownlink of 0.08 Mbps and maximum download speed of 0.2 Mbps. The uplink variedfrom 0.06 Mbps up to 0.1 Mbps.

From the samples for 3G we obtained an average download speed of 4.21 Mbps anda median of 3.47 Mbps. We can note that there are several values below the threshold

46

5.2 BASIC ALGORITHM CONCEPT

Telia considered their lower limit (2 Mbps). However on average it was within theinterval of Turbo 3G. Edge was fairly stable and fluctuations were likely caused byconnectivity problems. We have an average value of 0.15 Mbps and a median value of0.17 Mbps.

We also evaluated the time it takes to connect to the camera. Through measuringthe time from when the setup was started until the Axis camera had acknowledgedthe stream (the setup phase was complete). The result can be seen in figure 5.3. WiFihave an average time of 0.273 seconds. 3G have 1.67 seconds and finally Edge have aconnection time of 6.42 seconds.

One realizes that having the huge connection time on network such as Edge reallyhighlights the importance of selecting a correct quality before starting the stream. As itwould take over 6.5 seconds to start a new stream, not including the buffering required.WiFi does not suffer with the time delays and therefore could allow for more frequentquality switches.

5.2 Basic Algorithm ConceptIt’s possible to detect the current network on the phone and select a profile that havebeen designed beforehand. The profile bitrate usage should not exceed the bandwidthlimitations on the network that we measured in section 5.1

Adaptive video streaming can be achieved in a variety of ways, we discussed someabove but none that would be a good fit for our platforms. William Eklof [4] proposesa down- and up-switch algorithm but due to the limitations in the system it can’t beapplied. More specifically William proposes the server to switch quality of the streambased on the network delay time. This could be applied if the Axis camera later wouldopen up such possibilities.

In order to detect if a downswitch is needed the following approaches can be used.There are several other uninvestigated ways that could work as well, such as GPS [16]or quality of service feedback using RTCP [4].

Network speed can by being continuously monitored be used to determine if theclient changes network, or the current network is experiencing fluctuations. Ifthe expected download speed is drastically changed it can be an indication thatthe network changed. However changes can also occur because the video streamsimply requires less bandwidth. For a live stream environment this can be dueto a lamp being turned off or a store being closed for the night. In the end thisapproach won’t provide reliable information, as the stream bit rate will vary toomuch. With a static scene and when the bit rate is constant this could be used todetect changes in the network.

Frames Received Per Second is how many frames the client receives from the server.At any given time it won’t be considerably more than the expected frame rateof the stream. Fluctuations in the network can occur, which might lead to burstof frames or no frames at all. By taking an average over a set period of time it’spossible to detect if the network speed have dropped, as less frames will arrive.

47


Frames Rendered Per Second is how many frames the client currently is renderingon the screen. In a similar fashion as detecting frames received it’s possible todetect network drops. There are problems with this approach, firstly there willbe a buffer between the network and the renderer. This buffer will typicallyhave to be somewhere between 1-5 seconds. The detection of network drop willtherefore not happen until the frames passed through the buffer. Also the FPSwill be bound by the CPU processing power (iPhone 4S has a bottleneck withFFmpeg and 720p).

Detect network The average network speeds can be found in beforehand and it can beused to select a good profile. By continuously probing the network it’s possible todetect if the network changed from 3G to Edge and as such the expected networkspeed will drop. Network speeds can vary considerably and it’s very difficult tohave accurate statistics on the network capacity. As seen in our tests above thenetwork speed can drop below the promised values.

Buffer Usage can be measured to see if the buffer is utilized enough. The buffer has adelay to counter jitter, and it’s possible to calculate the expected utilization. Forinstance if we expect a FPS of 25 we will expect the buffer to be filled with 75frames using a 3 second jitter constant.

Two of the above approaches we found to be suitable (Detect Network and FramesReceived Per Second). Detecting the network is not sufficient by itself, but it can ensurethat profiles that the network can’t support won’t be used. Frames received will be usedto detect network fluctuations, such as congestion or lowered bandwidth support onthe network.

It’s much harder to detect when an upswitch is possible. Since information will besent at a maximum fixed rate for each stream, the current bit rate will only indicatethat the client is able to sustain the currently selected profile (or not). The only reliableway to detect if an upswitch is possible, is to monitor the network for changes, such aschanging from Edge to 3G.

The algorithm is implemented as a state machine described in Figure 5.4

Figure 5.4: Algorithm state machine

48

5.3 VIDEO QUALITY PARAMETERS BIT RATE USE

Initial The algorithm starts in this states and begins by gathering information aboutthe network. Since the algorithm will be in this state before the actual streamis started it’s not possible to obtain all measurements described above. The onlyavailable information is the current network and with this information the pro-filer selects an appropriate initial stream quality.

Normal After the stream has been started the normal state will supply the rendererwith frames. This state will continuously monitor frames received so that theprobing state will have more information. When a fixed timer timeout the statewill change into the probing state.

Probing The state will evaluate the network based on the frames received per secondand the current network it will decided whether the stream needs to be reprofiledor it can continue.

Profiling This state obtains current network information and based on this it selectsthe best profile. The profiling state also starts the stream with the selected qualityparameters.

5.3 Video Quality parameters bit rate useThe weighting factors that lead to a higher bit rate is the number of pixels in eachframe, frames per second and amount of motion in the image [21]. There are a numberof parameters that can be configured on the camera, among these are compression level,GOV length and FPS.

To obtain the effect quality parameters had on bit rate, we conducted a series ofexperiments. We used an Axis P3367 camera, this camera has support for 1080p in30FPS. We connected the camera to a D-Link DIR-655 through a wired connection.We wrote a program that changes quality parameters and measure the bit rate.

The camera captured a video (Windows 7 Bubbles Screen Saver, see figure 5.5) froma computer monitor. The amount of motion in the video is relatively high and wewanted high motion to be closer to the upper limit of bit rate that a set of qualityparameters can give. Also the screensaver will repeat with approximately the samemovement, while with a video it has to be restarted and our measurements depend onwhen it were restarted.

The experiments done in figure 5.6 5.7 5.9 5.8 can not represent all cases but thereare clear trends on how the parameters affect bit rate. In figure 5.6 5.7 there are somemissing measurements, this is because there was some unknown hardware/softwareproblems with these particular settings in our test camera.

Compression level Can be set from 0 to 100. This is a measurement on how muchdata should be discarded by lossy compression. The impact on bit rate withdifferent levels in our test video can be seen in figure 5.6. The compression trendis logarithmic, low values will affect the most and after that the gain on bit ratedeclines, in most cases it is not desirable to have a too high compression level asquality will suffer. Figure 5.6 shows that the first halving of bit rate occurs atabout compression level 20 and the next at 55. Compression level 20 seems to be

49


Figure 5.5: Windows 7 Bubbles Screen Saver

a good compromise between quality loss and bit rate. By comparing maximumresolution with minimum resolution there is a clear trend which shows us thatlower resolutions are affected more by compression. The minimum resolutionbit rate is reduced four times while maximum only is reduced about 2.5 times.However the actual bit rate is only reduced from 0.5 to 0.03 Mbit for the lowestresolution and from 20 to 4 Mbit for the highest resolution.

GOV length from 1 to 61440. Decides how many P-frames to send before each com-plete I-frame and the effectiveness is therefore highly dependent on how muchmovement there is in the video. The impact on bit rate with different lengths inour test video can be seen in figure 5.7. The GOV length can have a large impacton bit rate especially in a video where there is little movement. Figure 5.7 showsthe impact on bit rate with different lengths, the vertical Axis uses a logarithmicscale with base 2. There is a clear trend that the first halving of bit rate occurs atabout length 4, next at 16 and at 32 and above, the bit rate remains constant forall resolutions.

FPS from 1 to 30. This increases the bit rate in a linear fashion as it signifies howmany frames that should be sent per second. For example by halving the FPS thebitrate is also halved. What FPS that gives the best quality is subjective but anFPS of 25 or above is best for a smooth experience.

Resolution Can be set to several different resolutions that the camera supports. Theimpact on bit rate with different resolutions in our test video can be seen infigure 5.9.

So there is a trade-off between bit rate and quality, ideally we would have the highestresolution that the device can handle and not compress or lower the frame rate at all.However often the highest quality isn’t feasible or required. Some considerations forchoosing quality are cost (although outside the scope of this thesis), required quality(is a high resolution necessary?), available bandwidth (3G, WiFi?), and that quality issubjective [19].

With our measurements on available bandwidth and delay in different networksand how quality parameters affect bit rate we will try to balance all the parameters to

50

5.4 MOBILE PROFILES

Figure 5.6: Compression level andbit rate

Figure 5.7: GOV length and bitrate

Figure 5.8: FPS and bitrate Figure 5.9: Resolution level and bi-trate

find profiles to use in smartphones for different contexts. It’s a balancing act to choosethe parameters, a high resolution gives better quality but then with a low availablebandwidth either a lower frame rate or more compression is required.

5.4 Mobile ProfilesWe created the base profiles for all networks based on the measurements in section5.3 and in section 5.1. The profiles we created aim to test the algorithm for qualityswitching, as such they target a specific bit rate and we didn’t investigate the actualexperienced quality. We concluded that the profile worked well if the FPS was stableand no (or very few) dropped frames were detected. In [13] (a quality of experiencestudy) participants were asked to rate quality. It was concluded that a lower framerateis preferable over a lower resolution. Based on this information we prioritise to have ahigh resolution, then tweak FPS, GOV and compression Level to obtain the targetedbit rate. More configurations could have been made to ensure that the quality wasthe best possible, but as this a subjective topic more measurements would have beenneeded.

In table 5.1 we can see the profiles that we created. The bit rate was found the sameway as in section 5.3. For each profile the resolution is increased and FPS is capped at25. Then when a suitable resolution is found the other parameters are tweaked to anacceptable bit rate for the network. In the 3G high profile a resolution of 1080p would

51


give a too high bit rate while 720p gives a lot of room for a higher bit rate, so here theGOV has been lowered.

Context Resolution FPS Compression GOV Expected DownlinkEdge Low 160x90 15 35% 32 0.044 MbpsEdge High 320x180 20 35% 32 0.138 Mbps3G Low 640x360 25 20% 32 0.477 Mbps3G Medium 800x450 25 20% 32 1.036 Mbps3G High 1280x720 25 20% 16 1.709 MbpsWiFi 1920x1080 25 20% 32 7.936 Mbps

Table 5.1: Profiles with quality parameters and average expected bit rate

5.5 App designTo select a profile a profile=xxx is appended to the url. This allow us to design thequality selection independently of the underlying media player. The selection of profileis based on the current context. In Android it’s possible to find the current network inthe following manner:

Listing 5.1: Network detection on Android (Java)if(type == ConnectivityManager . TYPE_WIFI ){

// WiFi}else if(type == ConnectivityManager . TYPE_MOBILE ){

switch( subType ){case TelephonyManager . NETWORK_TYPE_EDGE :

// Edgecase TelephonyManager . NETWORK_TYPE_LTE :

// LTE...

}}

iOS only natively provide the possibility to detect if WiFi is used. However by readinginformation off the status bar it’s possible to detect the current network type. AppendixB have a code snippet explaining how to do this. This method enables the detectionof 2G, 3G, 4G and WiFi but not more specifically which underlying protocol of 3Gthat is used. 3G might have a speed of 0.1 Mbps or 20 Mbps(in Lund), making thismethod very uncertain. However it still gives a good indication on the speed of currentnetwork as can be seen in section 5.1.

We decided to implement the quality control on the iOS device. This was since thecode was considered by both of us to be more stable at the time. The design aims toextend the media player we already constructed. However the media player needs toadd statistics gathering such as frames received. This statistics is then used in the qualitymodule. In algorithm 1 we show how we achieved the switching. The algorithm worksby checking if enough time has passed since last probing (last time it checked if a switch

52

5.6 RESULTS

was needed). The first thing it checks is whether the network has changed, if it changedfrom Edge to 3G an up switch is performed, and a down switch is performed if theopposite is true. A down switch will be performed if the network hasn’t changed butthe frame rate can’t be sustained.

Algorithm 1 Quality switchingif currentT ime − lastProbeT ime > PROBE_INTERVAL then

if networkChanged thenselectNewProfileAndSwitch()

else if currentFps/expectedFps < FPS_CONSTANT thendownSwitch()

end ifend if

5.6 ResultsTo see how our implementation works we performed simulated tests. We comparedour technique with the a base case, where the client connects to the camera and selectsthe quality profile for that network, however no quality switching was performed . Foreach test we measured the bit rate usage and the frames received per second. In table5.2 the networks that we used to evaluate our algorithm can be seen. We performed 5different scenarios, all with different run time. And compared our down and upswitchalgorithm with a very simple algorithm which works by selecting the best profile (high-est quality) for each network it’s on. The result can be seen in figure 5.10-5.19, or asfull size in Appendix A.1-A.10 . The scenarios were performed with an Axis P3367camera, which we connected to through an ethernet connection. We didn’t use actualhardware(iPhone), instead we used the simulator bundled with Xcode on an iMac 27from 2013. To simulate network we used the Network Link Conditioner, which allowsfor bandwidth restrictions.

Id Network Downlink Delay Uplink Delay1 Edge 0.10 Mbps 350 ms 0.08 Mbps 370 ms2 Edge 0.25 Mbps 350 ms 0.2 Mbps 370 ms3 3G 0.85 Mbps 90 ms 0.42 Mbps 100 ms4 3G 1.5 Mbps 90 ms 0.9 Mbps 100 ms5 3G 5.45 Mbps 90 ms 1.0 Mbps 100 ms6 WiFi 2 Mbps 1 ms 0.5 Mbps 1 ms7 WiFi 45 Mbps 1 ms 40 Mbps 1 ms

Table 5.2: Networks used

Scenario 1: The client is connected to 7, after 60 seconds the network changes to 4. 60seconds later the network switches back to 7 and finally after 60 seconds back to 4.

Scenario 2: The client is connected to 5, after 60 seconds the network changes to

53


1.

Scenario 3: The client is connected to 1, after 60 seconds the network changes to4.

Scenario 4: The client is connected to 4, after 60 seconds the network changes to 6.120 seconds later the network switches to 3.

Scenario 5: The client is connected to 3, after 60 seconds the network changes to 5.60 seconds later the network switches to 7. After 60 seconds the network changes to 3and finally after 60 seconds it changes to 2.

Figure 5.10: Scenario 1, FPS Figure 5.11: Scenario 1, Bitrate us-age, Max is cropped at 10

In the first scenario the available bandwidth changed between a very high WiFi band-width and a normal 3G speed. We can see that during the first 60 seconds there isenough bandwidth available that neither algorithm would be affected by it’s selection.Both are able to sustain a 25 frames per second. The only suboptimal selection wouldbe to select a much lower quality than is available. At 60 seconds the network changed.Our algorithm detected that a network changed and switched down the quality (inthis case to 3G high), while in the other approach frames received per second droppedbelow 5. However after about 10 seconds the algorithm detects that it’s not able tosustain the profile, this is detected because the frames per second is below the expected.A downswitch is performed and the 3G medium profile is selected, which can sustain25 frames per second. At 120 seconds the network changes back to WiFi and the basicapproach will be able to sustain the frames per second. Our approach detects this net-work change and performs an upswitch which is successful, the rest of the scenario isjust a repetition of what happened at 60 seconds.

In Scenario 2 bandwidth change from 3G down to an Edge network. Both ap-proaches have the maximum 3G profile selected (which still is far below the availablebandwidth in terms of usage). However the next step of choice would be to selectthe WiFi profile which is above the available bandwidth. At 60 seconds the networkchanges which is detected by our algorithm and a change switch is performed. The ba-sic algorithm suffers from this drastic change and we can see that it’s not able to sustain

54

5.6 RESULTS

Figure 5.12: Scenario 2, FPS Figure 5.13: Scenario 2, Bitrate usage

even 1 frame per second. Our algorithm switches to the higher Edge profile, which itdetects it’s not able to sustain and therefore down switches to the lower Edge profile.


Scenario 3 has one network switch which is from Edge to 3G. In both cases theselected profile is the higher of the two Edge profiles, which neither can support. Ouralgorithm performs a downswitch and can sustain a good frame rate. The other ap-proach have a very uneven frame rate (below the targeted 20 FPS for the profile). Notuntil 60 seconds have passed is the basic algorithm able to sustain the expected FPS.However as we can see in the bit rate figure the usage is far below the available. Ouralgorithm detects the network changes and performs an upswitch however the selectedprofile requires too much bandwidth and a downswitch to the 3G medium profile isperformed. We can see that our algorithm utilizes the bandwidth much more efficientlyat this point.

Scenario 4 have 3 network changes, from 3G to WiFi to 3G. Both approaches selectsthe maximum 3G profile, which neither is able to sustain. Our algorithm performs adownswitch and have a smooth playback after this. We are able to sustain 25 framesper second while the basic algorithm is below 20. At 60 seconds the network changesto a slow WiFi, however it’s enough to sustain the high 3G profile which the basicalgorithm already is using. Our algorithm performed an upswitch and weren’t able toplay this profile. A downswitch to the 3G high profile is performed. At 180 seconds thenetwork changed to the low 3G network. Our algorithm detects the network switch,but since it’s already on the 3G profile it won’t switch. Instead the frame received drops

55



and two downswitches are performed.

Figure 5.18: Scenario 5, FPS Figure 5.19: Scenario 5, Bitrate us-age, Max is cropped at 10

The last scenario shows how our algorithm performs when several network changesare performed. The network changes from 3G low to 3G high to Wifi back to 3G lowand finally to Edge high. This scenario does not provide us with any other insight(thatthe other scenarios haven’t already) however it shows us that the algorithm is able toadapt to changes and utilize the available bandwidth to it’s fullest.

56

Chapter 6Discussion

6.1 App discussionThis thesis aimed at producing two functional live video streaming apps for the Axiscamera. Both platforms had the same requirements, that is RTSP/RTP and H.264.The main challenge was to find a suitable way to decode H.264 as the solutions will bedependent on the decoding. Decoding can either be with hardware or with software,and we focused on finding implementations that supports hardware decoding. Ourrecommended solutions are for each platform on its own, but solutions that work onboth platforms are possible but will require future work.

Android discussionThe use of a hardware decoder is essential and we could see this in the results in section4.4. The solutions we consider for the android platform is the ones that supports this.

The Proxy RTSP solution is the easiest of the presented solutions and it didn’t taketoo long time to implement. It also made sense for us to start developing this solutionsince we didn’t have much prior experience with streaming protocols and decoding ofvideo frames. However after we had developed a functional prototype and tested iton different devices we found that the media players of different vendors and devicesdiffer too much for the proxy solution to be recommended. The proxy solution is toodependent on that the media players work the same and they just don’t.

FFmpeg and GStreamer could be great solutions and prototyping these would havegiven a better insight of how well they perform and what problems there are. Howeversince we had limited time it wasn’t possible to prototype all the solutions. Since we werealready developing an app with FFmpeg for iOS we decided against doing the same forAndroid. GStreamer is similar to FFmpeg and we started looking into prototypingthis but found that it would be better to wait for new versions of the SDK after we

57

6. DISCUSSION

encountered several problems(missing libraries, not working with Eclipse) with it andthe Android developing platform. Since the iOS app uses FFmpeg it could be beneficialto use this on the Android device as well. Most of the code could then be sharedbetween the two apps.

MediaCodec API provides an interface to the decoder, it’s certainly possible to com-bine FFmpeg or Gstreamer with the API for hardware decoding. We opted to use theMediaCodec API without any third party dependencies. This was to reduce externaldependencies and the fact that we already had a working RTSP client from the proxysolution. However we would recommend to use a third party library to complete theMediaCodec API with the necessary streaming capabilities. MediaCodec is relativelynew and does still lack in depth documentation, but we foresee that this will becomeless of a problem when more developers start using it. MediaCodec gives access to thehardware decoder without using NDK.

We developed our own code for everything that the MediaCodec doesn’t do, such asstreaming and demuxing, it was a simple implementation in comparison to Gstreamerand FFmpeg. For a larger project, that has to be more stable and maintained we wouldsuggest using a third party library with the MediaCodec API.

Recommendations: MediaCodec with third party libraries such as Netty, Gstreameror FFmpeg.

iOS discussionIn contrast to the Android platform the iOS platform only had one suitable solution,that is to use FFmpeg. As the iOS platform allows to write code in C, FFmpeg canbe used without increasing the complexity. Starting to work with FFmpeg is not aneasy task, the documentation is lacking and information is scattered across mailing listsand in source code. However once we worked with the library long enough, findingthe information was becoming less of a problem. We still haven’t started utilizing alloptimization and tweaks that can be applied to FFmpeg so more performance can mostlikely be obtained. For instance to have the compiling to work on the iOS device wehad to disable assembly code optimization which can impact the performance.

The lack of hardware decoding for the iOS platform limits it severely, but not to theextent where the platform becomes unusable. It’s still able to decode up to 720p whichis more than the maximum pixels on the screen, and any future increase in resolutionwon’t be noticeable on the device.

Recommendations: FFmpeg

6.2 Video Quality DiscussionThe profiles are based on measurements on bit rate usage for different sets of param-eters. These measurement could have been more in depth, for example testing morevideos with different scenes.

58

6.3 FUTURE WORK

In section 5.6 we tested several scenarios on our algorithm. There was no obvi-ous candidate to compare our algorithm with, and we decided to compare it with analgorithm that at least selects an appropriate profile based on the first network it’s con-nected to. Another approach would have been to use a static profile or the camerasdefault quality settings.

In [4] it’s concluded that the server has to monitor and detect a potential upswitchby varying the transmission rate. This is consistent with our findings that finds noreliable way of detecting upswitch possibility on the same network, but it’s still possibleby detecting the network. In [4] blank data is inserted into the stream to increase thebit rate and then monitor if it’s capable of the higher bit rate. According to the paperthis works very well when the server has such capabilities. Our up switching algorithmis very optimistic compared to [4] as we will always upswitch on network changes andthen downswitch until the correct level is found. In the end the user experience mightsuffer in our solution as the stream will close and reopen, which is avoided in [4]. In [16]four switching algorithms are compared, one of them is a reactive (similar to our butwithout our prediction part) and the others are predictive. The predictive algorithmsuses GPS data to predict the bandwidth and quality. They find that the algorithms arecapable of planning ahead for lower bandwidth areas, where the algorithm will fill thebuffers (with the excess bandwidth) and downswitch before entering that area. Alsothey are successful at determining upswitch possibilities.

Since networks can vary, there needs either to be a profile that is close to the lowerband of the network (e.g. 1 Mbps for 3G) or a switching algorithm needs to be used.One could leave this up to the user with a variety of settings and the ability to changebetween profiles during playback. However using our approach it’s possible to detectat a much earlier stage that the network can’t sustain the targeted frame rate. Theuser would need to wait for the buffer to clear before visually being able to detect thereduced FPS.

There is a shortcoming with our tests and that is that they only test the bandwidthand delay changes in the network, not the actual switch of for instance changing net-work from WiFi to 3G. We performed several tests, but time limited us to fully developand test in a non-simulated environment. We also did some basics tests (disconnectingthe router) and observed that new sockets needs to be opened on the new network butnot when switching between different 3G networks or Edge networks. We still arguethat the algorithm we propose is much better than just connecting to the camera with-out any profiles. We observe that in almost all cases when the network changed thealgorithm performed a switch (down or up).

The probing interval we used were not based on more than our observations, thatit works. In the end we settled with a 5 second probing interval, with probing starting10 seconds after a stream had been opened.

6.3 Future WorkApps The apps we developed were prototypes and there is much room to investigate

new functionally such as multiple streams (possible and working on iOS), audioand playback of recorded media. More work on the Apps need to be done to

59

6. DISCUSSION

provide end users with a good user experience.

Profiles The profiles we developed can be improved, but they will serve as a solid basefor future development.

Quality Algorithm The algorithm we developed for automatic switching leaves muchroom for improvement. One improvement would be to incorporate more con-texts, such as GPS and movement.

Other Platforms This thesis developed two prototypes for iOS and Android due totheir large market share, there are still several other platforms that can be investi-gated, such as the Windows Phone.

6.4 ConclusionThe overall goals of this thesis were to investigate the current support on smartphonesfor live streamed video and implement prototypes on iOS and Android. Also this thesisaimed to ensure video quality on mobile networks.

• Playback of live streamed video (RTSP/H.264 streaming, authentication and de-coding)

• Quality control and reliability of live stream video

Our implemented apps were able to provide this functionality and the algorithmand profiles we developed enables quality control of live streamed video.

60

Appendices

61

Appendix AFull size figures

Figure A.1: Scenario 1, Frames per second

63

A. FULL SIZE FIGURES

Figure A.2: Scenario 1, Bitrate usage, Max is cropped at 10


64

Figure A.4: Scenario 2, Bitrate usage


65




66



67


Figure A.10: Scenario 5, Bitrate usage, Max is cropped at 10

Figure A.11: The initial CPU load

68

Figure A.12: The initial memory usage

Figure A.13: CPU load with our implementation

69


Figure A.14: Memory usage with our implementation

Figure A.15: CPU load with RTSP player

70

Figure A.16: Memory usage with RTSP player

71


72

Appendix BCode Snippets

B.1 iOS detect network

Listing B.1: iOS detecting network (Objective-C)/* Return: 0=No Network , 1=2G, 2=3G, 3=4G, 5= WiFi */- (int) dataNetworkTypeFromStatusBar {

UIApplication *app = [ UIApplication sharedApplication];

NSArray * subviews = [[[ app valueForKey :@" statusBar "]valueForKey :@" foregroundView "] subviews ];

NSNumber * dataNetworkItemView = nil;for (id subview in subviews ) {

if([ subview isKindOfClass :[ NSClassFromString (@"UIStatusBarDataNetworkItemView ") class ]]) {

dataNetworkItemView = subview ;break;

}}return [[ dataNetworkItemView valueForKey :@"

dataNetworkType "] intValue ];}

73

B. CODE SNIPPETS

74

Appendix CHow to compile FFmpeg

In this chapter we will show how to compile FFmpeg, it will be targeted for the iOSplatform but it would be very similar on other devices.

The best tutorial we found on how to use FFmpeg was [49]. It’s slightly outdatedbut the concepts of the library still applies. The more advanced usages of FFmpeg wefound by searching the mailing list and the source code.

A working example on how to use FFmpeg does simply not fit within the scope ofthis thesis and we suggest looking more into [49].

C.1 Compiling for iOSThe first thing that needs to be done is to compile the library so that it can be usedon the iOS platform. We also wanted to compile it for the simulator. There is a goodtutorial on tagentsoftworks blog [50] that we used. The following tutorial is directlytaken from the blog post of tagentsoftworks.

Before starting ensure that command line tools in Xcode is installed and that theversion of Xcode is targeted for 6.1. If any other version are used the correct pathsneeds to be updated. The first thing to do is to download FFmpeg:

git clone git://source.ffmpeg.org/ffmpeg.git ~/ffmpeg

There are three versions that we need to compile to armv7 (3Gs or later), armv7s(iPhone 5) and i386 (simulator). We want to create an universal build for all thesethree. Start by creating the directory structure:

cd ffmpegmkdir armv7mkdir armv7smkdir i386mkdir -p universal/lib

75

C. HOW TO COMPILE FFMPEG

As we want to compile with assembler optimizations we need to install gas-preprocessor.This can be done by downloading it from https://github.com/mansr/gas-preprocessor.Now copy the gas-preprocessor.pl to the /usr/bin directory. Finally change thepermission of the file to Read & Write for all.

The following three steps will be compiling of the library. More detailed infor-mation on the options of how FFmpeg can be configured can be found by issuing./configure --help in the FFmpeg folder.The following configure options will configure FFmpeg for the armv7 build../configure \--prefix=armv7 \--disable-ffmpeg \--disable-ffplay \--disable-ffprobe \--disable-ffserver \--enable-avresample \--enable-cross-compile \--sysroot="/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform

/Developer/SDKs/iPhoneOS6.1.sdk" \--target-os=darwin \--cc="/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/

Developer/usr/bin/gcc" \--extra-cflags="-arch armv7 -mfpu=neon -miphoneos-version-min=6.1" \--extra-ldflags="-arch armv7 -isysroot /Applications/Xcode.app/Contents/Developer

/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS6.1.sdk -miphoneos-version-min=6.1" \

--arch=arm \--cpu=cortex-a9 \--enable-pic \

Finally to build FFmpeg run the following commands:make clean && make && make install

To configure FFmpeg for armv7s use the following:./configure \--prefix=armv7s \--disable-ffmpeg \--disable-ffplay \--disable-ffprobe \--disable-ffserver \--enable-avresample \--enable-cross-compile \--sysroot="/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform

/Developer/SDKs/iPhoneOS6.1.sdk" \--target-os=darwin \--cc="/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/

Developer/usr/bin/gcc" \--extra-cflags="-arch armv7s -mfpu=neon -miphoneos-version-min=6.1" \--extra-ldflags="-arch armv7s -isysroot /Applications/Xcode.app/Contents/

Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS6.1.sdk -miphoneos-version-min=6.1" \

--arch=arm \--cpu=cortex-a9 \--enable-pic \

76

https://github.com/mansr/gas-preprocessor

C.1 COMPILING FOR IOS

Then build with

make clean && make && make install

Finally configure the following for the i386 build. Note that assembler optimizationsis disabled as it will throw an error if it is not.

./configure \--prefix=i386 \--disable-ffmpeg \--disable-ffplay \--disable-ffprobe \--disable-ffserver \--enable-avresample \--enable-cross-compile \--sysroot="/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneSimulator.

platform/Developer/SDKs/iPhoneSimulator6.0.sdk" \--target-os=darwin \--cc="/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneSimulator.

platform/Developer/usr/bin/gcc" \--extra-cflags="-arch i386" \--extra-ldflags="-arch i386 -isysroot /Applications/Xcode.app/Contents/Developer/

Platforms/iPhoneSimulator.platform/Developer/SDKs/iPhoneSimulator6.0.sdk" \--arch=i386 \--cpu=i386 \--enable-pic \--disable-asm \

And build again with:

make clean && make && make install

Now it’s time to create the universal library this is done with the lipo command. Issuethe following in the FFmpeg folder:

cd armv7/libfor file in *.adocd ../..xcrun -sdk iphoneos lipo -output universal/lib/$file -create \-arch armv7 armv7/lib/$file \-arch armv7s armv7s/lib/$file \-arch i386 i386/lib/$fileecho "Universal $file created."cd -donecd ../..

It’s now time to link the library in Xcode. Firstly create a new project in Xcode (or usean existing one). Drag the .a files from ffmpeg/universal/lib into the Frameworksfolder in the Project Navigator pane. Ensure that the ”Copy items into destinationgroup’s folder (if needed)” is checked.

Now the include files needs to be copied, do this by drag and drop the files underffmpeg/armv7/include into the Project Navigator pane and ensure that the checkmark is applied.

77

C. HOW TO COMPILE FFMPEG

Finally the Header Search Paths needs to be set for the project. This can be doneby clicking the project in the Project Navigator pane. Under Build Settings search for”Header Search Paths” and add the project path and set it to Recursive $(SRCROOT).Under Link Binary with Libraries (in Build Phases) add libbz2.dylib , libz.dyliband libiconv.dylib.

FFmpeg should now be ready to be used on the iOS platform. To ensure that thethe library is working add the following to the AppDelegate.m:

# include " avformat .h"

And in the method didFinishLaunchingWithOptions add:

av_register_all ();

If everything works as intended the app should be started on the simulator or thedevice.

78

Appendix D

Decoding with the MediaCodec API

In this chapter we will show how we used the API, how to configure/start/close thedecoder, what input and output that is expected and how to render to the screen.

As of writing this the MediaCodec API is still new and there doesn’t exist a lot ofexamples of how it can be used. The best introduction to the API we found was anpresentation given by Google [31]. In the Android reference [38] the API is explained.However it only gives a simple example of how the API can be used.

D.1 The basics

In the MediaCodec API a decoder has a number of input and output buffers, each bufferis referred to by index in API calls. Ownership of the buffers is transferred betweenthe client and the decoder. Calls to dequeueInputBuffer(timeout) transfer ownership ofan input buffer to the client and returns the index. The client can fill the input bufferwhen it has ownership then submit the data to the decoder for decoding by callingqueueInputBuffer(inputBufferIndex, ...). After the decoder has finished processing thedata it can be obtained in an output buffer by calling dequeueOutputBuffer(timeout)which returns the index. When the client is finished with the output buffer it canbe released back to the decoder by calling releaseOutputBuffer (outputBufferIndex,render) where render is a boolean which indicates if the buffer should be rendered on asurface [38].

Calls to the mentioned methods is thread safe and can be used asynchronously. Inour implementation we use three threads. The first for submitting data to the inputbuffer, the second to obtain the index of output buffers and the third to render andrelease the output buffers.

79

D. DECODING WITH THE MEDIACODEC API

D.2 Sample codeIt can be a good idea to check if a device has support for the media type that the decodershould decode before actually configuring the decoder. In D.1 based on an examplefrom [31] a check is made to see of the decoder supports the baseline profile of H.264.

Listing D.1: Check H.264 decoder support (Java)boolean support = false;for (int i = 0; i < MediaCodecList . getCodecCount (); ++i)

{MediaCodecInfo inf = MediaCodecList . getCodecInfoAt (i);

if (! inf. isEncoder ()) {continue;

}String [] types = inf. getSupportedTypes ();for (int j = 0; j < types.length; ++j) {

MediaCodecInfo . CodecCapabilities caps = inf. getCapabilitiesForType (types[j]);

if (! types[j]. equals("video/avc"))continue;

for (int k = 0; k < caps. profileLevels .length; ++k) {switch (caps. profileLevels [k]. profile ) {case MediaCodecInfo . CodecProfileLevel .

AVCProfileBaseline :Log.i(" Decoder ", " Supports AVCProfileBaseline ");support = true;break;

default:break;

}}

}}

If the check passed then it’s time to configure the decoder. In listing D.2 the decodercreated, configured and started. Also here we start our three work threads.

Listing D.2: Configure and start the decoder (Java)// Create the decoderdecoder = MediaCodec . createDecoderByType ("video/avc");

// Configure the decoderMediaFormat format = new MediaFormat ();format. setInteger ( MediaFormat .KEY_WIDTH , 800);format. setInteger ( MediaFormat .KEY_HEIGHT , 600);format. setString ( MediaFormat .KEY_MIME , "video/avc");decoder . configure (format , surface , null , 0);

80

D.2 SAMPLE CODE

decoder .start ();

// start the work threadsThread th1 = new Thread(new Input ());Thread th2 = new Thread(new Output ());Thread th3 = new Thread(new Render ());th1.start ();th2.start ();th3.start ();

All examples that we found by searching on the internet used the API to decodea stored media file. An instance of the MediaExtractor( ) class is created and a filepathset. Data to submit to the decoder is obained by calling extractor.readSampleData(buff,0);. When trying out the MediaCodec API the first time it can be a good thing to startout by decoding a stored file. In our thesis we handle streams of data, specifically RTPstreams carrying H.264 encoded media. Because of this the extractor can’t be used,instead a custom way of obtaining and writing data is required.

With H.264 the first submitted data must contain setup data. In addition to thesetup data each frame must begin with a padding. When submitting the setup datawith queueInputBuffer (inputBufferIndex, ..., flag) the flag must be set toMediaCodec.BUFFER_FLAG_CODEC_CONFIG. Listing D.3 shows the input thread.When submitting the data with queueInputBuffer(..) the presentationTimeUs attributesignifies the time the frame should be rendered. The presentationTimeUs isn’t actuallyused by the API, you can set it here then retrieve it from MediaCodec.BufferInfo whencalling dequeueOutputBuffer(info,..). The presentation time can be calculated from rtptimestamps.

Listing D.3: Input thread (Java)ByteBuffer [] inputBuffers = decoder . getInputBuffers ();while (run) {

// Wait for an inputBufferIndexint inIndex = decoder . dequeueInputBuffer (-1);

// obtain input data , We use a custom class callledFrame.

Frame frame = // obtain frame

// rewind buffer position and write the input data tothe buffer

ByteBuffer buffer = inputBuffers [ inIndex ];buffer.rewind ();buffer.put(frame.buff , 0, frame.length);

int sampleSize = frame.length;

// submit the data to the decoder .if (frame.config){

81


decoder . queueInputBuffer (inIndex , 0, sampleSize , 0,MediaCodec . BUFFER_FLAG_CODEC_CONFIG );

}else if(frame.end)decoder . queueInputBuffer (inIndex , 0, 0, 0, MediaCodec

. BUFFER_FLAG_END_OF_STREAM );run = false;

else{decoder . queueInputBuffer (inIndex , 0, sampleSize ,

presentationTimeUs , 0);}

}

The output thread in Listing D.4 calls dequeueOutputBuffer(info, -1), the info at-tribute is of the type MediaCodec.BufferInfo from which information about the buffercan be obtained such as if the end of stream flag has been set. The index of the outputbuffer and its format information is put into lists from which the render thread willlater retrieve them.

Listing D.4: Output thread (Java)while (run) {

MediaCodec . BufferInfo info = new MediaCodec . BufferInfo();

int outIndex = decoder . dequeueOutputBuffer (info , -1);indexes .add( outIndex );bufferInfo .add(info);run = (info.flags != MediaCodec .

BUFFER_FLAG_END_OF_STREAM );}

Listing D.5 shows the render thread. The call to releaseOutputBuffer (output-BufferIndex, render) will release the buffer back to the decoder, because the renderflag is set to true and we configured a surface with the decoder, the buffer data willbe rendered to that surface. In this sample code the video buffer has been omitted,normally before rendering there should be a video buffer.

Listing D.5: Render thread (Java)while (true) {

int outIndex = // get the next index from the index listMediaCodec . BufferInfo info = // get the next info from

the bufferInfo list

if(info.flags == MediaCodec . BUFFER_FLAG_END_OF_STREAM ){break;

}

switch ( outIndex ) {// There are more cases that can be checked herecase MediaCodec . INFO_OUTPUT_FORMAT_CHANGED :

82

D.2 SAMPLE CODE

Log.d("Render", "output format has changed to " +oformat );

break;default:

// There should be a video buffer before this step.Use info. presentationTimeUs for time calulation

decoder . releaseOutputBuffer (outIndex , true);break;

}}

In the three work threads the MediaCodec.BUFFER_FLAG_END_OF_STREAMflag occurs. We have used this flag to properly close the threads, the flag is set in thefirst thread then traverses and stops the threads in the correct order.

83


84

Bibliography

[1] Wikipedia, http://en.wikipedia.org/wiki/Axis_Communications, [Online][Accessed 13 May 2013]

[2] Charalampos Patrikakis, Nikos Papaoulakis, Chryssanthi Stefanoudaki, andMario Nunes, Streaming Content Wars: Download and Play Strikes Back , P. Darasand O. Mayora (Eds.): UCMedia 2009, LNICST 40, pp. 218-226, 2010.

[3] Alexander Tarnowski An Experimental Study of Algorithms for Multimedia Stream-ing, M.S. thesis, Royal Institute of Technology, Stockholm, Sweden 2004

[4] William Eklof, Adaptive Video Streaming, M.S. thesis, Royal Institute of Technol-ogy, Stockholm, Sweden 2008

[5] Kai Chen , Yuan Xue, Samarth H. Shah, Klara Nahrstedt, UnderstandingBandwidth-Delay Product in Mobile Ad Hoc Networks, University of Illinois atUrbana-Champaign, 2003.

[6] Behrouz A. Forouzan, Data Communications and Networking, chp 23. p 703-760,2007

[7] John G. Apostolopoulos, Wai-tian Tan, Susie J. Wee, Video Streaming: Concepts,Algorithms, and Systems, 2002

[8] RFC 2326, Real Time Streaming Protocol (RTSP), http://www.ietf.org/rfc/rfc2326.txt, [Online] [Accessed 28 April 2013]

[9] RFC 2617, HTTP Authentication: Basic and Digest Access Authentication, http://www.ietf.org/rfc/rfc2617.txt, [Online] [Accessed 28 April 2013]

[10] Apple, Tunneling QuickTime RTSP and RTP over HTTP, https://developer.apple.com/quicktime/icefloe/dispatch028.html, [Online] [Accessed 6 May2013]

85

http://en.wikipedia.org/wiki/Axis_Communications

http://www.ietf.org/rfc/rfc2326.txt




https://developer.apple.com/quicktime/icefloe/dispatch028.html

https://developer.apple.com/quicktime/icefloe/dispatch028.html

BIBLIOGRAPHY

[11] Whai-En Chen , Chun-Chieh Chiu ,Yuan-Bo Chang, A Performance Study onContext-Aware Real-time Multimedia Transmission between Smartphone and Cloud,2012

[12] Maoqiang Song,Jie Sun, Xiangling Fu, Wenkuo Xiong, Design and Implementationof Media Player Based on Android, 2010

[13] Thomas Zinner, Osama Abboud, Oliver Hohlfeld, Tobias Hossfeld, Impact ofFrame Rate and Resolution on Objective QoE Metrics, 2010

[14] Wikipedia, http://en.wikipedia.org/wiki/Feature_phone, [Online] [Ac-cessed 7 May 2013]

[15] Axis Communications, Technical Guide to network video, 2013,http://www.axis.com/files/brochure/bc_techguide_47847_en_1303_lo.pdf,[Online] [Accessed 28 April 2013]

[16] Haakon Riiser, Paul Vigmostad, Carsten Griwodz, Pal Halvorsen, Bitrate andvideo quality planning for mobile streaming scenarios using a gps-based bandwidthlookup service, 2011.

[17] Behrouz A. Forouzan, Data Communications and Networking, chp 29. p 901-902,2007

[18] Colin Perkins, RTP Audio and Video for the Internet, chp 3, 2008

[19] Motorola, Video Surveillance Trade-Offs,http://www.motorolasolutions.com/web/Business/_Documents/static%20files/VideoSurveillance_WP_3_keywords.pdf,[Online] [Accessed 7 May 2013]

[20] Axis Communications, An explanation of video compression techniques, 2008, ,http://www.axis.com/files/whitepaper/wp_videocompression_33085_en_0809_lo.pdf,[Online] [Accessed 7 May 2013]

[21] Kush Amerasinghe, H.264 for the rest of us, Adobe Systems.

[22] Wikipedia, http://en.wikipedia.org/wiki/2G, [Online] [Accessed 6 May2013]

[23] Wikipedia,http://en.wikipedia.org/wiki/Enhanced_Data_Rates_for_GSM_Evolution,[Online] [Accessed 6 May 2013]

[24] Wikipedia, http://en.wikipedia.org/wiki/HSDPA, [Online] [Accessed 28April 2013]

86

http://en.wikipedia.org/wiki/Feature_phone

http://www.axis.com/files/brochure/bc_techguide_47847_en_1303_lo.pdf

http://www.axis.com/files/brochure/bc_techguide_47847_en_1303_lo.pdf

http://www.motorolasolutions.com/web/Business/_Documents/static%20files/VideoSurveillance_WP_3_keywords.pdf

http://www.motorolasolutions.com/web/Business/_Documents/static%20files/VideoSurveillance_WP_3_keywords.pdf

http://www.axis.com/files/whitepaper/wp_videocompression_33085_en_0809_lo.pdf

http://www.axis.com/files/whitepaper/wp_videocompression_33085_en_0809_lo.pdf

http://en.wikipedia.org/wiki/2G

http://en.wikipedia.org/wiki/Enhanced_Data_Rates_for_GSM_Evolution

http://en.wikipedia.org/wiki/Enhanced_Data_Rates_for_GSM_Evolution

http://en.wikipedia.org/wiki/HSDPA

BIBLIOGRAPHY

[25] Wikipedia,http://en.wikipedia.org/wiki/UMTS, [Online] [Accessed 28 April2013]

[26] Wikipedia,http://en.wikipedia.org/wiki/4G, [Online] [Accessed 28 April2013]

[27] Wikipedia,http://en.wikipedia.org/wiki/802.11, [Online] [Accessed 28April 2013]

[28] Wikipedia, http://en.wikipedia.org/wiki/Smartphone, [Online] [Accessed28 April 2013]

[29] Android,http://developer.android.com/guide/appendix/media-formats.html,[Online] [Accessed 28 April 2013]

[30] Wikipedia, http://en.wikipedia.org/wiki/IOS [Online] [Accessed 28 April2013]

[31] MediaCodec, http://www.youtube.com/watch?v=YmCqJlzIUXs, [Online] [Ac-cessed 28 April 2013]

[32] Gartner, http://www.gartner.com/newsroom/id/2335616, [Online] [Accessed28 April 2013]

[33] Zenprise, http://www.zenprise.com/mdm-cloud-report-q3-2012, [Online][Accessed 28 April 2013]

[34] Wikipedia, http://en.wikipedia.org/wiki/Android_(operating_system),[Online] [Accessed 28 April 2013]

[35] Apple,http://developer.apple.com/library/ios/#documentation/networkinginternet/conceptual/streamingmediaguide/FrequentlyAskedQuestions/FrequentlyAskedQuestions.html,[Online] [Accessed 28 April 2013]

[36] Apple,http://developer.apple.com/library/ios/#documentation/miscellaneous/conceptual/iphoneostechoverview/MediaLayer/MediaLayer.html,[Online] [Accessed 28 April 2013]

[37] Apple, https://developer.apple.com/resources/http-streaming/,[Online] [Accessed 28 April 2013]

[38] Android, http://developer.android.com/reference/android/media/MediaCodec.html,[Online] [Accessed 28 April 2013]

87

http://en.wikipedia.org/wiki/UMTS

http://en.wikipedia.org/wiki/4G

http://en.wikipedia.org/wiki/802.11

http://en.wikipedia.org/wiki/Smartphone

http://developer.android.com/guide/appendix/media-formats.html

http://en.wikipedia.org/wiki/IOS

http://www.youtube.com/watch?v=YmCqJlzIUXs

http://www.gartner.com/newsroom/id/2335616

http://www.zenprise.com/mdm-cloud-report-q3-2012

http://en.wikipedia.org/wiki/Android_(operating_system)

http://developer.apple.com/library/ios/#documentation/networkinginternet/conceptual/streamingmediaguide/FrequentlyAskedQuestions/FrequentlyAskedQuestions.html



http://developer.apple.com/library/ios/#documentation/miscellaneous/conceptual/iphoneostechoverview/MediaLayer/MediaLayer.html



https://developer.apple.com/resources/http-streaming/

http://developer.android.com/reference/android/media/MediaCodec.html

http://developer.android.com/reference/android/media/MediaCodec.html

BIBLIOGRAPHY

[39] FFmpeg, http://en.wikipedia.org/wiki/FFmpeg, [Online] [Accessed 28April 2013]

[40] Changelog FFmpeg (version 0.9), http://www.ffmpeg.org, [Online] [Accessed28 April 2013]

[41] JJmpeg, http://code.google.com/p/jjmpeg/, [Online] [Accessed 28 April2013]

[42] Live555, http://www.live555.com/, [Online] [Accessed 28 April 2013]

[43] Netty, http://www.netty.io, [Online] [Accessed 28 April 2013]

[44] Gstreamer, http://www.gstreamer.com/, [Online] [Accessed 28 April 2013]

[45] SpeedTest, http://www.speedtest.net/, [Online] [Accessed 28 April 2013]

[46] Telia,http://www.telia.se/privat/mobilt/mer-om-mobiltelefoni/tackning-hastighet/hastighet-mobilsurf/hastighet-mobilsurf.page,[Online] [Accessed 28 April 2013]

[47] RTSP Player,https://play.google.com/store/apps/details?id=org.rtspplr.app,[Online] [Accessed 8 May 2013]

[48] Eqeniq, http://www.egeniq.com, [Online] [Accessed 8 May 2013]

[49] FFmpeg Tutorial, http://dranger.com/ffmpeg/ , [Online] [Accessed 13 May2013]

[50] Compiling FFmpeg,http://www.tangentsoftworks.com/blog/2012/11/12/how-to-prepare-your-mac-for-ios-development-with-ffmpeg-libraries/,Online] [Accessed 13 May 2013]

88

http://en.wikipedia.org/wiki/FFmpeg

http://www.ffmpeg.org

http://code.google.com/p/jjmpeg/

http://www.live555.com/

http://www.netty.io

http://www.gstreamer.com/

http://www.speedtest.net/

http://www.telia.se/privat/mobilt/mer-om-mobiltelefoni/tackning-hastighet/hastighet-mobilsurf/hastighet-mobilsurf.page

http://www.telia.se/privat/mobilt/mer-om-mobiltelefoni/tackning-hastighet/hastighet-mobilsurf/hastighet-mobilsurf.page

https://play.google.com/store/apps/details?id=org.rtspplr.app

http://www.egeniq.com

http://dranger.com/ffmpeg/

http://www.tangentsoftworks.com/blog/2012/11/12/how-to-prepare-your-mac-for-ios-development-with-ffmpeg-libraries/

http://www.tangentsoftworks.com/blog/2012/11/12/how-to-prepare-your-mac-for-ios-development-with-ffmpeg-libraries/