Video playback on iOS & tvOS
When developping an application on iOS and/or tvOS, video playback can be a strategic or accessory component. There are different solutions available on iOS/tvOS, each of them with pros and cons, depending on project requirements. The selection is an important milestone, determining future limitations or development charge. The unit in charge of playback is a player. This can be the playback engine alone (the library in charge of decoding and displaying the video) or its UI/UX wrapper providing a various set of playback control features.
Technological reminder: iOS and tvOS share the same codebase; even if there may be platform specific code, mostly regarding the UI.
Player pick
Let’s take a look at the available alternatives to find the most suitable player depending on the project requirements. There are essentially 3 technological choices regarding the player selection:
Off the shelf solution
There are different actors on the market, providing players with different sets of features, free or not. They can offer a complete package, with, among other features, DRM management. They can also combine on device playback with content hosting. The risk, with those solutions, is to get stuck because of an undoable feature; not being able to deal with a closed API to add custom features on top of it. Or suffer from bugs, or limited support.
AVKit
AVKit is the native “view-level services for media playback” from Apple. It offers a large set of features, but a thin API regarding customization. It guarantees Apple support and offers the most natural user experience.
Full custom player
It consists in doing the same work AVKit is doing, on your own, your way. This solution allows to add as many features as requested. But it implies a substantial cost, first at development, then at maintenance.
Both of those 2 last solutions are based on AVFoundation, the iOS/tvOS native playback engine. At player selection phase, the key is to list the current requirements, to anticipate the future ones and find the solution offering the best compromise.
The key features for a player
Streams
First things first, the streams. To provide the best playback experience the first ingredient is the stream itself.
On iOS and tvOS, Apple strongly recommends using the HLS format. In this format, a stream is made of different content playlists offering varying video qualities, supported audio formats, languages, etc. The player selects the most appropriate subset depending on the network quality and the user preferences. Indeed, AVFoundation is capable of real time adaptation to bandwidth and network capabilities to select the most suitable stream. If using AVFoundation as playback engine (see next section), the player is able to select the best stream quality depending on the screen size, so this is good practice to provide a large set of quality going from the iPhone 5 small screen to the iPad pro big screen (plus smallest qualities for poor network quality to avoid stalling).
Providing rich HLS streams enhances the user experience. IFrame playlist enables responsive seeking, especially on tvOS, where a thumbnail displays the seek target frame. Low Latency HLS enables a few seconds latency on live streams, this is a key feature for live sport events, for example.
On top of the stream format, depending on the content, a protection may be required to avoid stealing copyrighted content; this is what DRM does. Apple recommends the use of its own solution, Fairplay. This is the only technology supported by AVFoundation and AVKit. Having a proprietary DRM implementation may be a reason why using the associated proprietary player, even if it will shorten the available features set.
Playback engine
Once streams are good, there is a quick step to select the playback engine. I think not using AVFoundation would be a bad idea because it has been built and maintained by Apple for a long time, and benefits from their low level optimizations. It’s, for sure, efficient regarding both performances and power consumption, and will receive improvement through next iOS/tvOS versions. For example since iOS/tvOS 13, AVFoundation supports Low Latency HLS, without requiring additional work client side.
Controls
Then, the biggest part regarding UI and UX is the controls. This is the set of buttons and gestures available to let the user navigate through his video playback. On that specific area, AVKit has great advantages. Indeed, it provides a rich collection of handles, such as system level volume control, fine transportation through the media, drag to dismiss fullscreen mode, Airplay support (iOS only), seek thumbnail (tvOS only), audio source selection, language selection, Picture in Picture (iPad only, iPhone and Apple TV since iOS 14). On top of being elegant and well designed features, the AVKit controls are shared across most of the applications on the user iPhone, iPad or Apple TV, meaning the user is used to them, he knows where to find the buttons and how to manipulate the player. This last point is important considering how often someone uses his phone during the day.
If these features match the application requirement, AVKit is for sure the best solution. Indeed AVKit is built to be easy to use, does all the controls work and provides a great user experience.
Otherwise, mostly if application specific controls are required, considering going custom becomes a possibility. But this is important to consider that option knowing that developing and maintaining a player is a big time consuming operation.
AVKit pros and cons
AVKit is the player Apple uses in the system applications like Safari on iOS or Apple TV on tvOS. Before overviewing features and capabilities, using AVKit is a guarantee to have a homogeneous playback experience across the user device.
A deep integration with the OS
It benefits from a various set of features well integrated with the OS:
- the volume controls works at system level for up, down and mute
- after providing the media metadata (title, artwork, etc.), they are displayed in different information view across devices
- Airplay 2 with out-of-the-box multi-room support on both iOS and tvOS
With a light additional development effort, Picture-in-Picture is possible, even with the application in background, on both iOS and tvOS. This is not possible to do that without AVKit.
A large set of controls over playback
AVKit offers elegant ways of controlling the playback. iOS has a transport bar to seek through the content in combined fast and precise ways, meanwhile tvOS has a seek thumbnail feeded with the IFrame stream allowing efficient scrub with the remote controls. Even if it is possible to reimplement those behaviors in a custom player using it is a significant gain of time.
On the other hand, it is basically not possible to customize those behaviors nor UI elements. This last point can be frustrating, but it’s a good thing, it guarantees the exact same user experience inside the Apple ecosystem : the user knows exactly how the player behaves.
Additional features
tvOS specific
On tvOS, AVKit does not allow to add custom controls (like buttons) to the native ones. There is a “non obscured view” where it’s possible to display additional information over the video, but it won’t receive focus, so it won’t have user interaction. There are some workarounds to display buttons but not while displaying the native controls; Netflix is doing so to present the next episode proposition within its “binge watching” feature. Meanwhile, AVKit already supports various gestures, it’s not recommended to override them nor add others on top of them, making the customization window very tight.
Anyway the native gestures are still interesting:
- Swipe down opens a multi tab overlay with:
- “info” section to display the same kind of information display in the “now playing” screen on iOS control center, automatically filled with the content metadata
- Audio settings, automatically filled with the stream languages and subtitles, also provides audio settings for the system like the audio output
- A custom section to display custom content. And, good news, user interaction are enabled there
- Swipe up opens a custom overlay container
- Swipe right/left is a brand new tvOS 13 feature called channel flipping. It’s designed to zap through channels, AVKit asks for the next/previous channel and changes the stream on its own. This feature is only activable for live streams.
iOS specific
Like in tvOS, iOS AVKit does not allow to add additional controls to the regular ones. This is highly not recommended to add buttons into the AVPlayerViewController view hierarchy. It’s possible to add buttons on top of it, but not presenting a naked AVPlayerViewController will remove features like “pull to exit fullscreen” which is an important feature to avoid the user tapping the top of its screen to exit the player. Another difficulty adding controls is the display won’t be synchronized with the default ones, this means it’s possible to mimic the fade in and timed fadeout, but it won’t be perfect.
Roughly, AVKit is not made for custom controls insertion.
AVKit on iOS doesn’t offer as many UI/UX features as on tvOS, but it is way easier to implement them as there is no remote control handling that has to be shared between AVKit and the application.
Anyway, there are still, iOS specific interesting features, for example, the Airplay support lets the user stream his content on Apple TV, for free to the developper.
Conclusion
Regarding player solution selection, it has to be noticed that AVKit offers the best user experience for its available features. That is for sure that it’s not possible to offer an equivalent experience of integration with the system. And that experience comes for free, or almost free. But AVKit hasn’t been made to be customized or even extended by simple features, like adding buttons to the control overlay.
If a custom experience is more important than native features, the next solution is to build a custom player.