A Case Study in Making WebVR Social: conVRsation
The question I kept being asked at conferences, presentations and panels was ‘How can you demonstrate your latency remotely?’
Our standard ‘road’ demo, an arrangement of our camera and playback pipeline, was to immerse the user in a literal out-of-body experience and grant them a third person perspective. It was then that they would be able to see their arms moving and how their outfits looked from a god’s eye point-of-view…as it was happening — an onsite demo with a latency of approximately 0.3 seconds. This was enough to show how fast our stitching was as well as our re-projection to the headset but there were still missing pieces. Eventually we cranked it up a notch and began streaming the feed through the local network of wherever we were exhibiting, demonstrating how we could take the feed from the camera, stitch, encode, stream through the local network, decode, then feed it back into the headset (usually an Oculus Rift) without a compromise in our ‘real-time’ promise. The result: an onsite demo with a latency of approximately 0.7 seconds. This was, and still is, one of our go-to demos to draw a crowd.
How We Built a Social WebVR Pipeline
But ‘how can you demonstrate your latency remotely?’
Enter WebVR — the droid we had been looking for.
In a nutshell, WebVR allows for experiencing virtual reality experiences within your browser instead of on an outside platform such as Oculus, Steam VR or the like. It’s a tool for granting open access to the medium via the internet. In fact all you need to experience WebVR, as listed on their information site, is a headset and a compatible browser. Likewise, this is now the only two things that are required for users to access our remote latency demonstration platform. Easy.
The most logical way of showing virtual reality latency remotely was to make use of existing communication protocols — having a live feed be accessible across oceans and time-zones meant utilizing RTC (real time communication) protocols. WebRTC, a free and open source project, allowed us to navigate this hurdle, providing a framework on which to construct a streaming pipeline based on the available API’s, and essentially laying down the groundwork for peer to peer video communication within the browser — thus eliminating the need for external apps. Fundamentally, it’s the same foundation that Google Hangouts is built on. Currently WebRTC is supported mainly by Google and Mozilla and as such the compatible browser options are limited to Chrome and Firefox. In terms of using VR, the appropriate browser is dependent on the HMD (Head Mounted Display) being used.
The problem with A-Frame: there is no current native stereoscopic support for video content. Any video feeds are still limited to monoscopic formats. In order to circumnavigate this issue it was necessary to explore the annals of the shared internet community to look for a loophole.
It took 0.42 seconds to find a path forward. Development into this kind of video support was already ongoing and available through GitHub. A user, Óscar Marín Miró, had wrote a plug-in that allows A-Frame to play 3D video — a component that was ultimately the starting block for us to enable stereoscopic video playback through A-Frame in the browser.
Equirectangular projection is the standard for 360 video so most of the existing frameworks and plug-ins are built with this as the focus point but, considering our latency problem, it’s not necessarily the most efficient for web streaming in terms of maintaining the native quality of the feed whilst ensuring instant playback. It was necessary to include support for further projection options.
More concisely, our main contribution to the existing code was to add the capability of supporting and displaying cube map projection and subsequently being able to pipe it directly into A-Frame. Like anything else, the choice of projection is based on application specific considerations — there is no universally accepted correct choice. In an application like conVRsation, in which the end user will have total freedom to look around the 360 space, the points of focus in the space are unknown.
In an equirectangular projection the pixel density is unevenly divided — a good way to visualize this problem would be to understand that Greenland always looks bigger on the ‘unfolded’ map of the earth, similar in size to Australia, when in reality Australia is four times bigger. This is because at the northern and southern tips of an equirectangular projection, the pixel density is warped. Through this warping there is a substantial loss of quality in the affected region. In the central region (equator) of the projection, usually the main points of focus for an end user, the pixel spread results in a lesser quality. In a cube map projection the pixel density is more controlled, but the known regions of interest less apparent, which results in a more evenly distributed quality of image.
Consequently, having the support for cube map projection through WebVR enabled us to stream stereoscopic video feeds without a substantial loss in image quality. Ultimately it grants higher quality at lower resolution compared to standard equirectangular. With the standard trade-off’s of online streaming taken into account, mainly resolution loss, the decision to boost as much of the image as we can was a no-brainer.
The Suometry Performance Edge
As much as the above mentioned WebVR pipeline and ‘hacks’ have allowed us to ensure real-time stereoscopic streaming for remote demonstrations, its compatibility is limited to live stitching speed. As a company our core IP lies in the patented geometry of our cameras which in turn spills into the accompanying internal stitching software needed to deliver immersive virtual reality video with zero latency. Currently our stitching speed from glass-to-glass is less than the blink of an eye, so having that video feed be able to be ‘plugged’ into the WebVR pipeline results in no loss of latency on our end.
As a compatibility example, if you plugged any other off the shelf VR camera into the same pipeline the fastest available latency currently would grant about 2–3 seconds, and immediately the ability to have a flowing and normal conversation is impossible — building a layer of communication onto the streaming feed is only possible when the incoming video has undisturbed latency.
On this note, the final challenge was having the incoming video feed be recognized as a webcam for the browser. That is, a webcam feed that is already the stitched output from the camera.
How did we do this? DirectShow.
DirectShow, an interface developed by Microsoft and native to windows, is an API that allows high quality video capture and playback from your applications. It allows a device to be recognized as a webcam. This results in a smoother and more lossless flow of the feed into the browser as opposed to standard video streaming which involves the initial encoding of the stream and the decoding at the end of the pipeline — two processes that affect the quality of the video. If we take Skype as an example, a platform that utilizes the built in webcam of a device, the streaming is lossless AKA what you see is what you get. DirectShow allowed us to extort this same principle. Although Windows media foundation was supposed to be a replacement for the DirectShow API, it’s still the gold standard for navigating this protocol and still the most commonly used.
With all major barriers successfully sledgehammered, its now possible to push the stream through the browser in full stereoscopic capture. On the player side of things, the stream is hosted on our own internal website, and from there the video is available to connect, so long as WebRTC does its job, initializes the connection and displays the incoming feed (which it always does). All that is needed to join the feed and view in 3D is access to the unique URL link and a HMD. Alternatively, the video stream can be viewed in monoscopic 2D on the desktop or on a device.
conVRsation allows the sharing and joining of real-time virtual reality chat rooms and hangouts. A camera is setup on the streamers end, the patented Suometry model, and the feed initialized through the browser. The instructions for set up are simple and intuitive but the camera, audio and projection input settings must be specified. On the viewers end a headset is mounted and a URL accessed. The video is currently one-way and the audio two-way. The applications for its use are seemingly up to the imagination of the user — we’ve had inquiries about virtual tours, webinars, education, adult entertainment and collaborative spaces.
But the key factor remains: this is a platform for real-time streaming in VR.
Ultimately the latency we generate from the pipeline is symbiotic with that of our core IP. In attempting to build a stable remote demonstration platform we took a deep dive into existing protocols and built directly using WebVR, piggybacking on the open-source project. It’s now our go-to for proving our latency across time-zones as well as demonstrating our innovative stereoscopic capture and stitching whilst maintaining minimal GPU usage.
Beta Platform, User Experience and Moving Forward
Clients can now virtually step into our Montreal office, so to speak. But when hosting remote demos and relaying instructions through email, the user experience challenges inherent with software pipelines needed to be addressed. Firstly, starting from the ground up, the headset is the first main consideration. Every demo comes with a set of instructions relevant to the hardware being used on the other end. This applies to both the steps of usage and the browser compatibility as detailed above. All of this pertains to the requirements of WebVR.
It’s highly recommended to use connected headsets (Vive, Rift etc) when viewing the stream, as it allows improved resolution and appreciation of the stereoscopic capture at a higher factor but using cardboard’s or mobile VR headsets is also a good way to go albeit with the inherent resolution dangers of WiFi connectivity. Broadband is ultimately a large factor in how well the feed will be delivered — just like any other OTT streaming. Currently we always recommend Firefox 64-bit (the latest available version) as the compatible browser. Chrome also does a good job if Google Daydream is the weapon of choice. Ultimately, viewing the stream without a HMD negates the stereoscopic aspect but still allows back and forth communication and 360 exploration.
Compatibility also pertains to mobile devices. Currently Android is winning the war with VR hosting and it’s no different here. Although iOS still works, mainly using Chrome, it’s still ‘buggy’ to say the least. The UX challenges of mounting cardboard’s with iPhone’s outside of YouTube are still ongoing. Regardless, we have a preference — Android. (relevant article: https://medium.com/@peterjwest/web-vr-ux-and-fullscreen-on-ios-762192c38102)
Allowing microphone and video access is a go/no-go stipulation for accessing the stream within the browser. Prompting the user when microphone usage is needed is standard on most browsers but it’s common for this to be disregarded or lost in the mix. Having a setup that’s fully compatible with WebVR is now an essential item for joining a conVRsation. Usernames and profile pictures are now a part of the platform, allowing each user to identify the others. In terms of the number of people in a single ‘room’ we try to keep it to four maximum, but support is available for up to ten or more.
Through extensive beta testing over the past six months the feedback from users has helped us implement a new and upcoming platform for its deployment, taking into account features and improvements that will help facilitate collaborative conferences, the sharing of images and data across streams, the inclusion of AR and ultimately the comfort of the experience itself in terms of setting up the environment being shown. Ultimately the project is an ever evolving exploration, one that began as a way of demonstrating latency remotely to one that has become a possible open-source project for virtual reality communication.
At Suometry, we continually work to deliver the fastest 3D 360 stereoscopic capture on the market today, and as such conVRsation is another step on our development road-map — one that might have been accidental, but one that we are currently deploying as a tool for allowing others to explore and experience true Social VR, as opposed to relying on Avatars or surfing the uncanny valley.
An article about conVRsation: https://skarredghost.com/2018/02/23/suometry-wants-revolutionize-enterprise-vr-video-communication/
For a demo of the conVRsation platform, please contact me at firstname.lastname@example.org
Powered by WPeMatico