VR audio simulations and binaural sound (in HL2)

Post by **xef6** » Sat Mar 23, 2013 12:56 pm

So, it seems like the rift is going to be (for the foreseeable future) our solution to the visual part of VR. But, it's not just "Visual Reality", so we still have the problem of audio to worry about. I thought it'd be nice to have a thread dedicated to discussing this, and perhaps a place to share good examples of virtual audio simulations. I'm hoping to avoid non-simulation stuff like virtual barber shop recordings, because they aren't really applicable to realtime games.

Most VR sims will only include visual and audio parts, perhaps also including gesture stuff like the razer hydra. The rift is a huge step forward in terms of visual perceptual fidelity (ignoring the temporary issue of dev kit resolution); I'm curious how long we'll be stuck with "last-gen" audio. Does anyone have examples of games (or anything) that have good positional audio, or good sound design in general?

These are just research projects, but they are probably the best examples I can think of:
[youtube]http://www.youtube.com/watch?v=MQt1jtDBNK4[/youtube]

[youtube]http://www.youtube.com/watch?v=lRRmi5YfwuM[/youtube]

The rest of the youtube channel these videos were posted on has many other cool projects. They've obviously integrated their work with HL2 as a mod, but the source is unavailable AFAIK. Here are links to the project pages:
http://gamma.cs.unc.edu/PrecompWaveSim/
http://gamma.cs.unc.edu/ESM/

edit: added "HL2" to the title because of the videos above

Post by **STRZ** » Tue Mar 26, 2013 1:23 am

Very impressive, it could be a form of soundwave-tracing, similar to tracing light, you have the soundsource which is emitting soundwaves, bouncing off the virtual walls and surfaces which have different reflective and dampening characteristics, and your characters ears (L/R channels) receiving those soundwaves. The perceived distance and the the position localisation in this case would just be natural side-effects because reverb basically is only a delay from reflected soundwaves reaching your ear later than the direct path of a soundwave from a soundsource to your ears, position would be relative to which ear is faced towards the path of the soundwaves.

First you'd need to give ingame textures and surfaces reflective and dampening characteristics close to measured characteristics of similar real world material. Then you'd need to perform a virtual impulse response in this virtual environment, put this data into a realtime convolution reverb tool and route all audio from ingame soundsources through this tool, the L/R output of the tool would be your ears.

Post by **virror** » Tue Mar 26, 2013 3:39 am

Im getting upset every time i read about binaural sound since it has been available since Unreal 1 and probably longer, but Creative ruined it totally : (

Post by **BillRoeske** » Tue Mar 26, 2013 8:35 am

virror wrote:Im getting upset every time i read about binaural sound since it has been available since Unreal 1 and probably longer, but Creative ruined it totally : (

Do they have a patent on real-time binaural calculations, or something? I know they were building a bunch of programmable effects into their chips for a while, but I have to admit that I never really looked into what they were doing.

Post by **Dantesinferno** » Tue Mar 26, 2013 11:17 am

I found this info on reddit ! http://www.reddit.com/r/truegaming/comm ... in_gaming/

Ok so I see there are a couple of misconceptions in this thread:
You need headphones. Actually there have been binaural demos configured for speakers for quite a while, and they also happen to sound very good.
Too computationally intensive! Not necessarily. There are plenty of audio effect plugins available for audio post-production, and they allow the user to spatialize multiple tracks of audio into 3d audio live and without trouble. You can find them here, here, and in any copy of Apple's Logic Pro. It doesn't sound as good as naturally recorded binaural audio (dummy head with microphone in each ear), but it is still very convincing.
You need to simulate the sound waves bouncing off everything. Not true at all. You must account for the sound waves bouncing off everything. There are algorithms and models that can approximate how sound waves are warped by the head, shoulders, and ears in a very efficient manner. There is also the Convolution method (it is a devilishly clever method too!) that produces a very nice result. On the note of echo and reverberation: Rather than getting reverb from simulating sound waves, it is better to just add a reverberated track over a non-reverberated 3d sound. The result sounds very good.
Creative Technologies has a monopoly on this tech. There are a couple of iOS games that spatialize sound on the fly, and some game engines as well. Actually, WebGL this web-based Audio API does it too!
So why is no one doing it? I think it's a few things. One, people use their eyes way more than they use their ears. People want to see bloom, occlusion, and particle effects. They have no idea how powerful 3d sound is. Graphics and visuals have ALWAYS taken priority over sound, in games, movies, tv, and dare I say it: Music Videos. Second, I think it's the fact that the development of the technology allowing binaural sound to be played through speakers took a very long time to perfect, and people aren't quite aware that it has improved drastically (Thanks to Princeton University!!)
Thirdly, and finally: Usability and simplicity is very important. If someone wants to listen in 3d sound, they need to specify whether they're using headphones or not. It seems insignificant, but it is an issue that can make the experience rather jarring. Also, many tvs have speakers that are at the very center, and in order for binaural sound to work through speakers, they must be distinctly separated to the left and right.
Source: I've been writing a binaural digital signal processor, and have done more research than I'm comfortable with.
Edit: Fixed a link. Fixed WebGL statement.

and

THANK YOU!
Sometimes I think people are willing to believe anything. There are so many problems with some of the contents of this thread it is baffling. The "Creative Monopoly" is especially BS; The OpenAL sound library does positional 3D audio (Yes, I know 1.1 is propriatary and Creative-owned, but 1.0 does the same thing and is open source), and last time I checked, Qsound Labs is still in business.
Even the terminology is wrong; Binaural, in this context, refers to recordings made with a specially-made dummy head so that you can hear the position of the audio. It has to be pre-made and is thus useless for games. The word "binaural" just means "two ears"; you have binaural hearing. Stereo is intrinsically binaural. The term is 3D positional audio, or just 3D audio. 3D audio uses fancy algorithmic called head-related transfer functions to simulate how it should sound.
As for the computational complexity, that's pretty BS; Qsound Labs made a software 3D positional audio system for use in cell phones in 2003. Surely we can do better than that with nearly a decade of improvements? Oh, by the way, Sensara's algorhythms were put to use in 1993, during which the term on patents were 17 years - in other words, those patents have expired.
The honest truth is that developers don't bother with 3D audio that much because it is useless. Nobody notices it. Worse, it may actually make some people think the audio is worse; SNK stopped using their "Sphero Symphony" 3D audio technology for the home releases because they realized most people had their systems hooked up monaural, which removes the effect. And that's the biggest reason why they don't use 3D audio today: the technology requires a perfect setup to work properly, and the average end user does not know how to achieve this. Heck, even I sometimes put my headphones on the wrong way.

BUT NOW we have the RIFT so 3D Audio is a MUST!

Post by **Dantesinferno** » Tue Mar 26, 2013 11:56 am

check this site out! http://toni.org/a3d/ GREAT 3d audio examples, go to the demos below

Post by **EdZ** » Tue Mar 26, 2013 12:17 pm

There are actually two different things going on:

1) Simulation of how sound is affected by the environment. On a basic level, echoes, from flat surfaces. You can add more work to calculate multiple echoes, attenuation (even frequency dependant attenuation) due to material composition, etc.

2) Head Related Transfer Function, or HRTF. This takes the sound a it arrives just outside the avatar's ears, then passes it through a simulation of the user's outer ear (pinna/auricle). Because the outer ear does not move or deform relative to the inner ear, the reflections and attenuations need only be calculated once, and this transfer function (hence the name) used for a computationally light effect.

A generalised HRTF is included with almost every sound card available, moderm motherboards (i.e. Intel HD-Audio), and even many older motherboards (AC97), as well as numerous third party audio programs, drivers, CODECs etc (for example, ffdshow's audio decoder). It'll usually be labelled as 'headphone mode' or similar.

Most games have at least rudimentary simulation of primary reflection (i.e. basic echoing). Combine this with surround (5.1) audio and the HRTF, and many games will already provide you with a reasonable facsimile of 3D positional audio.

One more thing to add: most HRTFs are set up assuming you will be bypassing your physical pinna entirely, by using In Ear Monitors (IEMs), sometimes known as 'canalphones'. Using can-style headphones will produce some effect, but it won;t be optimal.

Post by **xef6** » Tue Mar 26, 2013 8:27 pm

EdZ wrote:There are actually two different things going on:

1) Simulation of how sound is affected by the environment. On a basic level, echoes, from flat surfaces. You can add more work to calculate multiple echoes, attenuation (even frequency dependant attenuation) due to material composition, etc.

2) Head Related Transfer Function, or HRTF. This takes the sound a it arrives just outside the avatar's ears, then passes it through a simulation of the user's outer ear (pinna/auricle). Because the outer ear does not move or deform relative to the inner ear, the reflections and attenuations need only be calculated once, and this transfer function (hence the name) used for a computationally light effect.

This is really good to note. This is explained in the papers linked on the project pages; they are definitely worth skimming imo. Some nice bits:

Precomputed Wave Simulation for Real-Time Sound Propagation of Dynamic Sources in Complex Scenes wrote:The early reﬂections (ER) phase comprises sparse, high-energy wavefronts which are detected and processed individually in human perception, followed smoothly by the late reverberation (LR) phase, comprising dense arrival of many low-amplitude wavefronts which are fused perceptually to infer aggregate properties such as the decay envelope. Perceptually, the ER conveys a sense of location, while the LR gives a global sense of the scene – its size, level of furnishing and overall absorptivity.

Precomputed Wave Simulation for Real-Time Sound Propagation of Dynamic Sources in Complex Scenes wrote:Binaural perception is sensitive to the exact geometry of the individual listener’s ears, head and shoulders, which can be encapsulated as his head-related transfer function (HRTF). Non-individualized HRTFs can lead to large errors in localization [Hartmann and Wittenberg 1996]. Our system is easily extensible to customized HRTFs and supports them with little additional run-time cost. To avoid the complexity and present results to a general audience, we currently use a simple model [Hartmann and Wittenberg 1996], based on a spherical head and cardioid directivity function.

I'm just curious if there's ever going to be any progress on personalized HRTF models, as that's been a limiting factor for a long time. One thing that's for sure is that the jump from speakers to headphones is HUGE for positional audio (eliminates crosstalk and aberrant positional cues). I imagine that going from on/over/around-ear headphones to IEMs is a smaller jump.

Post by **EdZ** » Wed Mar 27, 2013 5:23 pm

I imagine that going from on/over/around-ear headphones to IEMs is a smaller jump.

On it's own, it should be no jump at all (apart from the improved passive isolation of an IEM). However, you should fundamentally not be performing a HRTF if you intend to use over-the-ear headphones (cans).

I'm just curious if there's ever going to be any progress on personalized HRTF models, as that's been a limiting factor for a long time.

Some sort of small-volume high-resolution 3D scan would be perfect for this, combined with a high-volume low-resolution scan for localisation: head size, inter-ear distance, head-neck-shoulder distances, etc. The latter could fairly easily be done with an off-the-shelf Kinect or similar, but the former requires a more specialised setup. A structured light scan would work, as would a modified Kinect/Xtion/CIGC (for short distances), or a scanned laser and fixed camera setup.
The big barrier I can see is that you only ever need to do this once, so purchasing a digitise-it-yourself kit is unreasonable. And there aren't likely to be enough potential users to justify a service (due to the physical presence required, you;re limited to local users).

It's a business problem rather than a technical one.

Post by **virror** » Thu Mar 28, 2013 8:20 am

EdZ wrote: Most games have at least rudimentary simulation of primary reflection (i.e. basic echoing). Combine this with surround (5.1) audio and the HRTF, and many games will already provide you with a reasonable facsimile of 3D positional audio.

Still a world of difference compared to using "proper" binaural audio simulations using for example the openAL library as going through DirectSound, which does not care about the x,y,z coordinates of the sound source which of course makes proper simulations impossible. At least thats how i understand it. Still to this day i have never heard a as good soundscape in a game as the demo i heard from Unreal 1 that used "proper" binaural audio.

VR audio simulations and binaural sound (in HL2)

VR audio simulations and binaural sound (in HL2)

Re: VR audio simulations and binaural sound (in HL2)

Re: VR audio simulations and binaural sound (in HL2)

Re: VR audio simulations and binaural sound (in HL2)

Re: VR audio simulations and binaural sound (in HL2)

Re: VR audio simulations and binaural sound (in HL2)

Re: VR audio simulations and binaural sound (in HL2)

Re: VR audio simulations and binaural sound (in HL2)

Re: VR audio simulations and binaural sound (in HL2)

Re: VR audio simulations and binaural sound (in HL2)