Listen and learn

AUTHOR: Inavate

Ville Pulkki is a vice president of the AES and has conducted pioneering academic research into spatial audio and perception of sound. But Anna Mitchell finds that he wasn’t always on course to have such an impact on the audio world.

Ville Pulkki is the current vice president of the Audio Engineering Society’s Northern European region and a senior researcher at Aalto University in Helsinki, Finland. But, he wasn’t always on course to be a leading academic in the audio field. Twenty-one years ago he made a decision, which has had a lasting impact on Pulkki’s career, Aalto University and the entire audio industry.

“I completed a master’s in technical physics and information sciences but as so many people in the audio field, I was also interested in music, largely playing the piano and singing. I applied to the Sibelius Academy [a prestigious music university in Finland] and spent three years there as a full-time musical student.”

But, as a student and with little money coming in Pulkki was forced to look into alternative options to fund his studies. “I was not taken into the Finnish Radio Chamber Choir,” he says. “I had to find something else and discovered a one month project between Helsinki University of Technology (Aalto) and the Sibelius

The Sibelius Academy had a room with a 3D speaker system that comprised 32 loudspeakers and Pulkki’s objective was to investigate how to position virtual sources in this room. “I thought, I can do that,” said Pulkki.

Pulkki approached the project with little knowledge of the audio industry but started to think about how to add spatial attributes to the sound in the room. “I created this vector base amplitude panning (VBAP) theory without reading any audio papers and to my surprise it turned out it was a new thing I did!”

The VBAP method positions virtual sources in arbitrary 2-D or 3-D loudspeaker setups. In amplitude panning the same sound signal is applied to a number of loudspeakers with appropriate non-negative amplitudes.

“Only after it worked did I explore the possibility to publish it and discovered that scientifically it was a new thing and quite an important discovery. I talked with my professor in acoustics and he gave me three months more time to write the paper. I got the journal paper and I got some funding as a four-year scholarship to complete a PhD.”

It was 1995 and Pulkki had reached a crossroad in his career. “I had to choose if I wanted to be a professional musician or work professionally in the audio field. I was so excited by the project I thought I
will stay at Aalto University and I’m still here! Not anymore in the same room but in the same laboratory.” The music world’s loss was the audio industry’s gain.

Pulkki continued his work on vector based amplitude panning and is now an expert in the field. “I’ve developed a free software download and it’s now a defacto standard in multi-loudspeaker systems.”
Pulkki’s academic work has translated successfully into real-world applications and the software is used in multimedia theatres and art installations as well as training and simulation environments. “I believe the US Navy uses it in simulation rooms to create the perception that noises are coming from different locations,” explains Pulkki.

“My professor during my PhD was Matti Karjalainen,” he continues. “He was an inspiring man. After I had completed my VBAP article I started my PhD and that was looking into how well does this feedback work and why it works. Matti was an expert in psychoacoustics and human hearing and his input was invaluable on that side of the research.”

Professor Matti Karjalainen passed away in 2010 and Pulkki has fond memories of him. “He really was a great leader,” he says. “As a research leader he was so kind and helpful to the people he worked with. He just accepted everybody as they were. He was a very good spirit in this laboratory.”

By 2002-3 Pulkki had largely wrapped up his work in VBAP and pushed into the field of directional audio coding. “There was an old British innovation that used a soundfield microphone to output to lots of loudspeakers. It’s a neat technique that takes the soundfield information from the source but it had shortcomings.

“I thought that these shortcomings could be overcome somehow. With a colleague of mine we asked what is the resolution of human spatial hearing and can we use that knowledge in the reproduction of sound from microphone to loudspeakers. Then we figured out a way to use this resolution to overcome many flaws in the traditional technique.

“We created this audio format that allows transmission of about two or three channels of audio plus some meta data which, [in simple terms] tells you where the sound is, what direction it’s coming from and if the sound is diffused or not. This research led to quite a big breakthrough and I think we will see it applied in the next ten years.

“The resulting solution is generic. You can use it with any kind of loudspeaker set-up or headphones – even head-tracked headphone systems. We saw it as a new kind of loudspeaker agnostic sound format set-up and we had some IDR patents on it but have now transferred everything to the Fraunhofer Institute for Integrated Circuits.”

Although Fraunhofer had taken receipt of the patents the work continued as a joint project with two members of Fraunhofer working in Helsinki alongside Pulkki’s team. “We work in science we don’t make products,” explains Pulkki.

“We try to understand the science behind phenomena what it going on, why something works and, when it doesn’t work, what is happening.”

The research has already spawned one product that has been implemented by some communications companies. “It’s not a consumer product yet,” says Pulkki. “But it has important applications in teleconferencing. If you are using a cheap microphone, for example a single unit with four omni mics in one square, and you are connecting groups of people [over a videoconfencing set up] the audio connection will consist of only one channel.

“When people start to talk it can be quite irritating because the sound only comes from a single direction. You see lots of people on the screen and all the sound comes from one place. You have to look at the people’s lips to see who is talking. Adding this meta data to the single channel audio allows us to create the perception that the sound is coming from the speaker. It’s a more natural way of conducting a conference.”

Alongside his academic research Pulkki has also been a rising star in the AES. “I submitted my first journal paper to the AES,” he explains. “It was clearly audio I was working with and it was the only journal in audio. I have published a lot, and indeed still continue to publish in the AES journal and at its conventions and conferences.”

Pulkki started in the Finnish section of the AES and over the years became its chair and vice-chair. Now he is the vice president of the Northern European region, which incorporates the Belgian, British, Danish, Finnish, Moscow, Netherlands, Norwegian, St Petersburg and Swedish sections.

A few months into the job and Pulkki has his eye on establishing working committees in countries that currently don’t have one but have a lot of members. Pulkki believes the AES plays an important role. “It’s multidisciplinary which is important in an industry where there are so many fields involved. Audio covers acoustics, electronics, signal processing, psychoacoustics, perception, material sciences and more. The AES is a place for academics who believe their research results could be useful in audio. But it’s also a place for practitioners and audio engineers. People can come together and talk about mixing and mastering. It encompasses industry people; vendors and resellers. And also there’s a lot of students about. AES events are places where the whole audio chain meets.”

Through his research Pulkki has established himself as an important academic researcher in the audio field and through his work at the AES he has made sure he is a strong link in the complete audio chain.