After laying the foundations for the MP3 standard Professor Karlheinz Brandenburg continued to revolutionise the field of psychoacoustics and immersive sound. Now he wants to organise your music collection. Anna Mitchell catches up with a true audio industry pioneer.
Dieter Seitzer, a professor at the University of Erlangen, began researching methods in compressing music in 1977. Over time Seitzer came up with the idea that it might be possible to send music over ISDN lines. The concept was largely dismissed and Seitzer was repeatedly told it would not be possible. Undeterred he pushed on with his research and selected a PhD student to assist him. That student was Karlheinz Brandenburg.
“That was the starting point for my PhD work and for what later became the MP3,” interjects Professor Brandenburg. “Over time some 30 or 40 people had a part in the MP3 story but there was a core group of about five to ten that really drove the development.”
For his post doctorate studies Professor Brandenburg headed to the USA to further his work at AT&T Bell Laboratories. “This was an important time in the standards group,” notes the professor. “We did our first proposals for the Moving Pictures Experts Group.” The Moving Picture Experts Group (MPEG) is a working group of experts that was formed by ISO and IEC to set standards for audio and video compression and transmission.
As development continued Professor Brandenburg returned to Germany, driving the standard forward and lecturing at his old University, Erlangen. Finally, in 1993 with the standard in place, the professor moved over to join his former mentor, Professor Seitzer, at the Fraunhofer Institute for Integrated Circuits. Here, he took up the role of head of audio and multimedia technologies.
“From that time on I was responsible for further developments and bringing all these technologies to market. Over time I was digging deep, doing a lot of simulation code and working with others on the standards, trying to push ideas out to the standards group. Later on I administered the group which worked on developments like AAC (Advanced Audio Coding) and at the same time was trying to push this MPEG Audio Layer 3 into the market.”
The rest, as they say, is history. MP3 is the most common standard for digital audio compression to enable transfer and playback of music on digital machines. The standard has pervaded modern life. Many people will happily refer to “their MP3 player” or “what’s in the MP3 charts” without ever thinking of the standard’s development and its beginnings with a professor and a PhD student at a university in Germany.
Now, Professor Brandenburg heads up the Ilmenau based Fraunhofer IDMT (Institute for Digital Media Technology), after helping to found the organisation in 2000. He describes his work as “building the cornerstones for future digital media” and is particularly passionate about the institute’s developments in Wave Field Synthesis.
“[Wave Field Synthesis] is an idea that originates from Delft University of Technology in the Netherlands,” explains the professor. “They started work on this more than 20 years ago and ten years later we picked it up.
“The idea is to enable better sound reproduction than just two channel or five channel allows. With this technology you can use a ring of loudspeakers around the room to really give the feeling of being acoustically somewhere else. We call it immersive sound and have founded a spin-off company, to develop and market the technology, called Iosono.
“We are continually researching to expand the technology and at the same time there is already commercialisation going on in movie theatres and theme parks. We’ve also developed a similar technology, building on the same mathematical principles, which can be used for concerts and large outdoor events. What I love is demonstrating the system and watching people’s reactions. A lot of them say they didn’t think what we’ve achieved was possible.”
Development is now also focussed on creating flat panel loudspeakers. Professor Brandenburg points out that whilst large amounts of visible loudspeakers are acceptable in cinemas and theme parks there are many applications where it ruins the aesthetics. “In the end we’ll have wallpaper that doubles as a loudspeaker,” the professor predicts.
The work with Iosono and the MP3 development standard all ties into the field of psychoacoustics. “This is the science of trying to understand how our ears and brain work. The big question is what can we hear and what can’t we hear. And it’s been very important for audio coding and for MP3 and AAC because it’s the main effect that makes these coders work well.
“One of the main effects of psychoacoustics is the so called masking effect. You see this, for example, on a train platform. The train comes in and you’ve been talking to someone. You will find, in many cases, the train is so loud you can’t hear the other person speaking. The louder sound masks a fainter sound. Research has been carried out on this topic from the 1940s and it’s still not a solved problem. Simple ideas of how this masking effect works are a basic feature of every audio encoding system.
“When we talk about sound and sound reproduction these rules from psychoacoustics, as we know them, help us to do things better. In fact for the questions of how we listen to our environment and how we get the illusion of being somewhere else acoustically there is still quite a lot of research to be done. What we do know, for example, is blind people can walk towards a wall and stop just 30 or 50cms in front of it because they hear it. Obviously humans are capable of listening to these reflections without even knowing it.”
Professor Brandenburg says this is massively important for understanding sound reproduction in professional environments. By using simulation techniques the professor believes areas such as train stations and airports can massively increase the clarity of announcements.
And, if that wasn’t enough, Professor Brandenburg is currently, as part of his work with the Acoustic Engineering Society (AES), organising a conference on semantic audio. “That’s another large department here in Ilmenau [location of Fraunhofer IDMT],” the professor says. “This is largely based on music analysis, music information retrieval and search and recommendation technologies.
“Fifteen years ago the question was about access to music so we needed compression and audio coding. Nowadays you have an abundance of possibilities; internet radio, large local storage etc. From time to time I ask people how much music they have stored at home and I know a couple of people who have more than a terabyte of music. If you translate that it could very well mean years of uninterrupted listening. These people have no idea what they have and they need help to find things.
“There are a number of technologies out there to provide these services. There are web based technologies where you have lots and lots of people annotating the music with the meta data or so called collaborative filtering where you look at what others are listening to and you just get similarity measures from that. We are working on the signal processing part of that so trying to get as much data from the music itself and then process similarities.”
There’s no doubt Professor Brandenburg’s career so far has been packed full of considerable achievements. What I find interesting is that a seemingly disparate string of developments has been largely born from, and is heavily linked to, the professor’s pioneering work on developing the MP3 standard. It all harks back to the fundamental principles of psychoacoustics and the professor’s early research at the University of Erlangen.
Professor Brandenburg says being involved in the MP3 standard development is his proudest achievement and you can see why. Not only has it been a massively influential breakthrough, in a way, it has given birth to much of his future work and accomplishments.