Everyone knows a video conference without sound is useless, so why isn’t more care taken with the audio? Kevin Hilton looks at the issues.
Audio is a crucial part of any video conference but, despite the reality that the pictures would mean nothing without sound, it is often taken for granted or almost ignored completely.
“The most common problem is the pick-up in the room,” comments Tim Root, chief technology ofï¬cer at Revolabs. “It’s a fundamental thing, not being able to hear a person because they’re at the whiteboard a distance from the mic or they’re a quiet talker. There can also be other problems, such as having air conditioning blasting out in the room. At the end of the day having mics as close to humans as possible is going to always improve the experience.”
This option typically involves using what are known as wearable or personal microphones, with a small capsule clipped to a person’s clothing or on a light headset connected to a wireless transmitter hidden out of sight. While these are discreet and produce good quality signals, the speaker does have to feel comfortable wearing and using them. Getting the most out of a small wireless mic also depends on where it is placed, which is usually best left to an experienced sound engineer or AV technician.
Mounting microphones in the ceiling of the meeting room is a common practice but, as Root points out, this can create further problems: “High decibel noise sources live in the ceiling. When you have mics hanging from the ceiling they’re close to the air conditioning. People don’t project their voices looking up, so you’re relying on reflective pick-up.”
Table microphones are closely associated with conferences. While these are still used widely Root observes that other technology is literally proving to be a barrier. “These mics help a lot when people sit at the table,” he says, “but when they’ve got laptops and other devices on there as well, that breaks up the path of the pick-up.”
Tim Hill, director of engineering at systems integrator AVI-SPL, says: “Proper microphone placement is essential in a video call. One issue is that the microphones need to be accurately spaced so that a speaker has a microphone in front of them regardless which direction they are looking.” Hill also identiï¬es another major technical point when putting sound and pictures together but says it is not as problematic as it once was: “The days of huge delays in the audio that caused people to talk over each other and use video cues to let someone know when they can talk are all but gone.
Both hardware and network advancements have greatly improved the audio quality of video calls. When a video conferencing room is set up properly with the correct physical attributes, microphone selection
Sound is vital for any successful video conference, yet is often overlooked and placement, audio programming and a robust network in place there is no reason why you can’t achieve near perfect audio and video. One of the biggest mistakes I see people make these days is with the notion that a software codec and web camera can be used for business class video in a conference room without the same room set up requirements.”
Because many video conferencing systems are set up by a technician or AV manager and then used by people with little or no technical knowledge, modern digital signal processing (DSP) technology often comes into play to make the sound as good as possible with no human intervention. Christophe Palluat de Besset, chief executive of French installation and distribution company Sound Directions, emphasises the need for microphones to be as close to the noise source as possible but adds that “a mixer with good DSP to handle problems, as well as phone calls and auto-mixing” is necessary.
An essential part of any conference sound system is automatic echo cancellation (AEC), which uses algorithms to work out what is wanted in a signal - usually the human voice - and what is not, typically reverberations, reflections and other extraneous noises. Andrew Hug, vice president of systems engineers in EMEA for Polycom, describes AEC as a “fundamental building block” in conference systems, particularly for those installed in “hostile” environments.
Like other AECs, Polycom’s technology is adaptive, allowing it to react to different circumstances and work out the length of echo. Hug says that the company also uses a method of zoning in its HDX and Group series desk mics to keep out unwanted sounds. The mic looks like a three-pronged star, with each ‘leg’ containing microphone. These can be selected to create a ‘bubble’ with a very speciï¬c ï¬eld of capture, so that one mic alone will pick up the desired source. Hug adds that the same premise can be used with ceiling-mounted units.
Another company that developed its own AEC is Biamp Systems. The Sona algorithm is built into the company’s Tesira system, which is used in conjunction with a variety of video conferencing packages, from Cisco to Zoom and Skype for Business. Rob Houston, product manager for UC products, describes Sona as “the secret sauce” that differentiates Biamp from its competition. He adds that the system is also able to cope with low levels and exclude background noise.
Hill comments that having a properly set up AEC is crucial to the sound quality of a video call: “You need to ï¬rst make sure that the AEC of your DSP doesn’t ï¬ght with the AEC built into the video codec. The next step is to make sure that the person you are calling has it set up properly. One of the hardest things for users to understand is that if you are hearing echo in your room it is actually the far end that is set up improperly even though they are hearing perfect audio. With all the talk of hardware, software, and network advancements, you can’t forget about that a room needs to be physically built correctly. The ambient noise level in a room needs to be less than NC30 [the noise criteria set for AV facilities, teleconferencing rooms, large meeting areas and theatres]. The reverberation time of a room should be RT60 [the acoustical measurement used for calculating reverb time decay]. Meeting these physical conditions reduces the work of the DSP and allows for much higher sound ï¬delity.”
Root says another common technique used to solve acoustic and sound problems is automatic gain control (AGC), which is designed to boost the voice signal. “That comes in our top-end system FLX 2 voice over IP/Bluetooth conference phone system,” he explains. “It does all those tricks and helps but there can be artefacts from those algorithms, which can affect the audio.”
Which brings the focus back to getting the best possible signal to begin with, starting with the choice of microphone. Root says: “Omnis pick up everything but directional mics give a narrow beam and better response. They’re also not as reverberant, so if you’re in an acoustically challenging space then you need cardioid mics.”
Rob Houston of Biamp makes the point that no matter how good the microphone, it will not perform at its optimum level if it is poorly positioned. “Ideally everyone should have his or her own mic,” he says, “but that is not always possible.”
The other end of the audio chain is the loudspeaker and this is as important as the microphone because if the output quality is not good then what went before can be wasted. Root says the key point is not to have loudspeakers pointing directly at the microphones, which at worst can cause howl-round and at best make voices sound hollow and unpleasant. “Another problem is if you have a loudspeaker that introduces a lot of artefacts into the system, it can cause problems for the echo cancellation system, producing a lot of distortion because it doesn’t know what it is dealing with,” he adds. Palluat de Besset suggests a “good system of dissemination full of good quality components”, while Houston says a distributed pattern of loudspeakers round the room, typically in a quadrant pattern, delivers a more controlled sound than having everything blasting out from a sound bar or two cabinets at the front.
Hills takes a different view, saying he does not see loudspeaker selection as the top priority. “The room needs the correct number of loudspeakers to evenly ï¬ ll it or some louder units by the far end display to give some directional audio but the actual audio that is sent through a videoconferencing call for the most part is mono with medium ï¬delity speech. As long as you select a quality speaker, the number and placement are more important than the actual speaker you select.”
All of which shows that sound is a very subjective area, although most people can tell the difference between good and bad quality when they hear it.