Sound above vision

AUTHOR: Kevin Hilton

Everyone knows a video conference without sound is useless, so why isn’t more care taken with the audio? Kevin Hilton looks at the issues.

Audio is a crucial part of any video conference but, despite the reality that the pictures  would mean nothing without sound, it is often taken for granted or almost ignored completely.

“The most common problem is the pick-up in the room,” comments Tim Root, chief technology officer at Revolabs. “It’s a fundamental thing, not being able to hear a person because they’re at the whiteboard a distance from the mic or they’re a quiet talker. There can also be other problems, such as having air conditioning blasting out in the room. At the end of the day having mics as close to humans as possible is going to always improve the experience.”

This option typically involves using what are known as wearable or personal microphones, with a  small  capsule  clipped  to  a  person’s  clothing or  on  a  light  headset  connected  to  a  wireless transmitter hidden out of sight. While these are discreet  and  produce  good  quality  signals,  the speaker  does  have  to  feel comfortable  wearing and using them. Getting the most out of a small wireless mic also depends on where it is placed, which is usually best left to an experienced sound engineer or AV technician.

Mounting  microphones  in  the  ceiling  of  the meeting room is a common practice but, as Root points  out,  this  can  create  further  problems: “High  decibel  noise  sources  live  in  the  ceiling. When  you  have  mics  hanging  from  the  ceiling they’re close to the air conditioning. People don’t project their voices looking up, so you’re relying on reflective pick-up.”

Table microphones are closely associated with conferences.  While  these  are  still  used  widely Root  observes  that  other  technology  is  literally proving to be a barrier. “These mics help a lot when people sit at the table,” he says, “but when they’ve got laptops and other devices on there as well, that breaks up the path of the pick-up.”

Tim  Hill,  director  of  engineering  at  systems integrator  AVI-SPL,  says:  “Proper  microphone placement is essential in a video call. One issue is  that  the  microphones  need  to  be  accurately spaced  so  that  a  speaker  has  a  microphone  in front of them regardless which direction they are looking.” Hill  also  identifies  another  major  technical point when putting sound and pictures together but says it is not as problematic as it once was: “The  days  of  huge  delays  in  the audio  that caused people to talk over each other and use video  cues  to  let  someone know  when  they can  talk  are  all  but  gone. 

Both  hardware  and network  advancements  have  greatly  improved the  audio  quality  of  video  calls.  When  a  video conferencing  room  is  set  up  properly  with  the correct physical attributes, microphone selection

Sound is vital for any successful video conference, yet is often overlooked and placement, audio programming and a robust network in place there is no reason why you can’t achieve near perfect audio and video. One of the biggest mistakes I see people make these days is with the notion that a software codec and web camera can be used for business class video in a conference room without the same room set up requirements.”

Because  many  video  conferencing  systems are set up by a technician or AV manager and then used by people with little or no technical knowledge,  modern  digital  signal  processing (DSP) technology often comes into play to make the sound as good as possible with no human intervention. Christophe Palluat de Besset, chief executive of French installation and distribution company Sound Directions, emphasises the need for  microphones  to  be  as  close  to  the  noise source as possible but adds that “a mixer with good DSP to handle problems, as well as phone calls and auto-mixing” is necessary.

An  essential  part  of  any  conference  sound system  is  automatic  echo  cancellation  (AEC), which uses algorithms to work out what is wanted in a signal - usually the human voice - and what is not, typically reverberations, reflections and other extraneous  noises.  Andrew  Hug,  vice  president of  systems  engineers  in  EMEA  for  Polycom, describes AEC as a “fundamental building block” in conference  systems,  particularly  for  those installed in “hostile” environments.

Like  other  AECs,  Polycom’s  technology is  adaptive,  allowing  it  to  react  to  different circumstances and work out the length of echo. Hug says that the company also uses a method of zoning in its HDX and Group series desk mics to keep out unwanted sounds. The mic looks like a three-pronged star, with each ‘leg’ containing   microphone.  These  can  be  selected  to  create a ‘bubble’ with a very specific field of capture, so that one mic alone will pick up the desired source. Hug adds that the same premise can be used with ceiling-mounted units.

Another  company  that  developed  its  own AEC  is  Biamp  Systems.  The  Sona  algorithm  is built  into  the  company’s  Tesira  system,  which is  used  in  conjunction  with  a  variety  of  video conferencing  packages,  from  Cisco  to  Zoom and  Skype  for  Business.  Rob  Houston,  product manager for UC products, describes Sona as “the secret sauce” that differentiates Biamp from its competition.  He  adds  that  the  system  is  also able  to  cope  with  low  levels  and  exclude background noise.

Hill comments that having a properly set up AEC is crucial to the sound quality of a video call: “You need to first make sure that the AEC of your DSP  doesn’t  fight  with  the  AEC  built  into  the video codec. The next step is to make sure that the person you are calling has it set up properly. One of the hardest things for users to understand is that if you are hearing echo in your room it is actually the far end that is set up improperly even though they are hearing perfect audio. With all the talk of hardware, software, and network advancements,  you  can’t  forget  about  that  a room needs to be physically built correctly. The ambient noise level in a room needs to be less than NC30 [the noise criteria set for AV facilities, teleconferencing  rooms,  large  meeting  areas and theatres]. The reverberation time of a room should  be  RT60  [the acoustical  measurement used for calculating reverb time decay]. Meeting these physical conditions reduces the work of the DSP and allows for much higher sound fidelity.”

Root says another common technique used to solve acoustic and sound problems is automatic gain control (AGC), which is designed to boost the  voice  signal.  “That  comes  in  our  top-end system FLX 2 voice over IP/Bluetooth conference phone  system,”  he  explains.  “It  does  all  those tricks and helps but there can be artefacts from those algorithms, which can affect the audio.”

Which  brings  the  focus  back  to  getting  the best possible signal to begin with, starting with the  choice  of  microphone.  Root  says:  “Omnis pick  up  everything  but  directional  mics  give  a narrow  beam  and  better  response.  They’re  also not as reverberant, so if you’re in an acoustically challenging space then you need cardioid mics.”

Rob Houston of Biamp makes the point that no  matter  how  good  the  microphone,  it  will not perform at its optimum level if it is poorly positioned. “Ideally everyone should have his or her own mic,” he says, “but that is not always possible.”

The  other  end  of  the  audio  chain  is  the loudspeaker  and  this  is  as  important  as  the microphone because if the output quality is not good then what went before can be wasted. Root says  the  key  point  is  not  to  have  loudspeakers pointing  directly  at  the  microphones,  which  at worst can cause howl-round and at best make voices  sound  hollow  and  unpleasant.  “Another problem  is  if  you  have  a  loudspeaker  that introduces a lot of artefacts into the system, it can cause problems for the echo cancellation system, producing a lot of distortion because it doesn’t know what it is dealing with,” he adds. Palluat de Besset suggests a “good system of dissemination full of good quality components”, while Houston says a distributed pattern of loudspeakers round the room, typically in a quadrant pattern, delivers a more controlled sound than having everything blasting out from a sound bar or two cabinets at the front.

Hills  takes  a  different  view,  saying  he  does not  see  loudspeaker  selection  as  the  top priority.  “The  room  needs  the  correct  number of  loudspeakers  to  evenly  fi ll  it  or  some louder  units  by  the  far  end  display  to give  some  directional  audio  but  the  actual audio that is sent through a videoconferencing call  for  the  most  part  is  mono  with  medium fidelity  speech.  As  long  as  you  select  a quality  speaker,  the  number  and  placement are  more  important  than  the  actual  speaker you select.”

All  of  which  shows  that  sound  is  a  very subjective  area,  although  most  people  can  tell the  difference  between  good  and  bad  quality when they hear it.