The Fraunhofer Institute will present a new system at IBC 2023 that can automatically localise and identify people in large media archives based on faces and voice.
The Audiovisual Identity Suite combines technologies for face and speaker recognition, using artificial intelligence to analyse media content for the presence of specific individuals.
This system enables program planners to gain a comprehensive view of individual presences in TV broadcast, identifying specific individuals with a user-friendly interface that can be used for in-depth insights, trend analyses and statistics.
The tool uses a ‘heatmap’ to identify when and how often an individual is visible or audible on different TV channels, by also identifying when an individual is speaking but not shown in the picture.
A cross-modal analysis tool is also included to increase the validity and quality of search results, relying on AI-based algorithms to recognise speakers and classify gender as well as speech quality analysis.
In the future, Fraunhofer intends to add age estimation features based on visual analysis and audio improvements such as language recognition, speech-to-text conversion and keyword analytics.
Christian Rollwage, speaker recognition specialist, Fraunhofer Institute, commented: "Our planned enhancements will provide deeper opportunities for analysis. With the addition of text transcription, we can not only determine how often certain people appear but also which topics they are talking about.”