WaveFont – Visualization of information and emotions from the voice in captions
Research article Open access | Available online on: 25 August, 2019 | Last update: 28 October, 2021
While traditional captions only reflect what is spoken in an utterance, the innovative technology WaveFont additionally visualizes information of how it is spoken. WaveFont has been invented by Prof. Dr. Matthias Wölfel, Angelo Stitz and Dr. Tim Schlippe in an art research project. Due to the very positive feedback, Dr. Tim Schlippe decided to make it available for a broader audience by commercializing and developing WaveFont further as the founder and CEO of Silicon Surfer. Particularly, hearing-impaired people receive for the first time information and emotions from the voice (stresses, pauses, length) which they were excluded from. The WaveFont visualization is very intuitive: For example, when someone speaks louder, the font gets bolder, when some speaks slower, it gets wider:
After a short time information and emotions are much better transported than with conventional captions supporting accessibility, integration and inclusion.
In order to enable hearing-impaired people to receive the voice characteristics, Silicon Surfer offers a service to automatically produce WaveFont captions. To generate WaveFont captions, Silicon Surfer processes traditional subtitle files or the transcriptions together with the sound file.
WaveFont is universally applicable for different genres and can be ported to new languages and writing systems. So far it is available in English, Spanish and German and first actions have been taken to port it to Arabic.
Potential in the world
A huge number of people would benefit from additional information in captions: According to the World Health Organization, 466 million people worldwide have disabling hearing loss. This is 5% of world’s population. Due to the demographic change it is estimated that by 2050, over 900 million people will have hearing impairments. However, this new technology and visualization is not only interesting for hearing impaired people: For example, there are 244 million migrants living in a country other than where they were born. Many of them need to learn a new language. WaveFont captions would help them since Silicon Surfer’s analyses have demonstrated that the WaveFont visualization is much more intuitive than phonetic transcriptions such as the International Phonetic Alphabet. According to Amazon Prime Video, 30% of the users watch with captions but only 20% of those are hearing impaired. But by far the largest number of people who would benefit from WaveFont captions are on social media: 85% of the active Facebook users watch videos without sound. These are 2 billion people. WaveFont supports the UN sustainable development goals 3 (good health and well-being) and 4 (quality education) of the agenda 2030.
Potential in Qatar
5.2% of the population in Qatar suffers from hearing loss (Girotto et al., 2014). WaveFont can improve their TV, video and cinema experience since they receive information they were excluded from and are able to imagine better how their favorite actors speak.
A very interesting application for the WaveFont technology in Qatar is also during the FIFA World Cup. WaveFont is particularly suitable for visualizing emotional scenes. Such emotional scenes can be found in the sportscasts of the matches, but also in the commercials and explanatory videos about the event. Further application areas are information systems, display boards, and media libraries with videos. With WaveFont, Qatar can be a pioneer and offer a much more accessible and inclusive event.
At the conference ArabicSpeech 2019 at Qatar Computing Research Institute, Dr. Tim Schlippe, CEO of Silicon Surfer, presented the WaveFont technology and his first steps to port it to Arabic.
Dr. Tim Schlippe from Silicon Surfer at QCRI.
To make WaveFont available in Qatar, Silicon Surfer plans to participate in Mada’s Innovation Program. This includes to port the technology to Arabic, to analyze its impact and benefit in Qatar, to evaluate application areas, especially for the FIFA World Cup 2022, to integrate and use it in information systems, display boards, media libraries with videos and to port it to further Arabic dialects and languages.