- The Race to Hide Your Voice. Wired, 1 Jun 2022. Magazine article
- Artificial Intelligence: HSBC Releases AI Campaign Featuring the Faces of Fraudsters. Adweek.com, 17 Nov 2022. News article
- HSBC and Wunderman Thompson put AI faces to fraudsters in global fraud prevention campaign. Marketing Beat, 18 Nov 2022. News article
- HSBC puts faces to 'invisible' fraudsters in global blitz. Decision Marketing, 22 Nov 2022. Article
- New and innovative HSBC campaign by Wunderman Thompson UK reveals the real faces of fraudsters. ArabAd, 21 Nov 2022. Article
Voice Intelligence
Profiling humans from their voice
How much information can we deduce from the human voice? Currently, our work is focussed on deducing many aspects of the human persona from voice. Some of this work is described below.
Deducing faces from voicesSounds produced in our vocal tract resonate in our vocal chambers. They are modulated by our articulators (tongue, lips, jaw etc.) and further influenced by the structures within the vocal tract. The dimensions of our vocal chambers are highly correlated with our skull structure, which in turn defines our facial appearance to a very large extent. Many physical properties and dynamics of our vocal structures also correlate well with our age, ethnicity, height, gender and so on -- which in turn influence our appearance. Thus, it is easy to see how the signatures of many physical factors that naturally influence our voice can also directly or indirectly inform about our facial structure. Technologies that infer facial appearance and structure from voice leverage this web of information embedded in the voice signal. AI systems can also be designed to isolate this information. |
Publications
CodeApplications
First introduced in 2017, this technology made a global debut at the World Economic Forum in 2018, creating faces from voices of speakers in a VR environment. Over a thousand speakers tested it at the WEF. Some of our work was published in this book in 2019.
|
Biomarker discoveryBiomarker discovery techniques identify or design/construct specific mathematical representations of voice signals that bear a strong correlation with a particular influence hypothesized to affect voice. As an example, consider a scenario where we theorize that the consumption of a specific medication must impact the voice. Yet in practical life this does not seem to be so -- these alterations, if present, remain undetectable to us, evading even the most direct voice signal analyses. In such instances, a biomarker discovery system could reveal the precise, distinctive signature indicative of the medication's influence on a given voice signal. This distinguishing signature might lie in a complex, high-dimensional mathematical space -- an imperceptible virtual representation that is only useful for computational purposes. Nevertheless, the ability to now extract the identified signature from new voice samples allows machines to learn and detect, merely from the speaker's voice, the recent consumption of that specific medication. We began work on biomarker discovery techniques in 2016, well before the control of latent space representations using neural architectures became mainstream. Today we continue to expand the set of entities for which we have successfully created AI-based discovery pipelines. |
Publications
PatentApplicationsCurrently, multiple entities worldwide are engaged in exploring the possibilities of this technology in healthcare applications.
|
Deducing vocal fold oscillationsOur vocal folds oscillate in a self-sustained manner during phonation (ie, when we produce voiced sounds like a sustained "aa"). There is a wealth of information in the fine-level details of how they oscillate. Understanding these details can reveal an astonishing amount of information aboout the state of the speaker. However, historically it was hard to measure the vocal fold oscillations of each person as they spoke. Doing so required specialized instruments, to be used in clinical settings. At CVIS, we have been developing techniques to deduce the vocal fold oscillations of speakers from voiced sounds directly from recorded speech signals. This opens doors to analyzing vocal fold osciallations on an individual basis, and studying ther changes in their patterns in response to various influencing factors -- from substances to mental problems to infectious diseases like Covid-19. Based on our techniques, we built a live Covid-19 detection system in February 2020, and put together a protocol for analyzing a set of sustained vowel sounds and a couple of countinuous speech examples. This protocol is now globally used as a basis for sound analysis for Covid-19 and other conditions. We began this work in 2017 with the interesting goal of detecting voice disguise in-vacuo (without knowing what the original voice sounds like), and have made considerable progress in refining our techniques and exploring practical applications for it. |
Publications
PatentApplicationsThe most potent use of this technology is in the early detection of Parkinsons and other serious neuromuscular disorders and illnesses. With suffiencient discriminative data, this can be applied to detect a plethora of diseases, used to break voice disguise, used to differentiate synthetic speech from real, and used in many other profiling applications. |
Deducing emotional, physiological, psychological and behavioral states and traits from voiceVocal fold oscillations are not the sole cues that offer insights into the physiological alterations within our body. Subtleties embedded within our vocal production and control (vocal expression) are also linked with our emotional, behavioral, and psychological characteristics. The deduction of such "states" from vocal nuances has been an area of considerable research, spurred by centuries of observation and correlation between speech patterns and these characteristics. Our work is two-faceted. In one, we join hands with researchers across the globe, contributing to mainstream techniques for deducing target states from voice. In the other, we move beyond these, aiming to discover the underlying traits: correlations that not only apply universally across populations but also relate specifically to an individual and their current state, and correlations with our genetic makeup. Our research aims to develop technologies to analyze diverse human states and traits, thereby enhancing our comprehension of the intricate relationship between vocal features and human characteristics. As an example, we deduce psychological traits from an aggregations of emotional states of a person -- the emotional spectrum -- and objective measurements of various voice qualities. These are also supported by measurements of low-level features. |
Publications
PatentApplications
"Creating geriatric specialists takes time, and we already have far too few. In a year, fewer than three hundred doctors will complete geriatric training in the United States, not nearly enough to replace the geriatricians going into retirement, let alone alone meet the needs of the next decade. Geriatric psychiatrists, nurses, and social workers are equally needed, and in no better supply. The situation in countries outside the United States appears to be little different. In many, it is worse." -- Quoted from Chad Boult, Geriatrics Professor, in "Being Mortal" by A. Gawande, Surgeon, and Professor at Harvard Medical School and Harvard School of Public Health.
|