Firstpost
  • Home
  • Video Shows
    Vantage Firstpost America Firstpost Africa First Sports
  • World
    US News
  • Explainers
  • News
    India Opinion Cricket Tech Entertainment Sports Health Photostories
  • Asia Cup 2025
Apple Incorporated Modi ji Justin Trudeau Trending

Sections

  • Home
  • Live TV
  • Videos
  • Shows
  • World
  • India
  • Explainers
  • Opinion
  • Sports
  • Cricket
  • Health
  • Tech/Auto
  • Entertainment
  • Web Stories
  • Business
  • Impact Shorts

Shows

  • Vantage
  • Firstpost America
  • Firstpost Africa
  • First Sports
  • Fast and Factual
  • Between The Lines
  • Flashback
  • Live TV

Events

  • Raisina Dialogue
  • Independence Day
  • Champions Trophy
  • Delhi Elections 2025
  • Budget 2025
  • US Elections 2024
  • Firstpost Defence Summit
Trending:
  • Nepal protests
  • Nepal Protests Live
  • Vice-presidential elections
  • iPhone 17
  • IND vs PAK cricket
  • Israel-Hamas war
fp-logo
Google is training AI to separate voices in a crowd which could find use in video chat apps as well as hearing aids
Whatsapp Facebook Twitter
Whatsapp Facebook Twitter
Apple Incorporated Modi ji Justin Trudeau Trending

Sections

  • Home
  • Live TV
  • Videos
  • Shows
  • World
  • India
  • Explainers
  • Opinion
  • Sports
  • Cricket
  • Health
  • Tech/Auto
  • Entertainment
  • Web Stories
  • Business
  • Impact Shorts

Shows

  • Vantage
  • Firstpost America
  • Firstpost Africa
  • First Sports
  • Fast and Factual
  • Between The Lines
  • Flashback
  • Live TV

Events

  • Raisina Dialogue
  • Independence Day
  • Champions Trophy
  • Delhi Elections 2025
  • Budget 2025
  • US Elections 2024
  • Firstpost Defence Summit
  • Home
  • Tech
  • News & Analysis
  • Google is training AI to separate voices in a crowd which could find use in video chat apps as well as hearing aids

Google is training AI to separate voices in a crowd which could find use in video chat apps as well as hearing aids

tech2 News Staff • April 13, 2018, 11:13:27 IST
Whatsapp Facebook Twitter

According to Google, the technique involves combining the auditory and visual signals of an input video to separate the speech.

Advertisement
Subscribe Join Us
Add as a preferred source on Google
Prefer
Firstpost
On
Google
Google is training AI to separate voices in a crowd which could find use in video chat apps as well as hearing aids

Google’s computer vision has seen major improvements over the years, a fact that is highlighted by the artificial intelligence chops of its Photos apps, which recognises faces, objects and more. Now, Google wants to do the same with voice as well. More specifically, audio-visual speech separation. [caption id=“attachment_4410225” align=“alignleft” width=“380”]The Google logo. Reuters The Google logo. Reuters[/caption] Say you are in a crowd of people and a familiar friend calls out to you. Even though you may not know the location of where your friend is standing, his or her voice has a certain pattern which you can immediately recognise irrespective of the noisy crowd around you. A machine may not be able to do that as efficiently. Just try controlling a smart speaker at your next house party when you want to play your music and you will know what we are talking about. Google’s researchers have developed a deep learning system which can separate voices by looking at people’s faces when they’re speaking and then boosts those voices. The team went about doing this by training a neural network to first understand and recognise individual voices of people when they were just talking by themselves. It then simulated virtual parties and mixed the individual voices in this, to teach the AI to learn to isolated multiple voices into separate audio tracks.

In the test clip from above, Google has managed to separate the voices of the two stand up comedians from the crowd (and each other) by recognising their faces and generating an audio track for that particular individual’s speech. As the video progresses, you can push the slider on either end to just hear one particular comedian’s voice more clearly, by drowning out the audience laughter. According to Google, the technique involves combining the auditory and visual signals of an input video to separate the speech. Google looks at the movements of the person’s mouth and correlates that with the sounds that are produced as the person is speaking. The combination of the visual element in addition to the audio, as opposed to just audio separation, helps in separating and having clean speech tracks associated with a particular visible speaker in a video. This can be useful if you are trying to communicate in video chat services. In fact, Google is looking to explore opportunities to test this feature in its products such as Hangouts and Duo. This will boost up the voice of the person you are talking to, even if they are in a crowded room.

“We believe this capability can have a wide range of applications, from speech enhancement and recognition in videos, through video conferencing, to improved hearing aids, especially in situations where there are multiple people speaking,” said the Google Research Blog. Google also believes that this technology can help with automatic closed captioning systems where multiple speakers are overlapping each other. It can be used as a pre-process for speech recognition. According to Engadget, it could also be misused. It could be used in public eavesdropping too. China could easily implement something like this on a mass scale, considering how it has been using facial recognition technology to compromise law-breakers in the country.

Tags
Google NewsTracker artificial intelligence AI Speech Recognition computer vision Deep Learning newural networks
End of Article
Latest News
Find us on YouTube
Subscribe
End of Article

Top Stories

Israel targets top Hamas leaders in Doha; Qatar, Iran condemn strike as violation of sovereignty

Israel targets top Hamas leaders in Doha; Qatar, Iran condemn strike as violation of sovereignty

Nepal: Oli to continue until new PM is sworn in, nation on edge as all branches of govt torched

Nepal: Oli to continue until new PM is sworn in, nation on edge as all branches of govt torched

Who is CP Radhakrishnan, India's next vice-president?

Who is CP Radhakrishnan, India's next vice-president?

Israel informed US ahead of strikes on Hamas leaders in Doha, says White House

Israel informed US ahead of strikes on Hamas leaders in Doha, says White House

Israel targets top Hamas leaders in Doha; Qatar, Iran condemn strike as violation of sovereignty

Israel targets top Hamas leaders in Doha; Qatar, Iran condemn strike as violation of sovereignty

Nepal: Oli to continue until new PM is sworn in, nation on edge as all branches of govt torched

Nepal: Oli to continue until new PM is sworn in, nation on edge as all branches of govt torched

Who is CP Radhakrishnan, India's next vice-president?

Who is CP Radhakrishnan, India's next vice-president?

Israel informed US ahead of strikes on Hamas leaders in Doha, says White House

Israel informed US ahead of strikes on Hamas leaders in Doha, says White House

Top Shows

Vantage Firstpost America Firstpost Africa First Sports
Latest News About Firstpost
Most Searched Categories
  • Web Stories
  • World
  • India
  • Explainers
  • Opinion
  • Sports
  • Cricket
  • Tech/Auto
  • Entertainment
  • IPL 2025
NETWORK18 SITES
  • News18
  • Money Control
  • CNBC TV18
  • Forbes India
  • Advertise with us
  • Sitemap
Firstpost Logo

is on YouTube

Subscribe Now

Copyright @ 2024. Firstpost - All Rights Reserved

About Us Contact Us Privacy Policy Cookie Policy Terms Of Use
Home Video Shorts Live TV