Firstpost
  • Home
  • Video Shows
    Vantage Firstpost America Firstpost Africa First Sports
  • World
    US News
  • Explainers
  • News
    India Opinion Cricket Tech Entertainment Sports Health Photostories
  • Asia Cup 2025
Apple Incorporated Modi ji Justin Trudeau Trending

Sections

  • Home
  • Live TV
  • Videos
  • Shows
  • World
  • India
  • Explainers
  • Opinion
  • Sports
  • Cricket
  • Health
  • Tech/Auto
  • Entertainment
  • Web Stories
  • Business
  • Impact Shorts

Shows

  • Vantage
  • Firstpost America
  • Firstpost Africa
  • First Sports
  • Fast and Factual
  • Between The Lines
  • Flashback
  • Live TV

Events

  • Raisina Dialogue
  • Independence Day
  • Champions Trophy
  • Delhi Elections 2025
  • Budget 2025
  • US Elections 2024
  • Firstpost Defence Summit
Trending:
  • PM Modi in Manipur
  • Charlie Kirk killer
  • Sushila Karki
  • IND vs PAK
  • India-US ties
  • New human organ
  • Downton Abbey: The Grand Finale Movie Review
fp-logo
Microsoft's new AI bot VALL-E can replicate anyone’s voice with just a 3-seconds audio sample
Whatsapp Facebook Twitter
Whatsapp Facebook Twitter
Apple Incorporated Modi ji Justin Trudeau Trending

Sections

  • Home
  • Live TV
  • Videos
  • Shows
  • World
  • India
  • Explainers
  • Opinion
  • Sports
  • Cricket
  • Health
  • Tech/Auto
  • Entertainment
  • Web Stories
  • Business
  • Impact Shorts

Shows

  • Vantage
  • Firstpost America
  • Firstpost Africa
  • First Sports
  • Fast and Factual
  • Between The Lines
  • Flashback
  • Live TV

Events

  • Raisina Dialogue
  • Independence Day
  • Champions Trophy
  • Delhi Elections 2025
  • Budget 2025
  • US Elections 2024
  • Firstpost Defence Summit
  • Home
  • Tech
  • News & Analysis
  • Microsoft's new AI bot VALL-E can replicate anyone’s voice with just a 3-seconds audio sample

Microsoft's new AI bot VALL-E can replicate anyone’s voice with just a 3-seconds audio sample

Mehul Reuben Das • January 10, 2023, 13:39:35 IST
Whatsapp Facebook Twitter

Once the AI bot learns a specific voice, VALL-E can synthesize audio of that person saying anything, and do it in a way that attempts to preserve the speaker’s emotional tone, as well as the environment where the speaker is in.

Advertisement
Subscribe Join Us
Add as a preferred source on Google
Prefer
Firstpost
On
Google
Microsoft's new AI bot VALL-E can replicate anyone’s voice with just a 3-seconds audio sample

A team of researchers at Microsoft have developed a new text-to-speech AI model called VALL-E that can simulate a person’s voice almost perfectly, once it has been trained. And that in order to train this new AI bot, all they need is a three-second audio sample.   [caption id=“attachment_11959922” align=“alignnone” width=“640”]Microsoft’s new AI bot VALL-E can replicate anyone’s voice with just a 3-seconds audio sample Once the AI bot learns a specific voice, VALL-E can synthesize audio of that person saying anything, and do it in a way that attempts to preserve the speaker’s emotional tone, as well as the environment where the speaker is in.[/caption] Moreover, the researchers claim that once the AI bot learns a specific voice, VALL-E can synthesize audio of that person saying anything, and do it in a way that attempts to preserve the speaker’s emotional tone. The developers of VALL-E can potentially be used for high-quality text-to-speech applications, speech editing where a recording of a person could be edited and changed from a text transcript, and in conjunction of content creation with other generative AI models like GPT-3. Microsoft’s VALL-E builds off of a technology called EnCodec, which Meta announced in October 2022. Unlike other text-to-speech methods that typically synthesize speech by manipulating waveforms, VALL-E generates discrete audio codec codes from text and acoustic prompts. Bacially, VALL-E analyzes how a person sounds, and breaks down the voice into tokens. Then it uses the training data to match what it “knows” about how that voice would sound if it spoke other phrases. Microsoft used LibriLight, an audio library put together by Meta, to train VALL-voice E’s synthesis skills. The majority of the 60,000 hours of English-language speech are taken from LibriVox public domain audiobooks and are spoken by more than 7,000 different people. The voice in the three-second sample must closely resemble a voice in the training data for VALL-E to get a satisfactory result. In addition to preserving a speaker’s vocal timbre and emotional tone, VALL-E can also imitate the “acoustic environment” of the sample audio. The audio output, for instance, will imitate the acoustic and frequency qualities of a telephone call in its synthetic output, which is a fancy way of stating that it will sound like a telephone call as well. Additionally, Microsoft’s samples (included in the “Synthesis of Diversity” section) show how VALL-E may produce different voice tones by altering the random seed utilised during creation.

Tags
Microsoft VALL E Audio generative AI Text To Speech AI
End of Article
Latest News
Find us on YouTube
Subscribe
End of Article

Top Stories

Russian drones over Poland: Trump’s tepid reaction a wake-up call for Nato?

Russian drones over Poland: Trump’s tepid reaction a wake-up call for Nato?

As Russia pushes east, Ukraine faces mounting pressure to defend its heartland

As Russia pushes east, Ukraine faces mounting pressure to defend its heartland

Why Mossad was not on board with Israel’s strike on Hamas in Qatar

Why Mossad was not on board with Israel’s strike on Hamas in Qatar

Turkey: Erdogan's police arrest opposition mayor Hasan Mutlu, dozens officials in corruption probe

Turkey: Erdogan's police arrest opposition mayor Hasan Mutlu, dozens officials in corruption probe

Russian drones over Poland: Trump’s tepid reaction a wake-up call for Nato?

Russian drones over Poland: Trump’s tepid reaction a wake-up call for Nato?

As Russia pushes east, Ukraine faces mounting pressure to defend its heartland

As Russia pushes east, Ukraine faces mounting pressure to defend its heartland

Why Mossad was not on board with Israel’s strike on Hamas in Qatar

Why Mossad was not on board with Israel’s strike on Hamas in Qatar

Turkey: Erdogan's police arrest opposition mayor Hasan Mutlu, dozens officials in corruption probe

Turkey: Erdogan's police arrest opposition mayor Hasan Mutlu, dozens officials in corruption probe

Top Shows

Vantage Firstpost America Firstpost Africa First Sports
Latest News About Firstpost
Most Searched Categories
  • Web Stories
  • World
  • India
  • Explainers
  • Opinion
  • Sports
  • Cricket
  • Tech/Auto
  • Entertainment
  • IPL 2025
NETWORK18 SITES
  • News18
  • Money Control
  • CNBC TV18
  • Forbes India
  • Advertise with us
  • Sitemap
Firstpost Logo

is on YouTube

Subscribe Now

Copyright @ 2024. Firstpost - All Rights Reserved

About Us Contact Us Privacy Policy Cookie Policy Terms Of Use
Home Video Shorts Live TV