All you need to know about Adobe’s Project VoCo
Adobe at its recent annual MAX conference in San Diego, showcased a new experimental tool they called Project VoCo. According to them, Project VoCo lets you edit speech as easily as text — and it doesn’t just stop at editing existing text, but it allows you to use the same voice model to create completely new recordings.
These project when completed would practically do for audio what Photoshop does for the manipulation of images!
Here’s how it’s expected to work
The software would require about 20 minutes of voice samples from a given speaker. After which it analyzes the speech, breaking it down into phonemes, then transcribes it and creates a voice model.
As Adobe noted in their demo during a press event at MAX, the project wasn’t built in accordance with conventional speech synthesis technology but on what they called “voice conversion.” One astonishing thing about the program is that there’s little manual intervention necessary. You can easily correct the auto-generated transcript to improve the synthesis, but there’s no need to set timestamps, for example. The algorithms can figure that out themselves.
Purposes of the project
From the official statement released by Adobe during the event:
“When recording voiceovers, dialog, and narration, people would often like to change or insert a word or a few words due to either a mistake they made or simply because they would like to change part of the narrative. We have developed a technology called Project VoCo in which you can simply type in the word or words that you would like to change or insert into the voiceover. The algorithm does the rest and makes it sound like the original speaker said those words.”
Growing ethical concerns about the project
This program when completed can be used to insert single new words that the speaker never said and even create entirely new, natural-sounding sentences with the speaker’s voice. The technology behind the projects raises all kinds of questions. Can we still trust a recording of somebody’s speech anymore? You may soon be able to make people say things they never did.
All these are concerns that’s growing amongst the I.C.T community but from a purely technical perspective, this project presents some pretty impressive stuff.
Despite its intended audience though, if the program is released, it might be hard to trust a recording of someone’s speech. But given the popularity of podcasting, the constant interminable problems with capturing clean audio, and the monotony of ADR, Adobe’s Project VoCo looks like it could be a real game-changer for filmmakers. This one may indeed hit the market relatively soon. Although, as with all of their previous “Sneak Peeks,” Adobe won’t commit to ever shipping them, but over the years, many of the projects it has introduced this way have made their way into the company’s products.
Although VoCo does present some scary, dystopian potential—you could replicate someone’s voice for the sake of any number of vicious activity—but as avid follower of technological trends, it’s safe to say we’re excited by Adobe Project VoCo’s potentials.