

It seems, though, the algorithms may have the law in their favor. Musicians sue all the time over unauthorized samples of their work in other artists’ songs, so it may not seem unreasonable that they could sue for unauthorized samples in an AI simulator of their own voices. The actual voice is used in the creation but from there it’s all ones and zeros from the AI. The software has to be “trained” with audio samples and text transcripts. The self-described “hobbyist” behind the Voice Synthesis channel told the blog Waxy that the JAY-Z deepfakes were created with Tacotron 2, a text-to-speech program developed by Google. How are audio deepfakes different from sampling? The takedowns may have been a first attempt to challenge audio deepfake makers, but musicians and fans could potentially be grappling with the weird consequences of AI voice manipulations long into the future. Two days after the JAY-Z YouTubes were posted, they were removed due to a copyright claim. But audio deepfakes- AI-generated imitations of human voices-are possible, too. The better-known deepfakes are probably videos, which can be as silly as Green Day frontman Billie Joe Armstrong’s face superimposed on Will Ferrell’s, or as disturbing as non-consensual porn and political disinformation. “ Deepfakes” are super-realistic videos, photos, or audio falsified through sophisticated artificial intelligence. Did you ever imagine you’d hear JAY-Z do Shakespeare’s “To Be, Or Not to Be” soliloquy from Hamlet? How about Billy Joel’s “We Didn’t Start the Fire,” or a decade-old 4chan meme? All of these unlikely recitations were, of course, fake: “entirely computer-generated using a text-to-speech model trained on the speech patterns of JAY-Z,” according to a YouTube description.


In late April, audio clips surfaced that appeared to capture JAY-Z rapping several unexpected texts.
