Revolutionizing Voice Synthesis: OpenAI’s Voice Engine Leads the Way in AI-Generated Speech

In just 15 seconds, artificial intelligence can mimic a human voice in any language

In late 2022, OpenAI introduced a revolutionary platform called Voice Engine. This tool is set to transform the realm of speech synthesis by enabling the creation of a synthetic voice based on a short 15-second audio sample of a person, allowing for texts to be read in their original language or other languages upon request. To assess its positive applications and necessary security measures, OpenAI has partnered with various companies across different sectors, including Age of Learning, HeyGen, Dimagi, Livox, and the Lifespan Health System.

Jeff Harris from OpenAI’s product team for Voice Engine shared that the development of this platform began in late 2022. The technology utilizes licensed and publicly available data to fuel the text-to-speech API’s pre-built voices and ChatGPT’s Read Aloud feature. However, access will be restricted to around ten developers due to OpenAI’s cautious approach in introducing such advanced technology.

The text-to-audio generation field is rapidly advancing, with companies like Podcastle and ElevenLabs leading the way with their innovations. While this surge of interest holds great potential for businesses in various sectors, it also raises ethical and security concerns, exemplified by the US Federal Communications Commission’s recent ban on automated calls featuring cloned AI voices without consent.

In response to these risks, OpenAI has established strict usage policies for its partners to prevent impersonation without consent and ensure explicit and informed consent from the original speaker. Additionally, all generated audio clips will bear a watermark to aid traceability and monitor closely use synthetic voices. The proposed preventative measures by OpenAI include abstaining from voice authentication for bank account access, safeguarding people’s voices in AI content tracking systems, enhancing deepfake education and developing AI content tracking systems.

Overall Voice Engine is poised to revolutionize how we generate synthetic voices based on our speech samples making it an exciting new development in AI technology that could have far-reaching implications across different industries.

Leave a Reply