Latest news with #voicegeneration

Yahoo

13-07-2025

Business
Yahoo

Meta acquires voice startup Play AI

Meta has acquired Play AI, a startup that uses AI to generate human-sounding voices. A Meta spokesperson has confirmed the acquisition, according to Bloomberg, which also reports that an internal memo stated that the 'entire PlayAI team' will be joining the company next week. (TechCrunch has also reached out to Meta for confirmation.) Meta's memo reportedly said that PlayAI's 'work in creating natural voices, along with a platform for easy voice creation, is a great match for our work and road map, across AI Characters, Meta AI, Wearables and audio content creation.' The company has been making big investments in AI, including aggressive recruiting from OpenAI and a deal with Scale AI that saw the company's CEO Alexandr Wang joining Meta to lead a new group focused on superintelligence. The financial terms of the acquisition were not disclosed. Bloomberg had previously reported that the two companies were in acquisition talks. Error in retrieving data Sign in to access your portfolio Error in retrieving data Error in retrieving data Error in retrieving data Error in retrieving data

TechCrunch

13-07-2025

Business
TechCrunch

Meta acquires voice startup Play AI

In Brief Meta has acquired Play AI, a startup that uses AI to generate human-sounding voices. A Meta spokesperson has confirmed the acquisition, according to Bloomberg, which also reports that an internal memo stated that the 'entire PlayAI team' will be joining the company next week. (TechCrunch has also reached out to Meta for confirmation.) Meta's memo reportedly said that PlayAI's 'work in creating natural voices, along with a platform for easy voice creation, is a great match for our work and road map, across AI Characters, Meta AI, Wearables and audio content creation.' The company has been making big investments in AI, including aggressive recruiting from OpenAI and a deal with Scale AI that saw the company's CEO Alexandr Wang joining Meta to lead a new group focused on superintelligence. The financial terms of the acquisition were not disclosed. Bloomberg had previously reported that the two companies were in acquisition talks.

ElevenLabs Eyes India as Strategic Growth Hub in the AI Voice Race

Entrepreneur

24-06-2025

Business
Entrepreneur

ElevenLabs Eyes India as Strategic Growth Hub in the AI Voice Race

"We think we are very well placed to be the voice of the Indic internet where content has no barrier and creativity knows no limit," says Siddharth Srinivasan, GMT, India, ElevenLabs Opinions expressed by Entrepreneur contributors are their own. You're reading Entrepreneur India, an international franchise of Entrepreneur Media. As the global market for AI voice generation continues its rapid expansion is projected to surge from USD 3.5 billion in 2023 to USD 21.7 billion by 2030. AI voice generator Unicorn, ElevenLabs is doubling down on India, recognising the country's unmatched scale in multilingual content consumption, tech talent, and AI adoption. The US-based startup, valued at USD 3.3 billion following its latest Series C funding round, is positioning India at the centre of its international strategy. The company's public breakthrough in India came when it dubbed Prime Minister Narendra Modi's three-hour conversation with Lex Fridman from Hindi to English. Why India? "In many ways, India was always waiting for a solution like this," says Siddharth Srinivasan, GMT, India, ElevenLabs. "We are all natively bilingual or multilingual, and the internet penetration, content consumption, and developer community here are unmatched." According to Srinivasan, India offers an ideal structural fit across five major user cohorts from consumers, creators, developers, startups, to enterprise AI users. He notes that India is now home to 2.5 to 3 million monetised content creators, a number that continues to shock even former YouTube executives like himself. ElevenLabs operates as a SaaS platform with a two-pronged go-to-market strategy. The first is a self-serve model, allowing users to begin with a freemium tier and scale up to subscriptions starting at USD 5 (INR 400) per month. The second targets enterprises that require bespoke solutions, high-volume processing, and dedicated support. This hybrid model has enabled the company to work across various industries. "From Pocket FM and Kuku FM in audio storytelling to social media influencers like Varun Maya. Our voice stack is helping them scale content faster and in multiple languages," says Srinivasan. The startup is already working with leading Indian platforms such as Meesho, Apna, 99acres, and NoBroker, particularly in conversational AI and customer engagement workflows. "Some partners are using our tech to 3–4x their customer interaction scale, something they couldn't imagine doing manually," Srinivasan reveals. In education, ElevenLabs is collaborating with startups like Supernova to create personalised, multilingual learning experiences through AI-powered conversational agents. "The promise of education technology has always been one-to-one learning. We are now able to fulfil that using voice AI," he adds. On the cybersecurity challenges Given the increasing misuse of voice cloning in phishing and disinformation campaigns, ElevenLabs claims to take a strict, layered approach to responsibility. "Moderation, accountability, and provenance are built into our system," says Srinivasan. He elaborates that cloning protected voices such as public figures is restricted through a "no-go voice" list, and cloning can only occur with direct consent via their "voice capture" system. The company has also developed a speech classifier capable of identifying if a sample was generated on ElevenLabs with over 99 per cent precision. "We're working with industry standards on watermarking, detection, and are open to law enforcement partnerships," he says, pointing out that the company has successfully avoided misuse during recent US and Indian elections. No government tie-ups yet, but social impact is on the radar While ElevenLabs does not currently have formal partnerships with the Indian government, it is participating in NGO-led initiatives that use AI voice to support people with speech impairments. "We distribute the technology for free to those with vocal challenges, enabling them to express themselves," Srinivasan says. He acknowledges the massive potential for collaboration in education and social policy, particularly under the IndiaAI mission. "We hope to have a meaningful role in that ecosystem," he adds. Also, on the competition side, Srinivasan candidly explains, "If there's no competition, the space isn't worth being in." He says ElevenLabs differentiates itself through state-of-the-art research, a deeply user-centric product, and relentless execution. "Speed is the only real moat in AI," he states. India was the first market the company expanded into outside the West, and it's likely to remain a priority. ElevenLabs already supports 11–12 Indian languages and aims to push that further with emotion-rich, dialect-sensitive outputs in its V3 (latest) models. When asked about the future, Srinivasan is clear-eyed in ambition, "We think we are very well placed to be the voice of the Indic internet where content has no barrier and creativity knows no limit." He also hints at upcoming partnerships with Indian startups and research entities.

How to get the most out of Google's free AI Studio

Fast Company

09-06-2025

Fast Company

How to get the most out of Google's free AI Studio

Google's AI Studio and Labs let you experiment for free with new AI tools. I love the way these digital sandboxes—like the one from Hugging Face —let you try out creative new uses of AI. You can dabble around then download and share what you make, without having to master a complex new platform. Read on for a few Google AI experiments to try. All are free, fast, and easy to use. 1. Transform an image Upload a photo and use Gemini's AI Studio Image Generation to transform it with prompts. Iterate on your original image until you get a version you like. The model understands natural language, so you don't have to master prompt lingo. 2. Generate an AI voice conversation AI-generated voices are increasingly hard to distinguish from human ones. If you're surprised, try Generate Speech in the AI Studio or Google's NotebookLM. How to use Generate Speech in Google's AI Studio Paste in text, either for a narration or a conversation between two people Open the settings tab to pick from 30 AI voices. Each is labeled with a characteristic—e.g. upbeat, gravelly, or mature. Click run to generate the conversation. Optionally adjust the playback speed. Download the file if you want to keep it, or paste in different text to try again. Example: a silly 90-sec chat between two violinists I scripted with Gemini and rendered quickly with this Generate Speech tool. Use case: Make a narration track for an instructional video. ElevenLabs has a better professional model for this, but AI Studio's is free, easy and quick. Alternatives Google's Gemini AI app can also now generate audio overviews from files you upload, if you're on a paid plan. Google's free NotebookLM has a new mobile app, and now lets you generate an audio conversation in any of 50 languages. Unlike Generate Speech in AI Studio, NotebookLM audio overviews summarize your material, they don't perform words as written. Why NotebookLM is so useful. Google's Illuminate lets you generate, listen to, share, and download AI conversations about research papers and famous books. Here's an audio chat about David Copperfield, for example. A bit dry to listen to, but still useful. 3. Make a gif Alternative: You can also make a static image with Google's Imagen 3 or the new Imagen 4. Write a short prompt and select your preferred aspect ratio. So far I still prefer Ideogram (why I like it) and ChatGPT's new image engine. 4. Generate a short video Google's Veo 2 and Flow let you generate free short video clips almost instantly with a prompt. Create a clip to add vibrancy or humor to a presentation, or a visual metaphor to help you explain something. Here are 25 other quick ideas for how you might use little AI-generated video scenes. How to create a video clip with Veo 2 Pick a length (5 to 8 seconds) and select horizontal or vertical orientation Write a prompt & optionally upload a photo to suggest a visual direction Example: Take a look at a parakeet photo I started with and the 5-second video I generated from the photo with Veo 2. Tip: Convert short video clips into gifs for free with Ezgif or Giphy. Unlike video files, gifs are easy to share and auto-play in an email or presentation. What's next: Remarkably lifelike clips made with Google's newer Veo 3 model went viral this week. These AI-generated visuals—with sound—are only available on the $250/month(!) plan for now, so try Veo 2 for free. 5. Explain things with lots of tiny cats This playful mini app creates short, step-by-step visual guides using charming cat illustrations to explain any concept, from how a violin works to the concept behind the matrix.

ElevenLabs Launches Eleven v3 (alpha) : New Expressive Text to Speech Model

Geeky Gadgets

06-06-2025

Geeky Gadgets

ElevenLabs Launches Eleven v3 (alpha) : New Expressive Text to Speech Model

ElevenLabs has launched Eleven v3 (alpha), a new Text to Speech model designed to deliver highly expressive and realistic speech generation. This version introduces advanced features like multi-speaker dialogue, inline audio tags for emotional and tonal control, and support for over 70 languages. While it requires more prompt engineering than previous models, it offers significant improvements in expressiveness and naturalness, making it ideal for applications in media, audiobooks, and creative projects. A real-time version is under development, and API access will be available soon. At the core of Eleven v3 is its ability to produce highly expressive and lifelike speech, offering users greater control over tone, emotion, and delivery. This is achieved through several innovative features: ElevenLabs Eleven v3 (alpha) Text to Speech AI Model Advanced emotional and tonal controls: Users can fine-tune voice delivery to convey specific emotions or tones, enhancing the natural flow of speech. Users can fine-tune voice delivery to convey specific emotions or tones, enhancing the natural flow of speech. Inline audio tags: Tags such as '[whispers]' or '[laughs]' allow for the seamless integration of non-verbal cues like sighs, laughter, and whispers, making speech more dynamic and engaging. Tags such as '[whispers]' or '[laughs]' allow for the seamless integration of non-verbal cues like sighs, laughter, and whispers, making speech more dynamic and engaging. Multi-speaker dialogue synthesis: The new Text-to-Dialogue API enables the creation of overlapping, realistic conversations between multiple speakers, complete with smooth transitions and nuanced emotional shifts. These features make Eleven v3 particularly valuable for applications such as storytelling, audiobooks, media production, and interactive entertainment. By allowing more natural and expressive speech, the model enhances the overall user experience across a variety of platforms. Watch this video on YouTube. Breaking Language Barriers Eleven v3 addresses the growing demand for multilingual support by offering compatibility with over 70 languages. This capability ensures that speech output maintains natural stress, cadence, and contextual accuracy across diverse linguistic settings. Improved linguistic adaptability: The model demonstrates a deeper understanding of accents, dialects, and cultural nuances, making it suitable for a wide range of global audiences. The model demonstrates a deeper understanding of accents, dialects, and cultural nuances, making it suitable for a wide range of global audiences. Applications in multilingual projects: Eleven v3 is well-suited for international audiobooks, educational content, and customer support systems, allowing creators to reach broader audiences. By supporting diverse languages and accents, Eleven v3 fosters inclusive communication and helps bridge language gaps, making it a valuable tool for global accessibility. Real-Time Capabilities and Developer Integration Although Eleven v3 currently requires more prompt engineering than its predecessors, a real-time version is under development. This future iteration is expected to cater to applications that demand instantaneous speech synthesis, such as live voiceovers and conversational AI systems. The model also offers robust API integration, allowing developers to incorporate its features into existing workflows and platforms. This flexibility makes Eleven v3 a versatile tool for industries such as: Gaming: Creating lifelike character voices and immersive in-game dialogues. Creating lifelike character voices and immersive in-game dialogues. Film and media: Enhancing voiceovers and character-driven narratives. Enhancing voiceovers and character-driven narratives. Education: Generating engaging and accessible learning materials. Generating engaging and accessible learning materials. Accessibility: Improving digital tools for individuals with disabilities. The combination of real-time capabilities and developer-friendly integration ensures that Eleven v3 can meet the diverse needs of professionals across multiple sectors. Applications Across Industries The enhanced expressiveness and realism of Eleven v3 open up a wide range of applications, particularly in creative and functional domains. Media and entertainment: Filmmakers and game developers can use the model to create lifelike character voices, while audiobook producers can deliver more emotionally resonant narratives. Filmmakers and game developers can use the model to create lifelike character voices, while audiobook producers can deliver more emotionally resonant narratives. Accessibility tools: The model's ability to generate clear and expressive speech can improve digital experiences for individuals with visual impairments or other disabilities, making content more inclusive. The model's ability to generate clear and expressive speech can improve digital experiences for individuals with visual impairments or other disabilities, making content more inclusive. Customer service: Multilingual and emotionally nuanced speech capabilities can enhance automated customer support systems, providing a more human-like interaction. Multilingual and emotionally nuanced speech capabilities can enhance automated customer support systems, providing a more human-like interaction. Education: Eleven v3 can be used to create engaging educational content, including language learning tools and interactive lessons. By offering a combination of emotional depth, linguistic versatility, and technical precision, Eleven v3 has the potential to transform how industries approach voice generation and communication. Availability and Future Developments Eleven v3 is currently available on the ElevenLabs platform, with an 80% discount on the ElevenLabs app offered until the end of June. API access and Studio support are expected to roll out soon, with early access available through direct sales contact. For applications requiring real-time speech synthesis, ElevenLabs recommends using v2.5 Turbo or Flash until the real-time version of v3 becomes available. Addressing Challenges and Advancing TTS Technology Eleven v3 was designed to address the limitations of earlier models, particularly in terms of expressiveness and naturalness. By allowing lifelike and responsive speech, the model meets the needs of professionals in industries such as film, gaming, education, and accessibility. As demand for realistic AI voice generation continues to grow, Eleven v3 represents a significant advancement in TTS technology. Its combination of emotional nuance, multilingual support, and developer-friendly integration positions it as a valuable tool for both creative and functional applications. By focusing on realism, versatility, and accessibility, Eleven v3 demonstrates the potential of AI-driven speech synthesis to enhance communication and storytelling across a wide range of industries. Here are additional guides from our expansive article library that you may find useful on Text-to-Speech. Filed Under: AI, Top News Latest Geeky Gadgets Deals Disclosure: Some of our articles include affiliate links. If you buy something through one of these links, Geeky Gadgets may earn an affiliate commission. Learn about our Disclosure Policy.

Latest news with #voicegeneration

Meta acquires voice startup Play AI

Meta acquires voice startup Play AI

ElevenLabs Eyes India as Strategic Growth Hub in the AI Voice Race

How to get the most out of Google's free AI Studio

ElevenLabs Launches Eleven v3 (alpha) : New Expressive Text to Speech Model

Get Started Now: Download the App