Week in Voice AI #10: Local notetaking
Who let the voice out?
I am Ivan Mehta, a consumer tech reporter at TechCrunch. For more than a year, I have covered different aspects of voice AI. This newsletter is an experimental attempt at covering what is happening in this industry, which is growing at a rapid pace.
Top News
Quill gets funding for its privacy-first meeting notetaker
In the years after the COVID pandemic, meeting notetakers have become insanely popular among both users and investors. Companies like Read AI, Granola, and Fireflies have raised millions of dollars in funding. These companies possibly shipped fast and got into the user mindset.
Quill, a newer entrant in the market, took a different approach. It stayed quiet and acquired customers rather than doing any press. CEO Michael Daugherty, who was the chief strategy officer at AngelList, said that the company, founded in 2023, was bootstrapped for the first few years with little money raised from friends and family.
The startup announced earlier this month that it has raised $6.5 million in funding led by Basis Set Ventures.
Quill’s focus is on making the app work on your device as much as possible. For instance, the transcription of meetings happens locally. Users also have a choice of AI providers, where they can choose cloud models or local-first models.
“Our view is that in a world with AI, we want to enable people to have more control and more leverage. And that means control over where their data lives, how their AI is processed, but also control over what they do with AI. And so the first trend is that every knowledge worker, every job is becoming a manager of AI agents,” Daugherty told me over a call.
For Daugherty, the core principle is that there are tons of AI companies trying to get more data and context from users, which could be used for training or ads. In this era, Quill wants to provide more privacy controls to users. The company is also open to integrating its tools and data with other services through integrations like MCP (Model Context Protocol).
The meeting tool offers a generous free tier with local recording and transcription. You have to pay for pro tools like live meeting minutes and AI-generated action items.
Study around AI-generated voices puts models from Minimax, Wellsaid, and PlayHT at the top
Vocal Image, a site that uses AI to train people in better communication, released a study involving over 10,000 participants to test out different text-to-speech (TTS) models. Notably, the study focused on how humans perceive AI voices.
The study noted that the voice model from Minimax, a China-based company, came out on top with a 86.2% of approval score, followed by 85.6% for PlayHt and 82% for Wellsaid Labs. Voice AI giants like Eleven Labs and Deepgram only had 74% and 68.4% approval rates, respectively. From big tech, only Microsoft made it to the top 10.
In terms of model quality, PlayHT scored first place with a score of 85, followed by WellSaid and Minimax, both of which ended up on 81.
Vocal Image’s study said that native English speakers are more likely to detect AI-generated voices than non-native English speakers. For voice companies, this means that the latter cohort accepts AI voices better with a higher approval rate.
The report noted that users in Saudi Arabia had the highest approval rate of 72.5% for AI voices. India and the U.S., countries with over 2,000 participants each, were fifth and sixth in the rankings with 67.1% and 66.8% approval.
Quick Bytes
Krisp launches listener-side accent conversion, which can convert your accent with on-device processing.
Speechify launches a Granola competitor for meeting note-taking, but it only works in Chrome at the moment.
Music tool Moises hires singer-songwriter Charlie Puth as its “chief music officer” to drive its AI products.
New Zealand-based company NestEdge has been facing backlash for its AI-powered voice calling tool that sounds too real.
Munich-based voice AI platform for enterprises, Voiceline, raises $11.5 million in Series A funding. The platform helps sales agents improve their performance.
Singals & Experiments
I tried Krisp’s accent-changing tech, and I get the need for it for certain calls, but to me, the tone copying didn’t work. The voice that I tested didn’t feel like my own voice. I might find it useful to convert someone’s voice who is heavily accented, but only if I can hear some words clearly. Otherwise, personally, I don’t mind asking someone to repeat themselves because I didn’t understand them.
Thank you for tuning in. Keep listening.
Have a story about call centers?
Email: voiceaiweek@gmail.com or im@ivanmehta.com


