Lol, folks. Listen to your article before you post it. Doesn't matter what voice. You'll catch things like this from macrumors.com. In the app's settings (accessed via ChatPGT ➝ Settings… in the menu bar when the app's main window ...
A demo of a #spiel sample app, the voices used in order are: eSpeakNG's "Andy" variant, MBROLA US2, and Piper's Amy. You can observe the different features like word tracking and quality. #speech#tts#gnome#linux
Honestly, since the fast variants of the voices are a thing, I think I could really switch to the Sonata Neural Voices in NVDA full time. Now remember folks, these are AI voices. Scary, untrustworthy, AI voices that will smear your reputation all over fedi for using these voices! See, they even react to exclamation marks! Isn't that scary? :) Nah, the worst that'll happen, mainly with the HFC male and female, is that big numbers are garbled together. But every other voice does fine. I use Amy for work, and HFC for reading because those are among the most lively voices I've ever heard. And amazingly enough, we can make our own new voices. So, some people, from the Github repo's readme, are building more professional voices. And there are already versions of old TTS engines from the past that have been brought back to some semblence of life with this tech.
OpenAI debuts Voice Engine, which lets users generate synthetic copy of a voice from a 15-second sample, available to around 100 partners, including HeyGe. In other words, it's not available to the public just yet.
Maybe we have an open source competitor for ElevenLabs? Check out their demo which they switch between original and synthesized. I can't tell. lol Apparently they're going to fully open source codebase and model weights. #TTS#AI#MLhttps://jasonppy.github.io/VoiceCraft_web/
To clear up the hashtags a little bit:
Think of the components of a voice assistant / smartspeaker.
You need #stt (speech-to-text) or #asr (automatic speech recognition) on the "input" side of a user request and #tts (text-to-speech) on the "output" side.
To throw in another technology - #nlp (natural language processing) is used in the "middle" to really understand what the user request is all about.
We need a dbus interface to get a system-wide Text To Speech provider, and Flatpak apps should be able to register themselves as TTS providers.
In GNOME settings there should be an option to disable the current TTS provider, open its settings or switch to another one. Similarly to how android manages multiple keyboards, which you can install from the play store.
The same goes for Speech To Text. You should be able to install your favorite STT provider, with your preferred voice, from the store
La ROM de mon smartphone n'a pas de système de synthèse vocale (#TTS). Geovelo m'invite à télécharger... le système #Google (Speech Recognition & Synthesis)
PAS ENVIE. Plus confiance.
Connaissez-vous un système TTS libre Android (pour LineageOS par exemple) qui supporte 🇨🇵🇬🇧 voire 🇪🇸🇵🇹 ?
Another interesting #TTS#AI system. I need to look closer into it in order to see if it's a voice cloning approach or something else: https://github.com/yl4579/StyleTTS2