zeroun, to random French
@zeroun@mastodon.tedomum.net avatar

Salut,
Vous avez un soft de reconnaissance vocale () à me conseiller sur gnu/lINUX ?

daeinc, (edited ) to genart
@daeinc@genart.social avatar

Voice coding session with AI (Google Gemini Pro)
This time with vanilla JS Canvas code + Ssam.js
🔊 (sound on)
video edited to keep it short.

#creativeCoding #createSsam #MachineLearning #AI #javaScript #SpeechRecognition

persagen, to ai
@persagen@mastodon.social avatar
devinprater, to accessibility
unfa, to opensource
@unfa@mastodon.social avatar

Common Voice is a project by Mozilla to build an extensive ethically-sourced dataset of spoken word in various languages to help push forward open-source voice recognition technology like DeepVoice (also by Mozilla).

I just recorded a dozen or so sentences :)

https://commonvoice.mozilla.org/

Edent, to ai
@Edent@mastodon.social avatar
Kiloku, to random
@Kiloku@burnthis.town avatar

Does anyone here know how to use or other speech recognition stuff?

I tried whisper.cpp but it apparently can only use OpenAI's models, so it's not an option, on ethical grounds.

I want to implement cross-platform voice commands into Open, as it currently only works in Windows with the Microsoft SAPI.

(boosts welcome)

chenzi, to linguistics

Are you looking for a Cantonese forced aligner? Check out my new #Kaldi (source) tutorial on training models of HK Cantonese: 🌟https://chenzixu.rbind.io/resources/3asr/sr3/

I also replicated the process of training acoustic models for HK Cantonese in a streamlined MFA workflow. It is easily applicable to many other languages. Check out the MFA tutorial: 🌟 https://chenzixu.rbind.io/resources/3asr/sr4/

#linguistics #phonetics #ASR #academic #speechrecognition

KathyReid, to OpenAI
@KathyReid@aus.social avatar

Today in my web browsing history: The tech sector waxing lyrical about 's upgrades to , which include and capabilities, meanwhile on Wikipedia, which was scraped to train many LLMs, begs for donations.

All of the content we've placed online has been mined, processed, refined, and is being sold back to us.

For many of us, that's a new experience.

But I suspect that for those from the global South, it's a repeat of centuries of colonisation.

itnewsbot, to machinelearning
@itnewsbot@schleuss.online avatar

ChatGPT update enables its AI to “see, hear, and speak,“ according to OpenAI - Enlarge (credit: Getty Images)

On Monday, OpenAI announced a s... - https://arstechnica.com/?p=1970737

KathyReid, to OpenAI
@KathyReid@aus.social avatar

For folks who work with , specifically from - I have heard some anecdotal evidence of transcription with the medium-en model returning paragraphs of "junk" content, like weather reports and adverts for golfing supplies.

I have three confirmed reports from transcripts of interviews of unrelated topics, and am curious if there are other (as yet unreported) instances of similar?

If so, please let me know - DM for email address.

Boosts appreciated.

schizanon, to Amazon
@schizanon@mas.to avatar

"In addition to text, uses speech cues such as tones and pitch to hone in on toxic intent in speech."

Good luck AWS, my toxicity is completely deadpan

Flag harmful language in spoken conversations with | Blog https://aws.amazon.com/blogs/machine-learning/flag-harmful-language-in-spoken-conversations-with-amazon-transcribe-toxicity-detection/

KathyReid, to random
@KathyReid@aus.social avatar

A little piece I spoke to, and which my colleague, Lauren Pay, wrangled into coherence - it's about the history of as a - and how the ANU School of Cybernetics can help you learn how to interrogate and shape such systems.

Written to promote the school's new short courses.

Check it out at:
https://cybernetics.anu.edu.au/news/2023/07/05/are-you-listening-to-me/

yingtai, to random
@yingtai@zirk.us avatar

I have been trying to create a whole new system of shortcuts for myself, and I tell you, it is uncommonly like casting magic spells. Like Diane Duane and Patricia Wrede's magic systems: lots of work beforehand so you can trigger the spell with one word later on.

It makes me think the Harry Potter universe must have an awful lot of freeware floating around. Utterly unacknowledged, because JK Rowling isn't great at thinking through her worldbuilding.

itnewsbot, to tech
@itnewsbot@schleuss.online avatar

11 NLP Use Cases: Putting the Language Comprehension Tech to Work - Natural Language Processing (NLP), which encompasses areas such as linguistics, co... - https://readwrite.com/11-nlp-use-cases-putting-the-language-comprehension-tech-to-work/

chris_hayes, to accessibility
@chris_hayes@fosstodon.org avatar

Dictation - Google's Project Relate looks interesting. Google has you train your voice on 500+ cards, then creates a custom model for your voice. Particularly useful for anyone with unusual speech patterns.

Announced in late 2021, the Android app was released in January 2023. It sounds like the dictation accuracy (once trained) is better than current apps (Siri, Echo, Google, Dragon).

https://abilitynet.org.uk/news-blogs/dictating-speech-challenges-testers-experience-google%E2%80%99s-project-relate-speech-app

DoomsdaysCW, to random
@DoomsdaysCW@kolektiva.social avatar

groups in , fear as learns their languages

By Rina Chandran, April 03, 2023

• Generative AI models learn from mass data scraped from web
• Indigenous groups fear losing control over their data
• Some move to protect their information from commercial use

"When U.S. tech firm OpenAI rolled out Whisper, a speech recognition tool offering audio transcription and translation into English for dozens of languages including Māori, it rang alarm bells for many Indigenous New Zealanders.

"Whisper, launched in September by the company behind the ChatGPT chatbot, was trained on 680,000 hours of audio from the web, including 1,381 hours of the Māori language."

Read more: https://www.context.news/ai/nz-us-indigenous-fear-colonisation-as-bots-learn-their-languages?utm_source=pocket-newtab

HistoPol,
@HistoPol@mastodon.social avatar
  • All
  • Subscribed
  • Moderated
  • Favorites
  • megavids
  • cubers
  • DreamBathrooms
  • tacticalgear
  • magazineikmin
  • mdbf
  • Youngstown
  • everett
  • slotface
  • ngwrru68w68
  • rosin
  • thenastyranch
  • kavyap
  • khanakhh
  • JUstTest
  • tester
  • InstantRegret
  • cisconetworking
  • Durango
  • ethstaker
  • osvaldo12
  • GTA5RPClips
  • modclub
  • Leos
  • provamag3
  • normalnudes
  • anitta
  • lostlight
  • All magazines