chikim

@chikim@mastodon.social

Love music, technology, accessibility! Faculty at Berklee College of Music 👨🏻‍💻🎹🐕‍🦺

This profile is from a federated server and may be incomplete. Browse more on the original instance.

vick21, 2 days ago to accessibility

How NVDA & OSARA are empowering blind people globally - Audio described Version: https://youtube.com/watch?v=N-y3yomLLSk&si=xiibf5ZxJzrlDnES #accessibility

reply

expand (7)

collapse (7)

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 2 days ago

@vick21 I understand this is not a good way to measure by any mean, but regardless let's do some math! Globally there are 43M blind people , According to WHO. There are over 250k NVDA users in 175 countries, According to NVDA creators. Let's just say 300k. According to screen reader survey #10, NVDA counts for 37.7%. That means only 1.85% of blind people have access to screen reader. 0.3M/0.377%/43M*100 That's very sad! :( Let me know if I epically failed this math. lol #GAAD

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 2 days ago

@miki @pixelate @vick21 Good point! Globally 82% of blind people are over 50. If you adjust the stat to look only under 50 years old, it's 11.33%. I think also screen reader survey includes screen readers on mobile devices, so it's not just computer.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 2 days ago to random

I created samples for all 58 voices for xtts-v2. Hopefully it makes it easier for someone to choose a speaker. #coqui #tts https://we.tl/t-9vWd1gO3EN

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ datajake1999

chikim, 4 days ago to apple

Am I the only one who hates the giant escape key on newer Macbook Pro? It pushes all the function keys to the right, and it really throws off my muscle memory! Also Logitech thinks it's a great idea on MX Keys S. I just got older Logitech MX Mini for this reason. :( #apple #logitech #keyboard

reply

expand (1)

collapse (1)

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 5 days ago to ai

lol Cheery! The best name for ChatGPT 4O voice! Jeff Jarvis on TWIT podcast suggested. I agree, she's so over the top! #AI #ML #LLM #OpenAI

reply

expand (1)

collapse (1)

report

activity

copy /kbin url

copy original url

open original url

Loading...

blindbargains, 5 days ago to random

Be My Eyes Accessibility with GPT-4o https://www.youtube.com/watch?v=KwNUJ69RbwY

reply

expand (3)

collapse (3)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ evilcookies98

chikim, 5 days ago

@FreakyFwoof @blindbargains Is gpt-4O on BeMyAI available now? Only for Beta users?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 5 days ago to llm

Interesting, the Chat GPT desktop app for the Mac will be slowly rolling out to Plus subscribers starting today, but OpenAI "plans to release a Windows version of the desktop app later this year." Maybe the rumor, that Apple is cosing a deal with OpenAi for ChatGPT, is True... #LLM #AI #ML #OpenAI #Apple https://www.macrumors.com/2024/05/13/chatgpt-mac-app/

reply

expand (1)

collapse (1)

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 5 days ago to llm

If you missed it, check out the new GPT4O demo. #LLM #ML #AI #OpenAI https://www.youtube.com/watch?v=DQacCB9tDaw

reply

expand (5)

collapse (5)

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 5 days ago

@bryansmart lol I'm the one who posted about Google IO, and you replied to me. haha

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 5 days ago

@bryansmart FAlso funny. Google IO event is tomorrow. OpenAI intentionally revealed their model today. lol

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 5 days ago to llm

GPT4O combines audio, image, text. It can analyze actual audio. Also you can interrupt voice. It can pick up emotion from audio. You can also ask different speech with diffetrent style including singing! It can see the image in real time and chat on voice. For example solving equation in real time as you write on paper. This is amazing! #LLM #AI #GPT #ML #OpenAI

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ datajake1999

chikim, 8 days ago to accessibility

I'm late to the party, but I found out I'm with the majority! lol According to the WebAIM Screen Reader User Survey #10, 68.2% (779 out of 1142) "respondents indicate that individuals should not describe what they look like during a virtual meeting or webinar" for blind and visually impaired participants. #blind #accessibility https://www.webaxe.org/webaim-screen-reader-user-survey-10/

reply

expand (2)

collapse (2)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ stefan

chikim, 8 days ago to ai

ElevenLabs joined the AI music generation. not available for public, but here's a demo clip. It's getting better and better! #AI #ML https://www.youtube.com/watch?v=m9DrkOrr3QM

reply

expand (1)

collapse (1)

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 11 days ago to random

Well Logic Pro on Ipad now has built in stem separation like Demucs. lol

reply

expand (1)

collapse (1)

report

activity

copy /kbin url

copy original url

open original url

Loading...

macrumors, 12 days ago to random

Apple Event Live Blog: New iPads and Accessories Expected https://www.macrumors.com/2024/05/07/apple-event-live-blog-may-2024/?utm_source=dlvr.it&utm_medium=mastodon

reply

expand (1)

collapse (1)

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 12 days ago

@macrumors Hi, do you guys have MacRumorsLive on Mastodon?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

bryansmart, 13 days ago to random

@chikim I've really been enjoying VOLlama. Nice work! Would be nice to be able to switch between OpenAI and local models without going in to API prefs. More accelerator keys for menu options would be good, too. Could maybe a blank line be inserted in the log between each entry? Last, can you trap key-down on the Control key to stop the system voice? I know it's a hobby project, so no idea how much time you have for any of that, but just throwing them out there.

reply

expand (24)

collapse (24)

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 13 days ago

@bryansmart the Alternative is press alt or option+up to go into the edit mode, and it'll paste only one message at a time into the prompt field.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 13 days ago

@bryansmart Alt or option+up/down will let you edit the history context. It'll show one message at a time into the prompt field and lets you edit. If you want to just review it without editing it, just escape or alt/option+dwn all the way to the bottom when you're done reading.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 13 days ago

@bryansmart Hmm, what kind of error? @vick21 also told me he got out of memory error during creating embedding.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 13 days ago

@bryansmart @vick21 ah, you forgot to download the embedding model. ollama pull nomic-embed-text

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 13 days ago

@bryansmart @vick21 No problem. I totally get it. I hate writing manual/user guide/readme as well as reading them. lol

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 13 days ago

@vick21 @bryansmart That's actually great idea! VOLLlama has only few files, so I should try feeding everything and ask to write manual for it. hahahaha

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 13 days ago

@bryansmart Did you start your question with /q?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 13 days ago

@bryansmart Also, you can't ask general questions like summarize that require reading the entire thing. The way it works it compares your question against documents through embedding, and retrieve and feed few chunks that might be relevant to answer your question.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 13 days ago

@bryansmart If you want to see which chunks are fed into the model in order to answer your question, you can check the only checkbox in RAG settings. lol I don't think it's labeled in mac due to wx widget bug on Mac.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 13 days ago

@bryansmart Embedding model is only used during indexing. The quality of the answer depends on the model you are chatting with. Because it will read the chunks in text and give you the answer. It really depends on if the LlamaIndex was able to retrieve the relevant chunks or not. You can increase number of chunks and chunk length, but you might end up feeding chunks that are not related to your question. Also there's threshold you can change to filter out below certain simularity score.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...