Replies

This profile is from a federated server and may be incomplete. Browse more on the original instance.

Caoimhe, to random
@Caoimhe@dragonscave.space avatar

Is it possible to read the content of an Excel cell letter by letter with NVDA?

miki,
@miki@dragonscave.space avatar

@Caoimhe Press f2, though that only works if the text comes directly from the cell, not a formula. This trick is also pretty useful for advanced formula editing.

jscholes, to apple
@jscholes@dragonscave.space avatar

Costco had #Apple #AirTags on sale. Having never used one, I decided to buy a couple. I now have two smooth, round things on my desk that apparently don't stick or attach to anything without additional hardware, that I guess I can... put in a box that I might lose? Not really sure I understand this product.

miki,
@miki@dragonscave.space avatar

@jscholes They're great to put in a bag if you're a bag-carrying person like I am. Wallets too. You can attach them to a keychain or keyring. People who own a vehicle often put one there, in case it gets stolen or even to find it when parked.

jcsteh, to random

As I understand it, with all current LLMs, having a conversation involves feeding the model the entire conversation up to this point. That is, there is no memory: the prompt you feed it just gets longer and longer. So how does that work with something like GPT-4O which could be processing audio and/or video at a much faster rate? Surely the prompts must get very large very quickly with anything beyond a short interaction? Doesn't that mean the responses take longer and cost more as the conversation gets longer?

miki,
@miki@dragonscave.space avatar

@chikim @jcsteh Also there might be some caching involved. The expensive operation in LLMs is attention, which needs to be calculated for every pair of tokens, and that's O(n^2). However, when we're only adding a few new tokens to an already existing prompt, we only need to calcualte the new pairs, and that's just O(n+m*m), not O((n+m)^2). Most implementations throw all those calculations away after finishing every request. This makes sense, these attention vectors take up a lot of memory and there's usually load balancing involved, so even if you make a request with the same prompt, it's probably going to hit another instance. If you have a persistent connection to a single server and it's easy to determine exactly when this connection starts and ends, it might make sense to cache, which lowers the cost considerably.

KathyReid, to microsoft
@KathyReid@aus.social avatar

Why does want to implement ? It's not about images. It's about modelling what workers do on Windows, and then replacing them.

The most expensive part of a computer is the fallible feelings-filled unpredictable meat sack that operates it.

Google has YouTube, Google Photos, Maps, and a bucket load of search data, Google Analytics, advertising, as well as it's data (e.g. transcriptions). And a bunch of data from Android services. From this data they can model speech, model videos and model advertising systems, and how humans respond to them.

But they can't model what people do on computers.

Amazon has Prime data, and a bucket load of compute. But no operating system data. They can build models based around e-commerce and advertising systems.

But they can't model what people do on computers.

Meta has waves hands enough analytics to model human behaviour in the Metaverse.

But they can't model what people do on computers.

Microsoft has GitHub.
Microsoft has LinkedIn.
Microsoft has SharePoint.
Microsoft has Teams.
Microsoft has Dynamics.
Microsoft has O365.
Microsoft has Windows telemetry data.

Microsoft can model what people do on (Windows) computers. Like fill out spreadsheets.Write emails. Synthesize web pages of research. Interact with colleagues on Teams. Create and edit documents.

Microsoft wants data so they can model what people do with operating systems.

Then replace them.

Imagine a CoPilot that doesn't just write buggy code. Imagine one that also does spreadsheets. That creates documents on SharePoint. That communicates with colleages on Teams. That has a customer pipeline on Dynamics.

That's what Recall is about - 360 degree surveillance of the worker, to model their functions, make them fungible, replicable - and replaceable.

miki,
@miki@dragonscave.space avatar

@KathyReid The fatal flaw in this argument is the fact that recall data stays local and isn't sent to Microsoft

TheQuinbox, to random

Interesting observation: almost all of the blind hackers in my friend circle are bookworms, me included. I mean, some of us like audio over epub or vice versa, same with genres, but we're all bookworms. Wonder why?

miki,
@miki@dragonscave.space avatar

@TheQuinbox I feel like books for us are what movies are for others. At least for me, it takes a lot less mental effort to listen to a book (whether that be with TTS or audio) than to listen to a movie with AD. In this day and age, there's also a lot less stigma about books than there is about TV / games / social media, so you can read as much as you want, guilt-free.

miki, to random
@miki@dragonscave.space avatar

This whole Microsoft Recall thing makes me want to return to my "permanent storage of speech history" idea. Annotate it with some metadata like timestamps, app name and window title, stick it in a vector database for RAG, and some really interesting possibilities start to emerge.

miki,
@miki@dragonscave.space avatar

@zersiax yeah, this would need to be all local, maybe using Open AI for the RAG step itself if that.

capital, to random
@capital@scalie.zone avatar

Microsoft recall is fucking insane.

Recall snapshots are kept on Copilot+ PCs themselves, on the local hard disk, and are protected using data encryption on your device and (if you have Windows 11 Pro or an enterprise Windows 11 SKU) BitLocker.

Your doing what? Microsoft wh-

Recall uses Copilot+ PC advanced processing capabilities to take images of your active screen every few seconds. [...]

[...] The default allocation for Recall on a device with 256 GB will be 25 GB, which can store approximately 3 months of snapshots. [...]

WHAT WHY NO ST-

Note that Recall does not perform content moderation. It will not hide information such as passwords or financial account numbers. That data may be in snapshots that are stored on your device, especially when sites do not follow standard internet protocols like cloaking password entry.

Microsoft please... th-the tech support scams... think about what happens if this gets bre-

Recall also does not take snapshots of certain kinds of content, including InPrivate web browsing sessions...

Oh, okay I guess that's san-

...in Microsoft Edge.

AAAAAAAAAAAAAAAAAAAAAAAA

It treats material protected with digital rights management (DRM) similarly; like other Windows apps such as the Snipping Tool, Recall will not store DRM content.

Ah, but of course. The DRM is protected...

miki,
@miki@dragonscave.space avatar

@capital The fact that DRM is protected isn't some kind of evil / malicious scheme by Microsoft, it's just how Windows (and literally all other systems, Linux included) works. No app, whether Microsoft or third-party, is allowed to touch that data.

miki,
@miki@dragonscave.space avatar

@minneyar @capital Passwords are already protected, how would they detect credit card numbers?

DavidGoldfield, to random

For users of Voice Dream Reader: I found an extremely serious bug in v4.34.4 of VDR. I have reported the following bug to the VDR mailing list. I'm very confident that it will be fixed and so my intent in posting this on Mastodon is not to start a VDR bashing session but just to make other users aware of this bug. Here is what I sent to the list.

Hello. In this latest version of VDR I am no longer able to successfully import any files from OneDrive via the file browser.

Steps to Reproduce

  1. With VoiceOver enabled, double-tap the Add button.
  2. Choose File Browser.
  3. Choose a file with a recognized file format: audio, text, Word document, etc. You will hear that your file has been added.
  4. Select the Done button.
  5. In the VDR library, move focus to the file that you tried to import. VDR sees the file but reports it as being “unavailable.”

If this bug is reproducible, I consider it to be an extremely high-priority issue as I am no longer able to import any new files into VDR.

miki,
@miki@dragonscave.space avatar

@DavidGoldfield Does importing through the files app work? What about sharing from Onedrive directly?

simon, to random

Does anybody know how AI music creation services like Suno actually work? I assume they are trained on actual music, but I also suspect that polyphonic music with lyrics is far too complex for an AI to replicate. I think it must consist of several models, each of them generating specific things that then get reassembled into a song. it might even be some kind of fancy MIDI generator with lyrics and voices. I really doubt it just cranks out a whole entire song in one go.

miki,
@miki@dragonscave.space avatar

@simon @ivan_soto Judging by the terminology they use and by what those services can and can't do, I strongly suspect it's a Diffusion model. Something like Stable Diffusion, but trained on audio spectra, not images. Those models do really well on creating extremely realistic-looking images, so I wouldn't be too surprised if they were also capable of generating extremely realistic audio spectra. There's even precedent in the open source space, the Riffusion model works this way, although it's nowhere near as good.

scottjenson, to LLMs
@scottjenson@social.coop avatar

Saying "LLMs will eventually do every job" is a bit like:

  1. Seeing Wifi wireless data
  2. Then predicting "Wireless" Power saws (no electrical cord or battery) are just around the corner

It's a misapplication of the tech. You need to understand how work and extrapolate that capability. It's all text people. Summarizing, collating, template matching. All fair game. But stray outside of that box and things get much harder.

miki,
@miki@dragonscave.space avatar

@scottjenson Saying "we will once be able to fly from New York to Paris" is like seeing the contraption that the Wright brothers have just designed and extrapolating a jet engine.

talon, (edited ) to random
@talon@dragonscave.space avatar

In C# you can do:
someVar is > 10 and < 100 and not 60
and I think that's beautiful.

miki,
@miki@dragonscave.space avatar

@x0 @talon Not a C# expert by any means, but I think this is just pattern matching. Pattern matching definitely is cool though.

miki,
@miki@dragonscave.space avatar

@x0 @talon That's functional programming for you, read "structure and interpretation of computer programs" if you're into that sort of stuff. I never actually fully finished it myself, but the parts I managed to read were already quite fascinating.

vick21, to accessibility
@vick21@mastodon.social avatar

How NVDA & OSARA are empowering blind people globally - Audio described Version: https://youtube.com/watch?v=N-y3yomLLSk&si=xiibf5ZxJzrlDnES

miki,
@miki@dragonscave.space avatar

@vick21 @chikim @pixelate Webaim not being translated to other languages makes it pretty much unrepresentative of anything in my opinion.

Not that it's the only problem with it by any chance, it's just the largest one.

Caoimhe, to random
@Caoimhe@dragonscave.space avatar

I didn't know it was so difficult to find a keyboard with the ANSI layout in Europe. I thought those were more common. Or is it just me who doesn't find the ISO layout very convenient?

miki,
@miki@dragonscave.space avatar

@x0 @Caoimhe This created problems with NVDA for years, because ctrl+alt+n is an accented letter, that we use pretty often but the NVDA desktop shortcut took that over. They handle this in the installer now, if it detects that your system is set to Polish, it uses ctrl+alt+d instead.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • JUstTest
  • khanakhh
  • DreamBathrooms
  • tacticalgear
  • magazineikmin
  • Durango
  • Youngstown
  • ngwrru68w68
  • slotface
  • osvaldo12
  • rosin
  • thenastyranch
  • kavyap
  • everett
  • provamag3
  • normalnudes
  • InstantRegret
  • cisconetworking
  • GTA5RPClips
  • mdbf
  • cubers
  • anitta
  • ethstaker
  • Leos
  • tester
  • modclub
  • megavids
  • lostlight
  • All magazines