jonny, to DuckDuckGo
@jonny@neuromatch.social avatar

Im as anti-"AI" as the next person, but I think its important to keep in mind the larger strategic picture of "AI" w.r.t. when it comes to - both have the problem of inaccurate information, mining the commons, etc. But Google's use of LLMs in search is specifically a bid to cut the rest of the internet out of information retrieval and treat it merely as a source of training data - replacing traditional search with search. That includes a whole ecosystem of surveillance and enclosure of information systems including assistants, chrome, android, google drive/docs/et al, and other vectors.

DuckDuckGo simply doesnt have the same market position to do that, and their system is set up as just an allegedly privacy preserving proxy. So while I think more new search engines are good and healthy, and LLM search is bad and doesnt work, I think we should keep the bigger picture in mind to avoid being reactionary, and I dont think the mere presence of LLM search is a good reason to stop using it.

More here: https://jon-e.net/surveillance-graphs/#the-near-future-of-surveillance-capitalism-knowledge-graphs-get-chatbots

jonny,
@jonny@neuromatch.social avatar

@plaidtron3000
Totally. With this I assume its just trying to keep a rough feature parity for ppl who think AI search is good, and w/ this and other recent moves I also wonder how much their hand is forced by their relationship with Microsoft via Bing. Like I said I think more search engines are good, but as usual its good to have something you can recommend to someone who is not a tech person at all and is just a normal website they can use everywhere they use google.

jonny,
@jonny@neuromatch.social avatar

@plaidtron3000
TBC I am not a DDG diehard I just think its the least bad option in the general use search engine space ATM. Though I also love searXNG and think distributed social bookmarking and indexing is the way forward.

jonny, to LLMs
@jonny@neuromatch.social avatar

Seeing people praise for finally getting rid of hallucinations through simple RAG techniques of checking for reality in eg. citations. This moment where a lot of the trivial claims against stopped being true, but the deeper harms of surveillance and information monopoly remained was inevitable and the chief danger of dismissing it as "fancy autocomplete." That is why I wrote this almost a year ago, as a warning of what comes next and what we can do about it: https://jon-e.net/surveillance-graphs/

datarama,
@datarama@hachyderm.io avatar

@jonny A footnote: RAG mitigates hallucinations, but it doesn't eliminate them.

I've had one RAG system claim that encrypting my hard drive would protect against data loss if the power gets cut. It even gave a citation (which said nothing of the sort). Another invented a bunch of non-existent functionality in a piece of software (referencing a manual that didn't support its claims).

Wordpress, Amazon and MDN have deployed RAG systems that also still made shit up.

jonny,
@jonny@neuromatch.social avatar

@datarama
Yes of course. I am not suggesting it actually works https://neuromatch.social/@jonny/111850288640838937

jonny, to ai
@jonny@neuromatch.social avatar

Molly White is right as usual: "We’ve already tried out having a tech industry led by a bunch of techno-utopianists and those who think they can reduce everything to markets and equations. Let’s try something new, and not just give new names to the old."

trying to articulate new ideologies for computing is where my mind has been at the last few years too. i joke about the 'anti-perf manifesto,' but forging imaginaries that can run on computers that are actively antagonistic to the techno-utopians is all about killing myths of heroism where we are the someone else who goes out and "brings home the spoils." how do we reach a computing that isn't foundationally based on asymmetric power, we serfs at the mercy of the lord of the platform and vice versa, we altrustic platform providers building things the commoners couldn't possibly understand. The language of "scale" where one or a few services need to expand to provide for millions hides futures where we can provide for each other horizontally in overlapping quilts of dozens, hundreds. You could shorthand the "" boom as the continuation of the information conglomerates trying to provide the everything platform, and if our dreams are to meaningfully challenge theirs we can't also aspire to simply "do what they're doing, except it's us doing it."

I tried to articulate this as the cloud orthodoxy vs. a still-nebulous idea i've landed on as vulgarity in computing, but i'll probably be orbiting this idea for as long as i am on line.

re: @molly0xfff
https://hachyderm.io/@molly0xfff/111475137431905986
and
https://newsletter.mollywhite.net/p/effective-obfuscation

The world is asymmetrical and hierarchical. I am a consumer, a user and I trade my power to a developer or platform owner in exchange for convenience. The purpose of the internet is for platform holders to provide services to users. As a user I have a right to speak with the manager, but do not have a right to decide which services are provided or how. As a platform owner I have a right to demand whatever the users will give me in exchange for my services. Services are rented or given away freely56 rather than sold because to the user the product is convenience rather than software. Powerlessness is a feature: users don’t need to learn anything, and platform owners can freely experiment on users to optimize their experience without their knowledge. Information is asymmetrical in multiple ways: platforms collect and hold more information than the users can have and parcel it back out as services. But also, platform holders are the only ones who know how to create their services, and so they are responsible for the convenience prescribed for a platform but not the convenience of users understanding how to make the platform themselves.
Our infrastructures are social. There is no class distinction between “developer” and “user.” We resist concentrated power in favor of mutual empowerment. We don’t seek to cultivate dependence in councils of elders or create new chokepoints of control. Anything worth making is a potential source of power, so anything worth making is worth distributing governance of. We don’t assume the needs of others, but make tools to empower everyone to meet their own needs. We don’t make platforms, we make protocols with rough consensus based on what works. We are autonomous, but neither isolated nor selfish. Our dream is not one of solipsism, glued to our feed, being stuffed with the pellets of our social reality. We are radically responsible for one another, and by organizing together we can provide services as mutual aid. Mutual empowerment means that we are free to come and go as we please, even if we might be missed. We have no love for venerated institutions and organize fluidly, making systems so we can merge and fork105 code and ourselves freely [223, 224].

jonny,
@jonny@neuromatch.social avatar

@mauve yeah agreed. I think most wiki engines went the route of "lets be the everything app platform" (xwiki is explicitly this) rather than "lets distill something core about the wiki model and make it wildly interoperable" - most of what I have reused from that package is the interface to mediawiki whose API is mysteriously godawful. I think if they looked more like ways to link a bunch of subpage chunks from different mediums together they would be a lot more interesting, and a way of bridging interfaces like discord and fb by writing adapters that represent their slots and verbs.

jonny,
@jonny@neuromatch.social avatar

@mauve its a shame the matrix bridges are so promising and yet so limiting, like the double puppeting stuff is cool but extremely underused just to do 1:1 mirrors of channels across mediums, like it was easier to abandon matrix bridges altogether than it was to do an all-to-all bridge between multiple channels across slack and discord

jonny, to Amazon
@jonny@neuromatch.social avatar

releases details on its Alexa , which will use its constant surveillance data to "personalize" the model. Like , they're moving away from wakewords towards being able to trigger Alexa contextually - when the assistant "thinks" it should be responding, which of course requires continual processing of speech for content, not just a word.

The consumer page suggests user data is "training" the model, but the developer page describes exactly the augmented LLM, iterative generation process grounded in a personal knowledge graph that Microsoft, Facebook, and Google all describe as the next step in LLM tech.

https://developer.amazon.com/en-US/blogs/alexa/alexa-skills-kit/2023/09/alexa-llm-fall-devices-services-sep-2023

We can no longer think of LLMs on their own when we consider these technologies, that era was brief and has passed. Ive been waving my arms up and down about this since chatGPT was released - criticisms of LLMs that stop short at their current form, arguing about whether the language models themselves can "understand" language miss the bigger picture of what they are intended for. These are surveillance technologies that act as interfaces to knowledge graphs and external services, putting a human voice on whole-life surveillance

https://jon-e.net/surveillance-graphs/#the-near-future-of-surveillance-capitalism-knowledge-graphs-get-chatbots

Interest in these multipart systems is widespread, and arguably the norm: A group of Meta researchers described these multipart systems as “Augmented Language Models” and highlight their promise as a way of “moving away from language modeling” [190]. Google’s reimaginations of search also make repeated reference to interactions with knowledge graphs and other systems [184]. A review of knowledge graphs with authors from Meta, JPMorgan Chase, and Microsoft describes a consensus view that knowledge graphs are essential to compositional behavior75 in AI [5]. Researchers from Deepmind (owned by Google) argue that research focus should move away from simply training larger and larger models towards “inference-time compute,” meaning querying the internet or other information sources [191].
The immersive and proactive design of KG-LLM assistants also expand the expectations of surveillance. Current assistant design is based around specific hotwords, where unless someone explicitly invokes it then the expectation is that it shouldn’t be listening. Like the shift in algorithmic policing from reactive to predictive systems, these systems are designed to be able to make use of recent context to actively make recommendations without an explicit query 86. Google demonstrates being able to interact with an assistant by making eye contact with a camera in its 2022 I/O keynote [194]. A 2022 Google patent describes a system for continuously monitoring multiple sensors to estimate the level of intended interaction with the assistant to calibrate whether it should respond and with what detail. The patent includes examples like observing someone with multiple sensors as they ask aloud “what is making that noise?” and look around the room, indicating an implicit intention of interacting with the assistant so it can volunteer information without explicit invocation [201]. A 2021 Amazon patent describes an assistant listening for infra- and ultrasonic tags in TV ads so that if someone asks how much a new bike costs after seeing an ad for a bike, the assistant knows to provide the cost of that specific bike [202]. These UX changes encourage us to accept truly continual surveillance in the name of convenience — it’s good to be monitored so I can ask google “what time is the game”
This pattern of interaction with assistants is also considerably more intimate. As noted by the Stochastic Parrots authors, the misperception of animacy in assistants that mimic human language is a dangerous invitation to trust them as one would another person — and with details like Google’s assistant “telling you how it is feeling,” these companies seem eager to exploit it. A more violent source of trust prominently exploited by Amazon is insinuating a state of continual threat and selling products to keep you safe: its subsidiary Ring’s advertising material is dripping with fantasies of security and fear, and its doglike robot Astro and literal surveillance drone are advertised as trusted companions who can patrol your home while you are away [203, 204, 205]. Amazon patents describe systems for using the emotional content of speech to personalize recommendations87 and systems for being able to “target campaigns to users when they are in the most receptive state to targeted advertisements” [206, 207]. The presentation of assistants as always-present across apps, embodied in helpful robots, or as other people eg. by being present in a contact list positions them to take advantage of people in emotionally vulnerable moments. Researchers from the Center for Humane Technology88 describe an instance where Snapchat’s “My AI,” accessible from its normal chat interface, encouraged a minor to have a sexual encounter with an adult they met on Snapchat (47:10 in [208]).

lkngrrr,
@lkngrrr@hachyderm.io avatar

@jonny @ewhac The patents on beamforming were all about steering the mic arrays. Our brains and ears do a lot of directional filtering that computers just cant out of the box.

And yep, manual annotation is the only way to build a golden set that you trust, especially on such a wide-ranging data set!

And my apologies if I came off heavy handed, I forgot I’d moved Alexa out of my profile and robbed you of that context.

VE2UWY,
@VE2UWY@mastodon.radio avatar

@jonny

So Alexa (and others, surely) got from Always Listening, hoping to hear a magic word by interpreting every sound it hears at HQ to Always Listening, planning to insert itself into the conversation when commercial opportunities avail themselves.

When you talk to a family member about this pain you've been having, will the underpaid contract Amazon driver show up with aspirin or do you have to opt out?

jonny, to random
@jonny@neuromatch.social avatar

The NYTimes story on the AI writing news is a story about the repackaging of the knowledge graph. the language model is just an interface. Repackaging as an assistant, the examples of broken factboxes, the sale as a labor saving device, "we don't intend to replace your writers, we want to give you more convenient access to factual information" - here's a piece that should help make sense of that.

https://jon-e.net/surveillance-graphs/#the-lens-of-search-re-centers-our-focus-away-from-the-generative

The lens of search re-centers our focus away from the generative capabilities of LLMs towards parsing natural language: one of the foundations of contemporary search and what information giants like Google have spent the last 20 years building. The context of knowledge graphs that span public “factual” information with private “personal” information gives further form to their future. The Microsoft Copilot model above is one high-level example of the intended architecture: LLMs parse natural language queries, conditioned by factual and personal information within a knowledge graph, into computer-readable commands like API calls or other interactions with external applications, which can then have their output translated back into natural language as generated by the LLM. Facebook AI researchers describe another “reason first, then respond” system that is more specifically designed to tune answers to questions with factual knowledge graphs [189]. The LLM being able to “understand” the query is irrelevant, it merely serves the role as a natural language interface to other systems.
Historically, these personal assistants have worked badly83 and are rightly distrusted84 by many due to the obvious privacy violation represented by a device constantly recording ambient audio85. Impacts from shifts in assistants might be then limited by people simply continuing to not use them. Knowledge graph-powered LLMs appear to be a catalyst in shifting the form of these assistants to make them more difficult to avoid. There is already a clear push to merge assistants with search — eg. Bing Search powered by chatGPT, and Google has merged its Assistant team with the team that is working on its LLM search, Bard [199]. Microsoft’s Copilot 365 demo also shows a LLM prompt modeled as an assistant integrated as a first-class interface feature in its Office products. Google’s 2022 I/O Keynote switches fluidly between a search-like, document-like, and voice interface with its assistant. Combined with the restructuring of App ecosystems to more tightly integrate with assistants, their emerging form appears to look less like a traditional voice assistant and more like a combined search, app launcher, and assistant underlay that is continuous across devices. The intention is to make the assistant the primary means of interacting with apps and other digital systems. As with many stretches of the enclosure of the web, UX design is used as a mechanism to coerce patterns of expectation and behavior.
Regardless of how well this new iteration of assistants work, the intention of their design is to dramatically deepen the intimacy and intensity of surveillance and further consolidate the means of information access.

jonny,
@jonny@neuromatch.social avatar

The rewriting titles idea is perfectly in line with what they discuss in their investor calls in the context of advertising. it's a natural move if you see the LLMs as scope-limited enterprise tools that are intend to hook companies into dependence on their information access systems (consolidation of power) and hook people into them as means of interacting with an ecosystem of apps, commerce, etc. (intimacy of surveillance).

The debate about whether the LLMs are sentient is not serving us well. It's true, of course they aren't sentient, but it's obscuring more of the truth of the strategy than it is innoculating us against it at this point. Whether the LLMs are sentient is irrelevant because the plan was never to just continue to use the LLMs on their own. They are interfaces to other systems, can be presented as tools that can be conditioned by "factual information."

They won't work as advertised, of course, but we have to be very clear about the threat:
The threat is not that LLMs will write the news. That's already happening, do any search.
The threat is that the LLMs will be used to leverage greater control over our access to information by destabilizing our already fragile information ecosystem and presenting themselves as precisely not sentient, but handy assistants to interact with trusted databases - the last trustable sources of information left.

The addition of context-optimized clickbait headers for those willing to pay to be the brand beneath them is just an especially cynical product to sell to whichever suckers are desperate enough to buy it.

https://jon-e.net/surveillance-graphs/#the-most-obvious-power-grab-from-pushing-kg-llms-in-place-of-sea

jonny, to random
@jonny@neuromatch.social avatar

Glad to formally release my latest work - Surveillance Graphs: Vulgarity and Cloud Orthodoxy in Linked Data Infrastructures.

web: https://jon-e.net/surveillance-graphs
hcommons: https://doi.org/10.17613/syv8-cp10

A bit of an overview and then I'll get into some of the more specific arguments in a thread:

This piece is in three parts:

First I trace the mutation of the liberatory ambitions of the into , an underappreciated component in the architecture of . This mutation plays out against the backdrop of the broader platform capture of the web, rendering us as consumer-users of information services rather than empowered people communicating over informational protocols.

I then show how this platform logic influences two contemporary public information infrastructure projects: the NIH's Biomedical Data Translator and the NSF's Open Knowledge Network. I argue that projects like these, while well intentioned, demonstrate the fundamental limitations of platformatized public infrastructure and create new capacities for harm by their enmeshment in and inevitable capture by information conglomerates. The dream of a seamless "knowledge graph of everything" is unlikely to deliver on the utopian promises made by techno-solutionists, but they do create new opportunities for algorithmic oppression -- automated conversion therapy, predictive policing, abuse of bureacracy in "smart cities," etc. Given the framing of corporate knowledge graphs, these projects are poised to create facilitating technologies (that the info conglomerates write about needing themselves) for a new kind of interoperable corporate data infrastructure, where a gradient of public to private information is traded between "open" and quasi-proprietary knowledge graphs to power derivative platforms and services.

When approaching "AI" from the perspective of the semantic web and knowledge graphs, it becomes apparent that the new generation of are intended to serve as interfaces to knowledge graphs. These "augmented language models" are joint systems that combine a language model as a means of interacting with some underlying knowledge graph, integrated in multiple places in the computing ecosystem: eg. mobile apps, assistants, search, and enterprise platforms. I concretize and extend prior criticism about the capacity for LLMs to concentrate power by capturing access to information in increasingly isolated platforms and expand surveillance by creating the demand for extended personalized data graphs across multiple systems from home surveillance to your workplace, medical, and governmental data.

I pose Vulgar Linked Data as an alternative to the infrastructural pattern I call the Cloud Orthodoxy: rather than platforms operated by an informational priesthood, reorienting our public infrastructure efforts to support vernacular expression across heterogeneous mediums. This piece extends a prior work of mine: Decentralized Infrastructure for (Neuro)science) which has more complete draft of what that might look like.

(I don't think you can pre-write threads on masto, so i'll post some thoughts as I write them under this) /1

jonny,
@jonny@neuromatch.social avatar

Though the aims of the project themselves dip into the colonial dream of the great graph of everything, the true harms for both of these projects come what happens with the technologies after they end. Many information conglomerates are poised to pounce on the infrastructures built by the NIH and NSF projects, stepping in to integrate their work or buy the startups that spin off from them.

The NSF's Open Knowledge Network is much more explicitly bound to the national security and economic interests of the US federal government, intended to provide the infrastructure to power an "AI-driven future." That project is at a much earlier stage, but in its early sketches it promises to take the same patterns of knowledge-graphs plus algorithmic platforms and apply them to government, law enforcement, and a broad range of other domains.

This pattern of public graphs for private profits is well underway at existing companies like Google, and I assume the academics and engineers in both of these projects are operating with the best of intentions and perhaps playing a role they are unaware of.

/6

ronent,

@tkuhn @bengo @photocyte @jonny @knowledgepixels
Love this thread synthesizing so many cool directions! Adding our own perspective to the mix (along with @InferenceActive on birdsite) :
https://osf.io/preprints/metaarxiv/9nb3u/
TL;DR
Trying to make the case that attention/sensemaking data (eg what researchers are attending to and their assessments of content) are an important kind of nano-scientific knowledge that gets extracted by platforms instead of helping to power content curation and discovery networks

jonny, to random
@jonny@neuromatch.social avatar

freaking finally

Title page for the same. Author: Jonny L. Saunders UCLA - Department of Neurology, Institute of Pirate Technology Abstract: Information is power, and that power has been largely enclosed by a handful of information conglomerates. The logic of the surveillance‐driven information economy demands systems for handling mass quantities of heterogeneous data, increasingly in the form of knowledge graphs. An archaeology of knowledge graphs and their mutation from the liberatory aspirations of the semantic web gives us an underexplored lens to understand contemporary information systems. I explore how the ideology of cloud systems steers two projects from the NIH and NSF intended to build information infrastructures for the public good to inevitable corporate capture, facilitating the development of a new kind of multilayered public/private surveillance system in the process. I argue that understanding technologies like large language models as interfaces to knowledge graphs is critical to understand their role in a larger project of informational enclosure and concentration of power. I draw from multiple histories of liberatory information technologies to develop Vulgar Linked Data as an alternative to the Cloud Orthodoxy, resisting the colonial urge for universality in favor of vernacular expression in peer to peer systems. Original Publication: May 3rd, 2023 Document source: https://github.com/sneakers-the-rat/surveillance-graphs Web: https://jon-e.net/surveillance-graphs

jonny,
@jonny@neuromatch.social avatar

@parrhesiastic
I would be honored to receive ur criticism ♥️♥️♥️

jonny,
@jonny@neuromatch.social avatar

@Samuelmoore
oh dang now I really hope it's good ♥️

jonny, to random
@jonny@neuromatch.social avatar

ok we might not make it to an arXiv submission today, but the document is all prepped and ready to go except the abstract so we definitely will make tomorrow. phew. finally.

jonny, to random
@jonny@neuromatch.social avatar

sometimes big data solutionism jumps the shark and is just very funny

harnessing the vast amounts of data generated in every sphere of life and transforming them into useful, actionable information and knowledge is crucial to the efficient functioning of a modern society

from NSF's Open Knowledge Network roadmap

theLastTheorist,

@jonny Maybe they ingested the research and it gave the OKN indigestion. That's what happened to me at least.

flourn0,

@jonny This is The Truth.

jonny, to random
@jonny@neuromatch.social avatar

fuck it. this piece is long because the story is long. we're doing a final round of copy editing and putting it on arXiv tomorrow.

jonny,
@jonny@neuromatch.social avatar

beep beep now it's time for some pee too pee

jonny, to random
@jonny@neuromatch.social avatar

talkin bout you nerds on the fedi here. love ya nerds.

jonny, to random
@jonny@neuromatch.social avatar

I don't remember writing this but boy do i hate it.

irenes,
@irenes@mastodon.social avatar

@jonny we agree with your conclusions

  • All
  • Subscribed
  • Moderated
  • Favorites
  • megavids
  • kavyap
  • DreamBathrooms
  • khanakhh
  • magazineikmin
  • InstantRegret
  • ethstaker
  • thenastyranch
  • Youngstown
  • rosin
  • slotface
  • osvaldo12
  • everett
  • ngwrru68w68
  • JUstTest
  • Durango
  • cubers
  • tester
  • GTA5RPClips
  • modclub
  • mdbf
  • cisconetworking
  • tacticalgear
  • Leos
  • normalnudes
  • anitta
  • provamag3
  • lostlight
  • All magazines