Pleased to share my latest research "Zero-shot counting with a dual-stream neural network model" about a glimpsing neural network model the learns visual structure (here, number) in a way that generalises to new visual contents. The model replicates several neural and behavioural hallmarks of numerical cognition.
I feel there has to be a way of training neural networks to recognise the influence of their training data on the output.
This would probably include training a complementary indexing network + database that would then ”reverse-training” resolve and offer at some predetermined accuracy the #copyright-viable sources for each generated #aiart
I need some help though. A proof would show the companies know it can be done, but they just don’t want to.
Most of the Artificial Neural Net simulation research I have seen (say, at venues like NeurIPS) seems to take a very simple conceptual approach to analysis of simulation results - just treat everything as independent observations with fixed effects conditions, when it might be better conceptualised as random effects and repeated measures. Do other people think this? Does anyone have views on whether it would be worthwhile doing more complex analyses and whether the typical publication venues would accept those more complex analyses? Are there any guides to appropriate analyses for simulation results, e.g what to do with the results coming from multi-fold cross-validation (I presume the results are not independent across folds because they share cases).
14 years after Alan Turing's death, an unpublished manuscript emerged where he suggested the idea of a "disordered" computer that anticipated the rise of connectionism.
"Two dangerous falsehoods afflict decisions about artificial intelligence:
First, that neural networks are impossible to understand. Therefore, there is no point in trying.
Second, that neural networks are the only and inevitable method for achieving advanced AI. Therefore, there is no reason to develop better alternatives."
Quite interesting but confusing, as I come from #backpropagation DL.
If I got it right, the authors focus on showing how and why biological neural networks would benefit from being Energy Based Models for Predictive Coding, instead of Feedforward Networks employing backpropagation.
I struggled to reach where they explain how to optimize a ConvNet in PyTorch as an EB model, but they do: there is an algorithm and formulae, but I'm curious about how long and stable training is, and whether all that generalizes to typical computer vision architectures (ResNets, MobileNets, ViTs, ...).
Code is also #opensource at https://github.com/YuhangSong/Prospective-Configuration
I would like to sit a few hours at my laptop and try to better see and understand, but I think in the next days I will go to Modern #HopfieldNetworks. These too are EB and there's an energy function that is optimised by the #transformer 's dot product attention.
I think I got what attention does in Transformers, so I'm quite curious to get in what sense it's equivalent to consolidating/retrieving patterns in a Dense Associative Memory. In general, I think we're treating memory wrong with our deep neural networks. I see most of them as sensory processing, shortcut to "reasoning" without short or long term memory surrogates, but I could see how some current features may serve similar purposes...
The Machine Learning with Graphs course by Prof. 𝐉𝐮𝐫𝐞 𝐋𝐞𝐬𝐤𝐨𝐯𝐞𝐜 from Stanford University (CS224W) focuses on different methods for analyzing massive graphs and complex networks and extracting insights using machine learning models and data mining techniques. 🧵🧶👇🏼
JOSS publishes articles about open source research software. It is a free, open-source, community driven and developer-friendly online journal. JOSS reviews involve downloading and installing the software, and inspecting the repository and submitted paper for key elements
Please reach out if you are interested in reviewing this paper or know one who could review this paper.
Henry Markram, of spike timing dependent plasticity (STDP) fame and infamous for the Human Brain Project (HBP), just got a US patent for "Constructing and operating an artificial recurrent neural network": https://patents.google.com/patent/US20230019839A1/en
How is that not something thousands of undergrads are doing with PyTorch every week?
The goal, says the patent text, is for <<methods and processes for constructing and operating a recurrent artificial neural network that acts as a “neurosynaptic computer”>> – which seems patentable, but not the overreach that is patenting the construction and operation of an RNN, which is, instead, ludicrous.
Seems likely that the legal office in Markram's research institution did an overreach and got away with it. Good luck enforcing this patent though: Markram did not invent RNNs.
Now that #NeuralNetworks have had repeated big successes over the last 15 years, we are starting to look for better ways to implement them. Some new ones for me:
#Groq notes that NNs are bandwidth-bound from memory to GPU. They built a LPU specifically designed for #LLMs https://groq.com/
A wild one — exchange the silicon for moving parts, good old Newtonian physics. Dramatic drop in power utilization and maps to most NN architectures (h/t @FMarquardtGroup)
Are you interested in cortico-basal ganglia networks and would like to model them, but only have a basic proficiency in Python or computational modeling in general?
Well then, I’m happy to announce the release of CBGTPy, a software package for running biologically-realistic simulations of the cortico-basal ganglia-thalamic (CBGT) networks in a dynamic range of tasks. The latest tool out of our Exploratory Intelligence group at CMU, University of Pittsburgh, and University of the Balearic Islands (Spain).