Problems: @pydantic is great for modeling data!! but at the moment it doesn't support array data out of the box. Often array shape and dtype are as important as whether something is an array at all, but there isn't a good way to specify and validate that with the Python type system. Many data formats and standards couple their implementation very tightly with their schema, making them less flexible, less interoperable, and more difficult to maintain than they could be. The existing tools for parameterized array types like nptyping and jaxtyping tie their annotations to a specific array library, rather than allowing array specifications that can be abstract across implementations.
numpydantic is a super small, few-dep, and well-tested package that provides generic array annotations for pydantic models. Specify an array along with its shape and dtype and then use that model with any array library you'd like! Extending support for new array libraries is just subclassing - no PRs or monkeypatching needed. The type has some magic under the hood that uses pydantic validators to give a uniform array interface to things that don't usually behave like arrays - pass a path to a video file, that's an array. pass a path to an HDF5 file and a nested array within it, that's an array. We take advantage of the rest of pydantic's features too, including generating rich JSON schema and smart array dumping.
This is a standalone part of my work with @linkmlarrays and rearchitecting neurobio data formats like NWB to be dead simple to use and extend, integrating with the tools you already use and across the experimental process - specify your data in a simple yaml format, and get back high quality data modeling code that is standards-compliant out of the box and can be used with arbitrary backends. One step towards the wild exuberance of FAIR data that is just as comfortable in the scattered scripts of real experimental work as it is in carefully curated archives and high performance computing clusters. Longer term I'm trying to abstract away data store implementations to bring content-addressed p2p data stores right into the python interpreter as simply as if something was born in local memory.
OK, let's try something new. I'm not well connected because I'm bad at in person networking, and this is compounded by my decision to stop flying to conferences. So, can I use mastodon to find potential experimental colleagues who would like to work together?
Ideally for me, this would be people in Europe so I can visit by train, but it's not essential. I have some ideas for interesting projects and grant applications, and I'd love to develop those into concrete projects in close participation with experimental colleagues.
One of the main themes I'm interested in is how we can relate various neural mechanisms (e.g. inhibition, recurrence, nonlinear responses) to functions, using computational modelling to ask 'what if' questions that couldn't be answered by experiments alone.
I'm also interested in thinking about how we can use "information bottleneck" ideas to think more clearly about what computations networks of neurons are doing, going the next step beyond representing information to computing / discarding information.
A big question I'd like to answer is to find out how different brain regions work together in such a flexible and scalable way.
A technique I'm very excited about at the moment is using modern ML algorithms to train spiking neural networks at cognitively challenging tasks, making them directly comparable to both psychophysical and electrophysiological data.
Part of that could involve building in new mechanisms, like dendritic structure or neuromodulators into those networks and allowing the trained networks to make use of them in the best way possible.
I'd also love to build jointly motivated experimental and theoretical/synthetic datasets to test models against.
If any of that sounds interesting to you, take a look at some of my recent papers and get in touch. I'd love to hear from you.
I just want to run some spike detection code I made a while ago.
Instead of spikes I get a weird error. So now I need to update package 1, which requires updating package 2, 3, 4, 7, and 28, which in turn want a newer version of python (except package 9 which refuses to work now of course), so I also need to reinstall anaconda completely (fuck knows why the upgrade button never works)...
And of course none of that actually runs, so I need to figure out how to make things go in a docker container that is in turn wrapped in whatever the hell a singularity is?
@jonny@elduvelle@susanleemburg I had the opposite feeling, that everything with Matlab for me was always a struggle and that it was a joy to move over to Python for that reason. However, it's definitely the case that the Python ecosystem has gotten worse in terms of dependency hell than it used to be. I do wonder if the easy availability of virtual environments and docker etc. have made it easier to be lazy about backwards compatibility when developing packages. We test @briansimulator on a huge array of combinations of different versions of python, operating system, etc. to guard against this and it's not that difficult to set up a continuous integration infrastructure to do this using GitHub actions. I wish more people would.
@jonny@susanleemburg@elduvelle@briansimulator I don't agree that enforcing backwards compatibility is a bad idea. Otherwise you get the problem that package X requires a particular range of versions of Y, but package Z requires a non overlapping range of versions of Y meaning that X and Z can't be used together. Python ecosystem rife with this sort of problem. Backwards compatibility eliminates this problem. Just upgrade everything.
@jonny I'll agree with there being times when it's necessary. We broke compatibility once with Brian in it's 17 years so far, and that seems like a reasonable frequency. But we also made a new package name so actually no breaks in compatibility. My worry is when breaks are frequent enough that you can't install two packages because of non overlapping requirements. This happened to me more than once, so it's a real problem.
UX peeve. Lamps that you have to tap repeatedly to adjust brightness so that if you want it to get less bright you have to cycle through more bright first. Bring back clunky analogue switches. Touch interface is bad for everything except a phone.
Thought about hypothesis testing as an approach to doing science. Not sure if new, would be interested if it's already been discussed. Basically, hypothesis testing is inefficient because you can only get 1 bit of information per experiment at most.
In practice, much less on average. If the hypothesis is not rejected you get close to 0 bits, and if it is rejected it's not even 1 bit because there's a chance the experiment is wrong.
One way to think about this is error signals. In machine learning we do much better if we can have a gradient than just a correct/false signal. How do you design science to maximise the information content of the error signal?
In modelling I think you can partly do that by conducting detailed parameters sweeps and model comparisons. More generally, I think you want to maximise the gain in "understanding" the model behaviour, in some sense.
This is very different to using a model to fit existing data (0 bits per study) or make a prediction (at most 1 bit per model+experiment). I think it might be more compatible with thinking of modelling as conceptual play.
I feel like both experimentalists and modellers do this when given the freedom to do so, but when they impose a particular philosophy of hypothesis testing on each other (grant and publication review), this gets lost.
Incidentally this is also exactly the problem with our traditional publication system that only gives you 1 bit of information about a paper (that it was accepted), rather than giving a richer, open system of peer feedback.
@jonny well the first one seems to be arguing for longer spent before doing a hypothesis test so arguably that's an even lower information rate overall. 😉 The second one seems closer but I haven't read past first page yet. Does it talk about how we could have a richer error signal?
@jonny will definitely read, looked interesting and that's a strong recommendation.I think my point is maybe something like: if the real value is not the output of the experiment but the exploratory work, shouldn't we be teaching this and valuing it more highly rather than denigrating it as fishing trips and rejecting grants on this basis?
#Zotero still asking $60 for 1 year of 6Gb… 😭 @zotero I love you all but it’s 2024 now, maybe you could either reduce the price or increase what we get for it? #ReferenceManager
@elduvelle@zotero I think of it as a way to support development and sustainability. Happy to pay since I can, but also happy for others to use solutions like webdav etc.
:gt: @graph_tool is a comprehensive and efficient :python: Python library to work with networks, including structural, dynamical and statistical algorithms, as well as visualization.
It uses :cpp: C++ under the hood for the heavy lifting, making it quite fast.