Python is great, but stuff like this just drives me up the wall

Explanation: Python is a programming language. Numpy is a library for python that makes it possible to run large computations much faster than in native python. In order to make that possible, it needs to keep its own set of data types that are different from python’s native datatypes, which means you now have two different bool types and two different sets of True and False. Lovely.

Mypy is a type checker for python (python supports static typing, but doesn’t actually enforce it). Mypy treats numpy’s bool_ and python’s native bool as incompatible types, leading to the asinine error message above. Mypy is “technically” correct, since they are two completely different classes. But in practice, there is little functional difference between bool and bool_. So you have to do dumb workarounds like declaring every bool values as bool | np.bool_ or casting bool_ down to bool. Ugh. Both numpy and mypy declared this issue a WONTFIX. Lovely.

Image

Image alternative text

PanArab, 19 days ago

Why use bool when you can use int?

just never #define true 0

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

elxeno, 20 days ago

Type checker detecting different types?

surprisedpikachu.png

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Reddfugee42, 18 days ago

Why is this meme still so fuckin funny 😅

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

someacnt_, 21 days ago

What years of dynamic typing brainrot does to mf

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

CCF_100, 20 days ago

I learned Python as my first programming language, but ever since I got into other languages, I don’t like going back to dynamic typing…

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

ZILtoid1991, 19 days ago

That’s actually a quite bad way of naming types, even if someone really insists on using 32 bit integers for bools for “performance” reasons.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jubilationtcornpone, 20 days ago

I currently work on a NodeJS/React project and apparently I’m going to have to start pasting “‘any’ is not an acceptable return or parameter type” into every damned PR because half the crazy kids who started programming in JavaScript don’t seem to get it.

For fucks sake, we have TypeScript for a reason. Use it!

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

themagzuz, 20 days ago

if you have a pipeline running eslint on all your PRs (which you should have!), you can set no-explicit-any as an error in your eslint config so it’s impossible to merge code with any in it

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

FunkFactory, 20 days ago

+1 if you can have automated checks do part of your reviews for you, it’s a win. I never comment about code style anymore, if I care enough I’ll build it into the lint config

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

eager_eagle, 21 days ago (edited 17 days ago)

So you have to do dumb workarounds like declaring every bool values as bool | np.bool_ or casting bool_ down to bool.

these dumb workarounds prevent you from shooting yourself on the foot and not allowing JS-level shit like “1” + 2 === “12”

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

semperverus, 21 days ago

The JS thing makes perfect sense though,

“1” is a string. You declared its type by using quotes. myString = “1” in a dynamically typed language is identical to writing string myString = “1” in a statically typed language. You declare it in the symbols used to write it instead of having to manually write out string every single time.

2 is an integer. You know this because you used neither quotes nor a decimal place surrounding it. This is also explicit.

“1” + 2, if your interpreter is working correctly, should do the following

identify the operands from left to right, including their types.

note that the very first operand in the list is a string type as you explicitly declared it as such by putting it in quotes.

cast the following operands to string if they are not already.

use the string addition method to add operands together (in this case, this means concatenation).

In the example you provided, “1” + 2 is equivalent to “1” + “2”, but you’re making the interpreter do more work.

QED: “1” + 2 should, in fact, === “12”, and your lack of ability to handle a language where you declare types by symbols rather than spending extra effort writing the type out as a full english word is your own shortcoming. Learn to declare and handle types in dynamic languages better, don’t blame your own misgivings on the language.

Signed, a software engineer.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

lwuy9v5, 21 days ago

TypeError is also a correct response, though, and I think many folks would say makes more sense. Is an unnecessary footgun

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

guy, 19 days ago (edited 13 days ago)

“1” + 2 === “12” is not unique to JS (sans the requirement for the third equals sign), it’s a common feature of multiple strongly typed languages. imho it’s fine.

EDIT: I did some testing:

What it works in:

JS

TS

Java

C#

C++

Kotlin

Groovy

Scala

PowerShell

What produces a number, instead of a string:

PHP

SQL

Perl

VB

Lua

What it doesn’t work in:

R

C

Go

Swift

Rust

Python

Pascal

Ruby

Objective C

Julia

Fortran

Ada

Dart

D

Elixir

And MATLAB appears to produce 51, wtf idk

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Perhyte, 10 days ago

And MATLAB appears to produce 51, wtf idk

The numeric value of the ‘1’ character (the ASCII code / Unicode code point representing the digit) is 49. Add 2 to it and you get 51.

C (and several related languages) will do the same if you evaluate ‘1’ + 2.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

guy, 10 days ago

Oh that makes sense. I didn’t consider it might be treated as a char

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

fl42v, 20 days ago (edited 16 days ago)

Well, C has implicit casts, and it’s not that weird (although results in some interesting bugs in certain circumstances). Python is also funny from time to time, albeit due to different reasons (e.g. -5**2 is apparently -25 because of the order of operations)

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Remavas, 20 days ago

I don’t see how your example is ‘funny’. That’s what you expect to get. -5^2^ is -25. (-5)^2^ = 25.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

fl42v, 20 days ago

And how’s that different from js’s “1” + 2? One can always convert a number to string, and only sometimes – a string to a number, so it’s pretty logical to go with the former.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

RustyNova, 21 days ago

Good meme, bad reasoning. Things like that are why JavaScript is hated. While it looks the same, It should never, and in ANY case be IMPLICITLY turned into another type.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

nickwitha_k, 21 days ago

Typing and function call syntax limitations are exactly why I hate JS.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

renzev, 18 days ago

reasoning

What reasoning? I’m not trying to make any logical deductions here, I’m just expressing annoyance at a inevitable, but nevertheless cumbersome outcome of the interaction between numpy and mypy. I like python and I think mypy is a great tool, I wouldn’t be using it otherwise.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

danielquinn, 21 days ago (edited 17 days ago)

Honestly, after having served on a Very Large Project with Mypy everywhere, I can categorically say that I hate it. Types are great, type checking is great, but applying it to a language designed without types in mind is a recipe for pain.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

folkrav, 21 days ago

Adding types on an untyped project is hell. Greenfield stuff is usually pretty smooth sailing as far as I’m concerned…

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

acannan, 21 days ago

In my experience, mypy + pydantic is a recipe for success, especially for large python projects

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

scrion, 21 days ago

I wholeheartedly agree. The ability to describe (in code) and validate all data, from config files to each and every message being exchanged is invaluable.

I’m actively looking for alternatives in other languages now.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

expr, 20 days ago

You’re just describing parsing in statically-typed languages, to be honest. Adding all of this stuff to Python is just (poorly) reinventing the wheel.

Python’s a great language for writing small scripts (one of my favorite for the task, in fact), but it’s not really suitable for serious, large scale production usage.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

scrion, 20 days ago

I’m not talking about type checking, I’m talking about data validation using pydantic. I just consider mypy / pyright etc. another linting step, that’s not even remotely interesting.

In an environment where a lot of data is being exchanged by various sources, it really has become quite valuable. Give it a try if you haven’t.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

expr, 20 days ago

I understand what you’re saying—I’m saying that data validation is precisely the purpose of parsers (or deserialization) in statically-typed languages. Type-checking is data validation, and parsing is the process of turning untyped, unvalidated data into typed, validated data. And, what’s more, is that you can often get this functionality for free without having to write any code other than your type (if the validation is simple enough, anyway). Pydantic exists to solve a problem of Python’s own making and to reproduce what’s standard in statically-typed languages.

In the case of config files, it’s even possible to do this at compile time, depending on the language. Or in other words, you can statically guarantee that a config file exists at a particular location and deserialize it/validate it into a native data structure all without ever running your actual program. At my day job, all of our app’s configuration lives in Dhall files which get imported and validated into our codebase as a compile-time step, meaning that misconfiguration is a compiler error.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

scrion, 20 days ago (edited 14 days ago)

I am aware of what you are saying, however, I do not agree with your conclusions. Just for the sake of providing context for our discussion, I wrote plenty of code in statically typed languages, starting in a professional capacity some 33 years ago when switching from pure TASM to AT&T C++ 2, so there is no need to convince me of the benefits :)

That being said, I think we’re talking about different use cases here. When I’m talking configuration, I’m talking runtime settings provided by a customer, or service tech in the field - that hardly maps to a compiler error as you mentioned. It’s also better (more flexible / higher abstraction) than simply checking a JSON schema, and I’m personally encountering multiple new, custom JSON documents every week where it has proven to be a real timesaver.

I also do not believe that all data validation can be boiled down to simple type checking - libraries like pydantic handle complex validation cases with interdependencies between attributes, initialization order, and fields that need to be checked by a finite automaton, regex or even custom code. Sure, you can graft that on after the fact, but what the library does is provide a standardized way of handling these cases with (IMHO) minimal clutter. I know you basically made that point, but the example you gave is oversimplified - at least in what I do, I rarely encounter data that can be properly validated by simple type checking. If business logic and domain knowledge has to be part of the validation, I can save a ton of boilerplate code by writing my validations using pydantic.

Type annotations are a completely orthogonal case and I’ll be the first to admit that Python’s type situation is not ideal.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

renzev, 18 days ago

Gradual typing isn’t reinventing the wheel, it’s a new paradigm. Statically typed code is easier to write and harder to debug. Dynamically typed code is harder to debug, but easier to write. With gradual typing, the idea is that you can first write dynamic code (easier to write), and then – wait for it – GRADUALLY turn it into static code by adding type hints (easier to debug). It separates the typing away from the writing, meaning that the programmer doesn’t have to multitask as much. If you know what you’re doing, mypy really does let you eat your cake and keep it too.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Ephera, 21 days ago

So many people here explaining why Python works that way, but what’s the reason for numpy to introduce its own boolean? Is the Python boolean somehow insufficient?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

palordrolap, 21 days ago (edited 21 days ago)

Someone else points out that Python's native bool is a subtype of int, so adding a bool to an int (or performing other mixed operations) is not an error, which might then go on to cause a hard-to-catch semantic/mathematical error.

I am assuming that trying to add a NumPy bool_ to an int causes a compilation error at best and a run-time warning, or traceable program crash at worst.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

mynachmadarch, 21 days ago

Technically the Python bool is fine, but it's part of what makes numpy special. Under the hood numpy uses c type data structures, (can look into cython if you want to learn more).

It's part of where the speed comes from for numpy, these more optimized c structures, this means if you want to compare things (say an array of booleans to find if any are false) you either need to slow back down and mix back in Python's frameworks, or as numpy did, keep everything cython, make your own data type, and keep on trucking knowing everything is compatible.

There's probably more reasons, but that's the main one I see. If they depend on any specific logic (say treating it as an actual boolean and not letting you adding two True values together and getting an int like you do in base Python) then having their own also ensures that logic.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Ephera, 21 days ago

You know, at some point in my career I thought, it was kind of silly that so many programming languages optimize speed so much.

But I guess, that’s what you get for not doing it. People having to leave your ecosystem behind and spreading across Numpy/Polars, Cython, plain C/Rust and probably others. 🫠

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

rwhitisissle, 20 days ago

This is the only actual explanation I’ve found for why numpy leverages its own implementation of what is in most languages a primitive data type, or a derivative of an integer.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

baod_rate, 21 days ago

From numpy’s docs:

The bool_ data type is very similar to the Python bool but does not inherit from it because Python’s bool does not allow itself to be inherited from, and on the C-level the size of the actual bool data is not the same as a Python Boolean scalar.

and likewise:

The int_ type does not inherit from the int built-in under Python 3, because type int is no longer a fixed-width integer type.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

breadsmasher, 21 days ago (edited 17 days ago)

here’s a good question answer on this topic

stackoverflow.com/…/boolean-and-type-checking-in-…

plus this is kinda the tools doing their jobs.

bool_ exists for whatever reason. its not a bool but functionally equivalent.

the static type checker mpy, correctly, states bool_ and bool aren’t compatible. in the same way other type different types aren’t compatible

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

nickwitha_k, 21 days ago (edited 17 days ago)

Data typing is important. If two types do not have the same in-memory representation but you treat them like they do, you’re inviting a lot of potential bugs and security vulnerabilities to save a few characters.

ETA: The WONTFIX is absolutely the correct response here. This would allow devs to shoot themselves in the foot for no real gain, eliminating the benefit of things like mypy. Type safety is your friend and will keep you from making simple mistakes.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

owsei, 21 days ago

Even if they do have the same in-memory representation, you may want to assert types as different just by name.

AccountID: u64

TransactionID: u64

have the same in-memory representation, but are not interchangeable.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pingveno, 20 days ago

Python does allow this with NewType. Type checkers see two different types, but it is the same class at runtime.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

nickwitha_k, 21 days ago

That is a very solid point. If user-defined types are NOT explicitly defined as compatible (supposing language support), they should not be.

In your example, if it were, say a banking system, allowing both types to be considered equivalent is just asking for customer data leaks.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Telorand, 21 days ago (edited 18 days ago)

bool_ via Numpy is its own object, and it’s fundamentally different from bool in Python (which is itself a subclass of int, whereas bool_ is not).

They are used similarly, but they’re similar in the same way a fork and a spork can both be used to eat spaghetti.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Donkter, 21 days ago

And do you eat that spaghetti out of a bool?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

holgersson, 19 days ago

No i write some spaghetti with a lot of bools

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

HStone32, 21 days ago (edited 18 days ago)

I/O Issues are problems that come with the territory for scripting languages like python. Its why I prefer to use bash for scripting instead, because in bash, all I/O are strings. And if there are ever any conflicts, well that’s what awk/sed/Perl are for.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

DmMacniel, 21 days ago

Well yeah just because they kinda mean the same thing it doesn’t mean that they are the same. I can wholly understand why they won’t “fix” your inconvenience.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

wizardbeard, 21 days ago

Unless I’m missing something big here, saying they “kinda mean the same thing” is a hell of an understatement.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

nickwitha_k, 21 days ago

They are two different data types with potentially different in-memory representations.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Ephera, 21 days ago

Well, yeah, but they do mean the exact same thing, hopefully: true or false

Although thinking about it, someone above mentioned that the numpy bool_ is an object, so I guess that is really: true or false or null/None

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

nickwitha_k, 21 days ago

In an abstract sense, they do mean the same things but, in a technical sense, the one most relevant to programming, they do not.

The standard Python bool type is a subclass of the integer type. This means that it is stored as either 4 bytes (int32) or 8 bytes (int64).

The numpy.bool_ type is something closer to a native C boolean and is stored in 1 byte.

So, memory-wise, one could store a numpy.bool_ in a Python bool but that now leaves 3-7 extra bytes that are unused in the variable. This introduces not just unnecessary memory usage but potential space for malicious data injection or extraction. Now, if one tries to store a Python bool in a numpy.bool_, if the interpreter or OS don’t throw an error and kill the process, you now have a buffer overflow/illegal memory access problem.

What about converting on the fly? Well, that can be done but will come at a performance cost as every function that can accept a numpy.bool_ now has to perform additional type checking, validation, and conversion on every single function call. That adds up quick when processing data on scales where numpy is called for.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

rikudou, 21 days ago (edited 21 days ago)

I mean, naming something bool_ should be the first red flag. Python and its ecosystem is a shit show.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

breadsmasher, 21 days ago

Numpy named it bool_ not base python

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

rikudou, 21 days ago

I meant the whole ecosystem, not only Python.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

breadsmasher, 21 days ago (edited 18 days ago)

This explanation is pretty clear cut

What exactly is your use case for treating np.bool_ and bool as interchangeable? If np.bool_ isn’t a subclass of bool according to Python itself, then allowing one to be used where the other is expected just seems like it would prevent mypy from noticing bugs that might arise from code that expects a bool but gets an np.bool_ (or vice versa), and can only handle one of those correctly.

mpy and numpy are opensource. You could always implement the fix you need yourself ?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Ephera, 21 days ago

They’ve declared it as WONTFIX, so unless you’re suggesting that OP creates a fork of numpy, that’s not going to work.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

breadsmasher, 21 days ago (edited 17 days ago)

Well, yes exactly

Create fixes

Request merge. assume denied

Fork numpy and add your changes there

after just continue to pull new changes over from source of the fork and deal with any merge issues with the fix

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

fubbernuckin, 21 days ago

That’s incredibly inconvenient.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

breadsmasher, 21 days ago

Thats what adding strong typing does for you

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

fossphi, 21 days ago

Fork numpy

I have a feeling that you’re grossly underestimating the magnitude of this endeavour

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

breadsmasher, 21 days ago

Im making no estimation one way or the other

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Add comment