No disrespect to Wes McKinney (I don’t like #pandas, but I personally could have never done something like that myself), but there’s literally 0 reason (apart from running legacy code) to use #pandas now when there’s #polars on #Python. With #RStats, #dplyr is still the GOAT
@brodriguesco long time #rstats dplyr user here who has been shifting to #python. Mostly working in Pandas and love a few things like pd.read_sql() -- I'm going ETL, not analysis-- but this is a reminder that I need to check out polars!
@brodriguesco sure! I switched jobs, if still mostly analyzing data I'd use R and the tidyverse.
I'm the only R user here, a few teammates know Python
I mostly do ETL, no stats anymore, and my main applications are Apache Airflow for data migration and Apache Superset for dashboards and reports. Airflow DAGs have to be Python, both applications are written in Python and I contribute to Superset.
I still use R for intense data wrangling where I feel there's no other choice.
@samfirke thank you for your perspective! I agree, pandas is really behind dplyr, polars really closes that gap I must say. You can really feel the influence dplyr had on polar's design when you're using it.
@friessn@samfirke my guess? Ibis will become the de facto dplyr-like library for Python, and more tidyverse-inspired packages will join ibis and in due time, there will be a tidyverse-like set of packages for Python. Pandas will be kept around for historical reasons like plyr
It's complicated. It'll get there. But it may take a toolchain update (or two, or three, ...) to get there. But all things avant-garde become mainstream eventually due to the passage of time ...
Add comment