#polars - kbin.social

maren, 18 days ago to random

I'm happy to announce that I made my first contribution to the #polars library. 🤩 Polars is becoming increasingly popular in the world of data and I can very much recommend checking it out: https://github.com/pola-rs/polars . Big thanks to @marcogorelli for supporting me! #womeninfoss

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ CodenameTim

sergi, 1 month ago to random

Client libraries are better when they have no API: https://csvbase.com/blog/7

#CSV #pandas #polars #dask #curl #fsspec

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

swatantra, 2 months ago to datascience

Has anyone tried #Polars dataframes in #R? How was your experience, especially when working with a large dataset?

#DataScience @rstats

reply

expand (2)

collapse (2)

report

activity

copy /kbin url

copy original url

open original url

Loading...

astronomerritt, 3 months ago to python

#Python folk: is there any reason to use #Pandas over #Polars?

Also, does anyone with any experience using both (especially for large data frames) want to weigh in on how much better Polars is re: speed and memory?

Please do not reply with something that does not answer either of these questions. Even if you think it's really helpful. Bear in mind you have no idea what I am doing or why and I have asked these specific questions for a reason.

reply

expand (4)

collapse (4)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ maegul, juandesant, vicgrinberg

fohrloop, 3 months ago to python

Using #polars or #duckdb for interactive dashboard, where all data might not fit into memory (need for streaming algorithms / out-of-core computing).

Which one would you suggest? Both seem to be pretty awesome!

https://github.com/pola-rs/polars
https://github.com/duckdb/duckdb

#python #datascience #datasciencetools

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ fabian

peter_mcmahan, 3 months ago to datascience

It seems like no matter how how fancy the the data science tool (Postgresql, Polars, DuckDB, ...) I always end up with a combination of plain text (CSV/jsonl) and LMDB as the fastest and most practical solution.

I get that for production systems those other tools are great, but for one-off academic data-processing pipelines, plain text and LMDB are the only ones that never choke.

#datascience #database #polars #postgresql #lmdb

reply

expand (1)

collapse (1)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ hyc

brodriguesco, 4 months ago to python

So, how come it’s possible to write (in #Polars on #Python) dataset.filter(columnA = "1") if Python doesn’t have NSE? What am I missing or misunderstanding?

reply

expand (7)

collapse (7)

report

activity

copy /kbin url

copy original url

open original url

Loading...

kellogh, 4 months ago to opensource

#polars is the ideal #opensource project, imo. it hits all the important things for me

#rust

#python

#datascience

replacing #pandas

performance engineering

integrates with a large open ecosystem instead of creating a walled garden

pleasant to use

https://github.com/pola-rs/polars/releases/tag/rs-0.36.2

reply

expand (8)

collapse (8)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ janriemer

andrew, 9 months ago to random

Regular PSA that @grrrck 's tidyexplain animations are phenomenal for visualizing what happens with all of {dplyr}'s join functions and {tidyr}'s pivot_wider and pivot_longer (see all of them here: https://www.garrickadenbuie.com/project/tidyexplain/) #rstats

Animation showing how left_join combines two datasets

reply

expand (5)

collapse (5)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ grrrck

Volker, 9 months ago

@andrew @grrrck Super useful. Has anyone adapted these for #pandas or #polars yet? Might be handy for #python users.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

ChristosArgyrop, 9 months ago to python

Until a truly performant (= fast, low memory footprint) two dimensional storage ("table") type (*) emerges, what are the options for managing big data in #perl?

DBI into a performant DBMS (#clickhouse/ #MariaDB column store/ #duckdb)

shell over #R's data.table or #python's #polars / data.table packages, use files to get data in and some form of IPC to get data out

#PDL , others ?
(*) this a list of things one could encapsulate as objects for #perltable
https://duckdb.org/2023/04/14/h2oai.html
@Perl

reply

expand (4)

collapse (4)

report

activity

copy /kbin url

copy original url

open original url

Loading...

datascience, 9 months ago to random

Polars is a lightning fast DataFrame library/in-memory query engine with parallel execution and cache efficiency. And now you can use is with the tidyverse syntax: https://www.tidypolars.etiennebacher.com/ #rstats #polars #optimisation

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

victorp, 9 months ago to python

5 Latest Tools You Should Be Using With Python for Data Science.
🗂️ The article provides insightful details on tools like ConnectorX, DuckDB, Optimus, Polars, and Snakemake which could enhance data wrangling, querying, manipulation, and workflow automation capabilities.

🧰 ConnectorX: Simplifying the Loading of Data

🧰 DuckDB: Empowering Analytical Query Workloads

🧰 Optimus: Streamlining Data Manipulation

🧰 Polars: Accelerating DataFrames

🧰 Snakemake: Automating Data Science Workflows

https://www.makeuseof.com/latest-python-data-science-tools/

#Python #DataScience #ConnectorX #DuckDB #Optimus #Polars #Snakemake #Programming #DataAnalysis #Productivity

reply

expand (2)

collapse (2)

report

activity

copy /kbin url

copy original url

open original url

Loading...

Stark9837, 9 months ago

@victorp

Haven't heard of any of them! Thanks, will check them out!

#python #datascience #connectorx #duckdb #optimus #polars #snakemake #programming #dataanalysis #productivity

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pandas_dev, 11 months ago to random

Check out Patrick Hoefler's new blog about his experience #benchmarking #pandas and #Polars:

https://levelup.gitconnected.com/benchmarking-pandas-against-polars-from-a-pandas-pov-554416a863db

It has some nice details about how to optimize pandas code ✨

reply

expand (1)

collapse (1)

report

activity

copy /kbin url

copy original url

open original url

Loading...

nurkiewicz, 11 months ago to python

TIL In #Python #polars is a faster alternative to #pandas. Sadly, it has a different API: https://towardsdatascience.com/pandas-vs-polars-a-syntax-and-speed-comparison-5aa54e27497e

reply

expand (1)

collapse (1)

report

activity

copy /kbin url

copy original url

open original url

Loading...