Polars

Lightning-fast DataFrame library for Rust and Python

Familiar from the start

Knowing of data wrangling habits, Polars exposes a complete Python API, including the full set of features to manipulate DataFrames using an expression language that will empower you to create readable and performant code.

DataFrames to the Rust ecosystem

Polars is written in Rust, uncompromising in its choices to provide a feature-complete DataFrame API to the Rust ecosystem. Use it as a DataFrame library or as query engine backend for your data models.

On the shoulders of a giant

Polars is built upon the safe Arrow2 implementation of the Apache Arrow specification, enabling efficient resource use and processing performance. By doing so it also integrates seamlessly with other tools in the Arrow ecosystem.

Welcome to fast data wrangling

Polars is a lightning fast DataFrame library/in-memory query engine. Its embarrassingly parallel execution, cache efficient algorithms and expressive API makes it perfect for efficient data wrangling, data pipelines, snappy APIs and so much more.

Polars is about as fast as it gets, see the results in the H2O.ai benchmark.

Rust

Below a quick demonstration of Polars API in Rust.

use polars::prelude::*;

fn example() -> Result<DataFrame, PolarsError> {
    LazyCsvReader::new("foo.csv")
        .has_header(true)
        .finish()?
        .filter(col("bar").gt(lit(100)))
        .group_by(vec![col("ham")])
        .agg(vec![col("spam").sum(), col("ham").sort(false).first()])
        .collect()
}

Python

Below a quick demonstration of Polars API in Python.

import polars as pl

q = (
    pl.scan_csv("iris.csv")
    .filter(pl.col("sepal_length") > 5)
    .group_by("species")
    .agg(pl.all().sum())
)

df = q.collect()

Contributors

and more...

Sponsors

Xomnia logo Ponte Energy Partners logo databento logo