DuckDB is an in-process
SQL OLAP database management system

Why DuckDB?



Simple and portable

In-process, serverless
C++11, no dependencies, single-file build
APIs for Python, R, Java, Julia, Swift, …
Runs on Windows, Linux, macOS, OpenBSD, …



Feature-rich

Transactions, persistence
Extensive SQL support
Direct Parquet, CSV, and JSON querying
Joins, aggregates, window functions



Fast

Optimized for analytics
Vectorized and parallel engine
Larger than memory processing
Parallel Parquet, CSV, and NDJSON loaders



Free and extensible

Free & open-source
Permissive MIT License
Flexible extension mechanism

Installation

Choose your environment to use for DuckDB

Command Line
Python
R
Java
Node.js
ODBC

https://github.com/duckdb/duckdb/releases/download/v0.9.2/duckdb_cli-windows-amd64.zip

Latest release: DuckDB 0.9.2 System detected: Other Installations

When to use DuckDB



Processing and storing tabular datasets, e.g., from CSV or Parquet files
Interactive data analysis, e.g., join & aggregate multiple large tables
Concurrent large changes, to multiple large tables, e.g., appending rows, adding/removing/updating columns
Large result set transfer to client

When to not use DuckDB



High-volume transactional use cases (e.g., tracking orders in a webshop)
Large client/server installations for centralized enterprise data warehousing
Writing to a single database from multiple concurrent processes
Multiple concurrent processes reading from a single writable database

Blog

Updates to the H2O.ai db-benchmark!

TL;DR: the H2O.ai db-benchmark has been updated with new results. In addition, the AWS EC2 instance used for benchmarking has been changed to a c6id.metal for improved repeatability and fairness across libraries. DuckDB is the fastest library for both join and group by queries at almost every data size. Skip […]

2023-10-27

DuckDB's CSV Sniffer: Automatic Detection of Types and Dialects

TLDR: DuckDB is primarily focused on performance, leveraging the capabilities of modern file formats. At the same time, we also pay attention to flexible, non-performance-driven formats like CSV files. To create a nice and pleasant experience when reading from CSV files, DuckDB implements a CSV sniffer that automatically detects CSV […]

2023-10-06

DuckCon #4 in Amsterdam

We are excited to hold the next “DuckCon” DuckDB user group meeting for the first time in the birthplace of DuckDB, Amsterdam, the Netherlands. The meeting will take place on February 2, 2024 (Friday) in the OBA Congress Center’s Theater room, five minutes walking distance from Amsterdam Central Station. Conveniently, […]

DuckDB is an in-process SQL OLAP database management system