Allgemein

Writing DataFrame-Agnostic Python Code With Narwhals

Writing DataFrame-Agnostic Python Code With Narwhals

Narwhals is intended for Python library developers who need to analyze DataFrames in a range of standard formats, including Polars, pandas, DuckDB, and others. It does this by providing a compatibility layer of code that handles any differences between the various formats.

In this tutorial, you’ll learn how to use the same Narwhals code to analyze data produced by the latest versions of two very common data libraries. You’ll also discover how Narwhals utilizes the efficiencies of your source data’s underlying library when analyzing your data. Furthermore, because Narwhals uses syntax that is a subset of Polars, you can reuse your existing Polars knowledge to quickly gain proficiency with Narwhals.

The table below will allow you to quickly decide whether or not Narwhals is for you:

Use Case Use Narwhals Use Another Tool
You need to produce DataFrame-agnostic code.
You want to learn a new DataFrame library.

Whether you’re wondering how to develop a Python library to cope with DataFrames from a range of common formats, or just curious to find out if this is even possible, this tutorial is for you. The Narwhals library could provide exactly what you’re looking for.

Take the Quiz: Test your knowledge with our interactive “Writing DataFrame-Agnostic Python Code With Narwhals” quiz. You’ll receive a score upon completion to help you track your learning progress:


Writing DataFrame-Agnostic Python Code With Narwhals

Interactive Quiz

Writing DataFrame-Agnostic Python Code With Narwhals

If you’re a Python library developer wondering how to write DataFrame-agnostic code, the Narwhals library is the solution you’re looking for.

Get Ready to Explore Narwhals

Before you start, you’ll need to install Narwhals and have some data to play around with. You should also be familiar with the idea of a DataFrame. Although having an understanding of several DataFrame libraries isn’t mandatory, you’ll find a familiarity with Polars’ expressions and contexts syntax extremely useful. This is because Narwhals’ syntax is based on a subset of Polars’ syntax. However, Narwhals doesn’t replace Polars.

In this example, you’ll use data stored in the presidents Parquet file included in your downloadable materials.

This file contains the following six fields to describe United States presidents:

Heading Meaning
last_name The president’s last name
first_name The president’s first name
term_start Start of the presidential term
term_end End of the presidential term
party_name The president’s political party
century Century the president’s term started

To work through this tutorial, you’ll need to install the pandas, Polars, PyArrow, and Narwhals libraries:

Shell

$ python -m pip install pandas polars pyarrow narwhals

A key feature of Narwhals is that it’s DataFrame-agnostic, meaning your code can work with several formats. But you still need both Polars and pandas because Narwhals will use them to process the data you pass to it. You’ll also need them to create your DataFrames to pass to Narwhals to begin with.

You installed the PyArrow library to correctly read the Parquet files. Finally, you installed Narwhals itself.

With everything installed, make sure you create the project’s folder and place your downloaded presidents.parquet file inside it. You might also like to add both the books.parquet and authors.parquet files as well. You’ll need them later.

With that lot done, you’re good to go!

Understand How Narwhals Works

The documentation describes Narwhals as follows:

Extremely lightweight and extensible compatibility layer between dataframe libraries! (Source)

Narwhals is lightweight because it wraps the original DataFrame in its own object ecosystem while still using the source DataFrame’s library to process it. Any data passed into it for processing doesn’t need to be duplicated, removing an otherwise resource-intensive and time-consuming operation.

Narwhals is also extensible. For example, you can write Narwhals code to work with the full API of the following libraries:

It also supports the lazy API of the following:

Read the full article at https://realpython.com/narwhals-python/ »


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]