Speaker
✓ Join Parallel Conferences for $250
✓ Session recordings
Book Now
Description
Most Data Scientists/Analysts using Python are familiar with Pandas. And if you are in the data science field, you probably have invested quite a significant amount of time learning how to use them to manipulate your data. However, one of the main complaints about Pandas is its speed and inefficiencies when dealing with large datasets. Fortunately, there is a new dataframe library that attempts to address this main complaint about Pandas — Polars.
Polars is a DataFrame library that is completely written in Rust. In this article, I will walk you through the basics of using Polars in Python and how it can be used in place of Pandas.
Topics
- Comparing Pandas and Polars
- Loading a Polars DataFrame
- Selecting columns and rows
- Lazy Evaluation in Polars
- Manipulating values in dataframe
- Data cleansing
- Performing GroupBy functions
Content & Process
Content & Process
For this workshop, you will learn about the Polars DataFrame library. This is a lab-intensive workshop
where you will make use of Jupyter Notebook to experiment with the various features of Polars.
Participants will be encouraged to follow along as the instructor goes through each of the following
topics:
• Comparing Pandas and Polars
- Comparing the performances of using Pandas and Polars DataFrames
• Loading a Polars DataFrame
- Learn how to load a Polars DataFrame from a data structure (such as list or dictionary), as well from CSV files
• Selecting columns and rows
- Learn how to select rows and columns from a Polars DataFrame
• Lazy Evaluation in Polars
- Learn the unique features of Polars – lazy evaluation, and how it helps to optimize your query
• Manipulating values in dataframe
- Learn the best practises of how to manipulate the values in your Polars DataFrame
• Data cleansing
- Learn the various ways to clean your data (replacing null values, subsitituting withother values, etc)
• Performing GroupBy functions
- Learn how to use the GroupBy function to perform exploratory data analytics
All the code exercises will be made available as Jupyter Notebooks. Sample datasets will also be provided during the workshop
Audience & Requirements
This workshop is ideal for people who wanted to get started with data analytics. Experience with
existing libraries like NumPy or Pandas is not mandatory (although helpful). Knowledge of Python is
required.
We will be using Anaconda for this workshop. Before attending this workshop, please ensure that you
download and install Anaconda (https://www.anaconda.com/products/distribution) on your computer
(Windows, Mac, or Linux is supported).