pandas-wizard

Pandas-Wizard (pandaswizard) is a simple Python module for providing utility functions and wrappers for the pandas module. The module is kept simple and use of external dependencies is minimized unless needed to enhance performance.

This is a relatively new repository, and if you find any performance or improvement scope please check the contributing guidelines for the organization. All help and criticism are appreciated. If you find any additional use cases please create a pull request or submit for a new feature.

Getting Started

The source code is currently hosted at GitHub: sharkutilities/pandas-wizard. The binary installers for the latest release are available at the Python Package Index (PyPI).

pip install -U pandas-wizard

The list of changes between each release is available here.

The purpose of the below guide is to illustrate the main features of pandas-wizard and assume the working knowledge of the pandas module and use cases. The below example calculates the percentile of pandas.DataFrameGroupBy object using np.percentile.

import pandaswizard as pdw # attempt to create an ubiquitous naming

# let's calculate the 50th-percentile, i.e. the median for each group
percentiles = df.groupby("group").agg({"A" : pdw.percentile(50)})
percentiles.head()

# or, preferred usage is to use in conjunture with other aggregation function like
statistics = df.groupby("group").agg({"A" : [sum, pdw.percentile(50), pdw.quantile(0.95)]})
statistics.head()

The above function calculates the 50th percentile, i.e., the median of the feature “A” based on the grouped column “group” from the data frame.

Brief Overview of Capailities

Utility Functions and/or Wrappers for pandas Library

The pandaswizard module includes functions, wrappers and other utility functions for pandas module. The package is kept simple and minimalistic such that external dependencies are reduced. The working of the module is divided into the following sections:

  • pandaswizard.aggregate: a set of aggregate functions that can be used along with pd.groupby().agg({...}) method without comprimising functionality.

  • pandaswizard.wrappers: a set of decorators/wrappers that can be used along side a function.

User Defined Functions

A List of Useful Functions to Provide Functionality over a Dataframe

The numpy and pandas has various in-built functions to quickly summarize dataframe. The functions tabs enables end-users with the capability of creating custom functions (like half-life) and apply the same over a pandas dataframe object.


Footnote: The favicon is designed from the original pandas logo and no copyright infringement is intended. Since the main objective is to provide a utility function for pandas the logo is re-used and developed using canva.