Daffy¶

Validate pandas and Polars DataFrames with Python decorators.

Daffy catches missing columns, wrong data types, and invalid values at runtime — before they cause errors downstream in your data pipeline. Just add decorators to your functions.

Also supports Modin and PyArrow DataFrames.

Lightweight
Column & dtype validation with minimal overhead

Value Constraints
Nullability, uniqueness, range checks

Row Validation
Deep validation with Pydantic models

Multi-Backend
Works with pandas, Polars, Modin, PyArrow

Quick Example¶

from daffy import df_in, df_out

@df_in(columns=["price", "bedrooms", "location"])
@df_out(columns=["price_per_room", "price_category"])
def analyze_housing(houses_df):
    # Transform raw housing data into price analysis
    return analyzed_df

If a column is missing, has wrong dtype, or violates a constraint — Daffy fails fast with a clear error message at the function boundary.

Installation¶

pipconda

pip install daffy

conda install -c conda-forge daffy

Works with whatever DataFrame library you already have installed. Python 3.10–3.14.

Why Daffy?¶

Most DataFrame validation tools are schema-first (define schemas separately) or pipeline-wide (run suites over datasets). Daffy is decorator-first: validate inputs and outputs where transformations happen.

✓	Non-intrusive — Just add decorators — no refactoring, no custom DataFrame types, no schema files
✓	Easy to adopt — Add in 30 seconds, remove just as fast if needed
✓	In-process — No external stores, orchestrators, or infrastructure
✓	Pay for what you use — Column validation is essentially free; opt into row validation when needed

Next Steps¶

Getting Started
Quick introduction to Daffy's core features

Usage Guide
Core validation features for everyday use

Recipes & Patterns
Real-world examples and best practices

API Reference
Decorator signatures and parameters