PanelRegression#

class causalpy.experiments.panel_regression.PanelRegression[source]#

Panel regression with fixed effects estimation.

Enables panel-aware visualization and diagnostics, with support for both unpooled dummy-variable and demeaned (de-meaned) fixed effects.

Parameters:

data (DataFrame) – A pandas dataframe with panel data. Each row is an observation for a unit at a time period.
formula (str) – A statistical model formula using patsy syntax. For the unpooled dummy-variable fixed-effects approach, include C(unit_var) (and optionally C(time_var)) in the formula. For the demeaned transformation, do NOT include those C(...) terms; fixed effects are removed by transformation before fitting.
unit_fe_variable (str) – Column name for the unit identifier (e.g., “state”, “id”, “country”).
time_fe_variable (str | None) – Column name for the time identifier (e.g., “year”, “wave”, “period”). If provided, time fixed effects will be included. Default is None.
fe_method (Literal['dummies', 'demeaned']) –
Method for handling fixed effects: - “dummies”: Use unpooled dummy-variable fixed effects

(C(unit)/C(time) in formula). Gets individual unit effect estimates but creates N-1 dummy columns. Best for small N.
- ”demeaned”: Use demeaned (de-meaned) transformation. Scales to large N but doesn’t directly estimate individual unit effects.
model (PyMCModel | RegressorMixin | None) – A PyMC (Bayesian) or sklearn (OLS) model. If None, a model must be provided.

n_units#

Number of unique units in the panel.

Type:: int

n_periods#

Number of unique time periods (None if time_fe_variable not provided).

Type:: int or None

fe_method#

The fixed effects method used (“dummies” or “demeaned”).

Type:: str

_group_means#

Stored group means for recovering unit effects (demeaned method only).

Type:: dict

Examples

Small panel with dummy variables:

>>> import causalpy as cp
>>> import pandas as pd
>>> # Create small panel: 10 units, 20 time periods
>>> np.random.seed(42)
>>> units = [f"unit_{i}" for i in range(10)]
>>> periods = range(20)
>>> data = pd.DataFrame(
...     [
...         {
...             "unit": u,
...             "time": t,
...             "treatment": int(t >= 10 and u in units[:5]),
...             "x1": np.random.randn(),
...             "y": np.random.randn(),
...         }
...         for u in units
...         for t in periods
...     ]
... )
>>> result = cp.PanelRegression(
...     data=data,
...     formula="y ~ C(unit) + C(time) + treatment + x1",
...     unit_fe_variable="unit",
...     time_fe_variable="time",
...     fe_method="dummies",
...     model=cp.pymc_models.LinearRegression(
...         sample_kwargs={"random_seed": 42, "progressbar": False}
...     ),
... )

Large panel with demeaned transformation:

>>> # Create larger panel: 1000 units, 10 time periods
>>> np.random.seed(42)
>>> units = [f"unit_{i}" for i in range(1000)]
>>> periods = range(10)
>>> data = pd.DataFrame(
...     [
...         {
...             "unit": u,
...             "time": t,
...             "treatment": int(t >= 5),
...             "x1": np.random.randn(),
...             "y": np.random.randn(),
...         }
...         for u in units
...         for t in periods
...     ]
... )
>>> result = cp.PanelRegression(
...     data=data,
...     formula="y ~ treatment + x1",  # No C(unit) needed
...     unit_fe_variable="unit",
...     time_fe_variable="time",
...     fe_method="demeaned",
...     model=cp.pymc_models.LinearRegression(
...         sample_kwargs={"random_seed": 42, "progressbar": False}
...     ),
... )

Notes

The demeaned transformation (de-meaning by group) removes time-invariant confounders but also drops time-invariant covariates from the model. For the "dummies" approach (unpooled FE), individual unit effects can be extracted from the coefficients. For the demeaned approach, unit effects can be recovered post-hoc using the stored group means (_group_means), which are always computed from the original (pre-demeaning) data.

This class does not yet implement hierarchical/partial-pooling fixed effects. Those semantics are intentionally kept out of scope here so fe_method="dummies" remains an accurate label for the current unpooled estimator.

Two-way fixed effects (unit + time) control for both unit-specific and time-specific unobserved heterogeneity. This is the standard approach in difference-in-differences estimation.

Balanced vs unbalanced panels: A panel is balanced when every unit is observed in every time period; otherwise it is unbalanced (e.g. unit entry/exit, missing waves). When both unit and time fixed effects are requested with fe_method="demeaned", the sequential demeaning (first by unit, then by time) is algebraically equivalent to the standard two-way demeaned transformation only for balanced panels. For unbalanced panels, iterative alternating demeaning would be needed for exact convergence; the single-pass approximation used here may introduce small biases. Unbalanced panels are common in practice (e.g. firm or worker panels with attrition); for heavily unbalanced data, consider checking sensitivity or using dedicated FE packages that implement iterative two-way demeaning (e.g. reghdfe, pyfixest).

Methods

`PanelRegression.algorithm`()	Run the experiment algorithm: fit the model.
`PanelRegression.effect_summary`(*[, window, ...])	Generate a decision-ready summary of causal effects.
`PanelRegression.fit`(args, *kwargs)
`PanelRegression.generate_report`(*[, ...])	Generate a self-contained HTML report for this experiment.
`PanelRegression.get_plot_data`(args, *kwargs)	Recover the data of an experiment along with the prediction and causal impact information.
`PanelRegression.get_plot_data_bayesian`(**kwargs)	Get plot data for Bayesian model.
`PanelRegression.get_plot_data_ols`(**kwargs)	Get plot data for OLS model.
`PanelRegression.input_validation`()	Validate input parameters.
`PanelRegression.plot`(*args[, show, ...])	Plot the model.
`PanelRegression.plot_coefficients`([...])	Plot coefficient estimates with credible/confidence intervals.
`PanelRegression.plot_residuals`([kind])	Plot residual diagnostics.
`PanelRegression.plot_trajectories`([units, ...])	Plot unit-level time series trajectories.
`PanelRegression.plot_unit_effects`([...])	Plot distribution of unit fixed effects.
`PanelRegression.print_coefficients`([round_to])	Ask the model to print its coefficients.
`PanelRegression.set_maketables_options`(*[, ...])	Set optional maketables rendering options for this experiment.
`PanelRegression.summary`([round_to])	Print a summary of the panel regression results.

Attributes

`idata`	Return the InferenceData object of the model.
`supports_bayes`
`supports_ols`
`labels`
`data`

__init__(data, formula, unit_fe_variable, time_fe_variable=None, fe_method='dummies', model=None, **kwargs)[source]#

Parameters:

data (DataFrame)
formula (str)
unit_fe_variable (str)
time_fe_variable (str | None)
fe_method (Literal['dummies', 'demeaned'])
model (PyMCModel | RegressorMixin | None)
kwargs (dict)

Return type:

None

classmethod __new__(*args, **kwargs)#