pysmatch.Matcher

class pysmatch.Matcher.Matcher(test: DataFrame, control: DataFrame, yvar: str, formula: str | None = None, exclude: List[str] | None = None)[source]

Bases: object

A class to perform propensity score matching (PSM).

This class encapsulates the entire PSM workflow, including propensity score estimation, matching, and balance assessment.

data

The input DataFrame containing treatment, outcome, and covariates.

Type:

pd.DataFrame

treatment

The name of the treatment column.

Type:

str

outcome

The name of the outcome column.

Type:

str

covariates

A list of covariate column names.

Type:

list

exclude

A list of columns to exclude from calculations (often includes treatment and outcome).

Type:

list

scores

Propensity scores estimated for each observation.

Type:

pd.Series

matched_data

DataFrame containing the matched pairs/groups.

Type:

pd.DataFrame

model_fit

The fitted propensity score model object.

Type:

object

balance_stats

Statistics assessing covariate balance before and after matching.

Type:

pd.DataFrame

n_matches

The number of matches to find for each treated unit (used in some matching methods).

Type:

int

method

The matching method used (e.g., “nearest”, “optimal”, “radius”).

Type:

str

assign_weight_vector() None[source]

Assigns inverse frequency weights to records in the matched dataset.

Calculates weights as 1 / count, where count is the number of times an original record (identified by record_id) appears in the matched dataset. This is often used in analyses after matching with replacement to account for controls matched multiple times. The weights are added as a ‘weight’ column to self.matched_data.

Requires match() to have been run and matched_data to contain ‘record_id’.

compare_categorical(return_table: bool = False, plot_result: bool = True)[source]

Compares categorical variables between groups before and after matching.

Delegates the comparison logic and plotting to visualization.compare_categorical. Typically calculates and displays differences in proportions or performs Chi-Square tests for all categorical covariates found in self.xvars.

Parameters:
  • return_table (bool, optional) – If True, returns the comparison results as a DataFrame. Defaults to False.

  • plot_result (bool, optional) – If True, generates and displays plots summarizing the balance for categorical variables. Defaults to True.

Returns:

If return_table is True, returns a DataFrame containing

the comparison statistics (e.g., p-values before/after). Otherwise, returns None.

Return type:

Optional[pd.DataFrame]

compare_continuous(save: bool = False, return_table: bool = False, plot_result: bool = True)[source]

Compares continuous variables between groups before and after matching.

Delegates the comparison logic and plotting to visualization.compare_continuous. Typically calculates and displays standardized mean differences (SMD) or performs t-tests for all continuous covariates found in self.xvars.

Parameters:
  • save (bool, optional) – Whether to save any generated plots (functionality depends on the implementation in visualization.compare_continuous). Defaults to False.

  • return_table (bool, optional) – If True, returns the comparison results as a DataFrame. Defaults to False.

  • plot_result (bool, optional) – If True, generates and displays plots (e.g., Love plot) summarizing the balance. Defaults to True.

Returns:

If return_table is True, returns a DataFrame containing

the comparison statistics (e.g., SMD before/after). Otherwise, returns None.

Return type:

Optional[pd.DataFrame]

fit_model(index: int, X: DataFrame, y: Series, model_type: str, balance: bool, max_iter: int = 100) Dict[str, Any][source]

Fits a single propensity score model.

Internal helper method that calls pysmatch.modeling.fit_model. This is typically used within fit_scores, especially when fitting multiple models for balancing or ensembling.

Parameters:
  • index (int) – An identifier for the model (e.g., its index in an ensemble).

  • X (pd.DataFrame) – The feature matrix (covariates).

  • y (pd.Series) – The target variable (treatment indicator).

  • model_type (str) – The type of model to fit (e.g., ‘linear’, ‘rf’, ‘gb’).

  • balance (bool) – Whether the fitting process should aim to balance covariates (e.g., by undersampling the majority class or using class weights).

  • max_iter (int, optional) – Maximum iterations for iterative solvers (like logistic regression). Defaults to 100.

Returns:

A dictionary containing the fitted model object under the key ‘model’

and its accuracy under the key ‘accuracy’.

Return type:

Dict[str, Any]

fit_scores(balance: bool = True, nmodels: int | None = None, n_jobs: int = 1, model_type: str = 'linear', max_iter: int = 100, use_optuna: bool = False, n_trials: int = 10) None[source]

Fits propensity score model(s) to estimate scores.

Supports single model fitting, ensemble fitting for balance (by undersampling the majority class across multiple models), or hyperparameter tuning using Optuna.

Parameters:
  • balance (bool, optional) – If True, attempts to create balanced models. If nmodels is greater than 1, this typically involves fitting multiple models on undersampled majority data. If nmodels is 1, it might involve using class weights or other balancing techniques within the single model fit. Defaults to True.

  • nmodels (Optional[int], optional) – The number of models to fit in an ensemble. If None and balance is True, it’s estimated based on the majority/minority class ratio. If None and balance is False, it defaults to 1. Ignored if use_optuna is True. Defaults to None.

  • n_jobs (int, optional) – The number of parallel jobs to run when fitting multiple models (nmodels > 1). Uses ThreadPool. Defaults to 1.

  • model_type (str, optional) – The type of classification model to use for propensity score estimation (e.g., ‘linear’ for Logistic Regression, ‘rf’ for Random Forest, ‘gb’ for Gradient Boosting). Passed to fit_model. Defaults to ‘linear’.

  • max_iter (int, optional) – Maximum iterations for the solver in iterative models like Logistic Regression. Passed to fit_model. Defaults to 100.

  • use_optuna (bool, optional) – If True, uses Optuna for hyperparameter tuning instead of fitting nmodels. nmodels is ignored. Defaults to False.

  • n_trials (int, optional) – The number of trials for Optuna optimization if use_optuna is True. Defaults to 10.

Returns:

Models and accuracies are stored in self.models and self.model_accuracy.

Propensity scores are calculated and stored later via predict_scores().

Return type:

None

match(threshold: float = 0.001, nmatches: int = 1, method: str = 'min', replacement: bool = False) None[source]

Performs matching based on estimated propensity scores.

Parameters:
  • method (str, optional) – The matching algorithm to use. Options: “nearest”, “optimal”, “radius”. Defaults to “nearest”.

  • n_matches (int, optional) – The number of control units to match to each treated unit (for ‘nearest’ neighbor). Defaults to 1.

  • caliper (float, optional) – The maximum allowable distance (caliper) between propensity scores for a match. If None, no caliper is applied. Defaults to None. For ‘radius’ matching, this defines the radius.

  • replace (bool, optional) – Whether control units can be matched multiple times (matching with replacement). Defaults to False.

  • **kwargs – Additional keyword arguments passed to the specific matching algorithm.

Returns:

A DataFrame containing the matched treated and control units.

Return type:

pd.DataFrame

Raises:
  • RuntimeError – If propensity scores have not been estimated yet.

  • ValueError – If an invalid matching method is specified.

plot_matched_scores() None[source]

Plots the distribution of propensity scores after matching.

Visualizes the score overlap between test and control groups specifically within the self.matched_data. Requires match() to have been run.

plot_scores() None[source]

Plots the distribution of propensity scores before matching.

Visualizes the overlap of scores between the test (treated) and control groups in the original (unmatched) data. Requires scores to be calculated first.

predict_scores() None[source]

Predicts propensity scores using the fitted model(s).

If multiple models were fitted (ensemble), the scores are averaged across models. The predicted scores are added to the self.data DataFrame as a ‘scores’ column.

Returns:

Scores are stored in self.data[‘scores’].

Return type:

None

Raises:

RuntimeError – If fit_scores() has not been called successfully yet (no models exist).

prep_prop_test(data: DataFrame, var: str) list[source]

Prepares a contingency table for the Chi-Square test.

Creates a cross-tabulation of the specified variable (var) against the treatment variable (self.yvar) from the given DataFrame. Handles potential missing categories by ensuring both treatment groups (0 and 1) are present as columns, filled with 0 counts if necessary.

Parameters:
  • data (pd.DataFrame) – The DataFrame (either original or matched) to use.

  • var (str) – The categorical variable name.

Returns:

A list-of-lists representation of the contingency table suitable

for scipy.stats.chi2_contingency. Returns None if the input data is empty or the variable is missing.

Return type:

Optional[list]

prop_test(col: str) Dict[str, Any] | None[source]

Performs Chi-Square tests for a categorical variable before and after matching.

Compares the distribution of a categorical variable (col) between the test and control groups in both the original (self.data) and matched (self.matched_data) datasets using the Chi-Square test of independence.

Parameters:

col (str) – The name of the categorical column to test. The method checks if the column is likely categorical (not continuous) and not in self.exclude.

Returns:

A dictionary containing the variable name (‘var’), the

p-value from the Chi-Square test before matching (‘before’), and the p-value after matching (‘after’). Returns None if the variable is continuous, excluded, or if tests fail.

Return type:

Optional[Dict[str, Any]]

record_frequency() DataFrame[source]

Calculates the frequency of each original record in the matched dataset.

Useful when matching with replacement, as control units might appear multiple times. Requires match() to have been run successfully. The matched data must contain a ‘match_id’ or similar identifier linking back to original records if counts are desired per original record. Correction: Based on `assign_weight_vector`, it seems ‘record_id’ is the key identifier.

Returns:

A DataFrame with columns like ‘record_id’ and ‘n_records’ (frequency count),

or an empty DataFrame if matching hasn’t been done.

Return type:

pd.DataFrame

tune_threshold(method: str, nmatches: int = 1, rng: ndarray | None = None) None[source]

Evaluates matching retention across a range of threshold values.

Performs matching repeatedly for different threshold values and plots the proportion of the minority group retained at each threshold. This helps in selecting an appropriate threshold/caliper value.

Parameters:
  • method (str) – The matching method to use (e.g., ‘min’, ‘nn’, ‘radius’) for each threshold evaluation. Passed to matching.tune_threshold.

  • nmatches (int, optional) – The number of matches to seek (relevant for ‘nn’/’min’). Defaults to 1.

  • rng (Optional[np.ndarray], optional) – A NumPy array specifying the sequence of threshold values to test. If None, a default range (0 to 0.001 by 0.0001) is used. Defaults to None.