Regularized Kernel Hilbert Space
NPIV
This module provides implementations of RKHS Instrumental Variable (IV) estimators.
- Classes:
_BaseRKHSIV: Base class for RKHS IV methods. RKHSIV: RKHS IV estimator. RKHSIVCV: RKHS IV estimator with cross-validation. RKHSIVL2: RKHS IV estimator with L2 regularization. RKHSIVL2CV: RKHS IV estimator with L2 regularization and cross-validation. ApproxRKHSIV: Approximate RKHS IV estimator using kernel approximations. ApproxRKHSIVCV: Approximate RKHS IV estimator with cross-validation using kernel approximations.
- class rkhsiv.ApproxRKHSIV(kernel_approx='nystrom', n_components=10, kernel='rbf', gamma=2, degree=3, coef0=1, kernel_params=None, delta_scale='auto', delta_exp='auto', alpha_scale='auto')[source]
Bases:
_BaseRKHSIVApproximate RKHS IV estimator using kernel approximations.
This class implements an approximate RKHS IV estimator using kernel approximations.
- Parameters
kernel_approx (str) – Kernel approximation method (‘nystrom’ or ‘rbfsampler’).
n_components (int) – Number of approximation components.
kernel (str or callable) – Kernel function or string identifier.
degree (int) – Degree for polynomial kernels.
coef0 (float) – Zero coefficient for polynomial kernels.
alpha_scale (str or float) – Scale of the regularization parameter.
kernel_params (dict) – Additional parameters for the kernel.
- _get_new_approx_instance()[source]
Create a new kernel approximation instance.
- Returns
Kernel approximation instance.
- Return type
- fit(Z, T, Y)[source]
Fit the approximate RKHS IV estimator.
- Parameters
Z (array-like) – Instrumental variables.
T (array-like) – Treatments.
Y (array-like) – Outcomes.
- Returns
Fitted estimator.
- Return type
self
- class rkhsiv.ApproxRKHSIVCV(kernel_approx='nystrom', n_components=10, kernel='rbf', gamma=2, degree=3, coef0=1, kernel_params=None, delta_scale='auto', delta_exp='auto', alpha_scales='auto', n_alphas=30, cv=6)[source]
Bases:
ApproxRKHSIVApproximate RKHS IV estimator with cross-validation using kernel approximations.
This class implements an approximate RKHS IV estimator with cross-validation using kernel approximations.
- Parameters
kernel_approx (str) – Kernel approximation method (‘nystrom’ or ‘rbfsampler’).
n_components (int) – Number of approximation components.
kernel (str or callable) – Kernel function or string identifier.
degree (int) – Degree for polynomial kernels.
coef0 (float) – Zero coefficient for polynomial kernels.
alpha_scales (str or array-like) – Scale of the regularization parameter.
n_alphas (int) – Number of alpha scales to try.
cv (int) – Number of folds for cross-validation.
kernel_params (dict) – Additional parameters for the kernel.
- class rkhsiv.RKHSIV(kernel='rbf', gamma=2, degree=3, coef0=1, delta_scale='auto', delta_exp='auto', alpha_scale='auto', kernel_params=None)[source]
Bases:
_BaseRKHSIVRKHS IV estimator.
This class implements an RKHS IV estimator.
- Parameters
- fit(Z, T, Y)[source]
Fit the RKHS IV estimator.
- Parameters
Z (array-like) – Instrumental variables.
T (array-like) – Treatments.
Y (array-like) – Outcomes.
- Returns
Fitted estimator.
- Return type
self
- class rkhsiv.RKHSIVCV(kernel='rbf', gamma=2, degree=3, coef0=1, kernel_params=None, delta_scale='auto', delta_exp='auto', alpha_scales='auto', n_alphas=30, cv=6)[source]
Bases:
RKHSIVRKHS IV estimator with cross-validation.
This class implements an RKHS IV estimator with cross-validation.
- Parameters
kernel (str or callable) – Kernel function or string identifier.
degree (int) – Degree for polynomial kernels.
coef0 (float) – Zero coefficient for polynomial kernels.
alpha_scales (str or array-like) – Scale of the regularization parameter.
n_alphas (int) – Number of alpha scales to try.
cv (int) – Number of folds for cross-validation.
kernel_params (dict) – Additional parameters for the kernel.
- class rkhsiv.RKHSIVL2(kernel='rbf', gamma=2, degree=3, coef0=1, delta_scale='auto', delta_exp='auto', kernel_params=None)[source]
Bases:
_BaseRKHSIVRKHS IV estimator with L2 regularization.
This class implements an RKHS IV estimator with L2 regularization.
- Parameters
- class rkhsiv.RKHSIVL2CV(kernel='rbf', gamma=2, degree=3, coef0=1, kernel_params=None, delta_scale='auto', delta_exp='auto', alpha_scales='auto', n_alphas=30, cv=6)[source]
Bases:
RKHSIVL2RKHS IV estimator with L2 regularization and cross-validation.
This class implements an RKHS IV estimator with L2 regularization and cross-validation.
- Parameters
kernel (str or callable) – Kernel function or string identifier.
degree (int) – Degree for polynomial kernels.
coef0 (float) – Zero coefficient for polynomial kernels.
alpha_scales (str or array-like) – Scale of the regularization parameter.
n_alphas (int) – Number of alpha scales to try.
cv (int) – Number of folds for cross-validation.
kernel_params (dict) – Additional parameters for the kernel.
- class rkhsiv._BaseRKHSIV(*args, **kwargs)[source]
Bases:
objectBase class for RKHS IV methods.
This class provides common functionality for RKHS IV estimators.
- Parameters
Nested NPIV
This module provides implementations of nested NPIV estimators for RKHS function classes.
- Classes:
_BaseRKHS2IV: Base class for nested RKHS IV methods. RKHS2IV: Nested RKHS IV estimator. RKHS2IVCV: Nested RKHS IV estimator with cross-validation. RKHS2IVL2: Nested RKHS IV estimator with L2 regularization. RKHS2IVL2CV: Nested RKHS IV estimator with L2 regularization and cross-validation.
- class rkhs2iv.RKHS2IV(kernel='rbf', gamma=2, degree=3, coef0=1, delta_scale='auto', delta_exp='auto', kernel_params=None)[source]
Bases:
_BaseRKHS2IVNested RKHS IV estimator.
This class implements a nested RKHS IV estimator.
- Parameters
- fit(A, B, C, D, Y, W=None, subsetted=False, subset_ind1=None, subset_ind2=None)[source]
Fit the nested RKHS IV estimator.
- Parameters
A (array-like) – Instrumental variables for the first stage.
B (array-like) – Treatments for the first stage.
C (array-like) – Instrumental variables for the second stage.
D (array-like) – Treatments for the second stage.
Y (array-like) – Outcomes.
W (array-like, optional) – Weights. Defaults to None.
subsetted (bool, optional) – Whether to use subsets. Defaults to False.
subset_ind1 (array-like, optional) – Indices for the first subset. Required if subsetted is True.
subset_ind2 (array-like, optional) – Indices for the second subset. Optional.
- Returns
Fitted estimator.
- Return type
self
- class rkhs2iv.RKHS2IVCV(kernel='rbf', gamma=2, degree=3, coef0=1, kernel_params=None, delta_scale='auto', delta_exp='auto', alpha_scales='auto', n_alphas=30, cv=6)[source]
Bases:
RKHS2IVNested RKHS IV estimator with cross-validation.
This class implements a nested RKHS IV estimator with cross-validation.
- Parameters
kernel (str or callable) – Kernel function or string identifier.
degree (int) – Degree for polynomial kernels.
coef0 (float) – Zero coefficient for polynomial kernels.
alpha_scales (str or array-like) – Scale of the regularization parameter.
n_alphas (int) – Number of alpha scales to try.
cv (int) – Number of folds for cross-validation.
kernel_params (dict) – Additional parameters for the kernel.
- fit(A, B, C, D, Y, W=None, subsetted=False, subset_ind1=None, subset_ind2=None)[source]
Fit the nested RKHS IV estimator with cross-validation.
- Parameters
A (array-like) – Instrumental variables for the first stage.
B (array-like) – Treatments for the first stage.
C (array-like) – Instrumental variables for the second stage.
D (array-like) – Treatments for the second stage.
Y (array-like) – Outcomes.
W (array-like, optional) – Weights. Defaults to None.
subsetted (bool, optional) – Whether to use subsets. Defaults to False.
subset_ind1 (array-like, optional) – Indices for the first subset. Required if subsetted is True.
subset_ind2 (array-like, optional) – Indices for the second subset. Optional.
- Returns
Fitted estimator.
- Return type
self
- class rkhs2iv.RKHS2IVL2(kernel='rbf', gamma=2, degree=3, coef0=1, delta_scale='auto', delta_exp='auto', kernel_params=None)[source]
Bases:
_BaseRKHS2IVNested RKHS IV estimator with L2 regularization.
This class implements a nested RKHS IV estimator with L2 regularization.
- Parameters
- fit(A, B, C, D, Y, W=None, subsetted=False, subset_ind1=None, subset_ind2=None)[source]
Fit the nested RKHS IV estimator with L2 regularization.
- Parameters
A (array-like) – Instrumental variables for the first stage.
B (array-like) – Treatments for the first stage.
C (array-like) – Instrumental variables for the second stage.
D (array-like) – Treatments for the second stage.
Y (array-like) – Outcomes.
W (array-like, optional) – Weights. Defaults to None.
subsetted (bool, optional) – Whether to use subsets. Defaults to False.
subset_ind1 (array-like, optional) – Indices for the first subset. Required if subsetted is True.
subset_ind2 (array-like, optional) – Indices for the second subset. Optional.
- Returns
Fitted estimator.
- Return type
self
- class rkhs2iv.RKHS2IVL2CV(kernel='rbf', gamma=2, degree=3, coef0=1, kernel_params=None, delta_scale='auto', delta_exp='auto', alpha_scales='auto', n_alphas=30, cv=6)[source]
Bases:
RKHS2IVL2Nested RKHS IV estimator with L2 regularization and cross-validation.
This class implements a nested RKHS IV estimator with L2 regularization and cross-validation.
- Parameters
kernel (str or callable) – Kernel function or string identifier.
degree (int) – Degree for polynomial kernels.
coef0 (float) – Zero coefficient for polynomial kernels.
alpha_scales (str or array-like) – Scale of the regularization parameter.
n_alphas (int) – Number of alpha scales to try.
cv (int) – Number of folds for cross-validation.
kernel_params (dict) – Additional parameters for the kernel.
- fit(A, B, C, D, Y, W=None, subsetted=False, subset_ind1=None, subset_ind2=None)[source]
Fit the nested RKHS IV estimator with L2 regularization and cross-validation.
- Parameters
A (array-like) – Instrumental variables for the first stage.
B (array-like) – Treatments for the first stage.
C (array-like) – Instrumental variables for the second stage.
D (array-like) – Treatments for the second stage.
Y (array-like) – Outcomes.
W (array-like, optional) – Weights. Defaults to None.
subsetted (bool, optional) – Whether to use subsets. Defaults to False.
subset_ind1 (array-like, optional) – Indices for the first subset. Required if subsetted is True.
subset_ind2 (array-like, optional) – Indices for the second subset. Optional.
- Returns
Fitted estimator.
- Return type
self
- class rkhs2iv._BaseRKHS2IV(*args, **kwargs)[source]
Bases:
objectBase class for nested RKHS IV methods.
This class provides common functionality for nested RKHS IV estimators.
- Parameters
Random Forest
NPIV
This module provides implementations of ensemble instrumental variable (IV) estimators using RandomForest models.
- Classes:
EnsembleIV: Implements an ensemble learning IV method with adversarial and learner components. EnsembleIVStar: Similar to EnsembleIV but with a different method for updating the test predictions. EnsembleIVL2: An extension of EnsembleIV with L2 regularization and optional cross-validation for regularization parameter selection.
- Functions:
_mysign: A helper function that returns 2 if the input is non-negative and -1 otherwise.
- class ensemble.EnsembleIV(adversary='auto', learner='auto', max_abs_value=4, n_iter=100)[source]
Bases:
objectImplements an ensemble learning IV method with adversarial and learner components.
- Parameters
adversary (str or estimator) – Adversary model. If ‘auto’, a default RandomForestRegressor is used.
learner (str or estimator) – Learner model. If ‘auto’, a default RandomForestClassifier is used.
max_abs_value (float) – Maximum absolute value for the predictions.
n_iter (int) – Number of iterations for the ensemble.
- class ensemble.EnsembleIVL2(adversary='auto', learner='auto', n_iter=100, delta_scale='auto', delta_exp='auto', CV=False, alpha_scales='auto', n_alphas=30, n_folds=5)[source]
Bases:
objectAn extension of EnsembleIV with L2 regularization and optional cross-validation to select the best regularization parameter.
- Parameters
adversary (str or estimator) – Adversary model. If ‘auto’, a default RandomForestRegressor is used.
learner (str or estimator) – Learner model. If ‘auto’, a default RandomForestRegressor is used.
n_iter (int) – Number of iterations for the ensemble.
delta_scale (str or float) – Scale factor for the critical radius delta. Default is ‘auto’.
delta_exp (str or float) – Exponent for the critical radius delta. Default is ‘auto’.
CV (bool) – Whether to perform cross-validation to select the best alpha value.
alpha_scales (str or list) – Scales for alpha in cross-validation. Default is ‘auto’.
n_alphas (int) – Number of alpha values to test in cross-validation.
n_folds (int) – Number of folds for cross-validation.
- _cross_validate_alpha(Z, T, Y)[source]
Performs cross-validation to select the best alpha value.
- Parameters
Z (array-like) – Instrumental variables.
T (array-like) – Treatment variables.
Y (array-like) – Outcome variables.
- Returns
Best alpha value.
- Return type
- class ensemble.EnsembleIVStar(adversary='auto', learner='auto', max_abs_value=4, n_iter=100)[source]
Bases:
objectSimilar to EnsembleIV but with a different method for updating the test predictions using a linear combination approach.
- Parameters
adversary (str or estimator) – Adversary model. If ‘auto’, a default RandomForestRegressor is used.
learner (str or estimator) – Learner model. If ‘auto’, a default RandomForestClassifier is used.
max_abs_value (float) – Maximum absolute value for the predictions.
n_iter (int) – Number of iterations for the ensemble.
Nested NPIV
This module provides implementations of nested nonparametric instrumental variable (NPIV) estimators using ensemble RandomForest models.
- Classes:
Ensemble2IV: Implements a nested ensemble learning IV method with two adversaries and two learners. Ensemble2IVL2: An extension of Ensemble2IV with L2 regularization and optional cross-validation for regularization parameter selection.
- Functions:
_mysign: A helper function that returns 2 if the input is non-negative and -1 otherwise.
- class ensemble2.Ensemble2IV(adversary='auto', learnerg='auto', learnerh='auto', max_abs_value=4, n_iter=100, n_burn_in=10)[source]
Bases:
objectImplements a nested ensemble learning IV method with two adversaries and two learners.
- Parameters
adversary (str or estimator) – Adversary model. If ‘auto’, a default RandomForestRegressor is used.
learnerg (str or estimator) – Learner model for g. If ‘auto’, a default RandomForestClassifier is used.
learnerh (str or estimator) – Learner model for h. If ‘auto’, a default RandomForestClassifier is used.
max_abs_value (float) – Maximum absolute value for the predictions.
n_iter (int) – Number of iterations for the ensemble.
n_burn_in (int) – Number of burn-in iterations.
- fit(A, B, C, D, Y, W=None, subsetted=False, subset_ind1=None, subset_ind2=None)[source]
Fits the nested ensemble IV model to the provided data.
- Parameters
A (array-like) – Instrumental variables for the first stage.
B (array-like) – Instrumental variables for the second stage.
C (array-like) – Treatment variables for the first stage.
D (array-like) – Treatment variables for the second stage.
Y (array-like) – Outcome variables.
W (array-like, optional) – Weights for the observations.
subsetted (bool) – If True, use subsets of data as indicated by subset_ind1 and subset_ind2.
subset_ind1 (array-like) – Indices for the first subset.
subset_ind2 (array-like) – Indices for the second subset.
- Returns
Fitted nested ensemble IV model.
- Return type
self
- predict(B, *args)[source]
Predicts outcomes for new data using the fitted nested ensemble IV model.
- Parameters
B (array-like) – Instrumental variables for the second stage.
args (tuple) – Optional second argument for instrumental variables of the first stage.
- Returns
Predicted outcomes for the second stage. If a second argument is provided, returns a tuple with predictions for both stages.
- Return type
array
- class ensemble2.Ensemble2IVL2(adversary='auto', learnerg='auto', learnerh='auto', n_iter=100, n_burn_in=10, delta_scale='auto', delta_exp='auto', CV=False, alpha_scales='auto', n_alphas=30, n_folds=5)[source]
Bases:
objectAn extension of Ensemble2IV with L2 regularization and optional cross-validation to select the best regularization parameter.
- Parameters
adversary (str or estimator) – Adversary model. If ‘auto’, a default RandomForestRegressor is used.
learnerg (str or estimator) – Learner model for g. If ‘auto’, a default RandomForestRegressor is used.
learnerh (str or estimator) – Learner model for h. If ‘auto’, a default RandomForestRegressor is used.
n_iter (int) – Number of iterations for the ensemble.
n_burn_in (int) – Number of burn-in iterations.
delta_scale (str or float) – Scale factor for the critical radius delta. Default is ‘auto’.
delta_exp (str or float) – Exponent for the critical radius delta. Default is ‘auto’.
CV (bool) – Whether to perform cross-validation to select the best alpha value.
alpha_scales (str or list) – Scales for alpha in cross-validation. Default is ‘auto’.
n_alphas (int) – Number of alpha values to test in cross-validation.
n_folds (int) – Number of folds for cross-validation.
- _cross_validate_alpha(A, B, C, D, Y, W)[source]
Performs cross-validation to select the best alpha value.
- Parameters
A (array-like) – Instrumental variables for the first stage.
B (array-like) – Instrumental variables for the second stage.
C (array-like) – Treatment variables for the first stage.
D (array-like) – Treatment variables for the second stage.
Y (array-like) – Outcome variables.
W (array-like) – Weights for the observations.
- Returns
Best alpha value.
- Return type
- fit(A, B, C, D, Y, W=None, alpha=1.0, cross_validating=False, subsetted=False, subset_ind1=None, subset_ind2=None)[source]
Fits the nested ensemble IV model with L2 regularization to the provided data.
- Parameters
A (array-like) – Instrumental variables for the first stage.
B (array-like) – Instrumental variables for the second stage.
C (array-like) – Treatment variables for the first stage.
D (array-like) – Treatment variables for the second stage.
Y (array-like) – Outcome variables.
W (array-like, optional) – Weights for the observations.
alpha (float) – Regularization parameter.
cross_validating (bool) – Whether the function is called during cross-validation.
subsetted (bool) – If True, use subsets of data as indicated by subset_ind1 and subset_ind2.
subset_ind1 (array-like) – Indices for the first subset.
subset_ind2 (array-like) – Indices for the second subset.
- Returns
Fitted nested ensemble IV model.
- Return type
self
- predict(B, *args)[source]
Predicts outcomes for new data using the fitted nested ensemble IV model with L2 regularization.
- Parameters
B (array-like) – Instrumental variables for the second stage.
args (tuple) – Optional second argument for instrumental variables of the first stage.
- Returns
Predicted outcomes for the second stage. If a second argument is provided, returns a tuple with predictions for both stages.
- Return type
array
Neural Networks
NPIV
This module provides implementations of adversarial generalized method of moments (AGMM) estimators using neural networks.
- Classes:
_BaseAGMM: Base class for AGMM models. _BaseSupLossAGMM: Base class for AGMM models with supervised loss. AGMM: Adversarial Generalized Method of Moments estimator. KernelLayerMMDGMM: AGMM with kernel layer using Maximum Mean Discrepancy. CentroidMMDGMM: AGMM with centroid-based Maximum Mean Discrepancy. KernelLossAGMM: AGMM with kernel loss. MMDGMM: AGMM with Maximum Mean Discrepancy.
- class agmm.AGMM(learner, adversary)[source]
Bases:
_BaseSupLossAGMMAdversarial Generalized Method of Moments estimator.
- Parameters
learner – a pytorch neural net module for the learner.
adversary – a pytorch neural net module for the adversary.
- class agmm.CentroidMMDGMM(learner, adversary_g, kernel, centers, sigma)[source]
Bases:
_BaseSupLossAGMMAGMM with centroid-based Maximum Mean Discrepancy.
- Parameters
learner – a pytorch neural net module for the learner.
adversary_g – a pytorch neural net module for the g function of the adversary.
kernel – the kernel function.
centers – numpy array containing the initial value of the centers in the Z space.
sigma – float corresponding to the precision of the kernel.
- class agmm.KernelLayerMMDGMM(learner, adversary_g, g_features, n_centers, kernel, centers=None, sigmas=None, trainable=True)[source]
Bases:
_BaseSupLossAGMMAGMM with kernel layer using Maximum Mean Discrepancy.
- Parameters
learner – a pytorch neural net module for the learner.
adversary_g – a pytorch neural net module for the g function of the adversary.
g_features – the number of output features of g.
n_centers – the number of centers to use in the kernel layer.
kernel – the kernel function.
centers – numpy array containing the initial value of the centers in the g(Z) space.
sigmas – numpy array containing the initial value of the sigma for each center.
trainable – whether to train the centers and the sigmas.
- class agmm.KernelLossAGMM(learner, adversary_g, kernel, sigma)[source]
Bases:
_BaseAGMMAGMM with kernel loss.
- Parameters
learner – a pytorch neural net module for the learner.
adversary_g – a pytorch neural net module for the g function of the adversary.
kernel – the kernel function.
sigma – float corresponding to the precision of the kernel.
- fit(Z, T, Y, learner_l2=0.001, adversary_l2=0.0001, learner_lr=0.001, adversary_lr=0.001, n_epochs=100, bs=100, train_learner_every=1, train_adversary_every=1, ols_weight=0.0, warm_start=False, logger=None, model_dir='.', device=None, verbose=0)[source]
- Parameters
Z (instruments) –
T (treatments) –
Y (outcome) –
learner_l2 (l2_regularization of parameters of learner and adversary) –
adversary_l2 (l2_regularization of parameters of learner and adversary) –
learner_lr (learning rate of the Adam optimizer for learner) –
adversary_lr (learning rate of the Adam optimizer for adversary) –
n_epochs (how many passes over the data) –
bs (batch size) –
train_learner_every (after how many training iterations of the adversary should we train the learner) –
ols_weight (weight on OLS (square loss) objective) –
warm_start (whether to reset weights or not) –
logger (a function that takes as input (learner, adversary, epoch, writer) and is called after every epoch) – Supposed to be used to log the state of the learning.
model_dir (folder where to store the learned models after every epoch) –
- class agmm.MMDGMM(learner, adversary_g, n_samples, kernel, sigma)[source]
Bases:
_BaseAGMMAGMM with Maximum Mean Discrepancy.
- Parameters
learner – a pytorch neural net module for the learner.
adversary_g – a pytorch neural net module for the g function of the adversary.
n_samples – number of samples.
kernel – the kernel function.
sigma – float corresponding to the precision of the kernel.
- fit(Z, T, Y, learner_l2=0.001, adversary_l2=0.0001, adversary_norm_reg=0.001, learner_lr=0.001, adversary_lr=0.001, n_epochs=100, bs1=100, bs2=100, bs3=100, train_learner_every=1, train_adversary_every=1, ols_weight=0.0, warm_start=False, logger=None, model_dir='.', device=None, verbose=0)[source]
- Parameters
Z (instruments) –
T (treatments) –
Y (outcome) –
learner_l2 (l2_regularization of parameters of learner and adversary) –
adversary_l2 (l2_regularization of parameters of learner and adversary) –
learner_lr (learning rate of the Adam optimizer for learner) –
adversary_lr (learning rate of the Adam optimizer for adversary) –
n_epochs (how many passes over the data) –
bs (batch size) –
train_learner_every (after how many training iterations of the adversary should we train the learner) –
ols_weight (weight on OLS (square loss) objective) –
warm_start (whether to reset weights or not) –
logger (a function that takes as input (learner, adversary, epoch, writer) and is called after every epoch) – Supposed to be used to log the state of the learning.
model_dir (folder where to store the learned models after every epoch) –
- class agmm._BaseAGMM[source]
Bases:
objectBase class for AGMM models.
- _pretrain(Z, T, Y, learner_l2, adversary_l2, adversary_norm_reg, learner_lr, adversary_lr, n_epochs, bs, train_learner_every, train_adversary_every, warm_start, logger, model_dir, device, verbose, add_sample_inds=False)[source]
Prepares the variables required to begin training.
- predict(T, model='avg', burn_in=0, alpha=None)[source]
- Parameters
T (treatments) –
model (one of ('avg', 'final'), whether to use an average of models or the final) –
burn_in (discard the first "burn_in" epochs when doing averaging) –
alpha (if not None but a float, then it also returns the a/2 and 1-a/2, percentile of) – the predictions across different epochs (proxy for a confidence interval)
- class agmm._BaseSupLossAGMM[source]
Bases:
_BaseAGMMBase class for AGMM models with supervised loss.
- fit(Z, T, Y, learner_l2=0.001, adversary_l2=0.0001, adversary_norm_reg=0.001, learner_lr=0.001, adversary_lr=0.001, n_epochs=100, bs=100, train_learner_every=1, train_adversary_every=1, ols_weight=0.0, warm_start=False, logger=None, model_dir='.', device=None, verbose=0)[source]
- Parameters
Z (instruments) –
T (treatments) –
Y (outcome) –
learner_l2 (l2_regularization of parameters of learner and adversary) –
adversary_l2 (l2_regularization of parameters of learner and adversary) –
adversary_norm_reg (adveresary norm regularization weight) –
learner_lr (learning rate of the Adam optimizer for learner) –
adversary_lr (learning rate of the Adam optimizer for adversary) –
n_epochs (how many passes over the data) –
bs (batch size) –
train_learner_every (after how many training iterations of the adversary should we train the learner) –
ols_weight (weight on OLS (square loss) objective) –
warm_start (if False then network parameters are initialized at the beginning, otherwise we start) – from their current weights
logger (a function that takes as input (learner, adversary, epoch, writer) and is called after every epoch) – Supposed to be used to log the state of the learning.
model_dir (folder where to store the learned models after every epoch) –
Nested NPIV
This module provides implementations of joint estimation for nested nonparametric instrumental variables (NPIV) using neural networks.
- Classes:
_BaseAGMM2: Base class for joint estimation of nested NPIV models. _BaseSupLossAGMM2: Base class for joint estimation of nested NPIV models with supervised loss. _BaseSupLossAGMM2L2: Base class for joint estimation of nested NPIV models with L2 regularization. AGMM2L2: Adversarial Generalized Method of Moments estimator for nested NPIV with L2 regularization.
- class agmm2.AGMM2L2(learnerh, learnerg, adversary1, adversary2)[source]
Bases:
_BaseSupLossAGMM2L2Adversarial Generalized Method of Moments estimator for nested NPIV with L2 regularization.
- Parameters
learnerh – torch.nn.Module for second-stage learner
learnerg – torch.nn.Module for first-stage learner
adversary1 – torch.nn.Module for first-stage adversary
adversary2 – torch.nn.Module for second-stage adversary
- class agmm2._BaseAGMM2[source]
Bases:
objectBase class for joint estimation of nested NPIV models.
- _pretrain(A, B, C, D, Y, W, learner_l2, adversary_l2, learner_norm_reg, learner_lr, adversary_lr, n_epochs, bs, train_learner_every, train_adversary_every, warm_start, model_dir, device, verbose, add_sample_inds=False, subsetted=False, subset_ind1=None, subset_ind2=None)[source]
Prepares the variables required to begin training.
- predict(B, A, model='avg', burn_in=0, alpha=None)[source]
- Parameters
B (endogenous vars for second and first stage) –
A (endogenous vars for second and first stage) –
model (one of ('avg', 'final') or an integer epoch index) –
burn_in (discard the first burn_in epochs when averaging) –
alpha (confidence interval level (if not None)) –
- class agmm2._BaseSupLossAGMM2[source]
Bases:
_BaseAGMM2Base class for joint estimation of nested NPIV models with supervised loss.
- fit(A, B, C, D, Y, W=None, learner_l2=0.001, adversary_l2=0.0001, learner_norm_reg=1e-12, learner_lr=0.001, adversary_lr=0.001, n_epochs=100, bs=100, train_learner_every=1, train_adversary_every=1, warm_start=False, model_dir='.', device=None, verbose=0, subsetted=False, subset_ind1=None, subset_ind2=None)[source]
Fit AGMM model with supervised loss.
- Parameters
learner_l2 (L2 on parameters of learners) –
adversary_l2 (L2 on parameters of adversaries) –
learner_norm_reg (ridge penalty on learner outputs) –
base) ((others as in) –
- class agmm2._BaseSupLossAGMM2L2[source]
Bases:
_BaseAGMM2Base class for joint estimation of nested NPIV models with L2 regularization on outputs.
- fit(A, B, C, D, Y, W=None, learner_l2=0.001, adversary_l2=0.0001, learner_norm_reg=1e-12, learner_lr=0.001, adversary_lr=0.001, n_epochs=100, bs=100, train_learner_every=1, train_adversary_every=1, warm_start=False, model_dir='.', device=None, verbose=0, subsetted=False, subset_ind1=None, subset_ind2=None)
Fit AGMM model with supervised loss.
- Parameters
learner_l2 (L2 on parameters of learners) –
adversary_l2 (L2 on parameters of adversaries) –
learner_norm_reg (ridge penalty on learner outputs) –
base) ((others as in) –
Sparse Linear Function Spaces
NPIV
This module provides implementations of sparse linear NPIV estimators.
Classes
- _SparseLinearAdversarialGMM
Base class for sparse linear adversarial GMM.
- sparse_l1vsl1
Sparse Linear NPIV estimator using \(\ell_1-\ell_1\) optimization.
- sparse_ridge_l1vsl1
Sparse Ridge NPIV estimator using \(\ell_1-\ell_1\) optimization.
- class sparse_l1_l1._SparseLinearAdversarialGMM(lambda_theta=0.01, B=100, eta_theta='auto', eta_w='auto', n_iter=2000, tol=0.01, sparsity=None, fit_intercept=True)[source]
Bases:
objectBase class for sparse linear adversarial GMM.
This class implements common functionality for sparse linear models using adversarial GMM.
- Parameters
- property coef
- property intercept
- class sparse_l1_l1.sparse_l1vsl1(lambda_theta=0.01, B=100, eta_theta='auto', eta_w='auto', n_iter=2000, tol=0.01, sparsity=None, fit_intercept=True)[source]
Bases:
_SparseLinearAdversarialGMMSparse Linear NPIV estimator using \(\ell_1-\ell_1\) optimization.
This class solves the high-dimensional sparse linear problem using \(\ell_1\) relaxations for the minimax optimization problem.
- Parameters
_SparseLinearAdversarialGMM. (Same as) –
- _check_duality_gap(Z, X, Y)[source]
Check the duality gap to monitor convergence.
The ensembles can be thought of as primal and dual solutions, and the duality gap can be used as a certificate for convergence of the algorithm.
- Parameters
Z (array-like) – Instrumental variables.
X (array-like) – Covariates.
Y (array-like) – Outcomes.
- Returns
True if the duality gap is less than the tolerance, otherwise False.
- Return type
- class sparse_l1_l1.sparse_ridge_l1vsl1(lambda_theta=0.01, B=100, eta_theta='auto', eta_w='auto', n_iter=2000, tol=0.01, sparsity=None, fit_intercept=True)[source]
Bases:
_SparseLinearAdversarialGMMSparse Ridge NPIV estimator using \(\ell_1-\ell_1\) optimization.
This class solves the high-dimensional sparse ridge problem using \(\ell_1\) relaxations for the minimax optimization problem.
- Parameters
_SparseLinearAdversarialGMM. (Same as) –
- _check_duality_gap(Z, X, Y)[source]
Check the duality gap to monitor convergence.
The ensembles can be thought of as primal and dual solutions, and the duality gap can be used as a certificate for convergence of the algorithm.
- Parameters
Z (array-like) – Instrumental variables.
X (array-like) – Covariates.
Y (array-like) – Outcomes.
- Returns
True if the duality gap is less than the tolerance, otherwise False.
- Return type
Nested NPIV
This module provides implementations of sparse linear NPIV estimators with L1 norm regularization for nested NPIV.
Classes:
- _SparseLinear2AdversarialGMM
Base class for sparse linear adversarial GMM for nested NPIV.
- sparse2_l1vsl1
Sparse Linear NPIV estimator using \(\ell_1-\ell_1\) optimization for nested NPIV.
- sparse2_ridge_l1vsl1
Sparse Ridge NPIV estimator using \(\ell_1-\ell_1\) optimization for nested NPIV.
- class sparse2_l1_l1._SparseLinear2AdversarialGMM(mu=0.01, V1=100, V2=100, eta_alpha='auto', eta_w1='auto', eta_beta='auto', eta_w2='auto', n_iter=2000, tol=0.01, sparsity=None, fit_intercept=True)[source]
Bases:
objectBase class for sparse linear adversarial GMM for nested NPIV.
This class implements common functionality for sparse linear models using adversarial GMM in a nested NPIV setting.
- Parameters
mu (float) – Regularization parameter.
V1 (int) – Budget parameter for the first stage.
V2 (int) – Budget parameter for the second stage.
n_iter (int) – Number of iterations.
tol (float) – Tolerance for duality gap.
sparsity (int or None) – Sparsity level for the model.
fit_intercept (bool) – Whether to fit an intercept.
- _check_input(A, B, C, D, Y, W)[source]
Check and preprocess input arrays.
- Parameters
A (array-like) – Covariates for the first stage.
B (array-like) – Covariates for the second stage.
C (array-like) – Instrumental variables for the second stage.
D (array-like) – Instrumental variables for the first stage.
Y (array-like) – Outcomes.
W (array-like) – Weights.
- Returns
Processed A, B, C, D, Y, W.
- Return type
- property coef
- property intercept
- predict(B, *args)[source]
Predict using the fitted model.
- Parameters
B (array-like) – Covariates for the second stage.
args (array-like) – Optional covariates for the first stage.
- Returns
Predicted values for the second stage. If args are provided, also returns predicted values for the first stage.
- Return type
array
- class sparse2_l1_l1.sparse2_l1vsl1(mu=0.01, V1=100, V2=100, eta_alpha='auto', eta_w1='auto', eta_beta='auto', eta_w2='auto', n_iter=2000, tol=0.01, sparsity=None, fit_intercept=True)[source]
Bases:
_SparseLinear2AdversarialGMMSparse Linear NPIV estimator using \(\ell_1-\ell_1\) optimization for nested NPIV.
This class solves the high-dimensional sparse linear problem using \(\ell_1\) relaxations for the minimax optimization problem in a nested NPIV setting.
- Parameters
_SparseLinear2AdversarialGMM. (Same as) –
- _check_duality_gap(A, B, C, D, Y, W)[source]
Calculate the duality gap to certify convergence of the algorithm.
The ensembles can be thought of as primal and dual solutions, and the duality gap can be used as a certificate for convergence of the algorithm.
- Parameters
A (array-like) – Covariates for the first stage.
B (array-like) – Covariates for the second stage.
C (array-like) – Instrumental variables for the second stage.
D (array-like) – Instrumental variables for the first stage.
Y (array-like) – Outcomes.
W (array-like) – Weights.
- Returns
True if the duality gap is below the tolerance level, indicating convergence.
- Return type
- fit(A, B, C, D, Y, W=None, subsetted=False, subset_ind1=None, subset_ind2=None)[source]
Fit the model.
- Parameters
A (array-like) – Covariates for the first stage.
B (array-like) – Covariates for the second stage.
C (array-like) – Instrumental variables for the second stage.
D (array-like) – Instrumental variables for the first stage.
Y (array-like) – Outcomes.
W (array-like, optional) – Weights. Defaults to None.
subsetted (bool, optional) – Whether to use subsets. Defaults to False.
subset_ind1 (array-like, optional) – Subset indices for the first stage. Required if subsetted is True.
subset_ind2 (array-like, optional) – Subset indices for the second stage. Defaults to None.
- Returns
Fitted estimator.
- Return type
self
- class sparse2_l1_l1.sparse2_ridge_l1vsl1(mu=0.01, V1=100, V2=100, eta_alpha='auto', eta_w1='auto', eta_beta='auto', eta_w2='auto', n_iter=2000, tol=0.01, sparsity=None, fit_intercept=True)[source]
Bases:
_SparseLinear2AdversarialGMMSparse Ridge NPIV estimator using \(\ell_1-\ell_1\) optimization for nested NPIV.
This class solves the high-dimensional sparse ridge problem using \(\ell_1\) relaxations for the minimax optimization problem in a nested NPIV setting.
- Parameters
_SparseLinear2AdversarialGMM. (Same as) –
- _check_duality_gap(A, B, C, D, Y, W)[source]
Calculate the duality gap to certify convergence of the algorithm.
The ensembles can be thought of as primal and dual solutions, and the duality gap can be used as a certificate for convergence of the algorithm.
- Parameters
A (array-like) – Covariates for the first stage.
B (array-like) – Covariates for the second stage.
C (array-like) – Instrumental variables for the second stage.
D (array-like) – Instrumental variables for the first stage.
Y (array-like) – Outcomes.
W (array-like) – Weights.
- Returns
True if the duality gap is below the tolerance level, indicating convergence.
- Return type
- fit(A, B, C, D, Y, W=None, subsetted=False, subset_ind1=None, subset_ind2=None)[source]
Fit the model.
- Parameters
A (array-like) – Covariates for the first stage.
B (array-like) – Covariates for the second stage.
C (array-like) – Instrumental variables for the second stage.
D (array-like) – Instrumental variables for the first stage.
Y (array-like) – Outcomes.
W (array-like, optional) – Weights. Defaults to None.
subsetted (bool, optional) – Whether to use subsets. Defaults to False.
subset_ind1 (array-like, optional) – Subset indices for the first stage. Required if subsetted is True.
subset_ind2 (array-like, optional) – Subset indices for the second stage. Defaults to None.
- Returns
Fitted estimator.
- Return type
self
Regularized Linear Function Spaces
NPIV
This module provides implementations of sparse linear NPIV estimators with L2 norm regularization.
Classes:
- _SparseLinearAdversarialGMM
Base class for sparse linear adversarial GMM.
- sparse_l2vsl2
Sparse Linear NPIV estimator using \(\ell_2-\ell_2\) optimization.
- sparse_ridge_l2vsl2
Sparse Ridge NPIV estimator using \(\ell_2-\ell_2\) optimization.
- class sparse_l2_l2._SparseLinearAdversarialGMM(lambda_theta=0.01, B=100, eta_theta='auto', eta_w='auto', n_iter=2000, tol=0.01, sparsity=None, fit_intercept=True)[source]
Bases:
objectBase class for sparse linear adversarial GMM.
This class implements common functionality for sparse linear models using adversarial GMM.
- Parameters
- fit(Z, X, Y)
Fit the model.
- property coef
- property intercept
- class sparse_l2_l2.sparse_l2vsl2(lambda_theta=0.01, B=100, eta_theta='auto', eta_w='auto', n_iter=2000, tol=0.01, sparsity=None, fit_intercept=True)[source]
Bases:
_SparseLinearAdversarialGMMSparse Linear NPIV estimator using \(\ell_2-\ell_2\) optimization.
This class solves the high-dimensional sparse linear problem using \(\ell_2\) relaxations for the minimax optimization problem.
- Parameters
_SparseLinearAdversarialGMM. (Same as) –
- _check_duality_gap(Z, X, Y)[source]
Check the duality gap to monitor convergence.
The ensembles can be thought of as primal and dual solutions, and the duality gap can be used as a certificate for convergence of the algorithm.
- Parameters
Z (array-like) – Instrumental variables.
X (array-like) – Covariates.
Y (array-like) – Outcomes.
- Returns
True if the duality gap is less than the tolerance, otherwise False.
- Return type
- class sparse_l2_l2.sparse_ridge_l2vsl2(lambda_theta=0.01, B=100, eta_theta='auto', eta_w='auto', n_iter=2000, tol=0.01, sparsity=None, fit_intercept=True)[source]
Bases:
_SparseLinearAdversarialGMMSparse Ridge NPIV estimator using \(\ell_2-\ell_2\) optimization.
This class solves the high-dimensional sparse ridge problem using \(\ell_2\) relaxations for the minimax optimization problem.
- Parameters
_SparseLinearAdversarialGMM. (Same as) –
- _check_duality_gap(Z, X, Y)[source]
Check the duality gap to monitor convergence.
The ensembles can be thought of as primal and dual solutions, and the duality gap can be used as a certificate for convergence of the algorithm.
- Parameters
Z (array-like) – Instrumental variables.
X (array-like) – Covariates.
Y (array-like) – Outcomes.
- Returns
True if the duality gap is less than the tolerance, otherwise False.
- Return type
Nested NPIV
This module provides implementations of sparse linear NPIV estimators with L2 norm regularization for nested NPIV.
Classes:
- _SparseLinear2AdversarialGMM
Base class for sparse linear adversarial GMM for nested NPIV.
- sparse2_l2vsl2
Sparse Linear NPIV estimator using \(\ell_2-\ell_2\) optimization for nested NPIV.
- sparse2_ridge_l2vsl2
Sparse Ridge NPIV estimator using \(\ell_2-\ell_2\) optimization for nested NPIV.
- class sparse2_l2_l2._SparseLinear2AdversarialGMM(mu=0.01, V1=100, V2=100, eta_alpha='auto', eta_w1='auto', eta_beta='auto', eta_w2='auto', n_iter=2000, tol=0.01, sparsity=None, fit_intercept=True)[source]
Bases:
objectBase class for sparse linear adversarial GMM for nested NPIV.
This class implements common functionality for sparse linear models using adversarial GMM in a nested NPIV setting.
- Parameters
mu (float) – Regularization parameter.
V1 (int) – Budget parameter for the first stage.
V2 (int) – Budget parameter for the second stage.
n_iter (int) – Number of iterations.
tol (float) – Tolerance for duality gap.
sparsity (int or None) – Sparsity level for the model.
fit_intercept (bool) – Whether to fit an intercept.
- property coef
- property intercept
- predict(B, *args)[source]
Predict using the fitted model.
- Parameters
B (array-like) – Covariates for the second stage.
*args – Optional. If provided, the first argument is treated as the covariates for the first stage.
- Returns
Predicted values. If both B and A are provided, returns a tuple of predictions for both stages.
- Return type
array or tuple
- weighted_mean(arr, weights, axis=0)[source]
Compute the weighted mean of an array along the specified axis.
- Parameters
arr (array-like) – Input array.
weights (array-like) – Weights for the mean computation.
axis (int, optional) – Axis along which to compute the mean. Defaults to 0.
- Returns
Weighted mean.
- Return type
array
- class sparse2_l2_l2.sparse2_l2vsl2(mu=0.01, V1=100, V2=100, eta_alpha='auto', eta_w1='auto', eta_beta='auto', eta_w2='auto', n_iter=2000, tol=0.01, sparsity=None, fit_intercept=True)[source]
Bases:
_SparseLinear2AdversarialGMMSparse Linear NPIV estimator using \(\ell_2-\ell_2\) optimization for nested NPIV.
This class solves the high-dimensional sparse linear problem using \(\ell_2\) relaxations for the minimax optimization problem in a nested NPIV setting.
- Parameters
_SparseLinear2AdversarialGMM. (Same as) –
- _check_duality_gap(A, B, C, D, Y, W)[source]
Calculate the duality gap to certify convergence of the algorithm.
The ensembles can be thought of as primal and dual solutions, and the duality gap can be used as a certificate for convergence of the algorithm.
- Parameters
A (array-like) – Covariates for the first stage.
B (array-like) – Covariates for the second stage.
C (array-like) – Instrumental variables for the second stage.
D (array-like) – Instrumental variables for the first stage.
Y (array-like) – Outcomes.
W (array-like) – Weights.
- Returns
True if the duality gap is below the tolerance level, indicating convergence.
- Return type
- fit(A, B, C, D, Y, W=None, subsetted=False, subset_ind1=None, subset_ind2=None)[source]
Fit the model.
- Parameters
A (array-like) – Covariates for the first stage.
B (array-like) – Covariates for the second stage.
C (array-like) – Instrumental variables for the second stage.
D (array-like) – Instrumental variables for the first stage.
Y (array-like) – Outcomes.
W (array-like, optional) – Weights. Defaults to None.
subsetted (bool, optional) – Whether to use subsets. Defaults to False.
subset_ind1 (array-like, optional) – Subset indices for the first stage. Required if subsetted is True.
subset_ind2 (array-like, optional) – Subset indices for the second stage. Defaults to None.
- Returns
Fitted estimator.
- Return type
self
- class sparse2_l2_l2.sparse2_ridge_l2vsl2(mu=0.01, V1=100, V2=100, eta_alpha='auto', eta_w1='auto', eta_beta='auto', eta_w2='auto', n_iter=2000, tol=0.01, sparsity=None, fit_intercept=True)[source]
Bases:
_SparseLinear2AdversarialGMMSparse Ridge NPIV estimator using \(\ell_2-\ell_2\) optimization for nested NPIV.
This class solves the high-dimensional sparse ridge problem using \(\ell_2\) relaxations for the minimax optimization problem in a nested NPIV setting.
- Parameters
_SparseLinear2AdversarialGMM. (Same as) –
- _check_duality_gap(A, B, C, D, Y, W)[source]
Calculate the duality gap to certify convergence of the algorithm.
The ensembles can be thought of as primal and dual solutions, and the duality gap can be used as a certificate for convergence of the algorithm.
- Parameters
A (array-like) – Covariates for the first stage.
B (array-like) – Covariates for the second stage.
C (array-like) – Instrumental variables for the second stage.
D (array-like) – Instrumental variables for the first stage.
Y (array-like) – Outcomes.
W (array-like) – Weights.
- Returns
True if the duality gap is below the tolerance level, indicating convergence.
- Return type
- fit(A, B, C, D, Y, W=None, subsetted=False, subset_ind1=None, subset_ind2=None)[source]
Fit the model.
- Parameters
A (array-like) – Covariates for the first stage.
B (array-like) – Covariates for the second stage.
C (array-like) – Instrumental variables for the second stage.
D (array-like) – Instrumental variables for the first stage.
Y (array-like) – Outcomes.
W (array-like, optional) – Weights. Defaults to None.
subsetted (bool, optional) – Whether to use subsets. Defaults to False.
subset_ind1 (array-like, optional) – Subset indices for the first stage. Required if subsetted is True.
subset_ind2 (array-like, optional) – Subset indices for the second stage. Defaults to None.
- Returns
Fitted estimator.
- Return type
self
Linear Class
This module provides implementations of two-stage least squares (TSLS) and regularized TSLS using linear and elastic net regression.
- Classes:
tsls: Two-stage least squares estimator. regtsls: Regularized two-stage least squares estimator using Elastic Net.
- class tsls.regtsls[source]
Bases:
objectRegularized two-stage least squares estimator using Elastic Net.
This class implements the regularized TSLS estimator using Elastic Net regression.
- class tsls.tsls[source]
Bases:
objectTwo-stage least squares estimator.
This class implements the TSLS estimator.