Estimators for Sequential and Simultaneous Nested NPIV

Start Here

Overview

This section summarizes the optimization targets for nested NPIV estimators under different function classes and links each target to practical implementations (RKHS, random forest/ensemble, neural network, sparse/regularized linear, and linear baselines).

Assumptions

  • Observations are i.i.d. draws of \((A, B, C, C', Y)\).

  • Function classes \(\mathcal{G}, \mathcal{H}, \mathcal{F}, \mathcal{F}'\) are chosen by the estimator family.

  • Penalization and/or norm constraints are used to regularize finite-sample minimax estimation.

Notation

  • \(A\): first-stage endogenous treatment/features.

  • \(B\): second-stage endogenous treatment/features.

  • \(C'\): first-stage instruments for recovering \(g\).

  • \(C\): second-stage instruments for recovering \(h\).

  • \(g\): first-stage bridge function, \(h\): structural function of primary interest.

Estimator Objectives

Sequential Nested NPIV:

Given observations \((A_i, B_i, C_i)\), an initial estimator \(\hat{g}\), and hyperparameter values \((\lambda, \mu)\), estimate

\[\hat{h} = \arg\min_{h \in \mathcal{H}} \left[ \sup_{f \in \mathcal{F}} \left( 2 \cdot \text{loss}(f, \hat{g}, h) - \text{penalty}(f, \lambda) \right) + \text{penalty}(h, \mu) \right]\]

where \(\text{penalty}(f, \lambda) = \mathbb{E}_m\{f(C)^2\} + \lambda \cdot \|f\|^2_{\mathcal{F}}\) and \(\text{penalty}(h, \mu) = \mu \cdot \|h\|^2_{\mathcal{H}}\).

Interpretation: the adversary \(f\) probes IV moment violations for fixed \(h\), while the learner regularizes complexity to stabilize inversion.

Sequential Nested NPIV: Ridge:

Given observations \((A_i, B_i, C_i)\), an initial estimator \(\hat{g}\), and hyperparameter \(\mu\), estimate

\[\hat{h} = \arg\min_{h \in \mathcal{H}} \left[ \sup_{f \in \mathcal{F}} \left( 2 \cdot \text{loss}(f, \hat{g}, h) - \text{penalty}(f) \right) + \text{penalty}(h, \mu) \right]\]

where \(\text{penalty}(f) = \mathbb{E}_m\{f(C)^2\}\) and \(\text{penalty}(h, \mu) = \mu \cdot \mathbb{E}_m\{h(B)^2\}\).

Interpretation: this variant emphasizes prediction-space regularization for \(h\) via \(\mathbb{E}[h(B)^2]\).

Simultaneous Nested NPIV:

Given observations \((A_i, B_i, C_i, C_i')\) and hyperparameters \((\mu', \mu)\), estimate

\[\begin{split}(\hat{g}, \hat{h}) = \arg\min_{g \in \mathcal{G}, h \in \mathcal{H}} \left[ \sup_{f' \in \mathcal{F}} \left( 2 \cdot \text{loss}(f', Y, g) - \text{penalty}(f') \right) + \text{penalty}(g, \mu') \right. \\ \left. + \sup_{f \in \mathcal{F}} \left( 2 \cdot \text{loss}(f, g, h) - \text{penalty}(f) \right) + \text{penalty}(h, \mu) \right]\end{split}\]

Interpretation: joint estimation can propagate first-stage uncertainty into the second stage; diagnostics in Estimation Diagnostics help assess conditioning before fitting.

Progressive Recipe

# Step 1: prepare arrays (A, B, C_prime, C, Y)
from nnpiv.rkhs import RKHS2IVL2

est = RKHS2IVL2(mu=0.1, mu_prime=0.1)

# Step 2: fit simultaneous nested NPIV
est.fit(A=A, B=B, C=C, D=C_prime, Y=Y)

# Step 3: inspect structural predictions
h_hat = est.predict(B_test)

Estimator Families