Neural Networks

We now consider the case where the function classes correspond to neural networks. In such case, the (joint) estimator takes the form:

\[\begin{split}(\hat{g}, \hat{h}) = \arg \min _{\theta_1, \theta_2} \max_{\omega_1, \omega_2} \left\{ \mathbb{E}_n\left[2\left\{g_{\theta_1}(A) - Y\right\} f_{\omega_1}'(C') - f_{\omega_1}'(C')^2\right] + \mu' \mathbb{E}_n\{g_{\theta_1}(A)^2\} \right. \\ \left. + \mathbb{E}_n\left[2\left\{h_{\theta_2}(B) - g_{\theta_1}(A)\right\} f_{\omega_2}(C) - f_{\omega_2}(C)^2\right] + \mu \mathbb{E}_n\{h_{\theta_2}(B)^2\} \right\}\end{split}\]

where \(\theta_1, \theta_2, \omega_1, \omega_2\) are weights of the neural networks.

We use the Optimistic Adam algorithm of Daskalakis et al. (2017) to solve the previous minimax problem as was also proposed in Dikkala et al. (2020).

oadam.OAdam(*args, **kwargs)

Implements optimistic Adam algorithm.

Subsetted Estimator

Modify the computation of the loss for the adversary to be zero for the observations outside the restriction:

test = self.adversary(zb)
test[indices_] = 0
G_loss = - torch.mean((yb - pred) * test) + torch.mean(test**2)

Single estimator

agmm.AGMM(learner, adversary)

Adversarial Generalized Method of Moments estimator.

agmm.KernelLayerMMDGMM(learner, adversary_g, ...)

AGMM with kernel layer using Maximum Mean Discrepancy.

agmm.CentroidMMDGMM(learner, adversary_g, ...)

AGMM with centroid-based Maximum Mean Discrepancy.

agmm.KernelLossAGMM(learner, adversary_g, ...)

AGMM with kernel loss.

agmm.MMDGMM(learner, adversary_g, n_samples, ...)

AGMM with Maximum Mean Discrepancy.

Joint estimator

agmm2.AGMM2L2(learnerh, learnerg, ...)

Adversarial Generalized Method of Moments estimator for nested NPIV with L2 regularization.