Neural Networks

We now consider the case where the function classes correspond to neural networks. In such case, the (joint) estimator takes the form:

\[\begin{split}(\hat{g}, \hat{h}) = \arg \min _{\theta_1, \theta_2} \max_{\omega_1, \omega_2} \left\{ \mathbb{E}_n\left[2\left\{g_{\theta_1}(A) - Y\right\} f_{\omega_1}'(C') - f_{\omega_1}'(C')^2\right] + \mu' \mathbb{E}_n\{g_{\theta_1}(A)^2\} \right. \\ \left. + \mathbb{E}_n\left[2\left\{h_{\theta_2}(B) - g_{\theta_1}(A)\right\} f_{\omega_2}(C) - f_{\omega_2}(C)^2\right] + \mu \mathbb{E}_n\{h_{\theta_2}(B)^2\} \right\}\end{split}\]

where \(\theta_1, \theta_2, \omega_1, \omega_2\) are weights of the neural networks.

We use the Optimistic Adam algorithm of Daskalakis et al. (2017) to solve the previous minimax problem as was also proposed in Dikkala et al. (2020).

oadam.OAdam(*args, **kwargs)

Implements optimistic Adam algorithm.

Subsetted Estimator

Modify the computation of the loss for the adversary to be zero for the observations outside the restriction:

test = self.adversary(zb)
test[indices_] = 0
G_loss = - torch.mean((yb - pred) * test) + torch.mean(test**2)

Single estimator

`agmm.AGMM`(learner, adversary)	Adversarial Generalized Method of Moments estimator.
`agmm.KernelLayerMMDGMM`(learner, adversary_g, ...)	AGMM with kernel layer using Maximum Mean Discrepancy.
`agmm.CentroidMMDGMM`(learner, adversary_g, ...)	AGMM with centroid-based Maximum Mean Discrepancy.
`agmm.KernelLossAGMM`(learner, adversary_g, ...)	AGMM with kernel loss.
`agmm.MMDGMM`(learner, adversary_g, n_samples, ...)	AGMM with Maximum Mean Discrepancy.

Joint estimator

agmm2.AGMM2L2(learnerh, learnerg, ...)

Adversarial Generalized Method of Moments estimator for nested NPIV with L2 regularization.