Hidden Units
So far we have focused our discussion on design choices for neural networks that are common to most parametric machine learning models trained with gradientbased optimization. Now we turn to an issue that is unique to feedforward neural networks: how to choose the type of hidden unit to use in the hidden layers of the model. So far we have focused our