Build Basic GANs week3
Mode Collapse
Why can mode collapse occur?
The generator gets stuck in a local minimum.
Correct! Mode collapse occurs when the generator gets stuck generating one mode. The discriminator will eventually learn to
differentiate the generator’s fakes when this happens and out skill
it, ending the model’s learning.
Problem with BCE Loss
What is the problem with using BCE Loss?
The discriminator does not output useful gradients (feedback) for the generator when the real/fake distributions are far apart.
Correct! This is also called the vanishing gradient problem because the gradients approach 0 when the distributions are far apart.
Earth Mover’s Distance
What does Earth Mover’s distance measure?
How different two distributions are based on distance and amount that needs to be moved.
Correct! Earth mover’s distance is a measure of how different two distributions are by estimating the effort it takes to make the
generated distribution equal to the real one.
Wasserstein Loss
What do the discriminator and critic have in common?
They try to maximize the distance between the real distribution and the fake distribution.
Correct! They both want to maximize the difference between the expected values of the predictions for real and fake.
Because Wasserstein Loss is not bounded, the critic is allowed to improve without degrading its feedback back to the generator.
Condition on Wasserstein Critic
What points on a function are considered for the evaluation of 1-Lipschitz continuity?
All points on the function.
Correct! The slope can not be greater than 1 at any point on a function in order for it to be 1-Lipschitz Continuous.
When is a function 1-Lipschitz Continuous?
When its gradient norm is less than or equal to 1 at all points.
Correct! A function is 1-Lipschitz Continuous when its slope is no greater than 1 at all points.
1-Lipschitz Continuity Enforcement
Why do you use an intermediate image for calculating the gradient penalty?
Using the gradients on an intermediate image with respect to the critic is an approximation for enforcing the gradient norm to be 1 almost everywhere.
Correct! Since checking the critic’s gradient at each possible point of the feature space is virtually impossible, you can approximate this
by using interpolated images.
What is a soft way to restrict the critic to be 1-Lipschitz?
Adding regularization to the loss function that penalizes the gradient to be less than or equal to 1.
Correct! By using a gradient penalty, you are not strictly enforcing 1-L continuity, but encouraging it.