Skip to content

Deep Learning Interviews

Introduction


Q1: Distribution of maximum entropy

What is the distribution of maximum entropy; i.e. the distribution that has the maximum entropy among all distributions in a bounded interval [a, b], (-\inf, +\inf)?

Solution

In a bounded interval [a, b], the UNIFORM DISTRIBUTION has the maximum entropy. The variance of the Uniform Distribution \(\mathcal{U}(a, b)\) is \(\sigma^2 = \frac{(b-a)^2}{12}\). Therefore, the maximum entropy in a bounded interval [a, b] is \(\left(\frac{\log{12}}{2} + \log(\sigma)\right)\)


Q2: What's the purpose of this code-snippet?

Describe in your own words. what is the purpose of this code snippet?

1
2
3
4
5
self.transforms = []
if rotate:
    self.transforms.append(RandomRotate())
if flip:
    self.transforms.append(RandomFLip())

Solution

Overfitting is a common problem that occurs during training of machine learning systems. Among various strategies to overcome the problem of overfitting; data-augmentation is a very handy method. Data Augmentation is a regularization technique that synthetically expands the data-set by utilizing label-preserving transformations to add more invariant examples of the same data samples. Data Augmentation is very important in balancing the data distribution across various classes in the datatset. Some of the data-augmentation techniques are: random rotation, cropping, random flip, zoomingetc.

Usually, the data-augmentation process is done in the CPU before uploading the batched-data for training the model on the GPU.


Logistic Regression

Q3: Drawbacks of model fitting

For a fixed number of observations in a dataset, introducing more number of variables normally generate a model that has a better fit to the data. What may be drawbacks of such a model fitting strategy?

Solution

Introducing more number of variables increasing the capacity of the model. If the number of data points in teh dataset is kept fixed, and then increasing the number of model parameters (variables) leads to OVERFITTING. Overfitting is a scenario where the trained model performs very well on the training data but performs poorly in the test daatset due to lack of generalization capabilites as the overly sized model just remembered the data points instead of understanding teh features & data distribution in the traning set.

рд╣рд┐рдиреНрджреА рдореЗрдВ

рдкреНрд░рд╢реНрди: Training data рдореЗрдВ data-points рдХреА рд╕рдВрдЦреНрдпрд╛ рдирд┐рд╢реНрдЪрд┐рдд рд░рдЦрддреЗ рд╣реБрдП, model-variables рдХреА рд╕рдВрдЦреНрдпрд╛ рдмрдврд╝рд╛рдиреЗ рд╕реЗ рд╣рдо рдПрдХ рдРрд╕реЗ trained model рдХреЛ рдкреНрд░рд╛рдкреНрдд рдХрд░ рд╕рдХрддреЗ рд╣реИрдВ рдЬреЛ training data рдХреЛ рдЕрддреНрдпрдВрдд рдЕрдЪреНрдЫреА рдкреНрд░рдХрд╛рд░ рд╕реЗ fit рдХрд░ рд╕рдХрддрд╛ рд╣реИред рдЗрд╕ рдкреНрд░рдХреНрд░рд┐рдпрд╛ рдХреЗ рдХрд░рдиреЗ рд╕реЗ рдХреНрдпрд╛-рдХреНрдпрд╛ рд╣рд╛рдирд┐рдпрд╛рдБ рд╣реЛрддреА рд╣реИрдВ?

рдЙрддреНрддрд░: Model рдХреЗ variables рдмрдврд╝рд╛рдиреЗ рд╕реЗ model рдХреА capacity рдмрдврд╝рддреА рд╣реИ рдФрд░ рдЗрд╕рд╕реЗ рд╡реЛ training data рдХреЛ рдЖрдЪреЗ рд╕реЗ рд╕рдордЭ рд╕рдХрддрд╛ рд╣реИ рдФрд░ training data-distribution рдХреЗ рдмрд╛рд░реЗ рдореЗрдВ рд╕рдордЭ рд╕рдХрддрд╛ рд╣реИред рдкрд░рдВрддреБ рдЕрдЧрд░ variables рдХреА рд╕рдВрдЦреНрдпрд╛ рдмрдврд╝рд╛рдиреЗ рдХреЗ рд╕рд╛рде-рд╕рд╛рде рдЕрдЧрд░ рд╣рдо training-data рдХреА рд╕рдВрдЦреНрдпрд╛ рдирд╣реАрдВ рдмрдврд╝рд╛рддреЗ рд╣реИрдВ рддреЛ trained model рдореЗрдВ рдПрдХ рд╡рд┐рдХреГрддрд┐ рд╣реЛрдиреЗ рд▓рдЧрддреА рд╣реИ рдЬрд┐рд╕реЗ рд╣рдо OVERFITTING рдХрд╣рддреЗ рд╣реИрдВред Overfitting рд╣реЛрдиреЗ рд╕реЗ рд╣рдорд╛рд░рд╛ trained model, training-data рдХреЗ рд╡рд┐рд╖рдп рдореЗрдВ рддреЛ рдмрд╣реБрдд рдЕрдЪреНрдЫреЗ рд╕реЗ рдЬрд╛рдирддрд╛ рд╣реИ рдкрд░рдВрддреБ рд╡реЛ test-data рдкрд░ рд╡реЛ рдЕрдЪреНрдЫреЗ рд╕реЗ рдХрд╛рд░реНрдп рдирд╣реАрдВ рдХрд░рддрд╛ рд╣реИ рдХреНрдпреЛрдВрдХрд┐ training рдХреЗ рд╕рдордп рд╡реЛ рдЕрдкрдиреА generalization рдХреНрд╖рдорддрд╛ рдХреЛ рд╡рд┐рдХреНрд╖рд┐рдд рдирд╣реАрдВ рдХрд░ рдкрд╛рдпрд╛ рдФрд░ рд╕рдВрднрд╡рддрдГ рдПрдХ рд░рдЯрдВрдд (memorized training data) model рд╣реА рдмрди рдкрд╛рдпрд╛ред рдЕрддрдГ рд╣рдореЗ overfitting рдХреЛ рдпрдерд╛ рд╕рдВрднрд╡ рд░реЛрдХрдиреЗ рдХрд╛ рдкреНрд░рдпрд╛рд╕ рдХрд░рдирд╛ рдЪрд╛рд╣рд┐рдП рдЬрд┐рд╕рдХреЗ рд▓рд┐рдпреЗ рдЕрдиреЗрдХ рдкрде рдЕрдкрдирд╛рдП рдЬрд╛рддреЗ рд╣реИрдВ рдЬреИрд╕реЗ: (рез) dropout (реи) data augmentation (рей) pruning


Q4: Odds of Success

Define the term odds of success, both qualitively and formally. Give a numerical example that stresses the relationship the relationship between probability and odds of an event occuring.

Solution

Odds of Success of an event in an experiment is the ratio of probability of the event occuring and the probability of the event not occuring i.e. \(\left(\frac{\text{probability of occurance of an event E}}{1 - \text{(probability of the occurance of the event E)}}\right)\)