Why network overfits too early?
I want to train a neural network model, which basically does binary classification. I can't understand why my network overfits too early. I thought my network is too big and it memorizes the dataset, but when I make it smaller, it does not learn at all.
How avoid this situation? Dropout didn't work, augmentation techniques helped a bit, obviously, regularizations didn't change anything.
Can you guys explain the reasons and how I can avoid them?