Why network overfits too early?
I want to train a neural network model, which basicly does binary classification. I can't understand why my network overfits too early. I thought my network is too big and it memorizes the dataset, but when I make it smaller, it does not learn at all. How avoid this situation? dropout didn't work, augmentation techniques helped a bit, obviously regularizations didn't change anything. Can you guys explain the reasons, and how I can avoid it?