What is the gradient of max pooling function?
How tensorflow calculates the gradient of some functions like max_pooling, avg_pooling, etc.?
How tensorflow calculates the gradient of some functions like max_pooling, avg_pooling, etc.?
If I'm not mistaken, Tensorflow keeps all functions and their gradients' representations, before calculating. That's why you need to give the gradient function representation as well, when you define your custom and advanced function. Also I'm not sure, but I think it can calculate the gradients using iterative algorithms, obviously which give approximate values.