It does not matter. You can have both.
In general, people use the Sigmoid activation function when they have 2 classes to predict so having a Sigmoid on top of your network seems a good idea.

