For binary outputs you can use 1 output unit, so then:
self.outputs = nn.Linear(NETWORK_WIDTH, 1)
Then you use sigmoid
activation to map the values of your output unit to a range between 0 and 1 (of course you need to arrange your training data this way too):
def forward(self, x):
# other layers omitted
x = self.outputs(x)
return torch.sigmoid(x)
Finally you can use the torch.nn.BCELoss
:
criterion = nn.BCELoss()
net_out = net(data)
loss = criterion(net_out, target)
This should work fine for you.
You can also use torch.nn.BCEWithLogitsLoss
, this loss function already includes the sigmoid
function so you could leave it out in your forward.
If you, want to use 2 output units, this is also possible. But then you need to use torch.nn.CrossEntropyLoss
instead of BCELoss
. The Softmax
activation is already included in this loss function.
Edit: I just want to emphasize that there is a real difference in doing so. Using 2 output units gives you twice as many weights compared to using 1 output unit.. So these two alternatives are not equivalent.