In logistic regression we learn a family of functions
The hypothesis class is therefore (where for simplicity we are using homogenous linear functions):
Note that when
Next, we need to specify a loss function. That is, we should define how bad it is to predict some
Therefore, any reasonable loss function would increase monotonically with
Therefore, given a training set
The advantage of the logistic loss function is that it is a convex function with respect to