Artificial Intelligence : Notes
  • Supervised Learning
    • Trees
      • AdaBoost
      • ID3
      • Random Forests
    • Convolutional Neural Networks
    • DNN for Classification
    • K-Nearest Neighbors
    • LDA
    • Logistic Regression
      • Logistic model for binary classification
      • Logistic model for multiple predictors
      • Multinomial Logistic Regression
      • See also
    • Perceptron
    • QDA
    • SVM
  • Unsupervised Learning
    • DBSCAN
    • Deep Autoencoder
    • Generative Adversarial Networks (GAN)
    • K-Means Clustering
    • Linear Regression
    • Principal Component Analysis (PCA)
    • Restricted Boltzmann Machines (RBM)
  • Reinforcement Learning
    • Markov Decision Process
    • Q-Learning
    • Deep Q-Learning
  • Ensemble Strategies
    • Ensemble Learning
    • Fine-tuning and resampling
  • Other Techniques
    • Expectation-Maximization
    • Recurrent Neural Networks

Logistic Regression

Logistic Regression is a supervised learning algorithm that predicts the probability of an instance belonging to a particular class. It's particularly useful for binary classification problems, where the outcome is one of two classes (e.g., spam or not spam). It can also be used for multiple classification.

Logistic model for binary classification

Denote by X the input and Y the class input X belongs to. Denote also by:

p(X)=P(Y=1|X)

The logistic model is:

p(X)=exp(β0+β1X)1+exp(β0+β1X)

It can be rewritten as (simple calculus):

(E1) : log⁡(p(X)1−p(X))=β0+β1X

Estimators β^0 and β^1 are chosen to maximize a likelihood function, which is equivalent to minimizing the binary cross-entropy function

−1n∑i=1n(yilog⁡(pi)+(1−yi)log⁡(1−pi))

Logistic model for multiple predictors

With the same notations, assuming X=(X1,…,Xp), the model is defined as (see equation (E1)):

(E2) : log⁡(p(X)1−p(X))=β0+β1X+…+βpXp

Multinomial Logistic Regression

Denote by β0k,…,βpk the coefficients for the k-th class. Assuming K categories, the model is typically expressed as:

P(Y=k|X)=exp(β0k+β1kx1+…+βpkxp)∑j=1Kexp(β0j+β1jx1+…+βpjxp)

See also

DNN for classification

Prev
LDA
Next
Perceptron