Supervised and Unsupervised Learning Model Reference

Posted on Jun 13, 2026 in Computer Engineering

Supervised Classification

Type: Binary Classification
Scaling: Yes (StandardScaler)
Outliers: Not robust
Categorical Variables: No (encode first)
Core Idea: Sigmoid function maps output to 0–1 probability; threshold ≥ 0.5 predicts class 1.
Advantages: Fast, simple, interpretable, outputs probabilities.
Disadvantages: Binary only, requires linear boundary, fails on non-linear data.
Metrics: Accuracy, Precision, Recall, F1-Score, Confusion Matrix.

Type: Classification and Regression
Scaling: No (not required)
Outliers: Robust
Categorical Variables: Yes
Core Idea: IF-ELSE splits by feature; leaf nodes represent final predictions.
Advantages: Interpretable, no scaling needed, handles any data type, fast.
Disadvantages: Prone to overfitting, sensitive to small data changes.
Metrics: Gini Impurity, Accuracy, Confusion Matrix.

Type: Classification and Regression
Scaling: No
Outliers: Robust
Categorical Variables: Yes
Core Idea: Ensemble of trees (Bagging) using majority vote for final prediction.
Advantages: Reduces overfitting, stable, provides feature importance, handles missing values.
Disadvantages: Slow, less interpretable, requires hyperparameter tuning.
Metrics: Accuracy, Feature Importance, Out-of-Bag (OOB) error.

Type: Continuous target only
Scaling: Yes (recommended)
Outliers: Not robust (outliers distort the regression line)
Categorical Variables: No (encode first)
Core Idea: y = b0 + b1x1 + b2x2 + … predicts a continuous value.
Advantages: Simple, fast, interpretable, coefficients indicate feature impact.
Disadvantages: Assumes linearity, fails on complex patterns, sensitive to outliers.
Metrics: MAE, MSE, RMSE, R².

Type: Classification and Regression
Scaling: Yes (Distance-based; mandatory)
Outliers: Not robust
Categorical Variables: No
Core Idea: Classification via majority vote of K neighbors; Regression via average of K neighbors.
Advantages: Simple, no training phase, effective with small datasets.
Disadvantages: Slow on large data, sensitive to outliers. Note: K too low causes overfitting/noise; K too high causes underfitting. Default K ≈ 5.
Metrics: Accuracy, F1 (Class); MAE, MSE, R² (Regression).

Type: Clustering (No target variable)
Scaling: Yes (Distance-based)
Outliers: Not robust (shifts centroids)
Categorical Variables: No
Core Idea: Iteratively assign points to the nearest of K centroids and update centroids until stable.
Inertia: Sum of squared distances to centroid; lower is more compact. Used in the Elbow method.
Elbow Method: Plot inertia vs. K; pick K where the curve bends.
Advantages: Fast, simple, scalable.
Disadvantages: Must pre-set K, assumes spherical clusters, random initialization leads to inconsistent results.
Metrics: Inertia, Silhouette score.

Type: Clustering (No target variable)
Scaling: Yes (Distance-based)
Outliers: Depends on linkage
Categorical Variables: No
Core Idea: Agglomerative approach; start with N clusters and merge closest pairs.
Dendrogram: Tree diagram showing merge history; cut horizontally to determine cluster count.
Linkage Types:
- Single: Nearest point (good for outlier detection).
- Complete: Farthest point (spherical clusters).
- Average: Centroid distance (robust to outliers).
- Ward: Minimizes within-cluster variance (best general choice).
Advantages: No K needed in advance, deterministic, shows full merge history.
Disadvantages: Slow on large data, highly sensitive to linkage choice.
Metrics: Dendrogram, Silhouette score, Elbow (distortion).

Note: High confidence with Lift ≈ 1 indicates the consequent is equally common with or without the antecedent, rendering the rule useless.