🧠 AIMS Coursework

Bayesian Deep Active Learning

Late 2024 — Active Learning with Uncertainty Quantification

BNNs MC Dropout BALD Margin Sampling

About

Implementations and analyses of active learning experiments using Bayesian neural networks (BNNs) with MC Dropout. The experiments span MNIST and Dirty-MNIST datasets, investigating different acquisition strategies including Random Sampling, Margin Sampling, and BALD (Bayesian Active Learning by Disagreement).

View on GitHub →

MNIST Experiments

1. Setup: Training a BNN with MC Dropout

First step was training a Bayesian Neural Network on the full MNIST training dataset, evaluating test accuracy and Negative Log-Likelihood (NLL).

Final Results (Epoch 10/10):

97.25%

Train Accuracy

98.74%

Test Accuracy

0.0943

Train NLL

0.0667

Test NLL

Training vs Test Accuracy and NLL

Training and test curves for NLL and accuracy over 10 epochs

2. Baseline: Random Acquisition

Active learning with random acquisition starting with 2 labeled examples per class (20 total), acquiring batches of K=10 samples until reaching 1020 total labeled examples. Averaged over 10 trials.

Random Acquisition Learning Curve

Learning curve with confidence intervals — accuracy improves rapidly initially

3. Scaling Behavior: Epochs vs. Labeled Data

Exploring trade-offs between training longer (more epochs) vs. acquiring more labeled data.

Varying Epochs

Fixed 5 labels/class, varying epochs (1→2048)

Varying Labels

Fixed 10 epochs, varying labels (2→512/class)

Key Insight: More labeled examples generally outperform longer training epochs for improving accuracy and reducing NLL.

4. Margin Sampling

Selects samples with the smallest difference between the two most probable predicted classes — prioritizing where the model is most uncertain.

Margin Sampling

5. BALD (Bayesian Active Learning by Disagreement)

Bayesian acquisition strategy that selects samples with high epistemic uncertainty — quantifying disagreement among the ensemble of models.

BALD Acquisition

All Strategies Compared

All Acquisition Strategies

Results Summary:

  • 🥇 Margin Sampling: Best accuracy, convergence speed, and stability
  • 🥈 Random Acquisition: Reasonable baseline performance
  • 🥉 BALD: Slower convergence but valuable for understanding uncertainty

6. Top-K Trade-Offs for BALD

Comparing individual sample acquisition (K=1) vs batch acquisition (K=10) in BALD.

K=1 vs K=10 BALD

K=10 shows more stable learning with faster convergence

Dirty-MNIST Experiments

Evaluating acquisition strategies on the noisy Dirty-MNIST dataset — testing robustness to noise.

Dirty-MNIST Comparison

Key Finding: Margin Sampling is most robust to noise, outperforming BALD and Entropy-based acquisition on noisy data.

Key Takeaways

  • Margin Sampling wins — most effective for both clean and noisy datasets
  • More data beats more epochs — acquiring labels is more valuable than training longer
  • Batch acquisition (K=10) is more practical than single-sample (K=1)
  • BALD provides uncertainty insights but has computational overhead