Bagging vs Boosting

Swagata Ashwani
2 min readMay 29, 2022
Photo by Claudio Schwarz on Unsplash

Before we understand the difference between Bagging and Boosting, let’s understand what is Ensemble learning

Ensemble learning in Machine Learning is a way of learning the model by using multiple classifiers instead of one. Ensemble learning leads to better models, giving better results than traditional models. One such example that is widely used is Random Forest.

There are two types of Ensemble models-

Bagging

Bootstrap Aggregating, also knows as bagging, is a machine learning ensemble learning method that parallelly uses multiple classifiers to improve the stability and accuracy of machine learning algorithms used in statistical classification and regression. It decreases the variance and helps to avoid overfitting.

Steps involved in Bagging-

  1. Divide the dataset into multiple subsets of equal length, selecting observations with replacement.
  2. A base model is created for each of these subsets.
  3. Each model is learned in parallel from each training set and independent of each other.
  4. The final predictions are determined by combining the predictions from all the models.

Random forest uses bagging technique by combining multiple decision trees.

Boosting

Boosting is an ensemble learning method that builds a strong classifier from the number of weak classifiers. It is done by building a model by using weak models sequentially, unlike Bagging where models are trained parallelly. Firstly, a model is built from the training data. Then the second model is built which tries to correct the incorrect predictions present in the first model. This procedure is continued and models are added until either the complete training data set is predicted correctly.

Steps involved in Boosting-

  1. Assign equal weight to each observation in the dataset.
  2. Train the model with this dataset and identify the incorrect predictions.
  3. Increase the weight of the incorrectly classified data points.
  4. if (got all correct predictions or maximum model count reached) ->end
    else
    -> step 2

--

--

Swagata Ashwani

I love talking Data! Data Scientist with a passion for finding optimized solutions in the AI space.Follow me here — https://www.linkedin.com/in/swagata-ashwani/