(testing signal)

Tag: randomforest

What’s in a “Random Forest”? Predicting Diabetes

Python Implementation

Now that we’ve gone through some conceptual context behind what a random forest and a decision tree is, and how it makes its decisions, let’s actually implement the algorithm in Python!

For this implementation, I’ll be using real-life recent data from patients at the Sylhet Diabetes Hospital in Sylhet, Bangladesh. The data was collected and published just last year in June 2020, in this research paper by Dr. MM Faniqul Islam and others (cited below), and is freely available on the UC Irvine Machine Learning Repository at this link.

First, you’ll need to import the CSV file once it’s downloaded from the Repository.… Read more...

Essential guide to Impute Missing Values in a single line of Python code

Predict missing data using Random Forest and k-NN based Imputation

Image by Willi Heidelbach from Pixabay

A real-world dataset often has a lot of missing records that may be caused due to data corruption or failure to record the values. To train a robust machine learning model handling of missing values is essential during the feature engineering pipeline.

There are various imputation strategies that can be used to impute missing records for categorical, numerical, or time-series features. You can refer to one of my previous articles where I have discussed 7 strategies or techniques to handle missing records in the dataset.


Wisdom of the Crowd -Voting Classifier, Bagging-Pasting, Random Forest and Extra Trees-

Using multiple algorithms, ensemble learning, with python implementation

Photo by Cliff Johnson on Unsplash

People consult other people’s opinions before making decisions on most issues. While deciding a collective environment, the decision is usually made when the majority says. While this is the case even at the individual level, some companies survey many things at the global level. Decisions made by a collective crowd, not by a single expert, are called “wisdom of the crowd” and Aristotle used this argument for the first time in his work named Politics.


A More Accessible and Replicable Method for Satellite-Based Mapping of Hand-Harvested Crops in…