Splitting in Decision Tree

6 mins read14.3K Views Comment

Updated on Sep 18, 2023 16:31 IST

The below article goes through various methods to split a Decision Tree.

Machine Learning is one of the most in-demand technologies, with everyone wanting to master it and most businesses needing highly trained Machine Learning engineers. Various machine-learning techniques have been developed in this arena to tackle complicated issues quickly. These algorithms are highly automated and self-modifying, improving over time as more data is added and with minimal human intervention.

What is a Decision Tree?
Decision Tree Terminologies
How do you split nodes in a Decision tree?
Methods to split Decision Tree
Gini Impurity vs Information Gain
Conclusion

What is a Decision Tree?

Decision trees are one of the predictive modelling approaches used in machine learning. It uses a decision tree to travel from observations about an object (represented by the branches) to inferences about the item’s target value (represented by the leaves) (as a predictive model)

A decision tree’s main idea is to locate the features that contain the most information about the target feature and then split the dataset along with their values. The characteristic that best isolates the uncertainty from knowledge about the target feature is the most informative. The search for the most informative attribute continues until we have pure leaf nodes.

Decision Tree Terminologies

Root Node: Represents the entire sample. This will further get divided into two or more homogeneous sets.

Decision Node: Nodes Branched from Root nodes are Decision nodes.

Branch: Formed by splitting the tree.

To summarize, The inputs are routed through the root node of every tree. This root node is further segmented into decision nodes that are conditionally dependent on results and observations.

Splitting a single node into many nodes is known as splitting. A leaf node, also known as a terminal node, is a node that does not break into other nodes. A branch, sometimes called a sub-tree, is a section of a decision tree. Splitting is not the only concept that is diametrically opposite.

Decision trees classify cases by sorting them from the root to some leaf/terminal node, with the leaf/terminal node categorizing the example. Each node in the tree is a test case for a property, and each edge descending from it represents one of the test case’s possible solutions. This recursive procedure is carried out for each new node-rooted subtree.

How do you split nodes in a Decision tree?

Although their algorithms differ from those used in classification and regression trees, decision trees completely depend on the objective variable. There are a variety of methods for selecting how to partition the data.

The essence of decision trees is that they divide data sets into sections, resulting in an inverted decision tree with root nodes at the top. Through the pass-over nodes of the trees, the layered model of the decision tree leads to the end outcome.

Each node has an attribute (feature) that catalyses further splitting in the downward direction.

Multiple features are included in the decision-making process, and it is necessary to consider the relevance and repercussions of each feature, thereby assigning the relevant feature at the root node and traversing the node splitting downward.

Methods to split Decision Tree

There are some key splitting parameters to address the significant concerns described above. Yes, we shall discuss Entropy, Gini Index, and Information Gain within the scope of this post.

Recommended online courses

Best-suited Machine Learning courses for you

Learn Machine Learning with these high-rated online courses

Master of Computer Applications with specialization in Machine Learning and Artificial Intelligence (Online MCA)

Amity OnlineDegree

Total Fees

₹1.7 L

Duration

2 years

Advance Certification in Applied Data Science, Machine Learning & IoT

IIT GuwahatiCertificate

4.0

Total Fees

₹95 K

Duration

9 months

Professional Certificate Course In Generative AI And Machine Learning

IIT KanpurCertificate

Total Fees

₹1.53 L

Duration

11 months

MCA in Machine Learning Online

Amity OnlineDegree

Total Fees

₹2.5 L

Duration

2 years

IIT Roorkee - Post Graduate Certificate Program in Data Science & Machine Learning (Online)

TimesProCertificate

4.0

Total Fees

₹2 L

Duration

10 months

Data Science & Machine Learning Course

Coding NinjasCertificate

4.8

Total Fees

₹34.65 K

Duration

11 months

MCA in Machine Learning

Amity University Online, NoidaDegree

Total Fees

₹2.5 L

Duration

2 years

Full Stack Machine Learning & AI Program

Jigsaw AcademyCertificate

Total Fees

– / –

Duration

8 hours

M.Sc. in Machine Learning and AI

upGradDegree

Total Fees

₹5.6 L

Duration

18 months

IIT Roorkee & Wiley Post Graduate Certification in AI for BFSI

IIT RoorkeeCertificate

Total Fees

– / –

Duration

6 months

1. Entropy

Entropy is a measure of purity or the degree of uncertainty, impurity, or disorder of a random variable. It is, in essence, the assessment of impurity or unpredictability in data points

If all of the elements belong to the same class, the distribution is called “Pure,” and if they don’t, it’s called “Impurity”.

To put it another way, a high order of disorder indicates a low level of impurity. Entropy is a measure of disorder that ranges from 0 to 1. It can be higher than 1 depending on the number of groups or classes present in the data collection, but it has the same meaning.

Understanding Decision Tree Algorithm in Machine Learning

Decision tree algorithms are a type of supervised learning method used for both classification and regression problems. These algorithms create a tree-like model of decisions and their possible consequences, allowing...read more

Read Later

Decision Tree Algorithm for Classification

The article gives an introduction to decision tree algorithm for classification, with example in Python.

Read Later

How Can Decision Tree Handle Complex Data?

A decision tree’s objective is to categorize data into one of two groups based on a set of attributes. A decision tree might be used, for instance, to categorize emails...read more

Read Later

2. Gini Impurity

If all elements are accurately split into different classes, the division is called pure (an ideal scenario). The Gini impurity (pronounced “genie”) is used to predict the likelihood of a randomly chosen example being incorrectly classified by a particular node. It’s referred to as an “impurity” measure because it demonstrates how the model departs from a simple division.

Gini impurity is measured on a scale of 0 to 1, with 0 indicating that all elements belong to the same class and 1 indicating that only one class exists. A Gini impurity of 1 suggests that all items are scattered randomly across various classes, whereas a value of 0.5 shows that the elements are distributed uniformly across some classes.

Now that we have seen what Gini Impurity is? let us see how to calculate it.

Calculate Gini coefficients for sub-nodes using the success(p) and failure(q) formulas (p2+q2)
Next, Calculate the impurity for each node using a weighted Gini score.

3. Information Gain

When it comes to measuring information gain, the concept of entropy is key. “Information gain, on the other hand, is based on information theory.” “Information gain” refers to the process of identifying the most important features/attributes that convey the most information about a class. The entropy principle is followed with the goal of reducing entropy from the root node to the leaf nodes. Information gain is the difference in entropy before and after splitting, which describes the impurity of in-class items.

Information Gain = 1-Entropy

The entropy generally changes when we use a node in a decision tree to partition the training instances into smaller subsets. Information gain is a metric for entropy change.

The more information there is, the higher the entropy.

Now that we have seen what Information Gain is? Let us see how to calculate it.

For each split, calculate the entropy of each child node independently
Calculate the entropy of each split using the weighted average entropy of child nodes
Choose the split with the lowest entropy or the greatest gain in information
Repeat these steps to obtain homogeneous split nodes

Now, let us compare Information Gain and Gini Impurity

Information Gain Vs Gini Impurity

We’ll go over some comparison points gleaned from the preceding discussion to assist in deciding which strategy to adopt.

The likelihood of a class is multiplied by the log base 2 of that class’s probability to calculate information gain. Gini impurity is determined by subtracting the total of each class’s squared probability from one.
The Gini Impurity prefers larger partitions (distributions) and is easy to apply, whereas information gains prefer smaller partitions (distributions) with a wide range of values, needing a data and splitting criterion experiment.
CART algorithms employ the Gini Index approach, whereas ID3, C4.5 methods employ the Information Gain method
In contrast to the Gini index, which computes the difference between entropy before and after the split and indicates impurity in classes of elements, Information Gain computes the difference between entropy before and after the split and indicates impurity in classes of elements.

Conclusion

I hope this helps!! For the analysis of the real-time scenario, the Gini index and Information Gain are employed, and the data that is obtained from the real-time analysis is real. It’s also been referred to as “data impurity” or “data distribution” in a number of definitions. So we can figure out which data has a smaller or larger role in decision-making.

Recently completed any professional course/certification from the market? Tell us what liked or disliked in the course for more curated content.

Click here to submit its review with Shiksha Online.

About the Author

Shiksha Online

This is a collection of insightful articles from domain experts in the fields of Cloud Computing, DevOps, AWS, Data Science, Machine Learning, AI, and Natural Language Processing. The range of topics caters to upski... Read Full Bio

Splitting in Decision Tree

Contents

What is a Decision Tree?

Decision Tree Terminologies

How do you split nodes in a Decision tree?

Methods to split Decision Tree

Best-suited Machine Learning courses for you

Master of Computer Applications with specialization in Machine Learning and Artificial Intelligence (Online MCA)

Advance Certification in Applied Data Science, Machine Learning & IoT

Professional Certificate Course In Generative AI And Machine Learning

MCA in Machine Learning Online

IIT Roorkee - Post Graduate Certificate Program in Data Science & Machine Learning (Online)

Data Science & Machine Learning Course

MCA in Machine Learning

Full Stack Machine Learning & AI Program

M.Sc. in Machine Learning and AI

IIT Roorkee & Wiley Post Graduate Certification in AI for BFSI

1. Entropy

2. Gini Impurity

3. Information Gain

Information Gain Vs Gini Impurity

Conclusion

Top Picks & New Arrivals