What is the first step in machine learning?
What is the first step in machine learning?, we’ll begin with a general introduction of machine learning models and how they’re employed.
If you’ve done statistical modeling or machine learning before, this may seem simple.
What is Machine Learning?
Machine Learning is the application of Artificial Intelligence to allow computers to learn a task via experience rather than being programmed expressly for that activity.
(In other words, machines learn without human intervention!!!)
This procedure begins with providing them with high-quality data, which is then used to train the machines by creating multiple machine learning models based on the data and various methods.
The algorithms we use are determined by the type of data we have and the task we are attempting to automate.
As you work through the following scenario in this article, you will be aware of how the model works:
Let’s take an example, speculating in real estate has made your cousin millions of dollars.
Because of your passion for data science, he has volunteered to become your business partner.
He’ll provide the funds, and you’ll provide the models that anticipate the value of various homes.
You inquire as to how your cousin has anticipated real estate values in the past, and he responds that it is simply intuition.
However, further interrogation reveals that he has detected price patterns in previous houses that he has viewed, and he utilizes those patterns to generate forecasts for new residences he is contemplating.
The same is true for machine learning. We’ll begin with a model known as the Decision Tree.
There are more sophisticated models that provide more accurate forecasts.
However, decision trees are simple to understand and serve as the foundation for some of the best data science models.
To keep things simple, we’ll start with the smallest decision tree conceivable.
First Decision Trees
It categorizes houses into only two types. The estimated price for each given house is the historical average price of properties in the same category.
We utilize data to select how to divide the houses into two groups, and then calculate the expected price for each group.
Fitting or training the model refers to the process of extracting patterns from data. The training data is the data that was utilized to fit the model.
The intricacies of how the model is fit (for example, how to break up the data) are complicated enough that we will save them for later.
After fitting the model, you may apply it to new data to anticipate the prices of further dwellings.
Improving the Decision Tree
Which of the two decision trees is more likely to emerge from the real estate training data?
The decision tree on the left (Decision Tree 1) is perhaps more logical, as it reflects the fact that houses with more bedrooms sell for greater money than those with fewer bedrooms.
The model’s major flaw is that it ignores most of the elements that influence property prices, such as the number of bathrooms, lot size, and location.
With a tree with more “splits,” you can capture more variables. These trees are referred to as “deeper” trees.
The following is an example of a decision tree that takes into account the overall size of each house’s lot: 2nd Tree Depth
You may anticipate the price of any house by tracing through the decision tree and choosing the path that corresponds to the characteristics of that house.
The house’s anticipated price is near the bottom of the tree. A leaf is a place at the bottom when we make a prediction.