Description

A step-by-step guide that walks you through the fundamentals of Python programming followed using Python libraries to create random forest from scratch. A comprehensive course designed for both beginners with some programming experience or even those who know nothing about ML and random forest!

Machine learning is designed to understand and build methods that 'learn' to leverage data to improve performance on a set of tasks. Machine learning algorithms are used in a plethora of applications in medicine, email filtering, speech recognition, and more, where it is challenging to develop conventional algorithms to perform tasks. The course begins with an introduction to machine learning concepts and explains the motivation for machine learning. The course teaches all major concepts about Python including variables, objects, strings, loops, decision-making statements, classes, and a small project to recap. You will learn to use the power of Python to train your machine and make predictions and implement the ML algorithm 'Random Forest.' Use NumPy with Python for array handling, Pandas data frames for Excel files, and matplotlib for data visualization. You will learn to use Random Forest with sklearn, Matplotlib for Python plotting, and SciKit-Learn for Random Forest. Upon completion, you will Implement the structure of forest, impurity, information gain, partitions, leaf nodes, and decision nodes using Python and create a complete structure for Random Forest using Python to build one tree that lets you create an entire forest. You will write an accuracy calculator function and implement Random Forest on any dataset. All resources are available at: https://github.com/PacktPublishing/Machine-Learning-Random-Forest-with-Python-from-Scratch-

What You Will Learn

Use Random Forest with sklearn and Matplotlib for Python plotting
Use SciKit-Learn for Random Forest using the titanic dataset
Learn forest structure, impurity, partition, leaf/decision nodes
Create a complete Random Forest structure from scratch using Python
Build one tree that adds up to create a complete forest
Write accuracy calculator functions and implement them on any dataset

Audience

This course is for you if you want to learn how to program in Python for machine learning or want to make a predictive analysis model.

This course is for someone who is an absolute beginner and has truly little or even zero ideas of machine learning or wants to learn random forest from zero to hero.

Approach

This course delivers a step-by-step tutorial with carefully structured video lectures. It builds on what has already been explained and moves one step forward. The course assigns a small task to be solved at the beginning of each lecture, thus keeping you continuously abreast.

Key Features

Use the power of Python to train your machine to learn like a human and make predictions! * Learn data preprocessing steps to prepare data for machine learning algorithms * Master machine learning concepts and implement the essential ML algorithm, Random Forest

Github Repo

https://github.com/PacktPublishing/Machine-Learning-Random-Forest-with-Python-from-Scratch-

About the Author

AI Sciences

AI Sciences are experts, PhDs, and artificial intelligence practitioners, including computer science, machine learning, and Statistics. Some work in big companies such as Amazon, Google, Facebook, Microsoft, KPMG, BCG, and IBM. AI sciences produce a series of courses dedicated to beginners and newcomers on techniques and methods of machine learning, statistics, artificial intelligence, and data science. They aim to help those who wish to understand techniques more easily and start with less theory and less extended reading. Today, they publish more comprehensive courses on specific topics for wider audiences. Their courses have successfully helped more than 100,000 students master AI and data science.

Course Outline

1. Introduction to the Course

This section briefly outlines the Random Forest machine learning algorithm. The section also details what you will learn in this course and the benefits of enrolling in this course.

1. Introduction and Instructor

In this video, you will learn about the course content in general, the features and benefits of this course, as well as a brief introduction to the instructor.

2. Motivation for the Course

This lecture provides an overview of the benefits of enrolling in this course.

3. Past, Present, and Future of Machine Learning

This lecture illustrates why machine learning has started evolving recently and the future that beholds machine learning.

4. Course Overview

This video will introduce you to Python, machine learning, and Random Forest and discuss the live implementations, quizzes, and projects.

2. Introduction to Python

This section focuses on the Python programming language. This section will delve deep into datatypes, numbers, strings, lists, sets, dictionaries, and operators used in Python. You will also learn about loops and functions and work on a calculator project.

1. Hello World

In this video, we will understand the importance of Python for machine learning, use an IDE (jupyter), and create a Hello World program.

2. Introduction to Data Types

This video introduces us to Python's six data types: numbers, strings, lists, dictionaries, tuples, and sets.

3. Numbers

This lesson will teach us about numbers, the first standard datatype in Python, used for arithmetic operations and storing information.

4. Strings

Here, we will look at the second datatype, strings, which are characters, used to represent text instead of numbers.

5. Tuples

In this topic on tuples, you will learn how to get a starting index of a substring and write a simple program to find the starting index of the word WORLD.

6. Lists

We shall look at the following datatype: lists, mutable (changeable) after creation.

7. Sets

Our next datatype is a set, an unordered collection of iterable and mutable data.

8. Dictionaries

Let's look at the final datatype, dictionaries, indexed, changeable, and unordered data collection.

9. Comparison Operators

Here, you will learn about Python's two types of operators, the comparison operator, and the logical operator.

10. Logical Operators, User Input, Game

This video will teach us about the three types of logical operators: AND, OR, and Not(!).

11. Decision Making (if, else, elif)

Let's understand the fundamental element of any programming language: decision-making. We will look at the if, else, and elif statements.

12. Decision Making (nested if)

Continuing with the decision-making lesson, we will understand what a nested if statement is.

13. Better Coding Practice, Completing the Game

In this lecture, we will complete the game we left in the previous lesson.

14. For Loop

Let's understand the for loop and how it is used to iterate a sequence, which is a list, tuple, string, dictionary, or set.

15. While Loop

Let's understand the while loop, a conditional loop that runs until a given condition is fulfilled.

16. Simple Functions

Here, we will understand what a function is, how to pass data or parameters into a function, and understand the different types of functions.

17. Boolean and Value Returning Function

In this video, we will discuss a function that returns values as true or false, also called the Boolean function.

18. Calculator Project

This video demonstrates how to create our calculator using addition and subtraction functions.

3. Introduction to Machine Learning

This section delves deep into the core concepts of machine learning. You will learn how to train our machine, create labels and features, learn about the various data formats, and understand classification, regression, and clustering.

1. Let's Introduce Machine Learning

This brief video outlines machine learning, its importance, and how it can be used in various applications to make life easier.

2. Kids versus Computer Learning

In this lesson, we will compare their learning features to understand the fundamental difference between a kid and a computer.

3. Dataset

In this lesson, we will dive into the hardcore process of machine learning and the fundamental elements used in machine learning, like datasets, training and testing data, outliers, models, and so on.

4. Labels and Features

In this lesson, we will look at a target or label as output data and feature, a measurable property or characteristic of an element.

5. Outliers

Let's look at outliers, the data points of a dataset that differ from others and are usually excluded by visualizing a dataset.

6. Model and Training

Here, we will look at a machine learning model, an arithmetic expression or equation that fits and learns to predict data.

7. Overfitting and Underfitting

In this video, you will learn about overfitting, a modeling error when a model performs well in training but not in testing, and underfitting, where the model neither performs well during training nor during testing.

8. Accuracy and Error

This lecture explains accuracy and error in data when the predicted outcome is close to the expected result and when it is not defined as an error.

9. Formats of Data

You will learn about the different data formats in machine learning, including structured (labeled or unlabeled) and unstructured data, and how to choose the best format.

10. Types of Learning

Here, we will understand the learning types, including supervised and unsupervised machine learning algorithms.

11. Classification versus Regression

Let's learn about the three modes of machine learning: classification, regression, and clustering.

12. Clustering

In this lesson, you will learn about an unsupervised branch of learning called clustering, which involves grouping elements with no labels to classify them.

13. Recap, Flow of Machine Learning Project

This video is a quick recap of what we have learned thus far, and we will also be working on a project called the Flow of Machine Learning.

4. Random Forest Step-by-Step

This section focuses on the Random Forest concepts that deal with reading and manipulating datasets; you will learn about the pros and cons of Random Forest and use pandas for the Random Forest.

1. Introduction and Motivation

This video outlines the concepts of Random Forest when we can use the Random Forest, and its benefits and features.

2. How Decision Trees and Random Forest Work

We will understand what a decision tree is and create a decision tree and get a prediction result from the decision tree.

3. Pros and Cons of Random Forest

In this video, we will look at the benefits and limitations of Random Forest and the complexities involved in decision-making using Random Forest.

4. Introduction to the Final Project

In this video, we will discuss our final project, which classifies the titanic dataset using Random Forest.

5. Using NumPy for Random Forest

In this lesson, you will learn about the NumPy tool available in the Python library and look at the advantages of using NumPy.

6. Using Pandas for Random Forest (1)

This video introduces us to Pandas data structures and analysis tools, which help make data easy to handle and intuitive.

7. Using Pandas for Random Forest (2)

This is a continuation of the previous lesson, and here we will look at conditionally selecting values from a dataset.

8. Reading and Manipulating Dataset

Previously, you learned how to read a dataset; now, we will look at manipulating the data and using a sample dataset in our code.

9. Using Matplotlib for Data Visualization (1)

This video demonstrates data visualization using the Matplotlib function and explains data cleaning by removing outliers and filling in missing values.

10. Using Matplotlib for Data Visualization (2)

In this continuation lecture, we will focus on completing the assignment that we started on in the last lesson.

11. Dealing with Missing Values

Let's look at the first step involved in the data cleaning process, which is filling or removing missing values from a dataset.

12. Outliers Removal

In the second part of the data cleaning process, we will look at an outlier in detail and learn how to correct or remove the outlier.

13. Categorical to Numeric Conversion

In a machine, it understands values in the form of numbers. You will learn how to convert non-numeric data to numeric without changing the feature of the value.

14. Quick Implementation of Random Forest Model

Let's quickly implement Random Forest using the sklearn Random Forest model to tune the model's performance according to the project.

15. Feature Importance

After understanding features in a previous lesson, we will look at finding the most critical features' probability.

16. Recursion

In this video, you will learn about a complete implementation of the Random Forest using only Pandas to read the data and look at the results of the sklearn function.

17. Structure

In this video, we will discuss the structures of a Random Forest, namely forest, tree, leaf node, and decision node.

18. Importing Data, Helper Functions

Before creating a decision tree, we will first learn to import our dataset using Pandas.

19. Question and Partition

In this video, you will learn to create two more helper functions, question, and partition, which define statements for querying and retrieving data.

20. Impurity

Like all dataset characteristics, we will look at the impurities in a dataset and how they should be minimum in a good dataset.

21. Information Gain

In this video, we will define columns for questioning and determine how much information can be gained by splitting a column.

22. Best Slip

Here, we will determine the best split at any decision node where information is maximum and split into two branches, true and false.

23. Leaf and Decision Node

In this lesson, you will learn to create two classes, a leaf node and a decision node, with a constructor.

24. How to Build a Tree

After creating the decision node and leaf node classes, we will build our tree to add the nodes.

25. How to Classify

Let's learn to write a classification method that will train our module and help us get predictions.

26. Accuracy and Error

In this video, you will learn to implement the accuracy method to help determine our model's performance.

5. Conclusion

This section focuses on all the concepts learned so far and summarizes them succinctly.

1. Concluding remarks

In this video, we will look at the concluding remarks of the course and recap what we learned through the course, briefly.

Course Images

Machine Learning: Random Forest with Python from ScratchÂ©

By Packt

Booking options

Highlights