Splitting Training And Test Data For Machine Learning Using Python And Scikit Learn Tutorial

Welcome to the video series on Introduction to Machine Learning with Scikit Learn and Python. This is Chapter -7 and in this chapter, we will talk about how to judge the performance of our machine learning algorithm.

This is a video series on scikit learn tutorial. In this series I'm talking about using scikit learn machine learning for our implementations

Machine learning Algorithm selection faces a unique catch22 situation where you get the data to train but need unseen(new)data to test the algorithm which is available only with production.

To avoid this situation and understand the performance of the selected Machine Learning algorithm, we need to generate TEST DATASET from the available DATA Set.

We can do the same by segregating the available dataset in Training Data Set and Testing Data Set. Scikit Learn provides a utility function called train_test_split which can help us to achieve this goal

This video explains the usage of train_test_split function and how we can generate training and testing datasets.

#python #Machinelearning #scikitlearn #ArtificialIntelligence #python #softwaredevelopment #programming #pandas #scikitlearn #datascience #dataanalytics

Hi I am Deepak k Gupta (nickname - Daksh and Preferred). This channel is for budding as well as experienced software developers who are willing to explore the awesome world of programming.

Subscribe to my Youtube channel here bit.ly/Sub_CodesBay

Here is the brief list of things which you can find in my Youtube channel

1. C++ programming (latest specification C++17 and C++20 ), create high performance system applications using this one.
2. Create microservices designed for multiple CPU cores using my golang tutorial
3. Create web applications as well as backend application using my Javascript tutorial and node js
4. Create cross platform mobile apps using my flutter tutorial
5. Learn Python Programming, the language in demand and learn to do effective ways of doing Data Science and Machine Learning. My python tutorials includes but not limited to supervised and unsupervised learning, logistic regression, gradient descent. You will also be able to create neural networks using my Pytorch Tutorial
6. Learn source control with my git tutorial, which is one of the most widely used decentralized source control. Learn how to create branch using git branch, merge changes using git merge, checkout a branch using git checkout and commit your changes using git commit
7. Learn about persistent nosql databases like mongodb using my mongodb tutorial as well as in memory nosql databases like redis using my redis tutorial. you'll also learn about using redis nodejs
8. Understand the concept of handling large data using my big data tutorial and using databases like apache spark
9. Learn about graph theory and graph database and how to make use of graph databases like neo4j

  • Splitting Training and Test Data for Machine Learning Using Python and Scikit Learn tutorial ( Download)
  • Machine Learning Tutorial Python - 7: Training and Testing Data ( Download)
  • Why do we split data into train test and validation sets ( Download)
  • Splitting Datasets in Python With scikit-learn and train_test_split() ( Download)
  • Python Machine learning - Train Test Split - Sklearn ( Download)
  • Train Test Split | Training and Testing data | Machine Learning ( Download)
  • Build your first machine learning model in Python ( Download)
  • Train Test Split using Python (Scikit-Learn) ( Download)
  • Optuna: a hyperparameter optimization framework ( Download)
  • Use stratified sampling with train_test_split ( Download)
  • IML8: How to train and test a simple model using Scikit-learn in Jupyter Notebook (part 1) ( Download)
  • Data Splitting in Python ( Download)
  • Scikit-learn Crash Course - Machine Learning Library for Python ( Download)
  • How to split your test and training data using scikit learn ( Download)
  • Python Machine Learning Tutorial (Data Science) ( Download)