Hello Everyone, 

I am Yashwant Kumar and I am going to show you that which is best ML model for hand written Digits classification.

Handwriting Recognition:

Recognizing Handwritten Digits with scikit-learn  :

The scikit-learn library provides numerous datasets that are useful for testing many problems of data analysis and prediction of the results. Also in this case there is a dataset of images called Digits. This dataset consists of 1,797 images that are 8x8 pixels in size. Each image is a handwritten digit in grayscale,

So I am going to show al work on Jupyter Notebook, SO let's start Notebook and Load digits data set by following code:

from sklearn.datasets import load_digits

digits = load_digits()

as shown in fig1

fig 1


I have also explore digits data set using some codes:

dir(digits)

by this I have seen all the directory available in data set and using

digits.target_names

digits.data

I have seen all target names and data available in data set

Here you have noticed data is in the form of ndarray , because here this all  are pixels of image of digits, it has about 1797 samples of digits .

Now we will made a data frame using pandas because our scikit learn model will take data in dataframe for doing we will 

import pandas as pd

and rest is shown in fig2



fig2

as shown in figure I have made a data frame of digits.data and after this I have added target values also in data frame

now I will make x and y form split as shown in fig 3


fig3

In this I have at first added digits name using pandas for all digits who actually belongs to that data using lambda function now for making X variable I have dropped both target and digits name columns because in X we must have only data 

Now I will make y varaible which contain target values as shown in fig4


FIG4

here I have split data for train our model and after training for test our model.

for this I have used module sklearn.model_selection and imported train_test_split

in train_test_spit I have given 20% of data as test set and 80% for train set

After thi I am going to do Hyper Tuning for find a best suitable Model for our data, from best suitable model I means that Model which give us highest accuracy.

as shown in fig 5


fig5

for Hyper tuning I have taken some Model form skleanrn library by importing 

from sklearn.svm import SVC

from sklearn.linear_model import LogisticRegression

from sklearn.pipeline import make_pipeline

from sklearn.model_selection import GridSearchCV

from sklearn.ensemble import RandomForestClassifier

and I have amde a dictonary a shown in fig5 and some parameter which is choosen randomly , you can choose by your self also

then using GridSearchCV I have searches for best estimators and added to a list and made that list as a data frame

and shown in fig 6


fig6

by this Data frame Highest accuracy shown by SVM model with parameters Kernel='linear' and C=1.0

so I have made model of SVM and fit with our X_train and y_train and I have got 98.3 % of accuracy which is good for a model.

now I have have stored all predicted values in y_predicted to show a confusion matrix as shown in fig7


 fig7

I have made a confusion matrix and shown using seaborn modeule and matplotlib by importing

import matplotlib.pyplot as plt

import seaborn as sn

and in matrix we can accuracy with truth and predicted values , we can clearly see that there is some little error when truth is  9 it is predicting as 3 and some more also, it is due to our accuracy is 98% so we will have 2% error also.

but our model is Very good and we can use it .

Hope My explanation is suitable for you. Thankyou 

and

I am thankful to mentors at https://internship.suvenconsultants.com for providing awesome problem statements and giving many of us a Coding Internship Experience. Thank you www.suvenconsultants.com 











Comments