Support Vector Machines (SVM) Fundamentals Part-I

Hi,

Today I am going to discuss about Support Vector machine (SVM). I will try to make it very simple so if you are a beginner to Artificial Intelligence or in particular Support Vector machine (SVM) this can be your beginner’s guide from where you can kick start further knowledge building. Prerequisites to understand this post is elementary knowledge of geometry and algebra. I have divided this topic into three posts. In first post we will discuss about building blocks and motivation behind Support Vector Machine, in the next two posts we will further elaborate concepts of Support Vector Machine.

Developed by Vladimir Vapnik in 1990s Support Vector machine is a best classifier till date and the reason is that Support Vector machine not only classifies the patterns it also optimizes the decision boundary, how, we will see it later but first let us refresh some of the basic concepts.

Linear Separability vs Not-Linear Separability:  Two sets of points are linearly separable if they can be separated by single line in two dimension and by a hyperplane in more than two dimension. If they can not be separated by a line or in general hyperplane they are said to be not-linearly separable.

Linearly and not-linearly seperated datasets
Linearly and not-linearly seperated datasets

 

Equation of Line and Hyperplane:  the line in a two dimension plane is defined by the equation  X2=mX1+C  or X2-mX1=C or in general w1X1 + w2X2=C  where w1=-m , w2=1 

In functional form f(x)=w1X1 + w2X2-C=0

for n-dimensional hyperplane this function can be generalized as

WX  =C, where  W=[w1,w2,…….wn] and X=[X1,X2,……Xn]

some points to remember:

    1. The above equation separates the space into two half spaces (positive half space and negative half space). Positive half space satisfies the inequality WT X  >C  it means for all points lying on positive half space w1X1+w2X2+…….wnXn>C. Similarly for all points lying on negative half space w1X1+w2X2+…….wnXn<C
    2. Dimension of hyperplane is always (n-1). it means for 2-dimensional plane hyperplane will be a line. for 3-dimensional space 2-dimensional plane will be a hyperplane and so on.
    3. A hyperplane is also known as linear discriminant as it linearly divides the space in two halfs.

Now get ready for shocking revelation. Support vector machine is a linear discriminant or you can say linear classifier. Simple enough to understand. now you can say that whats new in this or what about patterns that are not-linearly separable. keep calm we will address this one by one by taking examples. Lets take the case of 2-dimensional space with two classes ( class 1: * , class 2 : +)  that are linearly separable as shown in figure below:

linearlySeparable

A line (linear discriminant) of equation x1+x2=1 classifies the two patterns. From the above figure it is evident that there can be infinite number of possible lines that can successfully classify the two patterns as shown in figure below.

linearlySeparableWithManyLines

Now the question arises which line is the best line as you can see from above figure that lines that are in red are quite at risk in classifying above patterns because if suppose due to some measurement noise etc points in class 1 shifts towards left  or class 2 shifts right then red lines would not be able to classify patterns so which line you will chose? the answer is very simple, line that maximizes the margin from both patterns and that is the virtue of Support Vector Machine, it not only classifies the patterns it also maximizes the margin.

Let us conclude first post here.How Support Vector Machine maximizes the margin, how it handles not-linearly separable data, training Support Vector Machine and other stuff I will be discussing in next two-three posts. Link to the second post is as follows:

Support Vector Machines (SVM) Fundamentals Part-II

my email address is: panthimanshu17@gmail.com                                                                     my LinkedIn profile is in about page.

Suggested Reading

Support Vector Machine (Information Science and Statistics)

Advertisements

8 thoughts on “Support Vector Machines (SVM) Fundamentals Part-I

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s