In the last post Support Vector Machines (SVM) Fundamentals Part-I , I talked about basic building blocks of support vector machines (SVM), lets just recap some important points:
- equation of n-dimensional hyperplane can be written asWT X =C, where W=[w1,w2,…….wn] and X=[X1,X2,……Xn]
- hyperplane separates the space into two half spaces (positive half space and negative half space).
- A hyperplane is also known as linear discriminant as it linearly divides the space in two halfs.
- Support vector machine is a linear discriminant.
Now lets move ahead. As I have mentioned equation of a hyperplane can be written as WT X =C , we will discuss about some properties W and C
- C determines the position of hyperplane and W determines the orientation (angle with axis) of a hyperplane. How? lets analyze it by taking a 2-dimensional surface, for a two dimensional surface hyperplane will be a line and the equation will be w1x1+w2x2=C , this equation can be written as x2=-(w1/w2)*x1+C , now compare it with general line equation y=mx+C, as you can see m determines the angle with the axis (-w1/w2) and C determines position in the X-Y Plane.
- Vector W is orthogonal to the hyperplane, the direction of W is in the direction of positive half. How? lets take two points m and n in hyperplane, now these points are in hyperplane so they will satisfy the equation WT m=C and WT n=C . Subtracting these two equations we will get WT(m-n)=0 , now direction of vector (m-n) will be in the direction of plane and as the dot product of W with (m-n) is zero, it means vector W is orthogonal (perpendicular in layman term) to hyperplane.
- Shortest distance between a point and the hyperplane. For Illustration lets take an example of figure below:
To find the minimum distance between Point X and and decision boundary, we need to find point Xp in decision boundary such that the vector (Xp-X) is orthogonal to the boundary. This becomes an optimization problem with a objective function:
find Xp Such that ||Xp-X|| is minimum and WT Xp =C (as Xp is on decision boundary)
Solving above optimization problem requires Formulation of Lagrangian and applying Karush-Kuhn-Tucker (KKT) conditions. For the sake of simplicity I am just producing Optimization Result. (If you want mathematics involved just send me a mail).
Xp=X -( (WT X – C)W/||W||2 ) and the distance D between Xp and X:
D= (WT X – C)/||W||
The above Distance equation is very important as it forms basis of Support Vector Machines (SVM)
Now I think that we have discussed enough fundamentals and building block so lets dive straight into Support Vector Machines. As I discussed in the last post that support Vector machine is a linear classifier that maximizes the margin. For Illustration purpose lets have a look at figure below:
Take an example of two class problem classified by hyperplane WT X – C=0, positive class is represented by hyperplane WT X – C=1 and negative class s represented by hyperplane WT X – C=-1, the patterns near the boundaries of each class falls on these hyperplanes. these hyperplanes shown in green in above figure are known as support planes. Pattern Vectors lying on the support plane are known as Support Vectors. These support vectors are sufficient for training of support vector machine it also results in data reduction.
As Support Vector Machine works on principle of margin maximization. from the above equation distance D between point (support vector) on support plane and and the classifier hyperplane WT X – C=0 is
D= (WT X – C)/||W||
now for negative class WT X – C=-1 and for positive class WT X – C=1 so the total margin between two support planes
Support Vector Machine aims to maximize Dt. so it becomes an optimization problem with a statement
It can be converted into dual problem: Minimize ||W||/2 or ||W||2/2 Subjected to:
WT X – C>=1 for positive half patterns and WT X – C<= -1 for negative half patterns
Lets conclude second post here. In the next post (https://panthimanshu17.wordpress.com/2013/08/21/support-vector-machines-svm-fundamentals-part-iii/) we will look into some more aspects of Support Vector Machines.
my email address is: firstname.lastname@example.org
My LinkedIn profile is in about page.