In the second post Support Vector Machines (SVM) Fundamentals Part-II , I discussed about margin maximization, support planes and support vectors. Lets have a quick recap:

- In a basic sense SVM is a two class classifier. it learns from the data lying in support planes.
- Data lying on support planes are called support vectors.
- Support Vector Machine aims to maximize Dt (margin) . So its a maximum margin classifier with objective function (dual problem)
*Minimize (**||W||*^{2}/2 )

Before discussing the learning process of support vector machines (SVM) you must understand what Karush–Kuhn–Tucker (KKT) conditions are. So before going ahead I recommend you to get some understanding of Karush–Kuhn–Tucker (KKT) conditions. Here are some references:

http://www.onmyphd.com/?p=kkt.karush.kuhn.tucker http://apmonitor.com/me575/index.php/Main/KuhnTucker http://www.aass.oru.se/Research/Learning/drdv_dir/seminar_optim_2011/kkt.pdf

At this point I assume that you are having basic understanding of KKT conditions. For the sake of clarity I am just mentioning KKT in breif.

* min f (x) subject to **g _{i}(x) <= 0, i = 1, . . . , n *

*The Lagrangian for the function*

*L (x, λ) = f (x) +*

*λ1g1(x)*

*+*

*λ2g2(x)………*

*+*

*λngn(x)*

Lets say x* is a local optimum point so KKT conditions for point x* to be local minimum are

∇*L (x*, λ)=0—————(stationary) *

* **gi(x*) <= 0, for all i ————- (primal feasibility) *

* λigi(x*)=0 for all i ———————-(complementary slackness) *

* λi=>0 for all i ———————-(dual feasibility)*

* ** ** *Lets take a case of one dimensional data-set. I am taking One dimensional case for the sake of simplicity. It can be easily extended for multidimensional case. Also data patterns are linearly separable. I will show you afterwards how it can be extended for not-linearly separable cases.

Suppose there are three patterns. Pattern 1 (value=2) and 2 (value=4) belong to class # and pattern 3 (value=10) belongs to class * shown in figure below.

In this case *w *will be a single value instead of a vector. So we have to minimize * ||w||^{2}/2 .* Constraints are:

*2w-c<=-1 ; 4w-c<=-1 and 10w-c=>1*

*so Lagrangian will be*

*min L= ||w||^{2}/2 –λ1(-2w+c-1)-λ2(-4w+c-1)-λ3(10w-c-1).*

Taking derivative of L with respect to w ,c and *λ we get*

*w+2 λ1+4λ2-10λ3=0 (1)*

*λ1+ λ2-λ3=0 (2)*

–*2w+c-1=0 (3) ; -4w+c-1=0 (4); 10w-c-1=0 (5)*

also other kkt conditions are *λ1(- 2w+c-1)=0 (6) ; λ2(-4w+c-1)=0 (7); λ3(10w-c-1)=0 (8)*

solving (4) and (5) we get 6*w=2 or w=1/3 and c=7/3 . Also it is implied from (6) λ1 = 0 so from (2) λ2=λ3 and from (1) λ2=1/18=λ3.*

now decision boundary is wx-c=0 so x/3-7/3=0 or ** x=7. **It is obvious from above figure that point x=7 classifies the two classes with maximum margin (3) as pattern 2 is a boundary for class # and pattern 3 is a boundary for class *. Same reasoning can be extended for higher dimensions.

**Not-Linearly Separable Case: **

Lets take a case of data shown be

as it is evident from above figure class # and class * are not linearly separable. But when we project this data into two dimension (x, x^{2}) it becomes linearly separable as shown in figure below.

Now we can use formulation discussed to classify the data with maximum margins. The trick here is to use advantage of higher dimension without actually going into it and thats the beauty of support vector machines(SVM).

this completes the blog posts “Support Vector Machines Fundamentals” series. For any queries you can mail me at panthimanshu17@gmail.com.

Thanks for reading post.

###### Related articles

- Support Vector Machines (SVM) Fundamentals Part-II (panthimanshu17.wordpress.com)
- Support Vector Machines (SVM) Fundamentals Part-I (panthimanshu17.wordpress.com)

Simple and interesting

Part 1 and 2 … were very interesting and i understood all very easily, however, Part three is difficult…. as i donot have a background knowledge of KKT and Lagrangian. So if it can be re-written for people like me in a little digestible

Thanks. This was helpful.

Well I must admit this is the best introduction to SVM . But i would like you to make further articles on SVM advance topic like kernels and others . Again, I really learnt from you . Thanks.

Your articles are really valuable. It made me learn a little bit more about SVM. Thanks.

Thanks. It’s really nice tutorial. However, I had to review many topics of school math 😀