Data Science (Incl AI/ML): SVM (Support Vector Machines) Modelling

Thursday, October 31, 2019

SVM (Support Vector Machines) Modelling

SVM is a linear regression model
Helps in dealing with large volume & high dimension
Not suitable for imbalanced data (why?)

SVM on imbalanced data often produces models that are biased towards majority data and hence poor performance on minority data.

Tips on choosing the right type of kernel for SVM

Ultimately boils down to trial and error
But not completely random, lot of this is informed by experience
You would get a sense what the classification boundary would look like when doing the EDA

If you find that data plot is completely intermingled, then non-linearity is needed.
If the data points are reasonably separable by a linear hyperplane, more or less, then linearity should do.

If there is a reasonable curvy line you could draw as a separator, then a polynomial kernel should work.
If the data is hopelessly intermingled, you would need a complex kernel. A most common choice in this case is an RBF Kernel.

SVM Notations

2-D hyperplane (essentially a line when it comes to 2-D) equation
W0 + W1X1 + W2X2 = 0
n-D equation

No comments:

Post a Comment

Subscribe to: Post Comments (Atom)