SVM (Support Vector Machines) Modelling
- SVM is a linear regression model
- Helps in dealing with large volume & high dimension
- Not suitable for imbalanced data (why?)
- SVM on imbalanced data often produces models that are biased towards majority data and hence poor performance on minority data.
- Tips on choosing the right type of kernel for SVM
- Ultimately boils down to trial and error
- But not completely random, lot of this is informed by experience
- You would get a sense what the classification boundary would look like when doing the EDA
- If you find that data plot is completely intermingled, then non-linearity is needed.
- If the data points are reasonably separable by a linear hyperplane, more or less, then linearity should do.
- If there is a reasonable curvy line you could draw as a separator, then a polynomial kernel should work.
- If the data is hopelessly intermingled, you would need a complex kernel. A most common choice in this case is an RBF Kernel.
SVM Notations
- 2-D hyperplane (essentially a line when it comes to 2-D) equation
W0 + W1X1 + W2X2 = 0
- n-D equation
No comments:
Post a Comment