Thursday, October 31, 2019

SVM (Support Vector Machines) Modelling


  • SVM is a linear regression model
  • Helps in dealing with large volume & high dimension
  • Not suitable for imbalanced data (why?) 
    • SVM on imbalanced data often produces models that are biased towards majority data and hence poor performance on minority data.
  • Tips on choosing the right type of kernel for SVM
    • Ultimately boils down to trial and error
    • But not completely random, lot of this is informed by experience
    • You would get a sense what the classification boundary would look like when doing the EDA
      • If you find that data plot is completely intermingled, then non-linearity is needed.
      • If the data points are reasonably separable by a linear hyperplane, more or less, then linearity should do. 
    • If there is a reasonable curvy line you could draw as a separator, then a polynomial kernel should work.
    • If the data is hopelessly intermingled, you would need a complex kernel. A most common choice in this case is an RBF Kernel.


SVM Notations

  • 2-D hyperplane (essentially a line when it comes to 2-D) equation
    W0 + W1X1 + W2X2 = 0
  • n-D equation

No comments:

Post a Comment