by Freddy Daniel
Last Updated July 12, 2019 04:19 AM

I have trained two models using sklearn library in python.. My dataset was about 750 features, 250 features per class (three classes), I trained only one feature dimension (1-D array). This are the results:

- SVM

Between training process and testing process (0.20%) I got: 0.029801 sg

- KNN

Between training process and testing process (0.20%) - 0.0074096 sg

As we can see K-NN got a shorter execution time ≈ 7 milliseconds and SVM 29.801 milliseconds.

I'm interested in knowing what of this two models are more complex computationally. According to [1] the complexity of SVM (LibSVM) is O(n^3) Sklearn is using libsvm like backend or like solver for svm problems (linear and non-linear)

According to [2] the complexity of K-NN is O(nd)

"Since the large O notation only gives a higher asymptotic dimension, and not an asymptotically adjusted upper bound, we can make statements that at first glance seem incorrect, but that are technically correct. For example, it is absolutely correct to say that the binary search is executed at a time O (n), That's because the execution time grows no faster than a constant multiplied by n. In fact, it grows slower." [3]

What is more complex? O(n^3) or O(nd) and Why?

Since my point of view KNN is more less complex in time execution that SVM model. thanks so much.

[1] https://core.ac.uk/download/pdf/48595285.pdf [2] k-NN computational complexity [3] https://es.khanacademy.org/computing/computer-science/algorithms/asymptotic-notation/a/big-o-notation

Since my point of view KNN is more less complex in time execution that SVM model. thanks so much.

Empirical evaluation can't really determine which of two algorithms has lower asymptotic complexity. In fact i'm pretty sure that would violate Rice's theorem.

What is more complex? O(n^3) or O(nd) and Why?

Well these aren't comparable, because one is a function of the number of datapoints, and the other is a function of both # of datapoints AND also dimension.

Furthermore, I really doubt the complexity of SVM is independent of dimension, so it is probably the case that $O(n^3)$ was derived assuming some fixed dimension, which makes it even more incomparable with a bound derived assuming $d$-dimension data points.

Since the large O notation only gives a higher asymptotic dimension, and not an asymptotically adjusted upper bound

This is a bit of a mathematical nuance, but to abuse some notation you can think of "$O$" as the "$\leq$" inequality. So it is valid to say that a constant time algorithm is in $O(e^n)$, because $1 \leq e^n$. Of course such looseness is rarely useful so people use $\Theta$ to denote a tight bound. (And in most cases when people say $O$ they really mean $\Theta$).

According to [2] the complexity of K-NN is O(nd)

There are two tasks at hand here: training and inference. For SVM, training takes $O(n^3)$ according to you, but inference takes $O(d)$, since you only need to determine which side of a hyperplane a given point lies on. For KNN, no training is needed, but inference is substantially more expensive (that is where the $O(nd)$ bound comes from). So really, it doesn't make much sense to compare the training time of one classifier with the inference time of another.

Updated March 08, 2019 21:19 PM

- Serverfault Query
- Superuser Query
- Ubuntu Query
- Webapps Query
- Webmasters Query
- Programmers Query
- Dba Query
- Drupal Query
- Wordpress Query
- Magento Query
- Joomla Query
- Android Query
- Apple Query
- Game Query
- Gaming Query
- Blender Query
- Ux Query
- Cooking Query
- Photo Query
- Stats Query
- Math Query
- Diy Query
- Gis Query
- Tex Query
- Meta Query
- Electronics Query
- Stackoverflow Query
- Bitcoin Query
- Ethereum Query