Using benchmark datasets, we study the performances of three efficient clustering algorithms which find cluster centers using a fixed number of random samples. The algorithms are also compared with two other (well-known) algorithms, namely k-means and PAM. One of the efficient algorithms, CLARA, is well-known while the other two, k-means-lite and PAM-lite, were introduced recently. CLARA and PAM-lite are based on the k-medoids approach, while k-means-lite adopts the k-means approach. The study shows that k-means-lite is the most efficient, followed by PAM-lite which is faster than CLARA. PAM-lite exhibits the best balance of efficiency and accuracy; it produces the most competitive results relative to PAM which is the most accurate but most inefficient
Reference:
Olukanmi, P.O., Nelwamondo, F.V. and Marwala, T. 2019. Performance evaluation of sampling-based large-scale clustering algorithms. SAUPEC/RobMech/PRASA Conference, Bloemfontein, South Africa, South Africa, 28-30 January 2019, pp 200-204.
Olukanmi, P., Nelwamondo, F. V., & Marwala, T. (2019). Performance evaluation of sampling-based large-scale clustering algorithms. IEEE. http://hdl.handle.net/10204/11121
Olukanmi, PO, Fulufhelo V Nelwamondo, and T Marwala. "Performance evaluation of sampling-based large-scale clustering algorithms." (2019): http://hdl.handle.net/10204/11121
Copyright: 2019. IEEE. Due to copyright restrictions, the attached PDF file only contains the abstract of the full text item. For access to the full text item, kindly consult the publisher's website.