Yibin (Spencer) Sun

I am currently pursuing my Doctorate in Computer Science at the University of Waikato. My research focuses on advanced machine learning algorithms for streaming data. My interests extend to all fields related to machine learning, data mining, and artificial intelligence. I am also fond of applications of data stream techniques on real-world data, such as energy pricing, Electromyography (EMG) signals, and so forth.

CapyMOA

As of May 02, 2024, we launched a novel machine learning library for data streams! See more information about it below. We presented tutorials for CapyMOA in various venues in 2024, including PAKDD (Taipei), IJCAI (Jeju, South Korea), KDD (Barcelona, Spain), ECML (Vilnius, Lithuania), KiwiPycon (Wellington, NZ), PRICAI (Kyoto, Japan), and ICONIP (Auckland, NZ). Material is available on the CapyMOA discord here.

Publications

Real-Time Energy Pricing in New Zealand: An Evolving Stream Analysis

Y Sun, H M Gomes, B Pfahringer, A Bifet. Pacific Rim International Conference on Artificial Intelligence (PRICAI), 2024. Springer. Longer version at ArXiv.

This study focuses on real-time energy pricing in New Zealand using stream analysis to provide timely and accurate predictions for energy costs.

Adaptive Prediction Interval for Data Stream Regression

Y Sun, H M Gomes, B Pfahringer, A Bifet. Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), 2024. Springer.

This paper proposes an adaptive prediction interval model for data stream regression, improving accuracy and reliability over traditional methods by dynamically adjusting prediction intervals.

SOKNL: A novel way of integrating k-nearest neighbours with adaptive random forest regression for data streams

Y Sun, H M Gomes, B Pfahringer. European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD), 2022. Springer.

This paper introduces SOKNL, a new approach combining k-nearest neighbors and adaptive random forest regression for enhanced data stream regression performance.