2018-08-09 | Lin Yang:Clustering High Dimensional Dynamic Data Streams

2018-08-09

Abstract

We present a data streaming algorithms for the k-median problem in high-dimensional dynamic geometric data streams, i.e. streams allowing both insertions and deletions of points from a discrete Euclidean space {1,2,…Δ}^d. Our algorithms use k/eps^2 * poly(d log Δ) space/time and maintain with high probability a small weighted set of points (a coreset) such that for every set of k centers the k-median cost of the coreset (1+eps)-approximates the cost of the streamed point set. We can use this coreset to compute a (1+eps)-approximation for the k-median problem by any efficient offline k-median algorithm. This is the first algorithm achieving space polynomial in d in this setting.


Time

8月9日(周四)14:00-15:00


Speaker

Lin Yang is currently a postdoc researcher at Princeton university, working with Prof. Mengdi Wang. He obtained his PhD degrees (in computer science and in physics) from Johns Hopkins University and B.S. degree from Tsinghua University. He is broaderly interested in designing efficient algorithms for streaming data analysis and machine learning.


Venue

信息管理与工程学院  602会议室

上海财经大学

上海市杨浦区武东路100号