Yan Zheng
PhD student. Visa Research.


Book Chapter  Journal  Conference  Workshop  Tech Report]

Journal

2017

  • Visualization of Big Spatial Data using Coresets for Kernel Density Estimates. (Project Website)
    By Yan Zheng,   Yi Ou,   Alexander Lex,   Jeff M. Phillips,   
    Vol.abs/1709.04453, CoRR (CORR 2017),  2017.
    Abstract
  • 2013

  • Geometric Inference on Kernel Density Estimates. (Project Website)
    By Jeff M. Phillips,   Bei Wang,   Yan Zheng,   
    Vol.abs/1307.7760, CoRR (CORR 2013),  2013.
    Abstract
  • Conference

    2017

  • Coresets for Kernel Regression (Project Website)
    By Yan Zheng and Jeff M. Phillips
    In Proceedings of ACM Conference on Knowledge Discovery and Data Mining (KDD),  pages ??-??,  August,  2017.
    Abstract

    This project developed new data summaries for kernel regression ā€” these have not been formally studied before. These approaches are related to kernel density estimates, for where there are several effective summaries, but only model density of the data, not an associated weight. We produce both sample complexity results as well as deterministic approaches which adapt to the structure of the data. We demonstrate these approaches on a variety of 1-dimensional (e.g., time series), 2-dimensional (e.g., spatial) and high-dimensional data sets. It can summarize sets of enormous size down to a few megabytes with minimal loss of accuracy.

  • 2015

  • Geometric Inference on Kernel Density Estimates. (Project Website)
    By Jeff M. Phillips,   Bei Wang,   Yan Zheng,   
    In Proceedings of Symposium on Computational Geometry (COMPGEOM 2015),  pages 857-871,  2015.
    Abstract
  • Subsampling in Smoothed Range Spaces. (Project Website)
    By Jeff M. Phillips,   Yan Zheng,   
    In Proceedings of ALT (ALT 2015),  pages 224-238,  2015.
    Abstract
  • Lāˆž Error and Bandwidth Selection for Kernel Density Estimates of Large Data. (Project Website)
    By Yan Zheng,   Jeff M. Phillips,   
    In Proceedings of KDD (KDD 2015),  pages 1533-1542,  2015.
    Abstract
  • 2013

  • Quality and Efficiency for Kernel Density Estimates in Large Data, Talk
    By Yan Zheng,    Jeffrey Jestes,    Jeff M. Phillips,    Feifei Li
    In Proceedings of ACM SIGMOD International Conference on Management of Data (SIGMOD 2013),  pages 433-444,  June,  2013.
    Abstract

    Kernel density estimates are important for a broad variety of applications including media databases, pattern recognition, computer vision, data mining, and the sciences. Their con- struction has been well-studied, but existing techniques are expensive on massive datasets and/or only provide heuristic approximations without theoretical guarantees. We propose randomized and deterministic algorithms with quality guarantees which are orders of magnitude more ef- ficient than previous algorithms. Our algorithms do not re- quire knowledge of the kernel or its bandwidth parameter and are easily parallelizable. We demonstrate how to imple- ment our ideas in a centralized setting and in MapReduce, although our algorithms are applicable to any large-scale data processing framework. Extensive experiments on large real datasets demonstrate the quality, efficiency, and scala- bility of our techniques.