Big data today is stored in a distributed fashion across many different machines or data sources. This poses new algorithmic and system challenges to performing efficient analysis on the full data set. To address these difficulties, the PIs are building the MIDDLE (Mergeable and Interactive Distributed Data LayEr) Summarization System and deploying it on large real-world datasets. The MIDDLE system builds and maintains a special class of summaries that can be efficiently constructed and updated while still allowing fine-grained analysis on the heavy tail. Mergeable summaries can represent any data set with a guaranteed tradeoff between size and accuracy, and any two such summaries can be merged to create a new summary with the same size-accuracy tradeoff.
Master/PhD student. Now at Visa Research.
Graduated with Master Summer 2015. First Employment: Facebook.
Jeff M. Phillips
PhD student graduated in 2013. Now at Microsoft.
PhD Student. Research Interest: knowledge base construction and application.
PhD Student. Research Interest: Distributed chain database.