Medium: Collaborative: Seal: Secure Engine for AnaLytics - From Secure Similarity Search to Secure Data Analytics

Many organizations and individuals rely on the cloud to store their data and process their analytical queries. But such data may contain sensitive information. Not only do users want to conceal their data on a cloud, they may also want to hide analytical queries over their data, results of such queries, and data access patterns from a cloud service provider (that may be compromised either from within or by a third party).


  • NSF SaTC Program

  • People

    Feifei Li
    Associate Professor

    Zhao Chang
    PhD Student. Research Interest: Secure data systems and analytics.

    Jeff M. Phillips
    Assistant Professor


  • Privacy Preserving Subgraph Matching on Large Graphs in Cloud, Talk
    By Zhao Chang,    Lei Zou,    Feifei Li
    In Proceedings of 35th ACM SIGMOD International Conference on Management of Data (SIGMOD 2016),  pages 199-213,  2016.

    The wide presence of large graph data and the increasing popularity of storing data in the cloud drive the needs for graph query processing on a remote cloud. But a fundamental challenge is to process user queries without compromising sensitive information.This work focuses on privacy preserving subgraph matching in a cloud server. The goal is to minimize the overhead on both cloud and client sides for subgraph matching, without compromising users%u2019 sensitive information. To that end, we transform an originalgraph G into a privacy preserving graph Gk, which meets the requirement of an existing privacy model known as k-automorphism. By making use of the symmetry in a k-automorphic graph, a subgraph matching query can be efficiently answered using a graphGo, a small subset of Gk. This approach saves both space and query cost in the cloud server. In addition, we anonymize the original query graphs to protect their label information using label generalization technique. To reduce the search space for a subgraph matching query, we propose a cost model to select the more effectivelabel combinations. The effectiveness and efficiency of our method are demonstrated through extensive experimental results on real datasets.

  • Oblivious RAM: A Dissection and Experimental Evaluation (Project Website), Talk
    By Zhao Chang,    Dong Xie,    Feifei Li
    In Proceedings of Very Large Data Bases (VLDB 2016),  pages 1113-1124,  New Delhi, India,  September,  2016.

    Many companies choose the cloud as their data and IT infrastructure platform. The remote access of the data brings the issue of trust, and the potential risk of compromising sensitive information should not be underestimated. Despite the use of strong encryption schemes, adversaries can still learn valuable information regarding encrypted data by observing the data access patterns. To that end, one can hide the access patterns, which may leak sensitive information, using Oblivious RAMs (ORAMs). Numerous works have proposed different ORAM constructions. Nevertheless, many such ORAM constructions are of only theoretical interest, hence, are notuseful in practice. Several more practical ORAM constructions do exist, but they have never been thoroughly compared against and tested on large databases. There are no open source implementation of these schemes, making such a study challenging to carry out (since most ORAMs are quite contrived in terms of both theoretical analysis and practical implementations).These limitations make it difficult for researchers and practitioners to choose and adopt a suitable ORAM for their applications. To address this issue, we provide a thorough study over several practical ORAM constructions, and implement them under the same library. We perform extensive experiments to provide insights into their performance characteristics with respect to efficiency, scalability, and communication cost. Lastly, we plan to release our ORAM implementations through GitHub so that the communities at large may benefit from and contribute to an open source ORAM library under one unified framework.