Medium: Collaborative: Seal: Secure Engine for AnaLytics - From Secure Similarity Search to Secure Data Analytics

Many organizations and individuals rely on the cloud to store their data and process their analytical queries. But such data may contain sensitive information. Not only do users want to conceal their data on a cloud, they may also want to hide analytical queries over their data, results of such queries, and data access patterns from a cloud service provider (that may be compromised either from within or by a third party).


  • NSF SaTC Program

  • People

    Feifei Li

    Zhao Chang
    PhD Student. Research Interest: Secure data systems and analytics.

    Jeff M. Phillips
    Associate Professor


  • Privacy Preserving Subgraph Matching on Large Graphs in Cloud, Talk
    By Zhao Chang,    Lei Zou,    Feifei Li
    In Proceedings of 35th ACM SIGMOD International Conference on Management of Data (SIGMOD 2016), pages 199-213, 2016.

    The wide presence of large graph data and the increasing popularity of storing data in the cloud drive the needs for graph query processing on a remote cloud. But a fundamental challenge is to process user queries without compromising sensitive information.This work focuses on privacy preserving subgraph matching in a cloud server. The goal is to minimize the overhead on both cloud and client sides for subgraph matching, without compromising users%u2019 sensitive information. To that end, we transform an originalgraph G into a privacy preserving graph Gk, which meets the requirement of an existing privacy model known as k-automorphism. By making use of the symmetry in a k-automorphic graph, a subgraph matching query can be efficiently answered using a graphGo, a small subset of Gk. This approach saves both space and query cost in the cloud server. In addition, we anonymize the original query graphs to protect their label information using label generalization technique. To reduce the search space for a subgraph matching query, we propose a cost model to select the more effectivelabel combinations. The effectiveness and efficiency of our method are demonstrated through extensive experimental results on real datasets.

  • Oblivious RAM: A Dissection and Experimental Evaluation (Project Website), Talk
    By Zhao Chang,    Dong Xie,    Feifei Li
    In Proceedings of Very Large Data Bases (VLDB 2016), pages 1113-1124, New Delhi, India, September, 2016.

    Many companies choose the cloud as their data and IT infrastructure platform. The remote access of the data brings the issue of trust, and the potential risk of compromising sensitive information should not be underestimated. Despite the use of strong encryption schemes, adversaries can still learn valuable information regarding encrypted data by observing the data access patterns. To that end, one can hide the access patterns, which may leak sensitive information, using Oblivious RAMs (ORAMs). Numerous works have proposed different ORAM constructions. Nevertheless, many such ORAM constructions are of only theoretical interest, hence, are notuseful in practice. Several more practical ORAM constructions do exist, but they have never been thoroughly compared against and tested on large databases. There are no open source implementation of these schemes, making such a study challenging to carry out (since most ORAMs are quite contrived in terms of both theoretical analysis and practical implementations).These limitations make it difficult for researchers and practitioners to choose and adopt a suitable ORAM for their applications. To address this issue, we provide a thorough study over several practical ORAM constructions, and implement them under the same library. We perform extensive experiments to provide insights into their performance characteristics with respect to efficiency, scalability, and communication cost. Lastly, we plan to release our ORAM implementations through GitHub so that the communities at large may benefit from and contribute to an open source ORAM library under one unified framework.

  • Secure DIMM: Moving ORAM Primitives Closer to Memory, Talk
    By Ali Shafiee,    Rajeev Balasubramonian,    Mohit Tiwari,    Feifei Li
    In Proceedings of 24th IEEE International Symposium on High-Performance Computer Architecture (HPCA 2018), pages 428-440, February , 2018.

    As more critical applications move to the cloud, there is a pressing need to protect the data and computations hosted on the cloud. While cloud infrastructures are vulnerable to a variety of attacks, in this work, we focus on an attack model where an untrusted cloud operator has physical access to the server and can monitor the signals emerging from the proces- sor socket. Even if data packets are encrypted, the sequence of addresses touched by the program serves as an informa- tion side channel. To eliminate this side channel, Oblivious RAM constructs have been investigated for decades, but con- tinue to pose large overheads. In this work, we make the case that ORAM overheads can be dramatically reduced by mov- ing ORAM functionality into the memory system. We first de- sign a secure DIMM (or SDIMM) that uses commodity low- cost memory and an ASIC as a secure buffer chip. We then design two new ORAM protocols that leverage SDIMMs to reduce bandwidth, latency and energy per ORAM access. In both protocols, each SDIMM is responsible for part of the ORAM tree. Each SDIMM performs a number of ORAM op- erations that are not visible to the main memory channel. By having many SDIMMs in the system, we are able to achieve highly parallel ORAM operations. The main memory chan- nel uses its bandwidth primarily to service blocks requested by the CPU, and not to perform the many shuffle operations required by conventional ORAM. The new protocols guaran- tee the same obliviousness properties as Path ORAM. On a set of memory-intensive workloads, our ORAM protocols %u2013 Independent ORAM and Split ORAM %u2013 are able to improve performance and energy by 1.9%uFFFD and 2.55%uFFFD, compared to Freecursive ORAM.