CSR: Medium: Energy-Efficient Architectures for Emerging Big-Data Workloads


In modern server architectures, the processor socket and the memory system are implemented as separate modules. Data exchange between these modules is expensive -- it is slow, it consumes a large amount of energy, and there are long wait times for narrow data links. Emerging big-data workloads will require especially large amounts of data movement between the processor and memory. To reduce the cost of data movement for big-data workloads, the project attempts to design new server architectures that can leverage 3D stacking technology. The proposed approach, referred to as Near Data Computing (NDC), reduces the distance between a subset of computational units and a subset of memory, and can yield high efficiency for workloads that exhibit locality. The project will also develop new big-data algorithms and runtime systems that can exploit the properties of the new architectures.

The project will lead to technologies that can boost performance and reduce the energy demands of big-data workloads. Several reports have cited the importance of these workloads to national, industrial, and scientific computing infrastructures. The project outcomes will be integrated into University of Utah curricula and will play a significant role in a new degree program on datacenter design and operation. The PIs will broaden their impact by publicly distributing parts of their software infrastructure and by engaging in outreach programs that involve minorities and K-12 students.


Funding


  • co-PI, NSF CNS Program, 07/01/13-06/30/17, $873,286
  • http://www.nsf.gov/awardsearch/showAward?AWD_ID=1302663

  • People


    Feifei Li
    Associate Professor


    Robert Christensen
    PhD student. Research Interest: interactive data analytics and systems.



    Publications


  • Comparing Implementations of Near-Data Computing with In-Memory MapReduce Workloads
    By Seth H. Pugsley,    Jeffrey Jestes,    Rajeev Balasubramonian,    Vijayalakshmi Srinivasan,    Alper Buyuktosunoglu,    Al Davis,    Feifei Li
    Vol.34, Pages 44-52, IEEE Micro Special Issue on Big Data (IEEE MICRO),  2014.
    Abstract

    MOVING COMPUTATION NEAR MEMORY HAS BECOME MORE PRACTICAL BECAUSE OF 3D STACKING TECHNOLOGY. THIS ARTICLE DISCUSSES IN-MEMORY MAPREDUCE IN THE CONTEXT OF NEAR-DATA COMPUTING (NDC). THE AUTHORS CONSIDER TWO NDC ARCHITECTURES: ONE THAT EXPLOITS HYBRID MEMORY CUBE DEVICES AND ONE THAT DOES NOT. THEY EXAMINE THE BENEFITS OF DIFFERENT NDC APPROACHES AND QUANTIFY THE POTENTIAL FOR IMPROVEMENT FOR AN IMPORTANT EMERGING BIG-DATA WORKLOAD.

  • Fixed-Function Hardware Sorting Accelerators for Near Data MapReduce Execution
    By Seth Pugsley,    Arjun Deb,    Rajeev Balasubramonian,    Feifei Li
    In Proceedings of 33rd IEEE International Conference on Computer Design (IEEE ICCD-33),  pages 439-442,  New York,  October,  2015.
    Abstract

    A large fraction of MapReduce execution time is spent processing the Map phase, and a large fraction of Map phase execution time is spent sorting the intermediate key-value pairs generated by the Map function. Sorting accelerators can achieve high performance and low power because they lack the overheads of sorting implementations on general purpose hardware, such as instruction fetch and decode. We find that sorting accelerators are a good match for 3D-stacked Near Data Processing (NDP) because their sorting throughput is so high that it saturates the memory bandwidth available in other memory organizations. The increased sorting performance and low power requirement of fixed-function hardware lead to very high Map phase performance and energy efficiency, reducing Map phase execution time by up to 92%, and reducing energy consumption by up to 91%. We further find that sorting accelerators in a less exotic form of NDP outperform more expensive forms of 3D-stacked NDP without accelerators. We also implement the accelerator on an FPGA to validate our claims.