USING COMPRESSION TO IMPROVE CHIP MULTIPROCESSOR Sameh Meligy, Tarek Khorshed, Moustafa Youssef, Moustafa

  • View

  • Download

Embed Size (px)

Text of USING COMPRESSION TO IMPROVE CHIP MULTIPROCESSOR Sameh Meligy, Tarek Khorshed, Moustafa Youssef,...




    Alaa R. Alameldeen

    A dissertation submitted in partial fulfillment of

    the requirements for the degree of

    Doctor of Philosophy

    (Computer Sciences)

    at the



  •  Copyright by Alaa R. Alameldeen 2006

    All Rights Reserved

  • i



    gle chip


    uch as

    he and


    s in the




    p pin




    CMP is

    pact of


    d. We

    he com-

    k com-

    Chip multiprocessors (CMPs) combine multiple processors on a single die, typically with private leve

    caches and a shared level-two cache. However, the increasing number of processors cores on a sin

    increases the demand on two critical resources: the shared L2 cache capacity and the off-chip pin

    width. Demand on these critical resources is further exacerbated by latency-hiding techniques s

    hardware prefetching. In this dissertation, we explore using compression to effectively increase cac

    pin bandwidth resources and ultimately CMP performance.

    We identify two distinct and complementary designs where compression can help improve CMP p

    mance: Cache Compression and Link Compression. Cache compression stores compressed line

    cache, potentially increasing the effective cache size, reducing off-chip misses and improving p

    mance. On the downside, decompression overhead can slow down cache hit latencies, possibly de

    performance. Link (i.e., off-chip interconnect) compression compresses communication messages

    sending to or receiving from off-chip system components, thereby increasing the effective off-chi

    bandwidth, reducing contention and improving performance for bandwidth-limited configurations. W

    compression can have a positive impact on CMP performance, practical implementations of compr

    raise a few concerns: (1) Compression algorithms have too high an overhead to implement at the

    level; (2) compression overhead can degrade performance; (3) the potential for compression on a

    unknown; (4) most benefits of compression can be achieved by hardware prefetching; and (5) the im

    compression on a balanced CMP design is not well understood.

    In this dissertation, we make five contributions that address the above concerns. We propose a com

    L2 cache design based on a simple compression algorithm with a low decompression overhea

    develop an adaptive compression scheme that dynamically adapts to the costs and benefits of cac

    pression, and employs compression only when it helps performance. We show that cache and lin

  • s. We

    m that

    duct of



    ii pression both combine to improve CMP performance for commercial and (some) scientific workload

    show that compression interacts in a strong positive way with hardware prefetching, whereby a syste

    implements both compression and hardware prefetching can have a higher speedup than the pro

    speedups of each scheme alone. We also provide a simple analytical model that helps provide qu

    intuition into the trade-off between cores, caches, communication and compression, and use full-

    simulation to quantify this trade-off for a set of commercial workloads.

  • iii


    . I can-



    t com-

    rs, and

    me as I




    o work

    n to


    , Guri


    nt, and

    o me



    e to

    Thank God that I have received a lot of help and support from many people over the past few years

    not imagine making it this far without such support. It is with great pleasure that I thank and acknow

    those who contributed to my success, or as many of them as I can remember.

    I am deeply grateful to my advisor, Prof. David Wood. He has been an outstanding mentor who pro

    me with a lot of support and guidance. I have learned a lot under his supervision, and not just abou

    puter architecture. He taught me how to conduct research, guided me in writing research pape

    trained me on how to present my research to others. I am thankful that he gave me as much of his ti

    asked for, and provided me with support when I needed it the most. On both a professional and a p

    level, I greatly enjoyed being his student for all these years!

    The University of Wisconsin is a great place to learn about computer science in general, and com

    architecture in particular. I would like to thank many outstanding professors at UW-Madison who ta

    me a lot and helped me throughout my Ph.D. program. I feel very fortunate to have had the chance t

    with and learn from Prof. Mark Hill, the other director of the Multifacet project. He is a great perso

    work with. I benefited a lot from his sound advice, excellent insight, and valuable feedback on my

    and especially my presentation skills. I would also like to thank Professors Jim Goodman, Jim Smith

    Sohi and Mikko Lipasti for building and maintaining an excellent computer architecture program. I t

    Prof. Jim Goodman for teaching me about multiprocessors, for his strong support and encourageme

    for his useful feedback on my work. I also thank Prof. Guri Sohi who has been extremely helpful t

    since my preliminary exam. I would also like to thank all my committee members: Professors Mark

    Mikko Lipasti, Jim Smith and Guri Sohi for their help, support, useful feedback, and flexibility in sche

    ing my preliminary and final oral exams. I thank them especially for encouraging and challenging m

  • ohi,



    ate to

    ul to



    l as his

    ng me

    for not


    y inter-

    om his

    uite a

    , as







    iv improve my dissertation research. I would also like to thank Professors David Wood, Mark Hill, Guri S

    Jim Goodman, Mikko Lipasti and Jeff Naughton for taking the time to write reference letters for me.

    I would also like to acknowledge many other UW-Madison faculty members. I thank Professors

    Shulte, Charlie Fischer and Mary Vernon for their useful feedback on my research. I also thank Prof.

    DeWitt who was the best instructor I had in graduate school, as well as all professors I was fortun

    learn from.

    I received a lot of support from student members of the Multifacet project. Milo Martin was very helpf

    me since the first day I joined the group. Milo and Dan Sorin were both good mentors to me. They

    were the first to think about workload variability, and they pointed me in the right direction to continue

    work. Carl Mauer has been a good friend, and I really appreciate his kindness towards me, as wel

    helping me solve various technical problems. Anastassia Ailamaki was a tremendous help in getti

    started on my database workloads work. I thank Brad Beckmann, my office-mate for the past year,

    complaining—at least to my face!—when I was bothering him with questions. I also thank him for he

    me use his code to get started in studying hardware prefetching. Kevin Moore engaged me in man

    esting discussions. Min Xu has been a source of tremendous technical help, and I benefited a lot fr

    knowledge. I also enjoyed listening to Mike Marty’s interesting opinions. Ross Dickson helped me q

    bit with the simulation infrastructure. I would like to acknowledge other former Multifacet members

    well as current members Luke Yen, Dan Gibson, Jayaram Bobba, Michelle Moravan, and Andy P

    Many thanks are due to project assistants Alicia Walley, Erin Miller, Carrie Pritchard and Caitlin Sc

    for their help with project administration and proof-reading.

    I’ve also had the opportunity to meet many excellent students in Madison. While I cannot possibly me

    the names of all those who gave me moral support, help and encouragement, I would like to spec

    thank a few people. My friend Ashraf Aboulnaga (currently Prof. Aboulnaga!) has helped me a lot d

    my first three years in Madison. I was glad I had the opportunity to work with him on a database res

  • uer,






    up. I



    , Greg



    to the

    d Paul

    l: the



    e bad


    her. I

    d, and

    v publication, much to my surprise! I am grateful to Craig Zilles, Adam Butts, Brian Fields, Carl Ma

    Brad Beckmann and Philip Wells for running the always-informative, often-entertaining architecture

    ing group, and to Allison Holloway for organizing the architecture students lunch. I also had a lot of i

    esting discussions with some of my colleagues here that I would like to thank: Trey Cain, Ravi Ra

    Ashutosh Dhodapkar, Kevin Lepak, Shiliang Hu, Tejas Karkhanis, Collin McCurdy, Jichuan Ch

    Gordie Bell, Sai Balakrishnan, Matt Allen, Koushik Chakraborty, and all others I might have forgo

    Trey Cain, Carl Mauer and I had a lot of enlightening discussions in our