Fast Multiproce ssor Mem ory Allocation and Garbage Colle ction

7/27/2019 Fast Multiproce ssor Mem ory Allocation and Garbage Colle ction

1/10

Fast Multiprocessor Memory Allocation

and Garbage CollectionHans-J . BoehmInternet and Mobile Systems LaboratoryHP L aboratories Palo AltoHPL -2000-165December 8th, 2000*

E-mail: [email protected]

garbagecollection,memoryallocation,multiprocessors,threads

We extended our garbage collecting memory allocator to provide goodperformance for multi-threaded applications on multiprocessors. The

basic design is similar to the approach previously pursued in [12].However, we concentrate on issues important to more common small-scale multiprocessors, and on specific issues not reported elsewhere.We argue that a reasonable level of garbage collector scalability canbe achieved with relatively minor additions to the underlyingcollector code. F urthermore the scalable collector does not need to beappreciably slower on a uniprocessor. Since our collector can serve aa plug-in replacement for malloc/free, we have the opportunity tocompare it to scalable malloc-free implementations, notably Hoard[3]. Somewhat surprisingly, our collector significantly outperformsHoard in some tests, a property that is mostly shared by the garbage

collecting allocator in [ETY97]. We argue that garbagecollectors currently require significantly less synchronization thanexplicit allocators, but that it may be possible to derive significantlyfaster explicit allocators from this observation.Speedy access to thread-local storage is a significant issue in thedesign of allocators that must conform to standard callingconventions. We present empirical evidence that at least in thepresence of a garbage collector, this can often be accomplished fasterin a thread-independent way than through the standard threadlibrary facilities, casting some doubt on the utility of the latter.

*Internal Accession Date Only Approved for External PublicationCopyright Hewlett-Packard Company 2000


2/10

Fast Multiprocessor Memory Allocation and GarbageCollection

Hans-J. Boehm Hewlett-Packard Laboratories1501 Page Mill Rd.

Palo Alto, CA 94304

Hans [email protected]

ABSTRACT

1. INTRODUCTION


3/10

2. RELATED WORK


4/10

3. CONTEXT

4. PARALLEL ALLOCATION

5. PARALLEL MARKING


5/10

Local Mark stacks(One per marker thread)

Clearedtraced

To be

Queue

Global Mark

6. ISSUES AFFECTING ABSOLUTE PER-

FORMANCE

6.1 Mark bit representation


6/10

6.2 Thread-specific-data


7/10

7. COLLECTOR MEASUREMENTS

7.1 Allocators

7.2 Benchmarks


8/10

7.2.1 Ghostscript

7.2.2 MT GCBench2

7.2.3 Larson


9/10

7.2.4 Larson-small

8. OBSERVATIONS ABOUT EXPLICIT DE-

ALLOCATION


10/10

9. REFERENCES

Documents

Fast Multiproce ssor Mem ory Allocation and Garbage Colle ction