"Measuring the Effects of Thread Placement on the Kendall Square KSR1" by Amy Apon, T D. Wagner et al.

Publications

Title

Measuring the Effects of Thread Placement on the Kendall Square KSR1

Authors

Amy Apon, Clemson UniversityFollow
T D. Wagner, Vanderbilt University
E Smirni, Vanderbilt University
M Madhukar, Vanderbilt University
L W. Dowdy, Vanderbilt University

Document Type

Article

Publication Date

8-1993

Abstract

This paper describes a measurement study of the effects of thread placement on memory access times on the Kendall Square multiprocessor, the KSRl. The KSRl uses a conventional shared memory programming model in a distributed memory architecture. The architecture is based on a ring of rings of 64-bit superscalar microprocessors. The KSRl has a Cache-Only Memory Architecture (COMA). Memory consists of the local cache memoria attached to each processor. Whenever an address is accessed, the data item is automatically copied to the local cache memory module, 80 that access times for subsequent references will be minimal. If a local cache has space allocated for a particular data item, but does not have a current valid copy of that data item, then it is possible for the cache to acquire a valid read-only copy before it is requested by the local processor due to a request by a different processor that happens to pass by on the ring. This automatic prefetching can greatly reduce the average time for a thread to acquire data items. Because of the automatic prefetching, the time required to obtain a valid copy of a data item does not depend simply on the distance from the owner of the data item, but also depends on the placement and number of other processing threads which ehare the same data item. Also, the strategic placement of processing threads helps programs take advantage of the unique features of the memory architecture which help eliminate memory access bottlenecks for shared data sets. Experiments run on the KSRl across a wide variety of thread configurations show that shared memory access is accelerated through strategic placement of threads which share data. The results indicate strategies for improving the performance of applications programs, and illustrate that KSRl memory access times can remain nearly constant even when the number of participating threads increases.

Comments

This article has been placed in the public domain courtesy of Oak Ridge National Laboratory, U.S. Dept. of Energy.

Publications

Title

Authors

Document Type

Publication Date

Abstract

Comments

Recommended Citation

Included in

Search

Browse by

Useful Links

Publications

Title

Authors

Document Type

Publication Date

Abstract

Comments

Recommended Citation

Included in

Share

Search

Browse by

Useful Links