spcl.inf.ethz.ch @spcl_eth A TOMICS : P ERFORMANCE D IMENSIONS Modified : in one cache and dirty Cache coherence state? Exclusive : in one cache and clean Shared : in >1 cache and clean
spcl.inf.ethz.ch @spcl_eth A TOMICS : P ERFORMANCE D IMENSIONS Modified : in one cache and dirty Cache coherence state? Exclusive : in one cache and clean Shared : in >1 cache and clean Invalid : garbage data
spcl.inf.ethz.ch @spcl_eth A TOMICS : P ERFORMANCE D IMENSIONS Architecture
spcl.inf.ethz.ch @spcl_eth A TOMICS : P ERFORMANCE D IMENSIONS Architecture
spcl.inf.ethz.ch @spcl_eth A TOMICS : P ERFORMANCE D IMENSIONS Architecture
spcl.inf.ethz.ch @spcl_eth A TOMICS : P ERFORMANCE D IMENSIONS Architecture
spcl.inf.ethz.ch @spcl_eth A TOMICS : P ERFORMANCE D IMENSIONS Architecture
spcl.inf.ethz.ch @spcl_eth R ESEARCH Q UESTIONS
spcl.inf.ethz.ch @spcl_eth R ESEARCH Q UESTIONS How do we model the performance of atomics?
spcl.inf.ethz.ch @spcl_eth R ESEARCH Q UESTIONS What is the How do we model the performance performance of difference between atomics? various atomics?
spcl.inf.ethz.ch @spcl_eth R ESEARCH Q UESTIONS What is the How do we model the performance performance of difference between atomics? various atomics? What is the influence of various parameters and mechanisms?
spcl.inf.ethz.ch @spcl_eth L ATENCY M ODEL Core … Cache Cache Cache Cache line
spcl.inf.ethz.ch @spcl_eth L ATENCY M ODEL Core … Cache Cache Cache Cache line
spcl.inf.ethz.ch @spcl_eth L ATENCY M ODEL Core Read for ownership … Cache Cache Cache Cache line
spcl.inf.ethz.ch @spcl_eth L ATENCY M ODEL Cache coherence state Core Read for ownership … Cache Cache Cache Cache line
spcl.inf.ethz.ch @spcl_eth L ATENCY M ODEL Cache coherence state Core Read for ownership = max(read latency, invalidation latency) … Cache Cache Cache Cache line
spcl.inf.ethz.ch @spcl_eth L ATENCY M ODEL Cache coherence state Core Read for ownership Cache line = max(read latency, invalidation latency) … Cache Cache Cache Cache line
spcl.inf.ethz.ch @spcl_eth L ATENCY M ODEL Cache coherence state Core Read for ownership Cache line = max(read latency, invalidation latency) … Cache Cache Cache Cache line
spcl.inf.ethz.ch @spcl_eth L ATENCY M ODEL Cache coherence state Core Read for ownership Cache line = max(read latency, invalidation latency) Execute = constant … Cache Cache Cache Cache line
spcl.inf.ethz.ch @spcl_eth L ATENCY M ODEL Cache coherence state Core Read for ownership Cache line = max(read latency, invalidation latency) Execute = constant … Cache Cache Cache Cache line Atomic
spcl.inf.ethz.ch @spcl_eth L ATENCY M ODEL Cache coherence state Core Read for ownership Cache line = max(read latency, invalidation latency) Execute = constant … Cache Cache Cache Cache line Atomic
spcl.inf.ethz.ch @spcl_eth L ATENCY M ODEL Cache coherence state Core Read for ownership Cache line = max(read latency, invalidation latency) Execute = constant … Cache Cache Cache Cache line Atomic Atomic
spcl.inf.ethz.ch @spcl_eth L ATENCY M ODEL Cache coherence state Core Read for ownership Cache line = max(read latency, invalidation latency) Execute = constant … Cache Cache Cache Cache line Atomic Cache Atomic coherence state
spcl.inf.ethz.ch @spcl_eth L ATENCY M ODEL E XCLUSIVE OR M ODIFIED S TATE Core Read for ownership Cache line = max(read latency, invalidation latency) Execute … Cache Cache Cache = constant Cache line Cache Atomic coherence state
spcl.inf.ethz.ch @spcl_eth L ATENCY M ODEL E XCLUSIVE OR M ODIFIED S TATE Core Read for ownership Cache line = read latency Execute … Cache Cache Cache = constant Cache line Cache Atomic coherence state
spcl.inf.ethz.ch @spcl_eth L ATENCY M ODEL E XCLUSIVE OR M ODIFIED S TATE mean of observed observed data predictions data Core Read for ownership Cache line = read latency Execute … Cache Cache Cache = constant Cache line Cache Atomic coherence state
spcl.inf.ethz.ch @spcl_eth L ATENCY H ASWELL , E XCLUSIVE
spcl.inf.ethz.ch @spcl_eth L ATENCY B ULLDOZER , E XCLUSIVE FAA CAS
spcl.inf.ethz.ch @spcl_eth L ATENCY H ASWELL , E XCLUSIVE Alignment?
spcl.inf.ethz.ch @spcl_eth L ATENCY Operand B ULLDOZER , E XCLUSIVE size? 64 bit 128 bit
spcl.inf.ethz.ch @spcl_eth B ANDWIDTH H ASWELL , A TOMICS
spcl.inf.ethz.ch @spcl_eth C ONCLUSIONS P ERFORMANCE I NSIGHTS
spcl.inf.ethz.ch @spcl_eth C ONCLUSIONS P ERFORMANCE I NSIGHTS The same latency of different atomics in most scenarios
spcl.inf.ethz.ch @spcl_eth C ONCLUSIONS P ERFORMANCE I NSIGHTS The same latency of different atomics in most scenarios CAS is the fastest for some cases
spcl.inf.ethz.ch @spcl_eth C ONCLUSIONS P ERFORMANCE I NSIGHTS Unaligned atomics should be avoided at all costs The same latency of different atomics in most scenarios CAS is the fastest for some cases
spcl.inf.ethz.ch @spcl_eth C ONCLUSIONS P ERFORMANCE I NSIGHTS Unaligned atomics should be avoided at all costs The same latency of different atomics in most scenarios No parallel execution (low bandwidth) even if there are no data deps CAS is the fastest for some cases
spcl.inf.ethz.ch @spcl_eth C ONCLUSIONS P ERFORMANCE I NSIGHTS Unaligned atomics should be avoided at all costs The same latency of different atomics in most scenarios No parallel execution (low bandwidth) even if there are no data deps CAS is the fastest for some cases Small operand sizes give best performance
Recommend
More recommend