SIGNET: NETWORK-ON-CHIP FILTERING FOR COARSE VECTOR DIRECTORIES Natalie Enright Jerger University of Toronto
Interaction of Coherence and Network 2 Cache coherence protocol drives network-on-chip traffic Scalable coherence protocols needed for many-core architectures Consider interconnection network optimizations to help facilitate scalable coherence DATE 2010 Natalie Enright Jerger
Talk Outline 3 Introduction Network-on-Chip Challenges with Scalable Coherence Protocol SigNet Architecture: Network filtering solution Evaluation Conclusion DATE 2010 Natalie Enright Jerger
Many-Core Cache Coherence Challenges 4 DATE 2010 Natalie Enright Jerger
Many-Core Cache Coherence Challenges 4 Broadcast Good latency Poor scaling due to bandwidth requirements DATE 2010 Natalie Enright Jerger
Many-Core Cache Coherence Challenges 4 Broadcast Good latency Poor scaling due to bandwidth requirements DATE 2010 Natalie Enright Jerger
Many-Core Cache Coherence Challenges 4 Broadcast Directory Good latency Good scalability due to point to point communication Poor scaling due to Storage overheads bandwidth requirements DATE 2010 Natalie Enright Jerger
Many-Core Cache Coherence Challenges 4 Store Miss Broadcast Directory Good latency Good scalability due to point to point communication Poor scaling due to Storage overheads bandwidth requirements DATE 2010 Natalie Enright Jerger
Many-Core Cache Coherence Challenges 4 Store Miss ForwardX Broadcast Directory Good latency Good scalability due to point to point communication Poor scaling due to Storage overheads bandwidth requirements DATE 2010 Natalie Enright Jerger
Many-Core Cache Coherence Challenges 4 Response Store Miss ForwardX Broadcast Directory Good latency Good scalability due to point to point communication Poor scaling due to Storage overheads bandwidth requirements DATE 2010 Natalie Enright Jerger
Scalable Cache Coherence 5 DATE 2010 Natalie Enright Jerger
Scalable Cache Coherence 5 Directory protocol storage overheads Single bit per core in sharing vector (full map) 256 cores ➔ 32 Bytes of overhead per cache line! DATE 2010 Natalie Enright Jerger
Scalable Cache Coherence 5 Directory protocol storage overheads Single bit per core in sharing vector (full map) 256 cores ➔ 32 Bytes of overhead per cache line! DATE 2010 Natalie Enright Jerger
Scalable Cache Coherence 5 Directory protocol storage overheads Single bit per core in sharing vector (full map) 256 cores ➔ 32 Bytes of overhead per cache line! DATE 2010 Natalie Enright Jerger
Scalable Cache Coherence 5 Directory protocol storage overheads Single bit per core in sharing vector (full map) 256 cores ➔ 32 Bytes of overhead per cache line! DATE 2010 Natalie Enright Jerger
Scalable Cache Coherence 5 Directory protocol storage overheads Single bit per core in sharing vector (full map) 256 cores ➔ 32 Bytes of overhead per cache line! DATE 2010 Natalie Enright Jerger
Scalable Cache Coherence 5 Directory protocol storage overheads Single bit per core in sharing Cores 1, 2, 5 & 15 share cache line vector (full map) 256 cores ➔ 32 Bytes of overhead per cache line! DATE 2010 Natalie Enright Jerger
Scalable Cache Coherence 5 Directory protocol storage overheads Single bit per core in sharing Cores 1, 2, 5 & 15 share cache line vector (full map) 256 cores ➔ 32 Bytes of overhead per cache line! Coarse Vector Directories Dir i CV r i: # of pointers r: # of cores in region Example: Dir 2 CV 2 Requires 1/2 as much storage DATE 2010 Natalie Enright Jerger
Scalable Cache Coherence 5 Directory protocol storage overheads Single bit per core in sharing Cores 1, 2, 5 & 15 share cache line vector (full map) 256 cores ➔ 32 Bytes of 2 pointers overhead per cache line! Coarse Vector Directories Dir i CV r i: # of pointers r: # of cores in region Example: Dir 2 CV 2 Requires 1/2 as much storage DATE 2010 Natalie Enright Jerger
Scalable Cache Coherence 5 Directory protocol storage overheads Single bit per core in sharing Cores 1, 2, 5 & 15 share cache line vector (full map) 256 cores ➔ 32 Bytes of 2 pointers overhead per cache line! Coarse Vector Directories Dir i CV r i: # of pointers r: # of cores in region Example: Dir 2 CV 2 Requires 1/2 as much storage DATE 2010 Natalie Enright Jerger
Scalable Cache Coherence 5 Directory protocol storage overheads Single bit per core in sharing Cores 1, 2, 5 & 15 share cache line vector (full map) 256 cores ➔ 32 Bytes of 2 pointers overhead per cache line! Only 2 sharers: 1 & 15 Coarse Vector Directories Dir i CV r i: # of pointers r: # of cores in region Example: Dir 2 CV 2 Requires 1/2 as much storage DATE 2010 Natalie Enright Jerger
Scalable Cache Coherence 5 Directory protocol storage overheads Single bit per core in sharing Cores 1, 2, 5 & 15 share cache line vector (full map) 256 cores ➔ 32 Bytes of 2 pointers overhead per cache line! Only 2 sharers: 1 & 15 Coarse Vector Directories Dir i CV r i: # of pointers r: # of cores in region Example: Dir 2 CV 2 Requires 1/2 as much storage DATE 2010 Natalie Enright Jerger
Scalable Cache Coherence 5 Directory protocol storage overheads Single bit per core in sharing Cores 1, 2, 5 & 15 share cache line vector (full map) 256 cores ➔ 32 Bytes of 2 pointers overhead per cache line! Only 2 sharers: 1 & 15 Coarse Vector Directories Dir i CV r i: # of pointers r: # of cores in region Example: Dir 2 CV 2 Requires 1/2 as much storage DATE 2010 Natalie Enright Jerger
Scalable Cache Coherence 5 Directory protocol storage overheads Single bit per core in sharing Cores 1, 2, 5 & 15 share cache line vector (full map) 256 cores ➔ 32 Bytes of 2 pointers overhead per cache line! Only 2 sharers: 1 & 15 Coarse Vector Directories Dir i CV r i: # of pointers r: # of cores in region Example: Dir 2 CV 2 Requires 1/2 as much storage DATE 2010 Natalie Enright Jerger
Scalable Cache Coherence 5 Directory protocol storage overheads Single bit per core in sharing Cores 1, 2, 5 & 15 share cache line vector (full map) 256 cores ➔ 32 Bytes of 2 pointers overhead per cache line! Only 2 sharers: 1 & 15 Coarse Vector Directories Dir i CV r i: # of pointers r: # of cores in region 3 rd sharer: overflow Example: Dir 2 CV 2 Requires 1/2 as much storage DATE 2010 Natalie Enright Jerger
Scalable Cache Coherence 5 Directory protocol storage overheads Single bit per core in sharing Cores 1, 2, 5 & 15 share cache line vector (full map) 256 cores ➔ 32 Bytes of 2 pointers overhead per cache line! Only 2 sharers: 1 & 15 Coarse Vector Directories Dir i CV r i: # of pointers r: # of cores in region 3 rd sharer: overflow Example: Dir 2 CV 2 Requires 1/2 as much storage DATE 2010 Natalie Enright Jerger
Scalable Cache Coherence 5 Directory protocol storage overheads Single bit per core in sharing Cores 1, 2, 5 & 15 share cache line vector (full map) 256 cores ➔ 32 Bytes of 2 pointers overhead per cache line! Only 2 sharers: 1 & 15 Coarse Vector Directories Dir i CV r i: # of pointers r: # of cores in region 3 rd sharer: overflow Example: Dir 2 CV 2 Requires 1/2 as much storage DATE 2010 Natalie Enright Jerger
Scalable Cache Coherence 5 Directory protocol storage overheads Single bit per core in sharing Cores 1, 2, 5 & 15 share cache line vector (full map) 256 cores ➔ 32 Bytes of 2 pointers overhead per cache line! Only 2 sharers: 1 & 15 Coarse Vector Directories Dir i CV r i: # of pointers r: # of cores in region 3 rd sharer: overflow Example: Dir 2 CV 2 Requires 1/2 as much storage DATE 2010 Natalie Enright Jerger
Scalable Cache Coherence 5 Directory protocol storage overheads Single bit per core in sharing Cores 1, 2, 5 & 15 share cache line vector (full map) 256 cores ➔ 32 Bytes of 2 pointers overhead per cache line! Only 2 sharers: 1 & 15 Coarse Vector Directories Dir i CV r i: # of pointers r: # of cores in region 3 rd sharer: overflow Example: Dir 2 CV 2 3 sharers but 6 Requires 1/2 as much storage invalidations DATE 2010 Natalie Enright Jerger
Scalable Cache Coherence 5 Directory protocol storage overheads Single bit per core in sharing Cores 1, 2, 5 & 15 share cache line vector (full map) 256 cores ➔ 32 Bytes of 2 pointers overhead per cache line! Represents Only 2 sharers: 1 & 15 Coarse Vector Directories cores 0 & 1 Dir i CV r i: # of pointers r: # of cores in region 3 rd sharer: overflow Example: Dir 2 CV 2 3 sharers but 6 Requires 1/2 as much storage invalidations DATE 2010 Natalie Enright Jerger
Recommend
More recommend