GNNs for HL-LHC Tracking ExaTrkX @ Berkeley Lab Daniel Murnane Office of BERKELEY LAB 1 Science
Goal Sub-second processing of HL-LHC hit data into: • Seeds (i.e. triplets) for further processing with traditional techniques, AND/OR • Tracks, where each hit is assigned to exactly one track Office of BERKELEY LAB 2 Science
The Current Pipeline Filter likely, Filter, convert adjacent to triplets Raw hit data doublets embedded Train/classify Train/classify doublets in tripets in Apply cut DBSCAN GNN GNN for seeds for track labels Office of BERKELEY LAB 3 Science
Dataset • “ TrackML Kaggle Competition” dataset • Generated by simulation • 8000 collisions to train on • Each collision has up to 100,000 hits of around 10,000 particles Office of BERKELEY LAB 4 Science
Dataset • Ideal final result is a “ TrackML score” 𝑇 ∈ 0,1 • All hits belonging to same track labelled with same unique label ⇒ 𝑇 = 1 • We use the barrel as a test case, and ignore noise Office of BERKELEY LAB 5 Science
Embedding + MLP Construction • Won’t give any detail (Nick’s talk next on embeddings) • Generally: For each hit in event, embed features (co-ordinates, cell direction data, etc.) into N- 1. dimensional space Associate hits from same tracks as close in N-dimensional distance 2. Score each hit within embedding neighbourhood against the “seed” hit at centre 3. Filter by score, to create a set of doublets for the neighbourhood 4. All doublets in event generate a graph, 5. converted to a directed graph (by ordering layers) Office of BERKELEY LAB 6 Science
Segmentation A full graph from the embedding does not fit on a single GPU. Therefore the event graphs are segmented, according to how large the GNN model is expected to be. Hard cut One-directional soft cut Bi-directional soft cut Office of BERKELEY LAB 7 Science
Previous ML Approaches • Tracks as images (CNN) • Tracks as sequences of points (LSTM) Office of BERKELEY LAB 8 Science 8
Graph Neural Network for Edge Classification Classify edges with score between [0,1] score > cut: true score < cut: fake Office of BERKELEY LAB 9 Science
Passing information around the graph gives it learning power • Can make a node “aware” of its neighbours by concatenating the neighbouring hidden features • Iterating this neighbourhood learning passes information around the graph • Can be considered a generalisation of a flat CNN convolution Office of BERKELEY LAB 10 Science
GNN Edge prediction architecture • Message Passing • Attention Message Passing + + • Attention Message Passing with Residuals • Attention Message Passing with Recursion Office of BERKELEY LAB 11 Science 11 11
GNN Edge prediction architecture • Message Passing • Attention Message Passing Have found + + best efficiency • Attention Message Passing & purity with Residuals performance. • Attention Message Passing with Recursion Office of BERKELEY LAB 12 Science 12 12
Edge attention architecture • Input node features • Hidden node features 𝑦 • Hidden edge features 𝑧 𝑨 • Edge score 0 • Attention aggregation • New hidden node features x n iterations • New hidden edge features (hyperparameter) • New edge score Office of BERKELEY LAB 13 Science 13 13
Edge attention architecture • Input node features • Hidden node features 𝑦 • Hidden edge features 𝑧 𝑨 • Edge score 0 • Attention aggregation • New hidden node features ℎ 1 x n iterations … • New hidden edge features ℎ 𝑜 0 • New edge score Office of BERKELEY LAB 14 Science 14 14
Edge attention architecture • Input node features ℎ 1 … • Hidden node features ℎ 𝑜 1 • Hidden edge features ℎ 1 • Edge score … ℎ 𝑜 • Attention aggregation 0,1 • New hidden node features ℎ 1 x n iterations … • New hidden edge features ℎ 𝑜 0 • New edge score Office of BERKELEY LAB 15 Science 15 15
Edge attention architecture • Input node features • Hidden node features 0.6 • Hidden edge features ℎ 1 • Edge score … ℎ 𝑜 • Attention aggregation 0,1 • New hidden node features x n iterations • New hidden edge features • New edge score Office of BERKELEY LAB 16 Science 16 16
Edge attention architecture • Input node features • Hidden node features 0.1 0.4 0.6 • Hidden edge features • Edge score • Attention aggregation • New hidden node features x n iterations • New hidden edge features 0.9 0.1 0.8 0.8 0.4 • New edge score Office of BERKELEY LAB 17 Science 17 17
Edge attention architecture • Input node features • Hidden node features ℎ 1 ℎ 2 0.4 0.6 • Hidden edge features • Edge score • Attention aggregation • New hidden node features x n iterations • New hidden edge features 0.4 0.1 0.8 • New edge score ℎ 3 ℎ 4 ℎ 5 Office of BERKELEY LAB 18 Science 18 18
Edge prediction architecture • Input node features • Hidden node features 0.4 0.6 • Hidden edge features • Edge score ℎ 1 ℎ 2 ℎ 1 • Attention aggregation + … ℎ 𝑜 ℎ 5 0 • New hidden node features ℎ 3 ℎ 4 x n iterations • New hidden edge features 0.4 0.1 0.8 • New edge score Office of BERKELEY LAB 19 Science 19 19
Edge attention architecture • Input node features ℎ 1 … • Hidden node features ℎ 𝑜 1 0.6 • Hidden edge features ℎ 1 • Edge score … ℎ 𝑜 • Attention aggregation 0,1 • New hidden node features ℎ 1 x n iterations … • New hidden edge features ℎ 𝑜 0 • New edge score Office of BERKELEY LAB 20 Science 20 20
Edge attention architecture • Input node features • Hidden node features 0.1 0.2 0.9 • Hidden edge features ℎ 1 • Edge score … ℎ 𝑜 • Attention aggregation 0,1 • New hidden node features x n iterations x n iterations • New hidden edge features (hyperparameter) 0.9 0.3 0.9 0.2 0.2 • New edge score Office of BERKELEY LAB 21 Science 21 21
Doublet GNN Performance Threshold 0.5 0.8 Accuracy 0.9761 0.9784 Purity 0.9133 0.9694 Efficiency 0.9542 0.9052 Two points to keep in mind • In the past, graphs have been constructed with a heuristic procedure that had much lower efficiency than the learned embedding. This GNN is classifying a ∼ 96% efficient doublet dataset • These metrics are not the end product: we use the scores of the doublets to create triplets without losing efficiency Office of BERKELEY LAB 22 Science
Why not simply join together our doublet predictions? x 4 x 3 detector centre Distance from 0.01 0.99 Pretty easy x 2 decision 0.99 x 1 Office of BERKELEY LAB 23 Science 23 23
Doublet choice can be ambiguous x 4 x 3 detector centre Distance from 0.87 0.84 Not so easy… x 2 so teach the network how to combine 0.99 x 1 Office of BERKELEY LAB 24 Science 24 24
But a GNN doesn’t know about “triplets” x 4 x 3 detector centre Distance from A GNN only knows x 2 about nodes ? ? and edge x 1 Office of BERKELEY LAB 25 Science 25 25
Moving to a “doublet graph” gives us back GNN power x 4 x 3 0.87 0.84 x 2 x 2 x 2 Now… 0.99 nodes represent doublets , edges represent triplets x 1 Office of BERKELEY LAB 26 Science 26 26
Moving to a “doublet graph” gives us back GNN power ( ) ( ) x 2 x 2 x 4 x 3 0.87 0.84 ( ) x 1 x 2 0.99 Now… nodes represent doublets , edges represent triplets Office of BERKELEY LAB 27 Science 27 27
Triplet Propaganda Threshold 0.5 0.8 Accuracy 0.9761 0.9784 Purity 0.9133 0.9694 Efficiency 0.9542 0.9052 * relative Doublet GNN Triplet GNN Threshold 0.5 0.8 Accuracy 0.9960 0.9957 Purity 0.9854 0.9923 Efficiency 0.9939 0.9850 * relative Office of BERKELEY LAB 28 Science
Triplet propaganda Gold : Unambiguously correct triplet or quadruplet Other colours : False positive/negative Key : Silver: Ambiguously correct triplet or quadruplet (i.e. edge shared by correct triplet and false positive triplet) Bronze dashed: Correct triplet, but missed quadruplet (i.e. edge shared by correct triplet and false negative triplet) Red: Completely false positive triplet Blue dashed: Completely false negative triplet Office of BERKELEY LAB 29 Science
Triplet propaganda Gold : Unambiguously correct triplet or quadruplet Other colours : False positive/negative Key : Silver: Ambiguously correct triplet or quadruplet (i.e. edge shared by correct triplet and false positive triplet) Bronze dashed: Correct triplet, but missed quadruplet (i.e. edge shared by correct triplet and false negative triplet) Red: Completely false positive triplet Blue dashed: Completely false negative triplet Office of BERKELEY LAB 30 Science
Triplet GNN improves doublet GNN results Black : Triplet classifier correctly labelled, doublet classifier mislabelled Red : Doublet classifier correctly labelled, triplet classifier mislabelled In this graph, triplet classifier Fixes 389 edges Worsens 10 edges Office of BERKELEY LAB 31 Science
Recommend
More recommend