Leveraging Graphs for Better AI Alicia M. Frame, PhD Lead Data Scientist, neo4j alicia.frame@neo4j.com 1
Graph Data Science Applications Financial Services Drug Discovery Recommendations Customer Segmentation Cybersecurity Churn Prediction Search/MDM Predictive Maintenance
Novel & More Accurate Predictions with the Data You Already Have Current data science models ignore network structure • Graphs add highly predictive features to existing ML models • Otherwise unattainable predictions based on relationships • Machine Learning Pipeline
“The idea is that graph networks are bigger than "Where do the graphs any one machine-learning approach. come from that graph networks operate Graphs bring an ability to generalize about structure that the over?” individual neural nets don't have.”
Building a Graph ML Model Data Native Graph Platform Machine Sources Learning MLlib Parquet JSON and more… and more… Aggregate Disparate Data Unify Graphs and Engineer Build Predictive Models and Cleanse Features
Example: Spark & Neo4j Workflow Native Graph Platform Spark Graph Machine Learning Graph Graph Transactions Analytics Cypher 9 in Spark 3.0 to Native Graph Algorithms, MLlib to Train Models create non-persistent Processing, and Storage graphs
Explore Graphs Build Graph Solutions • Massively scalable • Persistent, dynamic graphs • Powerful data pipelining • Graph native query and algorithm performance • Robust ML Libraries • Constantly growing list of graph • Non-persistent, non-native graphs algorithms and embeddings
The Steps of Graph Data Science Knowledge Graph Feature Graph Native Graphs Engineering Learning Data Science Complexity Graph Neural Networks Graph Algorithm Graph Feature Embeddings Engineering Query Based Feature Query Based Engineering Knowledge Graph Enterprise Maturity Graph Persistence
Steps Forward in Graph Data Science Data Science Complexity Graph Neural Networks Graph Algorithm Graph Embeddings Feature Engineering Query Based Feature Engineering Query Based Knowledge Graph Enterprise Maturity
Query based knowledge graphs: Connecting the Dots at NASA “Using Neo4j someone from our Orion project found information from the Apollo project that prevented an issue, saving well over two years of work and one million dollars of taxpayer funds.” 14
Steps Forward in Graph Data Science Data Science Complexity Graph Neural Networks Graph Algorithm Graph Feature Embeddings Engineering Query Based Feature Query Based Engineering Knowledge Graph Enterprise Maturity
Query-Based Feature Engineering Telecom-churn prediction Churn prediction research has found that simple hand- engineered features are highly predictive How many calls/texts has • an account made? Telecommunication networks How many of their • are easily represented as graphs contacts have churned?
Query-Based Feature Engineering Telecom-churn prediction Khan et al, 2015 Add connected features based on graph queries to tabular data 21
Knowledge Graphs: Getting Started Example with Spark Spark Graph Native Graph Platform Machine Learning Graph Graph Transactions Analytics • Bring query based graph Move to Neo4j to build • Merge distributed data into • features to ML pipeline expert queries DataFrames Persist your graph • Reshape your tables • into graphs Explore cypher queries •
Steps Forward in Graph Data Science Data Science Complexity Graph Neural Networks Graph Algorithm Graph Feature Embeddings Engineering Query Based Feature Query Based Engineering Knowledge Graph Enterprise Maturity
Graph Feature Engineering Feature Engineering is how we combine and process the data to create new, more meaningful features , such as clustering or connectivity metrics. Add More Descriptive Features: - Influence - Relationships - Communities
Graph Feature Categories & Algorithms Community Centrality / Pathfinding Detection Importance & Search Determines the importance of Finds the optimal paths or evaluates Detects group clustering or distinct nodes in the network route availability and quality partition options Heuristic Similarity Embeddings Link Prediction Estimates the likelihood of nodes Learned representations Evaluates how alike nodes forming a relationship of connectivity or topology are 25
Financial Crime: Detecting Fraud Large financial institutions already have existing pipelines to identify fraud via heuristics and models Graph based features improve accuracy: Connected components to identify • disjointed graphs sharing identifiers PageRank to measure influence and • transaction volumes Louvain to identify communities that • frequently interact Jaccard to measure account similarity • based on relationships 26
+142,000 Peer Reviewed Publications Graph Fraud / Anomaly Detection in the last 10 years
Graph Feature Engineering: Getting Started Example with Spark Spark Graph Native Graph Platform Machine Learning Graph Graph Transactions Analytics • Bring graph features to ML Persist your graph • Merge distributed data into • pipeline for training Create rule based features • DataFrames Run native graph • Reshape your tables • algorithms and write to into graphs graph or stream Explore cypher queries and • simple algorithms
Graph Algorithms in Neo4J Pathfinding Centrality / Community & Search Importance Detection • Parallel Breadth First Search • Degree Centrality • Triangle Count • Parallel Depth First Search • Closeness Centrality • Clustering Coefficients • Shortest Path • CC Variations: Harmonic, Dangalchev, • Connected Components (Union Find) • Single-Source Shortest Path Wasserman & Faust • Strongly Connected Components • All Pairs Shortest Path • Betweenness Centrality • Label Propagation • Minimum Spanning Tree • Approximate Betweenness Centrality • Louvain Modularity – 1 Step & Multi-Step • A* Shortest Path • PageRank • Balanced Triad (identification) • Yen’s K Shortest Path • Personalized PageRank • K-Spanning Tree (MST) • ArticleRank • Random Walk • Eigenvector Centrality Link Prediction Similarity Adamic Adar • Euclidean Distance • • Common Neighbors Cosine Similarity • • Preferential Attachment Jaccard Similarity • • Resource Allocations Overlap Similarity • • Same Community neo4j.com/docs/ Pearson Similarity • • Total Neighbors graph-algorithms/current/ 29
Steps Forward in Graph Data Science Data Science Complexity Graph Neural Networks Graph Algorithm Graph Feature Embeddings Engineering Query Based Feature Query Based Engineering Knowledge Graph Enterprise Maturity
Graph Embeddings Embedding transforms graphs into a vector, or set of vectors, describing topology, connectivity, or attributes of nodes and edges in the graph • Vertex embeddings: describe connectivity of each node • Path embeddings: traversals across the graph • Graph embeddings: encode an entire graph into a single vector 31
Graph Embeddings - Recommendations Explainable Reasoning over Knowledge Graphs for Recommendation 32
Graph Embeddings - Recommendations Explainable Reasoning over Knowledge Graphs for Recommendation 33
Graph Feature Engineering: Getting Started Example with Spark Spark Graph Native Graph Platform Machine Learning Graph Graph Transactions Analytics • Bring graph features to ML Move to Neo4j to build • Merge distributed data into • pipeline for training expert queries DataFrames Write to persist • Reshape your tables • Stay tuned for DeepWalk • into graphs and DeepGL algorithms Explore cypher queries and • simple algorithms
Steps Forward in Graph Data Science Data Science Complexity Graph Neural Networks Graph Algorithm Graph Feature Embeddings Engineering Query Based Feature Query Based Engineering Knowledge Graph Enterprise Maturity
Graph Native Learning Deep Learning refers to training multi-layer neural networks using gradient descent 36
Graph Native Learning Graph Native Learning refers to deep learning models that take a graph as an input , performs computations, and return a graph Battaglia et al, 2018 37
Graph Native Learning Example: electron path prediction Bradshaw et al, 2019 Given reactants and reagents, what will the Given reactants and reagents, what will the products be? products be? 38
Graph Native Learning Example: electron path prediction 39
Progressing in Graph Data Science Knowledge Graph Feature Graph Native Learning Graphs Engineering Data Science Complexity Graph Neural Networks Graph Algorithm Graph Feature Embeddings Engineering Query Based Feature Query Based Engineering Knowledge Graph Enterprise Maturity Graph Persistence
neo4j.com/ Resources graph-algorithms-book Business • neo4j.com/use-cases/ artificial-intelligence-analytics/ Data Scientists/Developers • neo4j.com/sandbox • neo4j.com/developer/ • community.neo4j.com alicia.frame@neo4j.com @aliciaframe1
Recommend
More recommend