introduction in graph databases and neo4j
play

Introduction in Graph Databases and Neo4j most slides from: Stefan - PowerPoint PPT Presentation

Introduction in Graph Databases and Neo4j most slides from: Stefan Armbruster Michael Hunger t: @darthvader42 e:stefan.armbruster@neotechnology.com 1 1 The Path Forward 1. No .. NO .. NOSQL 2. Why graphs? 3. What's a graph database? 4. Some


  1. Introduction in Graph Databases and Neo4j most slides from: Stefan Armbruster Michael Hunger t: @darthvader42 e:stefan.armbruster@neotechnology.com 1 1

  2. The Path Forward 1. No .. NO .. NOSQL 2. Why graphs? 3. What's a graph database? 4. Some things about Neo4j. 5. How do people use Neo4j? 2 2

  3. Trends in BigData & NOSQL 1. increasing data size (big data) • “Every 2 days we create as much information as we did up to 2003” - Eric Schmidt 2. increasingly connected data (graph data) • for example, text documents to html 3. semi-structured data • individualization of data, with common sub-set 4. architecture - a facade over multiple services • from monolithic to modular, distributed applications 3 3

  4. NOSQL 5 5

  5. 6 6

  6. 7 7 htup://www.fmickr.com/photos/crazyneighborlady/355232758/

  7. 8 8 htup://gallery.nen.gov.uk/image82582-.html

  8. 9 9 htup://www.xtranormal.com/watch/6995033/mongo-db-is-web-scale

  9. NOSQL Databases 1 1 0 0

  10. Living in a NOSQL World Graph Databases RDBMS Density ~= Complexity Document Databases Column Family Key-Value Store Volume ~= Size 1 1 1 1

  11. complexity = f(size, connectedness, uniformity) 1 1 2 2

  12. Patrik Runald @patrikrunald 3 Nov “@mgonto: The best explanation about what BigData is. Hilarious: pic.twitter.com/d8ZVP7xJFu 1 1 3 3

  13. 1 1 4 4

  14. A Graph? Yes, a graph 1 1 5 5

  15. Leonhard Euler 1707-1783 1 1 6 6

  16. 1 1 7 7

  17. They are everywhere 1 1 8 8

  18. They are everywhere http://www.bbc.co.uk/london/travel/downloads/tube_map.html 1 1 9 9

  19. Graphs Everywhere ๏ Relationships in • Politics, Economics, History, Science, Transportation ๏ Biology, Chemistry, Physics, Sociology • Body, Ecosphere, Reaction, Interactions ๏ Internet • Hardware, Software, Interaction ๏ Social Networks • Family, Friends • Work, Communities • Neighbours, Cities, Society 2 2 0 0

  20. Good Relationships ๏ the world is rich, messy and related data ๏ relationships are as least as important as the things they connect ๏ Graphs = Whole > Σ parts ๏ complex interactions ๏ always changing, change of structures as well ๏ Graph: Relationships are part of the data ๏ RDBMS: Relationships part of the fixed schema 2 2 1 1

  21. Everyone is talking about graphs... 2 2 7 7

  22. Everyone is talking about graphs... 2 2 8 8

  23. Graph DB 101 3 3 1 1

  24. A graph database... NO: not for charts & diagrams, or vector artwork YES: for storing data that is structured as a graph remember linked lists, trees? graphs are the general-purpose data structure “A relational database may tell you the average age of everyone in this session, but a graph database will tell you who is most likely to buy you a beer.” 3 3 2 2

  25. You know relational foo foo_bar bar 3 3 3 3

  26. now consider relationships... foo foo_bar bar 3 3 4 4

  27. We're talking about a Property Graph Properties (each a key+value) + Indexes (for easy look-ups) + Labels (Neo4j 2.0) 3 3 5 5

  28. Looks different, fine. Who cares? ๏ a sample social graph • with ~1,000 persons ๏ average 50 friends per person ๏ pathExists(a,b) limited to depth 4 ๏ caches warmed up to eliminate disk I/O # persons query time Relational database 1.000 2000ms Neo4j 1.000 2ms Neo4j 1.000.000 2ms 3 3 6 6

  29. Graph Database: Pros & Cons ๏ Strengths • Powerful data model, as general as RDBMS • Fast, for connected data • Easy to query ๏ Weaknesses: • Sharding (though they can scale reasonably well) ‣ also, stay tuned for developments here • Requires conceptual shift ‣ though graph-like thinking becomes addictive 3 3 7 7

  30. And, but, so how do you query this "graph" database? 3 3 8 8

  31. Query a graph with a traversal // lookup starting point in an index // then traverse to find results start n=node:People(name = ‘Andreas’) start n=node:People(name = ‘Andreas’) match (n)--()--(foaf) return foaf 3 3 9 9

  32. Modeling for graphs 4 4 0 0

  33. 4 4 1 1

  34. Adam SHARED FRIEND_OF LOL Cat LIKES ON Sarah FUNNY COMMENTED 4 4 2 2

  35. Adam SHARED FRIEND_OF LOL Cat LIKES ON Sarah FUNNY COMMENTED 4 4 3 3

  36. Neo4j 2.0: Person Adam Lables SHARED FRIEND_OF Photo LOL Cat LIKES Person ON Sarah FUNNY COMMENTED 4 4 4 4

  37. Neo4j - the Graph Database 4 4 5 5

  38. 4 4 6 6

  39. Neo4j is a Graph Database ๏ A Graph Database: • a schema-free Property Graph • perfect for complex, highly connected data ๏ A Graph Database: • reliable with real ACID Transactions • fast with more than 1M traversals / second • Server with REST API, or Embeddable on the JVM • scale out for higher-performance reads with High-Availability 4 4 8 8

  40. Whiteboard --> Data Andre Peter knows as knows knows Alliso n knows Emil // Cypher query - friend of a friend start n=node(0) match (n)--()--(foaf) return foaf 4 4 9 9

  41. Two Ways to Work with Neo4j ๏ 1. Embeddable on JVM • Java, JRuby, Scala... • Tomcat, Rails, Akka, etc. • great for testing 5 5 0 0

  42. Show me some code, please Show me some code, please GraphDatabaseService graphDb = new EmbeddedGraphDatabase(“var/neo4j”); Transaction tx = graphDb.beginTx(); try { Node steve = graphDb.createNode(); Node michael = graphDb.createNode(); steve.setProperty(“name”, “Steve Vinoski”); michael.setProperty(“name”, “Michael Hunger”); Relationship presentedWith = steve.createRelationshipT o( michael, PresentationT ypes.PRESENTED_WITH); presentedWith.setProperty(“date”, today); tx.success(); } fjnally { tx.fjnish(); }

  43. Spring Data Neo4j @NodeEntity public class Movie { @Indexed private String title; @RelatedT oVia(type = “ACTS_IN”, direction=INCOMING) private Set<Role> cast; private Director director; } @NodeEntity public class Actor { @RelatedT o(type = “ACTS_IN”) private Set<Movies> movies; } @RelationshipEntity public class Role { @StartNode private Actor actor; @EndNode private Movie movie; private String roleName; }

  44. Cypher Query Language ๏ Declarative query language • Describe what you want, not how • Based on pattern matching ๏ Examples: START david=node:people(name=”David”) # index lookup MATCH david-[:knows]-friends-[:knows]-new_friends WHERE new_friends.age > 18 RETURN new_friends START user=node(5, 15, 26, 28) # node IDs MATCH user--friend RETURN user, COUNT(friend), SUM(friend.money) 5 5 4 4

  45. Create Graph with Cypher CREATE (steve {name: “Steve Vinoski”}) -[:PRESENTED_WITH {date:{day}}]-> (michael {name: “Michael Hunger”})

  46. Two Ways to Work with Neo4j ๏ 2. Server with REST API • every language on the planet • flexible deployment scenarios • DIY server, or cloud managed 5 5 6 6

  47. Bindings REST:// 5 5 7 7

  48. Two Ways to Work with Neo4j ๏ Server capability == Embedded capability • same scalability, transactionality, and availability 5 5 8 8

  49. Neo4j in HA mode: replicating the graph 5 9

  50. the Real World 6 6 0 0

  51. Industry: Communications Use case: Recommendations San Jose, CA Knowledg Knowledg Cisco.com e e Base Base Article Article Knowledg Support Knowledg Support e Case e Case Base Base Article Article • Cisco.com serves customer and business customers with Support Services Support • Needed real-time recommendations, to encourage use of Support Case Case online knowledge base • Cisco had been successfully using Neo4j for its internal Solution Solution master data management solution. Knowledg Knowledg • Identified a strong fit for online recommendations Message Message e e Base Base Article Article • Call center volumes needed to be lowered by improving • Cases, solutions, articles, etc. continuously scraped for the efficacy of online self service cross-reference links, and represented in Neo4j • Leverage large amounts of knowledge stored in service • Real-time reading recommendations via Neo4j • Neo4j Enterprise with HA cluster cases, solutions, articles, forums, etc. • Problem resolution times, as well as support costs, needed • The result: customers obtain help faster, with decreased to be lowered reliance on customer support

Recommend


More recommend