busy developer
play

Busy Developer Dr. Jim Webber Chief Scientist, Neo Technology - PowerPoint PPT Presentation

Session code: 6191 A Little Graph Theory for the Busy Developer Dr. Jim Webber Chief Scientist, Neo Technology @jimwebber Roadmap Imprisoned data Graph models Graph theory Local properties, global behaviours Predictive


  1. Session code: 6191 A Little Graph Theory for the Busy Developer Dr. Jim Webber Chief Scientist, Neo Technology @jimwebber

  2. Roadmap • Imprisoned data • Graph models • Graph theory – Local properties, global behaviours – Predictive techniques • Graph matching – Predictive, real-time analytics for fun and profit • Fin

  3. http://www.flickr.com/photos/crazyneighborlady/355232758/

  4. http://gallery.nen.gov.uk/image82582-.html

  5. http://www.xtranormal.com/watch/6995033/mongo-db-is-web-scale

  6. Aggregate-Oriented Data http://martinfowler.com/bliki/AggregateOrientedDatabase.html “There is a significant downside - the whole approach works really well when data access is aligned with the aggregates, but what if you want to look at the data in a different way? Order entry naturally stores orders as aggregates, but analyzing product sales cuts across the aggregate structure. The advantage of not using an aggregate structure in the database is that it allows you to slice and dice your data different ways for different audiences. This is why aggregate-oriented stores talk so much about map- reduce.”

  7. complexity = f(size, connectedness, uniformity)

  8. http://www.bbc.co.uk/london/travel/downloads/tube_map.html

  9. Property graphs • Property graph model: – Nodes with properties – Named, directed relationships with properties – Relationships have exactly one start and end node • Which may be the same node

  10. Property Graph Model

  11. Property Graph Model

  12. Property Graph Model name: the Doctor age: 907 species: Time Lord first name: Rose late name: Tyler vehicle: tardis model: Type 40

  13. Property graphs are very whiteboard-friendly

  14. http://blogs.adobe.com/digitalmarketing/analytics/predictive-analytics/predictive-analytics-and-the-digital-marketer/

  15. Meet Leonhard Euler • Swiss mathematician • Inventor of Graph Theory (1736) http://en.wikipedia.org/wiki/File:Leonhard_Euler_2.jpg

  16. http://en.wikipedia.org/wiki/Seven_Bridges_of_Königsberg

  17. Triadic Closure name: Kyle name: Stan name: Kenny

  18. Triadic Closure name: Kyle name: Stan name: Kenny name: Kyle FRIEND name: Stan name: Kenny

  19. Structural Balance name: Cartman name: Craig name: Tweek

  20. Structural Balance name: Cartman name: Craig name: Tweek name: Cartman FRIEND name: Craig name: Tweek

  21. Structural Balance name: Cartman name: Craig name: Tweek name: Cartman ENEMY name: Craig name: Tweek

  22. Structural Balance name: Kyle name: Stan name: Kenny name: Kyle FRIEND name: Stan name: Kenny

  23. Structural Balance is a key predictive technique And it’s domain -agnostic

  24. Allies and Enemies UK Austria France Germany Italy Russia

  25. Allies and Enemies UK Austria France Germany Italy Russia

  26. Allies and Enemies UK Austria France Germany Italy Russia

  27. Allies and Enemies UK Austria France Germany Italy Russia

  28. Allies and Enemies UK Austria France Germany Italy Russia

  29. Allies and Enemies UK Austria France Germany Italy Russia

  30. Predicting WWI [Easley and Kleinberg]

  31. Strong Triadic Closure It if a node has strong relationships to two neighbours, then these neighbours must have at least a weak relationship between them. [Wikipedia]

  32. Triadic Closure (weak relationship) name: Kenny name: Stan name: Cartman

  33. Triadic Closure (weak relationship) name: Kenny name: Stan name: Cartman name: Kenny FRIEND 50% name: Stan name: Cartman

  34. Weak relationships • Relationships can have “strength” as well as intent – Think: weighting on a relationship in a property graph • Weak links play another super-important structural role in graph theory – They bridge neighbourhoods

  35. Local Bridges FRIEND name: Cartman name: Kenny ENEMY FRIEND FRIEND name: Kyle name: Stan name: Sally FRIEND 50% FRIEND FRIEND name: Wendy name: Bebe

  36. Local Bridge Property “If a node A in a network satisfies the Strong Triadic Closure Property and is involved in at least two strong relationships, then any local bridge it is involved in must be a weak relationship.” [Easley and Kleinberg]

  37. University Karate Club

  38. Graph Partitioning • (NP) Hard problem – Recursively remove the spanning links between dense regions – Or recursively merge nodes into ever larger “ subgraph ” nodes – Choose your algorithm carefully – some are better than others for a given domain • Can use to (almost exactly) predict the break up of the karate club!

  39. University Karate Clubs

  40. Cypher • Declarative graph pattern matching language – “SQL for graphs” – Columnar results • Supports graph matching queries – And aggregation, ordering and limit, etc. – Mutation

  41. Cypher is Declarative • Imperative • Declarative – specify starting point – follow relationship – specify desired outcome – breadth-first vs depth- – algorithm adaptable first – explicit algorithm – based on query

  42. Cypher is a pattern matching language A B C

  43. Un-named Nodes & Rels () --> ()

  44. Un-named Relationship B A (A) --> (B)

  45. ASCII Art Patterns B LOVES A A -[:LOVES]-> B

  46. ASCII Art Patterns B C A A --> B --> C

  47. ASCII Art Patterns A B C A --> B --> C, A --> C A --> B --> C <-- A

  48. Variable Length Paths A B B A A B A -[*]-> B

  49. Optional Relationships A B A -[?]-> B

  50. Example Query Start node from • The top 5 most frequently appearing index companions: Subgraph pattern start doctor=node:characters(character = 'Doctor') match (doctor)<-[:COMPANION_OF]-(companion) -[:APPEARED_IN]->(episode) Accumulates return companion.character, count(episode) rows by episode order by count(episode) desc limit 5 Limit returned rows

  51. Category: Category: baby alcoholic drinks MEMBER_OF MEMBER_OF Category: Category: beer Category: consumer nappies electronics MEMBER_OF MEMBER_OF MEMBER_OF SKU : 2555f258 SKU : 49d102bc Category: Product : Product : Baby console Peewee Pilsner Dry Nights Firstname : SKU : 5e175641 SKU : 49d102bc BOUGHT BOUGHT Mickey Product : Product : XBox Surname : Smith Badgers 360 DoB : 19781006 Nadgers Ale

  52. Category: beer Category: nappies Firstname : * BOUGHT Category: game Surname : * console DoB : 1996 > x > 1972

  53. Category: beer Category: nappies Firstname : * !BOUGHT Category: game Surname : * console DoB : 1996 > x > 1972

  54. (nappies) (beer) (console) () () () (daddy)

  55. Flatten the graph (daddy)-[:BOUGHT]->()-[:MEMBER_OF]->(nappies) (daddy)-[:BOUGHT]->()-[:MEMBER_OF]->(beer) (daddy)-[b:BOUGHT]->()-[:MEMBER_OF]->(console)

  56. Wrap in a Cypher MATCH clause MATCH (daddy)-[:BOUGHT]->()-[:MEMBER_OF]->(nappies) , (daddy)-[:BOUGHT]->()-[:MEMBER_OF]->(beer) , (daddy)-[b:BOUGHT]->()-[:MEMBER_OF]->(console)

  57. Cypher WHERE clause MATCH (daddy)-[:BOUGHT]->()-[:MEMBER_OF]->(nappies), (daddy)-[:BOUGHT]->()-[:MEMBER_OF]->(beer), (daddy)-[b:BOUGHT]->()-[:MEMBER_OF]->(console) WHERE b is null

  58. Full Cypher query START beer=node:categories (category=‘beer’), nappies=de:categories(category= ‘nappies’ ), xbox=node:products (product=‘ xbox 360’) MATCH (daddy)-[:BOUGHT]->()-[:MEMBER_OF]->(beer), (daddy)-[:BOUGHT]->()-[:MEMBER_OF]->(nappies), (daddy)-[b?:BOUGHT]->(xbox) WHERE b is null RETURN distinct daddy

  59. Results ==> +---------------------------------------------+ ==> | daddy | ==> +---------------------------------------------+ ==> | Node[15]{name:"Rory Williams",dob:19880121} | ==> +---------------------------------------------+ ==> 1 row ==> 6 ms ==> neo4j-sh (0)$

  60. What are graphs good for? • Recommendations • Business intelligence • Social computing • Geospatial • MDM • Systems management • Web of things • Genealogy • Time series data • Product catalogue • Web analytics • Scientific computing (especially bioinformatics) • Indexing your slow RDBMS • And much more!

  61. Free O’Reilly eBook! Visit: http://GraphDatabases.com

  62. Thanks for listening Neo4j: http://neo4j.org Neo Technology: http://neotechnology.com Me: @jimwebber

  63. Neo4j Meetup in Hilversum Next Week

Recommend


More recommend