APOC Pearls Michael Hunger Developer Relations Engineering, Neo4j Follow @mesirii
APOC Unicorns Michael Hunger Developer Relations Engineering, Neo4j Follow @mesirii
All Images by TeeTurtle.com & Unstable Unicorns
Power Up
Backercorns: https://unstable-unicorns.backerkit.com/hosted_preorders/project_updates?page=4 https://www.kickstarter.com/projects/ramybadie/unstable-unicorns-control-and-chaos-the-back ercorn/posts/2271771
Extending Neo4j User Defined Procedures let you write custom code that is: • Written in any JVM language • Deployed to the Database • Accessed by applications via Cypher
Extending Neo4j User Defined Procedures let you write custom code that is: • Written in any JVM language Applications • Deployed to the Database • Accessed by applications via Cypher Bolt User Defined Neo4j Execution Engine Procedure
APOC History My Unicorn Moment • 3.0 was about to have • User Defined Procedures Add the missing utilities • Grew quickly 50 - 150 - 450 • Active OSS project • Many contributors •
Agenda why and how of user defined extensions procedures, functions, aggregation functions • history of apoc • 5 pearls -> come to the training if you want to see more • apoc.help() + doc & videos • 1 3x utilities - text, map and collection functions • 2 aggregation functions • 3 data integration - load json • 4 handling large updates - periodic iterate • 5 graph refactoring • 6 path expanders • 7 triggers • 8 time to live • 9 graph grouping • 10 cypher functions •
Available On Neo4j Sandbox • Neo4j Desktop • Neo4j Cloud •
Available On Neo4j Sandbox • Neo4j Desktop • Neo4j Cloud •
Install
What's in the Box? Utilities & Converters • Data Integration • Import / Export • Graph Generation / Refactoring • Transactions / Jobs / TTL • much more ... •
Where can I learn more? Videos • Documentation • Browser Guide • APOC Training • Neo4j Community Forum • apoc.help() •
If you learn one thing: call apoc.help("keyword")
APOC Video Series Youtube Playlist: r.neo4j.com/apoc-videos
APOC Docs installation instructions • videos • searchable overview table • detailed explanation • examples • neo4j-contrib.github.io/neo4j-apoc-procedures
Browser Guide :play apoc live examples •
The Pearls - That give you Superpowers 21
Data Integration 22
Data Integration Relational / Cassandra • MongoDB, Couchbase, • ElasticSearch JSON, XML, CSV, XLS • Cypher, GraphML • ... •
apoc.load.json load json from web-apis and files • JSON Path • streaming JSON • compressed data • neo4j-contrib.github.io/neo4j-apoc-procedures/#_load_json
WITH "https://api.stackexchange.com/2.2/questions?pagesize=100..." AS url CALL apoc. load .json(url) YIELD value UNWIND value.items AS q MERGE (question:Question {id:q.question_id}) ON CREATE SET question.title = q.title, question.share_link = q.share_link, question.favorite_count = q.favorite_count MERGE (owner:User {id:q.owner.user_id}) ON CREATE SET owner.display_name = q.owner.display_name MERGE (owner)-[:ASKED]->(question) FOREACH (tagName IN q.tags | MERGE (tag:Tag {name:tagName}) MERGE (question)-[:TAGGED]->(tag)) …
StackOverflow data model
Huge Transactions 28
apoc.periodic.iterate driving statement • executing statement • batching • • parallel execution handling retries • neo4j-contrib.github.io/neo4j-apoc-procedures/#_apoc_periodic_iterate
Run large scale imports CALL apoc.periodic.iterate( 'LOAD CSV … AS row RETURN row' , 'MERGE (n:Node {id:row.id}) SET n.name = row.name' , {batchSize:10000})
Run large scale imports CALL apoc. periodic .iterate( 'UNWIND range(1,1000000) as id return id' , 'CREATE (n:Node {id:id,name:"an "+id})' , {batchSize:10000, parallel: true }) YIELD batches, total, timeTaken; +-------------------------------+ | batches | total | timeTaken | +-------------------------------+ | 100 | 1000000 | 1 | +-------------------------------+ 1 row available after 1868 ms, consumed after another 0 ms
Run large scale updates CALL apoc.periodic.iterate( 'MATCH (n:Person) RETURN n' , 'SET n.name = n.firstName + " " + n.lastName' , {batchSize:10000, parallel:true})
Utilities 33
Text Functions - apoc.text.* indexOf, indexesOf split, replace, regexpGroups format capitalize, decapitalize random, lpad, rpad snakeCase, camelCase, upperCase charAt, hexCode base64, md5, sha1, https://neo4j-contrib.github.io/neo4j-apoc-procedures/#_text_functions
Collection Functions - apoc.coll.* sum, avg, min,max,stdev, zip, partition, pairs sort, toSet, contains, split indexOf, .different occurrences, frequencies, flatten disjunct, subtract, union, … set, insert, remove randomItem(s) https://github.com/neo4j-contrib/neo4j-apoc-procedures/blob/3.4/docs/overview.adoc#collection-functions
Map Functions - apoc.map.* .fromNodes, .fromPairs, • .fromLists, .fromValues .merge • .setKey,removeKey • .clean(map,[keys],[values]) • .groupBy(Multi) • https://github.com/neo4j-contrib/neo4j-apoc-procedures/blob/3.4/docs/overview.adoc#map-functions
JSON - apoc.convert.* .toJson([1,2,3]) .fromJsonList('[1,2,3]') .fromJsonMap('{"a":42,"b":"foo","c":[1,2,3]}') .toTree([paths],[lowerCaseRels=true]) .getJsonProperty(node,key) .setJsonProperty(node,key,complexValue) (JSON)-[:IS]->(everywhere)-[:LIKE]->(graphs)
Graph Refactoring 38
Aggregation Function - apoc.refactor.* .cloneNodes • .mergeNodes • .extractNode • .collapseNode • .categorize • Relationship Modifications .to(rel, endNode) • .from(rel, startNode) • .invert(rel) • .setType(rel, 'NEW-TYPE') •
apoc.refactor.mergeNodes MATCH (n:Person) WITH n.email AS email, collect(n) as people WHERE size(people) > 1 CALL apoc.refactor.mergeNodes(people) YIELD node RETURN node
apoc.create.addLabels MATCH (n:Movie) CALL apoc. create .addLabels( id(n), [ n.genre ] ) YIELD node REMOVE node .genre RETURN node
Triggers 42
Triggers CALL apoc.trigger.add( name, statement,{phase:before/after}) apoc.trigger.pause/resume/list/remove • Transaction-Event-Handler calls Cypher statement • parameters: • createdNodes, assignedNodeProperties, deletedNodes,... • utility functions to extract entities/properties from update-records • triggers stored in graph, restored at startup • https://medium.com/neo4j/streaming-graph-loading-with-neo4j-and-apoc-triggers-188ed4dd40d5
Time to Live 44
Time To Live TTL enable in config: apoc.ttl.enabled=true Label :TTL apoc.date.expire(In)(node, time, unit) Creates Index on :TTL(ttl)
Time To Live TTL background job (every 60s - configurable) that runs: MATCH (n:TTL) WHERE n.ttl > timestamp() WITH n LIMIT 1000 DET DELETE n
Aggregation Functions 47
Aggregation Function - apoc.agg.* more efficient variants of collect(x)[a..b] • .nth,.first,.last,.slice • .median(x) • .percentiles(x,[0.5,0.9]) • .product(x) • .statistics() provides a full • numeric statistic
Graph Grouping 49
Graph Grouping MATCH (p:Person) set p.decade = b.born / 10; MATCH (p1:Person)-->()<--(p2:Person) WITH p1,p2,count(*) as c MERGE (p1)-[r:INTERACTED]-(p2) ON CREATE SET r.count = c CALL apoc.nodes.group(['Person'],['decade']) YIELD node, relationship RETURN *;
Graph Grouping MATCH (p:Person) set p.decade = b.born / 10; MATCH (p1:Person)-->()<--(p2:Person) WITH p1,p2,count(*) as c MERGE (p1)-[r:INTERACTED]-(p2) ON CREATE SET r.count = c CALL apoc.nodes.group(['Person'],['decade']) YIELD node, relationship RETURN *;
Cypher Procedures 52
Custom Procedures (WIP) apoc.custom.asProcedure/asFunction (name,statement, columns, params) Register statements as real procedures & functions • 'custom' namespace prefix • Pass parameters, configure result columns • Stored in graph and distributed across cluster •
Custom Procedures (WIP) call apoc.custom.asProcedure( 'neighbours' , 'MATCH (n:Person {name:$name})-->(nb) RETURN neighbour' , [[ 'neighbour' , 'NODE' ]],[[ 'name' , 'STRING' ]]); call custom.neighbours( 'Joe' ) YIELD neighbour;
Report Issues Contribute!
Ask Questions neo4j.com/slack community.neo4j.com
APOC on GitHub
Join the Workshop tomorrow!
Any Questions?
Best Question gets a box!
Recommend
More recommend