Select Git revision
browser_guide.adoc
-
Marc Feger authoredMarc Feger authored
Code owners
Assign users and groups as approvers for specific file changes. Learn more.
browser_guide.adoc 11.53 KiB
Cypher: A data exploration of D-BAS
Connect to the D-BAS database
Since this tool is located in the same network as D-BAS, the containers web and db can be addressed. Therefore, queries can be sent directly to the D-BAS database. So that Neo4j can submit the requests to PostgreSQL, the plugins APOC and the PostgreSQL JDBC Driver are provided.
The following example connects Neo4J to the D-BAS database db.
Please enter the password of db:
WITH 'jdbc:postgresql://db/discussion?user=postgres&password=' + 'FooBar' AS url
CALL apoc.load.jdbc(url, 'textversions')
YIELD row
RETURN row
Creating the statement nodes
Password:
WITH 'jdbc:postgresql://db/discussion?user=postgres&password=' + 'FooBar' AS url
CALL apoc.load.jdbc(url, 'statements')
YIELD row
MERGE (a:Statement{uid:row.uid, is_position:row.is_position, is_disabled:row.is_disabled})
RETURN a
Filling the statement nodes with textversions
Password:
WITH 'jdbc:postgresql://db/discussion?user=postgres&password=' + 'FooBar' AS url
CALL apoc.load.jdbc(url, 'textversions')
YIELD row
MATCH (a:Statement{uid:row.statement_uid})
WHERE NOT EXISTS(a.content)
SET a += {content:row.content}
RETURN a
Creating the user nodes
Password:
WITH 'jdbc:postgresql://db/discussion?user=postgres&password=' + 'FooBar' AS url
CALL apoc.load.jdbc(url, 'users')
YIELD row
MERGE (a:User{uid:row.uid, public_nickname:row.public_nickname})
RETURN a
Create relation between Users and Statements
Password:
WITH 'jdbc:postgresql://db/discussion?user=postgres&password=' + 'FooBar' AS url
CALL apoc.load.jdbc(url, 'textversions')
YIELD row
MATCH (a:User), (b:Statement)
WHERE a.uid = row.author_uid AND b.uid = row.statement_uid
MERGE (a)-[r:HAS_WRITTEN]->(b)
RETURN a, b, r
Create issue nodes
Password:
WITH 'jdbc:postgresql://db/discussion?user=postgres&password=' + 'FooBar' AS url
CALL apoc.load.jdbc(url, 'issues')
YIELD row
MERGE (a:Issue{uid:row.uid, title:row.title})
RETURN a
Connect statements with issues
Password:
WITH 'jdbc:postgresql://db/discussion?user=postgres&password=' + 'FooBar' AS url
CALL apoc.load.jdbc(url, 'statement_to_issue')
YIELD row
MATCH (a:Statement{uid:row.statement_uid}), (b:Issue{uid:row.issue_uid})
MERGE (a)-[r:WRITTEN_IN]->(b)
RETURN a,b,r
Every User who has written a Position likes it
Rating between 0 and :
MATCH (a:User)-[:HAS_WRITTEN]->(b:Statement{is_position:True})
MERGE (a)-[r:LIKES{rating:toInt('2')}]->(b)
RETURN a, b, r
Every User who has written none gets a random rating for a Position
Rating between 0 and :
MATCH (a:User)
WHERE NOT EXISTS((a)-[:LIKES]->())
MATCH (b:Statement)
WHERE b.is_position
MERGE (a)-[:LIKES{rating:round(rand()*toInt('2'))}]->(b)
Get sub-graph
Uid between 0 and :
MATCH (a:User), (b:Statement{is_position:True})
WHERE a.uid in range(0, toInt('10'))
RETURN a,b
Get k-nearest-neighbours with Pearson-Correlation
Find top -neighbors for
MATCH (p1:User {public_nickname: 'Björn'})-[l:LIKES]->(statement)
WITH p1, algo.similarity.asVector(statement, l.rating) AS p1Vector
MATCH (p2:User)-[l:LIKES]->(statement) WHERE p2 <> p1
WITH p1, p2, p1Vector, algo.similarity.asVector(statement, l.rating) AS p2Vector
RETURN p1.public_nickname AS from,
p2.public_nickname AS to,
algo.similarity.pearson(p1Vector, p2Vector, {vectorType: "maps"}) AS similarity
ORDER BY similarity DESC
LIMIT toInt('5')
Get two User and their connection
User A: User B:
match (a:User{public_nickname:'Björn'}), (b:User{public_nickname:'Christian'}), (s:Statement)
WHERE (a)-[:LIKES]->(s) or (b)-[:LIKES]->(s)
return a,b,s
Get Top-N Prediction for User with Pearson-Similarity + Weighted-Average + kNN
Find Top- Predictions for in the -NN
MATCH (p1:User {public_nickname: 'Björn'})-[l:LIKES]->(statement)
WITH p1, algo.similarity.asVector(statement, l.rating) AS p1Vector
MATCH (p2:User)-[l:LIKES]->(statement) WHERE p2 <> p1
WITH p1, p2, p1Vector, algo.similarity.asVector(statement, l.rating) AS p2Vector
WITH p1 AS from, p2 AS to, algo.similarity.pearson(p1Vector, p2Vector, {vectorType: "maps"}) AS similarity
ORDER BY similarity DESC limit toInt('5')
MATCH (to)-[r:LIKES]->(s:Statement) WHERE NOT EXISTS((from)-[:LIKES]->(s))
RETURN from , to, s, sum(similarity * r.rating)/FILTER(x in [sum(abs(similarity)), 1] WHERE NOT x=0)[0] AS prediction
ORDER BY prediction DESC LIMIT toInt('5')
Get Top-N Prediction for User with Pearson-Similarity + Mean-Centering + kNN
Find Top- Predictions for in the -NN
MATCH (p1:User {public_nickname: 'Björn'})-[l:LIKES]->(statement)
WITH p1, algo.similarity.asVector(statement, l.rating) AS p1Vector, avg(l.rating) as u1_avg
MATCH (p2:User)-[l:LIKES]->(statement) WHERE p2 <> p1
WITH p1, p2, u1_avg, p1Vector, algo.similarity.asVector(statement, l.rating) AS p2Vector, avg(l.rating) as u2_avg
WITH p1 AS from, p2 AS to, algo.similarity.pearson(p1Vector, p2Vector, {vectorType: "maps"}) AS similarity, u1_avg, u2_avg
ORDER BY similarity DESC limit toInt('5')
MATCH (to)-[r:LIKES]->(s:Statement) WHERE NOT EXISTS((from)-[:LIKES]->(s))
RETURN from , to, s, u1_avg + sum(similarity * (r.rating-u2_avg))/FILTER(x in [sum(abs(similarity)), 1] WHERE NOT x=0)[0] AS prediction
ORDER BY prediction DESC LIMIT toInt('5')
Get Top-N Prediction for User with Pearson-Similarity + Z-Score-Normalization + kNN
Find Top- Predictions for in the -NN
MATCH (p1:User {public_nickname: 'Björn'})-[l:LIKES]->(statement)
WITH p1, algo.similarity.asVector(statement, l.rating) AS p1Vector, avg(l.rating) as u1_avg, stDev(l.rating) as u1_std
MATCH (p2:User)-[l:LIKES]->(statement) WHERE p2 <> p1
WITH p1, p2, u1_avg, u1_std, p1Vector, algo.similarity.asVector(statement, l.rating) AS p2Vector, avg(l.rating) as u2_avg, stDev(l.rating) as u2_std
WITH p1 AS from, p2 AS to, algo.similarity.pearson(p1Vector, p2Vector, {vectorType: "maps"}) AS similarity, u1_avg, u1_std, u2_avg, u2_std
ORDER BY similarity DESC limit toInt('5')
MATCH (to)-[r:LIKES]->(s:Statement) WHERE NOT EXISTS((from)-[:LIKES]->(s))
RETURN from , to, s, u1_avg + u1_std* sum(similarity*(r.rating-u2_avg)/u2_std)/FILTER(x in [sum(abs(similarity)), 1] WHERE NOT x=0)[0] AS prediction
ORDER BY prediction DESC LIMIT toInt('5')