图数据库-Neo4j-初探

时间:2021-10-03 23:38:53
  <header class="article-header">

<h1 class="article-title" itemprop="name">
图数据库-Neo4j-初探
</h1> <a href="/2018/08/17/图数据库-Neo4j-初探/" class="archive-article-date">
<time datetime="2018-08-17T05:15:46.000Z" itemprop="datePublished"><i class="icon-calendar icon"></i>2018-08-17</time>

  </header>

<div class="article-entry" itemprop="articleBody">

    <p>本次初探主要学习如何安装<code>Neo4j</code>,以及<code>Cypher</code>的基本语法。</p>

1. 安装Neo4j

  • Desktop版本

    neo4j-desktop

  • Server版本(Community版)

    比较建议安装这个版本,因为Desktop版本的老是闪退,且要激活之类的。

    • 下载Neo4j数据库

      neo4j-server-community

    • 下载常用算法的插件

      将下载下来的算法插件放入到$NEO4J_HOME/plugins文件夹下

    • Service版修改配置文件$NEO4J_HOME/conf/neo4j.conf

      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      # 解决登入的时候报没有授权的错误
      dbms.security.auth_enabled=false
      # 添加下载的算法插件
      dbms.security.procedures.unrestricted=apoc.*,algo.*
      apoc.import.file.enabled=true

      #增加页缓存到至少4G,推荐20G:
      dbms.memory.pagecache.size=4g
      #JVM堆保存留内存从1G起,最大4G:
      dbms.memory.heap.initial_size=1g
      dbms.memory.heap.max_size=4g
    • 启动/停止 (把server所在的路径添加到系统的PATH)

      1
      2
      3
      4
      5
      6
      7
      # 建议将neo4j所在的路径条件到系统$PATH当中,
      # export NEO4J_HOME="path-to-neo4j"
      $NEO4J_HOME/bin/neo4j start
      $NEO4J_HOME/bin/neo4j console
      $NEO4J_HOME/bin/neo4j stop
      $NEO4J_HOME/bin/neo4j start -u neo4j -p neo4j
      $NEO4J_HOME/bin/cypher-shell
      1
      CALL dbms.procedures() // 查看neo4j可用的进程,包括刚刚安装的插件

2. Cypher基本语法

  • Nodes基本语法

    在Cypher里面通过一对小括号代表一个节点

    • () 代表匹配任意一个节点
    • (node1) 代表匹配任意一个节点,并给它起了一个别名
    • (:Lable) 代表查询一个类型的数据
    • (person:Lable) 代表查询一个类型的数据,并给它起了一个别名
    • (person:Lable {name:”小王”}) 查询某个类型下,节点属性满足某个值的数据
    • (person:Lable {name:”小王”,age:23}) 节点的属性可以同时存在多个,是一个AND的关系
  • Relationship基本语法

    系用一对-组成,关系分有方向的进和出,如果是无方向就是进和出都查询

    • —> 指向一个节点
    • -[role]-> 给关系加个别名
    • -[:acted_in]-> 访问某一类关系
    • -[role:acted_in]-> 访问某一类关系,并加了别名
    • -[role:acted_in {roles:[“neo”,”Hadoop“]}]->
  • 创建/删除节点

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    // 插入一个Artist类别的节点,而且这个节点有一个属性为Name,值为Lady Gaga
    CREATE (a:Artist {Name:"Lady Gaga"})

    // 创建并返回
    CREATE (a:Artist {Name:"Lady Gaga", Gemder:"Femal"}) return a

    // 一次性创建多个
    CREATE (a:Album { Name: "Killers"}), (b:Album { Name: "Fear of the Dark"})
    RETURN a, b

    CREATE (a:Album { Name: "Piece of Mind"})
    CREATE (b:Album { Name: "Somewhere in Time"})
    RETURN a, b

    // 删除节点,如果这个节点和其他节点有连接的话,不能单单删除这个节点
    MATCH (a:Album {Name: "Killers"}) DELETE a

    // 一次性删除多个节点
    MATCH (a:Artist {Name: "Iron Maiden"}), (b:Album {Name: "Powerslave"})
    DELETE a, b

    // 删除所有节点
    MATCH (n) DELETE n
  • 创建/删除关系

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    // 对Lady Gaga和专辑PieceOfMind之间创建一个released的关系
    MATCH (a:Artist), (b:Album)
    WHERE a.Name = "Lady Gaga" AND b.Name = "Piece of Mind"
    CREATE (a)-[r:RELEASED]->(b)
    RETURN r

    MATCH (a:Artist), (b:Album), (p:Person)
    WHERE a.Name = "Strapping Young Lad" AND b.Name = "Heavy as a Really Heavy Thing" AND p.Name = "Devin Townsend"
    CREATE (p)-[pr:PRODUCED]->(b), (p)-[pf:PERFORMED_ON]->(b), (p)-[pl:PLAYS_IN]->(a)
    RETURN a, b, p

    // 删除指定的关系
    MATCH (:Artist)-[r:RELEASED]-(:Album)
    DELETE r

    MATCH (:Artist {Name: "Strapping Young Lad"})-[r:RELEASED]-(:Album {Name: "Heavy as a Really Heavy Thing"})
    DELETE r

    // 删除所有的关系
    MATCH ()-[r:RELEASED]-()
    DELETE r

    // 清除所有节点和关系
    MATCH (n)
    OPTIONAL MATCH
    (n)-[r]-()
    DELETE n,r

    // 删除整个数据库
    MATCH (n) DETACH DELETE n

    图数据库-Neo4j-初探

  • 创建/删除约束

    SQL一样,Neo4j数据库支持对Noderelationship的属性的UNIQUE约束

    1
    2
    3
    CREATE CONSTRAINT ON (a:Artist) ASSERT a.Name IS UNIQUE

    DROP CONSTRAINT ON (a:Artist) ASSERT a.Name IS UNIQUE
  • 创建/删除索引

    1
    2
    3
    4
    5
    6
    CREATE INDEX ON :Album(Name)

    // View the schema
    :schema

    DROP INDEX ON :Album(Name)
  • 更新一个节点/边

    1
    2
    MATCH (n:Person { name: "Andres" })
    SET n.name = "Taylor";
  • 筛选过滤

    1
    2
    3
    4
    5
    6
    7
    8
    9
    // WHERE
    MATCH (p1: Person)-[r:friend]->(p2: Person)
    WHERE p1.name=~"K.+" or p2.age=24 or "neo" in r.rels
    RETURN p1, r, p2

    // NOT
    MATCH (p:Person)-[:ACTED_IN]->(m)
    WHERE NOT (p)-[:DIRECTED]->()
    RETURN p, m
  • 结果集返回

    1
    2
    3
    4
    5
    MATCH (p:Person)
    RETURN p, p.name AS name, upper(p.name), coalesce(p.nickname,"n/a") AS nickname, { name: p.name, label:head(labels(p))} AS person

    MATCH (n)
    RETURN DISTINCT n.name;
  • 聚合函数

    Cypher支持count, sum, avg, min, max

    聚合的时候null会被跳过

    count 语法 支持 count( distinct role )

    1
    2
    3
    4
    5
    6
    MATCH (actor:Person)-[:ACTED_IN]->(movie:Movie)<-[:DIRECTED]-(director:Person)
    RETURN actor,director,count(*) AS collaborations

    // 收集聚合结果
    MATCH (m:Movie)<-[:ACTED_IN]-(a:Person)
    RETURN m.title AS movie, collect(a.name) AS cast, count(*) AS actors
  • 排序和分页

    1
    2
    3
    MATCH (a:Person)-[:ACTED_IN]->(m:Movie)
    RETURN a, count(*) AS appearances
    ORDER BY appearances DESC SKIP 3 LIMIT 10;
  • Union 联合

    1
    2
    3
    4
    5
    MATCH (actor:Person)-[r:ACTED_IN]->(movie:Movie)
    RETURN actor.name AS name, type(r) AS acted_in, movie.title AS title
    UNION (ALL)
    MATCH (director:Person)-[r:DIRECTED]->(movie:Movie)
    RETURN director.name AS name, type(r) AS acted_in, movie.title AS title
  • With语句

    with语句给Cypher提供了强大的pipeline能力,可以一个或者query的输出,或者下一个query的输入 和return语句非常类似,唯一不同的是,with的每一个结果,必须使用别名标识。

    使用with我们可以在查询结果里面在继续嵌套查询。

    1
    2
    3
    4
    MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
    WITH p, count(*) AS appearances, COLLECT(m.Title) AS movies
    WHERE appearances > 1
    RETURN p.name, appearances, movies

    有点类似SQL中的having,这里是with + where两个一起来实现的。

  • 查询最短路径

    1
    2
    MATCH (ms:Person { name: "Node A" }),(cs:Person { name:"Node B" }), p = shortestPath((ms)-[r:Follow]-(cs)) 
    RETURN p;
  • 加载数据

    Cypher Neo4j Couldn’t load the external resource

    neo4j初探

    加载存在本地server上的数据,会在路径前面自动加个前缀 /path-to-neo4j/neo4j-community-3.4.5/import,即Server对应所在的路径下的import

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    // 加载address
    LOAD CSV WITH HEADERS FROM "file:///data/addresses.csv" AS csvLine
    CREATE (p:Person {id: toInt(csvLine.id), email: csvLine.address })

    // 加载email
    LOAD CSV WITH HEADERS FROM "file:///data/emails.csv" AS csvLine
    CREATE (e:Email {id: toInt(csvLine.id), time: csvLine.time, content: csvLine.content })


    // 创建收发关系
    USING PERIODIC COMMIT 500 // 分段加载
    LOAD CSV WITH HEADERS FROM "file:///data/relations.csv" AS csvLine
    MATCH (p1:Person {id: toInt(csvLine.fromId)}),(e:Email { id: toInt(csvLine.emailId)}),(p2:Person{ id: toInt(csvLine.toId)})
    CREATE UNIQUE (p1)-[:FROM]->(e)
    CREATE(e)-[:TO]->(p2)

    如果需要导入其他地方的,可以使用

    1
    2
    3
    4
    5
    6
    7
    8
    9
    LOAD CSV FROM "https://path-to-csv" AS csvLine
    CREATE (:Genre {GenreId: csvLine[0], Name: csvLine[1]})

    // 使用csv中的header
    LOAD CSV WITH HEADERS FROM "https://path-to-csv" AS csvLine
    CREATE (:Genre {GenreId: csvLine.Id, Name: csvLine.Track, Length: csvLine.Length})

    // 自定义csv文件中的分隔符
    LOAD CSV WITH HEADERS FROM "https://path-to-csv" AS csvLine FIELDTERMINATOR ";"
  • 使用 neo4j-import 导入数据

    使用neo4j-import导入数据

    • 使用条件
      • 需要先关闭neo4j
      • 无法再原有的数据库添加,只能重新生成一个数据库
      • 导入文件格式为csv
    • 参数
      • —into:数据库名称
      • —bad-tolerance:能容忍的错误数据条数(即超过指定条数程序直接挂掉),默认1000
      • —multiline-fields:是否允许多行插入(即有些换行的数据也可读取)
      • —nodes:插入节点
      • —relationships:插入关系
      • 更多参数可允许命令bin/neo4j-import
    1
    bin/neo4j-import --multiline-fields=true --bad-tolerance=1000000 --into graph.db --id-type string --nodes:person node.csv  --relationships:related relation_header.csv,relation.csv

    运行完后,将生成的graph.db放入data/databases,覆盖原有数据库,启动运行即可

3. References

    <div class="page-reward">
<a href="javascript:;" class="page-reward-btn tooltip-top">
<div class="tooltip tooltip-east">
<span class="tooltip-item">

</span>
<span class="tooltip-content">
<span class="tooltip-text">
<span class="tooltip-inner">
<p class="reward-p"><i class="icon icon-quo-left"></i>Thanks for your donate.<i class="icon icon-quo-right"></i></p>
<div class="reward-box"> <div class="reward-box-item">
<img class="reward-img" src="/images/alipay.jpeg">
<span class="reward-type">支付宝</span>
</div> </div>
</span>
</span>
</span>
</div>
</a>
</div> </div>
<div class="article-info article-info-index"> <div class="article-tag tagcloud">
<i class="icon-price-tags icon"></i>
<ul class="article-tag-list"> <li class="article-tag-list-item">
<a href="javascript:void(0)" class="js-tag article-tag-list-link color5">图数据库</a>
</li> <li class="article-tag-list-item">
<a href="javascript:void(0)" class="js-tag article-tag-list-link color1">Neo4j</a>
</li> </ul>
</div>








  <div class="clearfix"></div>
</div>

原文地址:https://chenson.cc/2018/08/17/图数据库-Neo4j-初探/