neo4j合并2个或多个重复节点。

时间:2022-09-04 18:01:09

I am feeding my neo4j db manually using cypher, so prone to error like creating duplicate nodes:

我正在用cypher手动输入我的neo4j db,所以很容易出错,比如创建重复的节点:

The duplicate nodes will have each relationships to other nodes. Is there a built-in function to merge these nodes? Or should I do it manually?

重复的节点将与其他节点有每个关系。是否有一个内置函数来合并这些节点?还是我应该手动操作?

Sounds possible, but complicated with cypher script:

听起来可能,但复杂的cypher脚本:

    1. Get the relationships of each duplicate node
    2. 获取每个重复节点的关系。
  • 获取每个重复节点的关系。
    1. Recreate them (with their properties) with the correct node (given node id)
    2. 用正确的节点(给定节点id)重新创建它们(及其属性)
  • 用正确的节点(给定节点id)重新创建它们(及其属性)
    1. Remove relationships to the duplicate nodes
    2. 删除与重复节点的关系。
  • 删除与重复节点的关系。
    1. and finally remove the duplicate nodes.
    2. 最后删除重复的节点。
  • 最后删除重复的节点。

2 个解决方案

#1


2  

To avoid this situation in the future, please look at the MERGE keyword in Cypher. Unfortunately, as far as I know, there is nothing in Cypher (yet) like:

为了避免将来出现这种情况,请查看Cypher中的MERGE关键字。不幸的是,据我所知,Cypher(至今)没有任何东西像:

MATCH (n:MyNode),(m:MyNode)
WHERE ID(n) <> ID(m) AND
PROPS(n) IN PROPS(m) AND PROPS(m) IN PROPS(n)
(...) DELETE (...)

The fictional function PROPS of the third line is not part of Cypher language and User-Defined functions have not made it yet into Neo4j.

第三行中虚构的函数道具不是Cypher语言的一部分,用户定义的函数还没有进入Neo4j。

If you're not working with production instances, the easiest is probably to back up your data folder and try to start the insertion over (with MERGE).

如果您不使用生产实例,最简单的方法可能是备份您的数据文件夹,并尝试开始插入(与合并)。

Otherwise, you can also try writing a traversal to collect the duplicates and delete them in batch (here is an example with the REST API).

否则,您也可以尝试编写遍历来收集副本并在批处理中删除它们(这里有一个使用REST API的示例)。

#2


2  

Try this:

试试这个:

MATCH (n:MyNode),(m:MyNode),(o:OtherNode {id:123})
WHERE n <> m
MATCH (m)-[r:FOO]->()
CREATE (n)-[r2:FOO]->(o)
SET r2 = r
DELETE r,m

#1


2  

To avoid this situation in the future, please look at the MERGE keyword in Cypher. Unfortunately, as far as I know, there is nothing in Cypher (yet) like:

为了避免将来出现这种情况,请查看Cypher中的MERGE关键字。不幸的是,据我所知,Cypher(至今)没有任何东西像:

MATCH (n:MyNode),(m:MyNode)
WHERE ID(n) <> ID(m) AND
PROPS(n) IN PROPS(m) AND PROPS(m) IN PROPS(n)
(...) DELETE (...)

The fictional function PROPS of the third line is not part of Cypher language and User-Defined functions have not made it yet into Neo4j.

第三行中虚构的函数道具不是Cypher语言的一部分,用户定义的函数还没有进入Neo4j。

If you're not working with production instances, the easiest is probably to back up your data folder and try to start the insertion over (with MERGE).

如果您不使用生产实例,最简单的方法可能是备份您的数据文件夹,并尝试开始插入(与合并)。

Otherwise, you can also try writing a traversal to collect the duplicates and delete them in batch (here is an example with the REST API).

否则,您也可以尝试编写遍历来收集副本并在批处理中删除它们(这里有一个使用REST API的示例)。

#2


2  

Try this:

试试这个:

MATCH (n:MyNode),(m:MyNode),(o:OtherNode {id:123})
WHERE n <> m
MATCH (m)-[r:FOO]->()
CREATE (n)-[r2:FOO]->(o)
SET r2 = r
DELETE r,m