I am trying to introduce a huge amount of data in neo4j from a file. I am using node.js code, simple javascript with no much complexity.
我试图从一个文件中引入neo4j中的大量数据。我使用node.js代码,简单的javascript没有太多的复杂性。
The thing is that I have 386213 lines or 'nodes' to introduce, but when executed (and wait 3 hours) I only see the half moreless. I think some of the queries are lost in the way, but I do not know why...
问题是我有386213行或“节点”要引入,但是当执行(并等待3个小时)时,我只看到一半没有更多。我认为有些问题会丢失,但我不知道为什么......
I am using npm node-neo4j package for the connection and that.
我正在使用npm node-neo4j包进行连接。
Here my node.js code:
这是我的node.js代码:
var neo4j = require('neo4j');
var readline = require("readline");
var fs = require("fs")
var db = new neo4j.GraphDatabase('http://neo4j:Gemitis26@localhost:7474');
var rl = readline.createInterface({
input: fs.createReadStream('C:/Users/RRamos/Documents/Projects/test-neo4j/Files/kaggle_songs.txt')
});
var i=1;
rl.on('line', function (line) {
var str = line.split(" ");
db.cypher({
query: "CREATE (:Song {id: '{line1}', num_id: {line2}})",
params: {
line1: str[0],
line2: str[1],
},
}, callback);
console.log(i + " " + "CREATE (:Song {id: '"+str[0]+"', num_id: "+str[1]+"})");
i = i+1;
});
function callback(err, results){
if(err) throw err;
}
1 个解决方案
#1
0
Making 386213 separate Cypher REST queries (in separate transactions) is probably the slowest possible way to create such a large number of nodes.
制作386213单独的Cypher REST查询(在单独的事务中)可能是创建如此大量节点的最慢的方法。
There are at least 3 better ways (in order of increasing performance):
至少有3种更好的方法(按性能提升的顺序):
-
Create multiple nodes at a time by sending as a parameter an array containing the data for multiple nodes. For example, you can create 8 nodes by sending this array parameter:
[['a', 1],['b', 2],['c', 3],['d', 4],['e', 5],['f', 6],['g', 7],['h', 8]]
, and using this query:通过将包含多个节点数据的数组作为参数发送,一次创建多个节点。例如,您可以通过发送此数组参数来创建8个节点:[['a',1],['b',2],['c',3],['d',4],['e ',5],['f',6],['g',7],['h',8]],并使用此查询:
UNWIND {data} AS d CREATE (:Song {id: d[0], num_id: d[0]})
-
You can use the LOAD CSV clause to create the nodes. Since your input file seems to use a space to separate node property values, this might work for you:
您可以使用LOAD CSV子句创建节点。由于您的输入文件似乎使用空格来分隔节点属性值,因此这可能对您有用:
LOAD CSV FROM 'file:///C:/Users/RRamos/Documents/Projects/test-neo4j/Files/kaggle_songs.txt' AS line FIELDTERMINATOR ' ' CREATE (:Song {id: line[0], num_id: line[1]})
-
For even better performance, you could use the Import tool, which is a command line tool for initializing a new DB.
为了获得更好的性能,您可以使用导入工具,这是一个用于初始化新数据库的命令行工具。
#1
0
Making 386213 separate Cypher REST queries (in separate transactions) is probably the slowest possible way to create such a large number of nodes.
制作386213单独的Cypher REST查询(在单独的事务中)可能是创建如此大量节点的最慢的方法。
There are at least 3 better ways (in order of increasing performance):
至少有3种更好的方法(按性能提升的顺序):
-
Create multiple nodes at a time by sending as a parameter an array containing the data for multiple nodes. For example, you can create 8 nodes by sending this array parameter:
[['a', 1],['b', 2],['c', 3],['d', 4],['e', 5],['f', 6],['g', 7],['h', 8]]
, and using this query:通过将包含多个节点数据的数组作为参数发送,一次创建多个节点。例如,您可以通过发送此数组参数来创建8个节点:[['a',1],['b',2],['c',3],['d',4],['e ',5],['f',6],['g',7],['h',8]],并使用此查询:
UNWIND {data} AS d CREATE (:Song {id: d[0], num_id: d[0]})
-
You can use the LOAD CSV clause to create the nodes. Since your input file seems to use a space to separate node property values, this might work for you:
您可以使用LOAD CSV子句创建节点。由于您的输入文件似乎使用空格来分隔节点属性值,因此这可能对您有用:
LOAD CSV FROM 'file:///C:/Users/RRamos/Documents/Projects/test-neo4j/Files/kaggle_songs.txt' AS line FIELDTERMINATOR ' ' CREATE (:Song {id: line[0], num_id: line[1]})
-
For even better performance, you could use the Import tool, which is a command line tool for initializing a new DB.
为了获得更好的性能,您可以使用导入工具,这是一个用于初始化新数据库的命令行工具。