neo4j与mysql相比的性能(如何改进呢?)

时间:2022-12-12 18:03:04

This is a follow up to can't reproduce/verify the performance claims in graph databases and neo4j in action books. I have updated the setup and tests, and don't want to change the original question too much.

这是无法在图形数据库和动作书籍中复制/验证性能声明的后续操作。我已经更新了设置和测试,不想对原来的问题做太多修改。

The whole story (including scripts etc) is on https://baach.de/Members/jhb/neo4j-performance-compared-to-mysql

整个故事(包括脚本等)是在https://baach.de/members/jhb/neo4j -performance- comparedto mysql

Short version: while trying to verify the performance claims made in the 'Graph Database' book I came to the following results (querying a random dataset containing n people, with 50 friends each):

短版本:在尝试验证“图形数据库”书中的性能声明时,我得到了以下结果(查询一个包含n个人的随机数据集,每个人有50个朋友):

My results for 100k people

depth    neo4j             mysql       python

1        0.010             0.000        0.000
2        0.018             0.001        0.000
3        0.538             0.072        0.009
4       22.544             3.600        0.330
5     1269.942           180.143        0.758

"*": single run only

“*”:单只运行

My results for 1 million people

depth    neo4j             mysql       python

1        0.010             0.000        0.000
2        0.018             0.002        0.000
3        0.689             0.082        0.012
4       30.057             5.598        1.079
5     1441.397*          300.000        9.791

"*": single run only

“*”:单只运行

Using 1.9.2 on a 64bit ubuntu I have setup neo4j.properties with these values:

我在64位ubuntu上使用1.9.2安装了neo4j。与这些值属性:

neostore.nodestore.db.mapped_memory=250M
neostore.relationshipstore.db.mapped_memory=2048M

and neo4j-wrapper.conf with:

和neo4j-wrapper。配置:

wrapper.java.initmemory=1024
wrapper.java.maxmemory=8192

My query to neo4j looks like this (using the REST api):

我对neo4j的查询如下(使用REST api):

start person=node:node_auto_index(noscenda_name="person123") match (person)-[:friend]->()-[:friend]->(friend) return count(distinct friend);

Node_auto_index is in place, obviously

显然,Node_auto_index已经就位。

Is there anything I can do to speed neo4j up (to be faster then mysql)?

我能做些什么来加快neo4j的速度(加快mysql的速度)吗?

And also there is another benchmark in * with same problem.

在*中还有一个基准测试也有同样的问题。

2 个解决方案

#1


4  

I'm sorry you can't reproduce the results. However, on a MacBook Air (1.8 GHz i7, 4 GB RAM) with a 2 GB heap, GCR cache, but no warming of caches, and no other tuning, with a similarly sized dataset (1 million users, 50 friends per person), I repeatedly get approx 900 ms using the Traversal Framework on 1.9.2:

很抱歉你不能重现结果。然而,MacBook Air(1.8 GHz i7,4 GB RAM)2 GB堆,GCR缓存,缓存但是没有变暖,和没有其他调优,与一个同样大小的数据集(100万用户,每人50位朋友),我多次获得大约900毫秒1.9.2使用遍历框架:

public class FriendOfAFriendDepth4
{
    private static final TraversalDescription traversalDescription = 
         Traversal.description()
            .depthFirst()
            .uniqueness( Uniqueness.NODE_GLOBAL )
            .relationships( withName( "FRIEND" ), Direction.OUTGOING )
            .evaluator( new Evaluator()
            {
                @Override
                public Evaluation evaluate( Path path )
                {
                    if ( path.length() >= 4 )
                    {
                        return Evaluation.INCLUDE_AND_PRUNE;
                    }
                    return Evaluation.EXCLUDE_AND_CONTINUE;

                }
            } );

    private final Index<Node> userIndex;

    public FriendOfAFriendDepth4( GraphDatabaseService db )
    {
        this.userIndex = db.index().forNodes( "user" );
    }

    public Iterator<Path> getFriends( String name )
    {
        return traversalDescription.traverse( 
            userIndex.get( "name", name ).getSingle() )
                .iterator();
    }

    public int countFriends( String name )
    {
        return  count( traversalDescription.traverse( 
            userIndex.get( "name", name ).getSingle() )
                 .nodes().iterator() );
    }
}

Cypher is slower, but nowhere near as slow as you suggest: approx 3 seconds:

Cypher较慢,但远不及你建议的慢:约3秒:

START person=node:user(name={name})
MATCH (person)-[:FRIEND]->()-[:FRIEND]->()-[:FRIEND]->()-[:FRIEND]->(friend)
RETURN count(friend)

Kind regards

亲切的问候

ian

伊恩

#2


3  

Yes, I believe the REST API is significantly slower than the regular bindings and therein lies your performance problem.

是的,我认为REST API要比常规绑定慢得多,这就是性能问题所在。

#1


4  

I'm sorry you can't reproduce the results. However, on a MacBook Air (1.8 GHz i7, 4 GB RAM) with a 2 GB heap, GCR cache, but no warming of caches, and no other tuning, with a similarly sized dataset (1 million users, 50 friends per person), I repeatedly get approx 900 ms using the Traversal Framework on 1.9.2:

很抱歉你不能重现结果。然而,MacBook Air(1.8 GHz i7,4 GB RAM)2 GB堆,GCR缓存,缓存但是没有变暖,和没有其他调优,与一个同样大小的数据集(100万用户,每人50位朋友),我多次获得大约900毫秒1.9.2使用遍历框架:

public class FriendOfAFriendDepth4
{
    private static final TraversalDescription traversalDescription = 
         Traversal.description()
            .depthFirst()
            .uniqueness( Uniqueness.NODE_GLOBAL )
            .relationships( withName( "FRIEND" ), Direction.OUTGOING )
            .evaluator( new Evaluator()
            {
                @Override
                public Evaluation evaluate( Path path )
                {
                    if ( path.length() >= 4 )
                    {
                        return Evaluation.INCLUDE_AND_PRUNE;
                    }
                    return Evaluation.EXCLUDE_AND_CONTINUE;

                }
            } );

    private final Index<Node> userIndex;

    public FriendOfAFriendDepth4( GraphDatabaseService db )
    {
        this.userIndex = db.index().forNodes( "user" );
    }

    public Iterator<Path> getFriends( String name )
    {
        return traversalDescription.traverse( 
            userIndex.get( "name", name ).getSingle() )
                .iterator();
    }

    public int countFriends( String name )
    {
        return  count( traversalDescription.traverse( 
            userIndex.get( "name", name ).getSingle() )
                 .nodes().iterator() );
    }
}

Cypher is slower, but nowhere near as slow as you suggest: approx 3 seconds:

Cypher较慢,但远不及你建议的慢:约3秒:

START person=node:user(name={name})
MATCH (person)-[:FRIEND]->()-[:FRIEND]->()-[:FRIEND]->()-[:FRIEND]->(friend)
RETURN count(friend)

Kind regards

亲切的问候

ian

伊恩

#2


3  

Yes, I believe the REST API is significantly slower than the regular bindings and therein lies your performance problem.

是的,我认为REST API要比常规绑定慢得多,这就是性能问题所在。