使用YCSB压测Cassandra 3.7

时间:2022-09-03 04:50:32
使用YCSB压测Cassandra 3.7
1、在https://github.com/brianfrankcooper/YCSB/获取源代码(Git或直接下载,我用Git下载到82%时候不动了,于是直接下载zip文件)


2、编译源代码
整个项目编译:mvn clean package
Cassandra单个编译:mvn -pl com.yahoo.ycsb:cassandra-binding -am clean package
我选择的是单个编译,比较快


3、编译成功后

使用YCSB压测Cassandra 3.7

显示Cassandra 2.1+ DB Binding .... SUCCESS就说明cassandra编译成功了  其他的不用管,直接去拿/home/yc/YCSB-master/cassandra/target/ycsb-cassandra-binding-0.11.0-SNAPSHOT.tar.gz就行了

#mkidr /usr/local/ycsb/
#cp ycsb-cassandra-binding-0.11.0-SNAPSHOT.tar.gz /usr/local/ycsb/
#cd /usr/local/ycsb/
#tar -zxvf ycsb-cassandra-binding-0.11.0-SNAPSHOT.tar.gz
#cd ycsb-cassandra-binding-0.11.0


4、用cassandra的cqlsh创建keyspace和cloumn family

(1)新建keyspace:

create keyspace usertable with replication = {'class':'SimpleStrategy', 'replication_factor':1};


(2)新建table也即cloumn family:

create table usertable (y_id varchar primary key,field0 varchar,field1 varchar,field2 varchar,field3 varchar,field4 varchar,field5 varchar,field6 varchar,field7 varchar,field8 varchar,field9 varchar);


5、查看ycsb命令格式

#cd bin
#ycsb
usage: ./ycsb command database [options]


Commands:
load Execute the load phase
run Execute the transaction phase
shell Interactive mode


Databases:
accumulo https://github.com/brianfrankcooper/YCSB/tree/master/accumulo
aerospike https://github.com/brianfrankcooper/YCSB/tree/master/aerospike
arangodb https://github.com/brianfrankcooper/YCSB/tree/master/arangodb
asynchbase https://github.com/brianfrankcooper/YCSB/tree/master/asynchbase
basic https://github.com/brianfrankcooper/YCSB/tree/master/basic
cassandra-cql https://github.com/brianfrankcooper/YCSB/tree/master/cassandra
cassandra2-cql https://github.com/brianfrankcooper/YCSB/tree/master/cassandra2
couchbase https://github.com/brianfrankcooper/YCSB/tree/master/couchbase
couchbase2 https://github.com/brianfrankcooper/YCSB/tree/master/couchbase2
dynamodb https://github.com/brianfrankcooper/YCSB/tree/master/dynamodb
elasticsearch https://github.com/brianfrankcooper/YCSB/tree/master/elasticsearch
geode https://github.com/brianfrankcooper/YCSB/tree/master/geode
googlebigtable https://github.com/brianfrankcooper/YCSB/tree/master/googlebigtable
googledatastore https://github.com/brianfrankcooper/YCSB/tree/master/googledatastore
hbase094 https://github.com/brianfrankcooper/YCSB/tree/master/hbase094
hbase098 https://github.com/brianfrankcooper/YCSB/tree/master/hbase098
hbase10 https://github.com/brianfrankcooper/YCSB/tree/master/hbase10
hypertable https://github.com/brianfrankcooper/YCSB/tree/master/hypertable
infinispan https://github.com/brianfrankcooper/YCSB/tree/master/infinispan
infinispan-cs https://github.com/brianfrankcooper/YCSB/tree/master/infinispan
jdbc https://github.com/brianfrankcooper/YCSB/tree/master/jdbc
kudu https://github.com/brianfrankcooper/YCSB/tree/master/kudu
mapkeeper https://github.com/brianfrankcooper/YCSB/tree/master/mapkeeper
memcached https://github.com/brianfrankcooper/YCSB/tree/master/memcached
mongodb https://github.com/brianfrankcooper/YCSB/tree/master/mongodb
mongodb-async https://github.com/brianfrankcooper/YCSB/tree/master/mongodb
nosqldb https://github.com/brianfrankcooper/YCSB/tree/master/nosqldb
orientdb https://github.com/brianfrankcooper/YCSB/tree/master/orientdb
rados https://github.com/brianfrankcooper/YCSB/tree/master/rados
redis https://github.com/brianfrankcooper/YCSB/tree/master/redis
riak https://github.com/brianfrankcooper/YCSB/tree/master/riak
s3 https://github.com/brianfrankcooper/YCSB/tree/master/s3
solr https://github.com/brianfrankcooper/YCSB/tree/master/solr
tarantool https://github.com/brianfrankcooper/YCSB/tree/master/tarantool
voldemort https://github.com/brianfrankcooper/YCSB/tree/master/voldemort


Options:
-P file Specify workload file
-cp path Additional Java classpath entries
-jvm-args args Additional arguments to the JVM
-p key=value Override workload property
-s Print status to stderr
-target n Target ops/sec (default: unthrottled)
-threads n Number of client threads (default: 1)


Workload Files:
There are various predefined workloads under workloads/ directory.
See https://github.com/brianfrankcooper/YCSB/wiki/Core-Properties
for the list of workload properties.
ycsb: error: too few arguments

从命令格式里可以看出   -P可以加载一些配置文件    -p可以以键值对的方式加载一些配置   -s每隔一段时间输出执行信息  -threads线程数



6、新建cassandra连接文件(里面的属性可以在源码/home/yc/YCSB-master/cassandra/src/main/java/com/yahoo/ycsb/db/CassandraCQLClient.java中查看)

#vim cassandra.properties


hosts = spark131,spark130,spark129 #host列表,用逗号,隔开
port = 9042
cassandra.keyspace = usertable #测试表
cassandra.username = ershixiong #cassandra用户名
cassandra.password = 111111 #cassandra密码
cassandra.readconsistencylevel = ANY
cassandra.writeconsistencylevel = ANY
cassandra.maxconnections = 100
cassandra.connecttimeoutmillis = 1000000000
cassandra.readtimeoutmillis = 1000000000



7、配置workload

#vim workloads/workloada
workload=com.yahoo.ycsb.workloads.CoreWorkload
readallfields=false
readproportion=0.5
updateproportion=0.5
scanproportion=0
insertproportion=0
requestdistribution=zipfian
fieldcount 表示每条数据中的字段数,默认为 10;
fieldlength 表示每个字段的值的长度,默认为 100;
readallfields 域用来标识是否读取所有的所有的字段,取值有 ture 或 false;
readproportion,
updateproportion,
scanproportion,
insertproportion 分别表示该 workload中读、更新、扫描和插入操作占总操作的百分比,这四个值的和为 1;
requestdistribution 表示数据的分布情况,当前支持 uniform,zipfian 和 latest,默认为 uniform;
maxscanlength 域主要为扫描操作定义,定义了最大扫描的记录数量,默认为 1000;
scanlengthdistribution 域也是为扫描操作定义的,为每次扫描的长度定义相应的分布,默认是 uniform;
insertorder 域主要分两种 ordered 和 hashed,默认为 hashed;
operationcount 总共的 operation 数量;
maxexecutiontime 为该 workload 定义了最长的执行时间,单位为 s。
AverageLatency(平均潜伏期)平均潜伏期(average latency):指当磁头移动到数据所在的磁道后,然后等待所要的数据块继续转动(半圈或多些、少些)到磁头下的时间,单位为毫秒(ms)。平均潜伏期是越小越好,潜伏期小代表硬盘的读取数据的等待时间短,这就等于具有更高的硬盘数据传输率。


8、

#cd ..
#./bin/ycsb load cassandra-cql -P workloads/workloada -P cassandra.properties -p columnfamily=usertable -s -threads 20 > ./write7read3-log.log



数据接口名为cassandra-cql
加载ycsb的配置:workloads/workloada
加载cassandra的配置:cassandra.properties
columnfamily名称为usertable
20个线程执行