当操作的数据量很大时,相比于单条更新/插入,批量更新/插入数据能显著减少操作时间。本文通过几个简单的例子介绍如何用executeBatch进行批量操作,以及在批量操作中常见的几个坑。
首先在数据库中新建一张表用于测试:
create table stuInfo(
id INT(11) UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT,
stu_name varchar(30) NOT NULL COMMENT "学生姓名",
stu_age tinyint(4) NOT NULL COMMENT "学生年龄",
score float DEFAULT 0 COMMENT "学生成绩"
);
随机生成100000条数据,分别用批量操作和单条数据操作进行插入,比较两种方法耗费的时间。代码如下(ConnectMysql 类的代码在文章最后给出):
import java.util.Random;
/**
* Created by Alex on 2016/10/16.
*/
public class TestMysalExecuteBatch {
private ConnectMysql sql = new ConnectMysql();
/**
* 数据库插入数据测试
*/
public void testInsert() throws Exception {
sql.initConnection("localhost:3306", "knowledge", "root", "*******"); //连接数据库
String sqlStr = "insert into stuInfo(stu_name, stu_age, score) values(?,?,?)";
sql.pst = sql.conn.prepareStatement(sqlStr);
String[] familyNames = {"赵", "钱", "孙", "李", "周", "吴", "郑", "王", "冯"};
String[] givenNames = {"计", "算", "机", "科", "学", "与", "技", "术"};
Random random = new Random();
int i = 0;
long startTime = System.currentTimeMillis(); //记录开始时间
while (i < 100000) {
i++;
String familyName = familyNames[random.nextInt(familyNames.length)];
String givenName = givenNames[random.nextInt(givenNames.length)];
int age = random.nextInt(18);
float score = random.nextFloat();
sql.pst.setString(1, familyName + givenName);//设置姓名
sql.pst.setInt(2, age);//设置年龄
sql.pst.setFloat(3, score);//设置分数
int result = sql.pst.executeUpdate();//插入一条数据
}
long endTime = System.currentTimeMillis(); //记录结束时间
System.out.println("testInsert cost time:" + (endTime - startTime) + "ms");
sql.closeConnection(); //断开数据库连接
}
/**
* 数据库批量插入数据测试
*/
public void testExecuteBatch() throws Exception {
sql.initConnection("localhost:3306", "knowledge", "root", "*******"); //连接数据库
String sqlStr = "insert into stuInfo(stu_name, stu_age, score) values(?,?,?)";
sql.pst = sql.conn.prepareStatement(sqlStr); //注意,prepareStatement放在循环之外
sql.conn.setAutoCommit(false); //很重要,取消自动提交
String[] familyNames = {"赵", "钱", "孙", "李", "周", "吴", "郑", "王", "冯"};
String[] givenNames = {"计", "算", "机", "科", "学", "与", "技", "术"};
Random random = new Random();
int count = 0;
int i = 0;
long startTime = System.currentTimeMillis();//记录开始时间
while (i < 100000) {
i++;
String familyName = familyNames[random.nextInt(familyNames.length)];
String givenName = givenNames[random.nextInt(givenNames.length)];
int age = random.nextInt(18);
float score = random.nextFloat();
sql.pst.setString(1, familyName + givenName);//设置姓名
sql.pst.setInt(2, age);//设置年龄
sql.pst.setFloat(3, score);//设置分数
sql.pst.addBatch();// 该条数据加入到批处理数据集中
count ++;
if(count>=10000){
try {
//批量操作,返回结果result数组保存每条语句更新的数据数量
int[] result = sql.pst.executeBatch();
sql.conn.commit();
count = 0;
System.out.println("testExecuteBatch>> executeBatch length of result is:" + result.length);
}catch(Exception e){
sql.conn.rollback(); //出错的时候回滚
count = 0;
System.out.println("testExecuteBatch>> executeBatch error");
e.printStackTrace();
}
}
}
long endTime = System.currentTimeMillis();//记录结束时间
System.out.println("testExecuteBatch cost time:" + (endTime - startTime) + "ms");
sql.closeConnection(); //断开数据库连接
}
public static void main(String[] args) throws Exception {
TestMysalExecuteBatch testMysalExecuteBatch = new TestMysalExecuteBatch();
testMysalExecuteBatch.testExecuteBatch();
testMysalExecuteBatch.testInsert();
}
}
测试结果:
testExecuteBatch>> executeBatch length of result is:10000
testExecuteBatch>> executeBatch length of result is:10000
testExecuteBatch>> executeBatch length of result is:10000
testExecuteBatch>> executeBatch length of result is:10000
testExecuteBatch>> executeBatch length of result is:10000
testExecuteBatch>> executeBatch length of result is:10000
testExecuteBatch>> executeBatch length of result is:10000
testExecuteBatch>> executeBatch length of result is:10000
testExecuteBatch>> executeBatch length of result is:10000
testExecuteBatch>> executeBatch length of result is:10000
testExecuteBatch cost time:25180ms
testInsert cost time:3798286ms
可以发现批量数据操作比单条数据操作要快很多。在进行批量数据操作的时候,有几个点需要注意:
1. 首先要取消自动提交,从而更新语句能保存在批量更新数据集中等待批量更新executeBatch(),否则更新操作语句会自动提交执行。
sql.conn.setAutoCommit(false);
2. prepareStatement要放在循环之外,否则只会在批量操作集中加入最后的数据。
sql.pst = sql.conn.prepareStatement(sqlStr);
比如上面的例子,把testExecuteBatch函数修改成:
/**
* 数据库批量插入数据测试
*/
public void testExecuteBatch() throws Exception {
sql.initConnection("localhost:3306", "knowledge", "root", "045116"); //连接数据库
String sqlStr = "insert into stuInfo(stu_name, stu_age, score) values(?,?,?)";
// sql.pst = sql.conn.prepareStatement(sqlStr);//注意,prepareStatement放在循环之外
sql.conn.setAutoCommit(false); //很重要,取消自动提交
String[] familyNames = {"赵", "钱", "孙", "李", "周", "吴", "郑", "王", "冯"};
String[] givenNames = {"计", "算", "机", "科", "学", "与", "技", "术"};
Random random = new Random();
int count = 0;
int i = 0;
long startTime = System.currentTimeMillis();
while (i < 100000) {
i++;
String familyName = familyNames[random.nextInt(familyNames.length)];
String givenName = givenNames[random.nextInt(givenNames.length)];
int age = random.nextInt(18);
float score = random.nextFloat();
sql.pst = sql.conn.prepareStatement(sqlStr);//prepareStatement放在循环内进行测试
sql.pst.setString(1, familyName + givenName); //设置姓名
sql.pst.setInt(2, age); //设置年龄
sql.pst.setFloat(3, score); //设置分数
sql.pst.addBatch(); // 该条数据加入到批处理数据集中
count ++;
if(count>=10000){
try {
//批量操作,返回结果result数组保存每条语句更新的数据数量
int[] result = sql.pst.executeBatch();
sql.conn.commit();
count = 0;
System.out.println("testExecuteBatch>> executeBatch length of result is:" + result.length);
}catch(Exception e){
sql.conn.rollback(); //出错的时候回滚
count = 0;
System.out.println("testExecuteBatch>> executeBatch error");
e.printStackTrace();
}
}
}
long endTime = System.currentTimeMillis();
System.out.println("testExecuteBatch cost time:" + (endTime - startTime) + "ms");
sql.closeConnection(); //断开数据库连接
}
运行结果为:
testExecuteBatch>> executeBatch length of result is:1
testExecuteBatch>> executeBatch length of result is:1
testExecuteBatch>> executeBatch length of result is:1
testExecuteBatch>> executeBatch length of result is:1
testExecuteBatch>> executeBatch length of result is:1
testExecuteBatch>> executeBatch length of result is:1
testExecuteBatch>> executeBatch length of result is:1
testExecuteBatch>> executeBatch length of result is:1
testExecuteBatch>> executeBatch length of result is:1
testExecuteBatch>> executeBatch length of result is:1
testExecuteBatch cost time:2204ms
可以发现每次executeBatch都只执行了一条插入语句,因为每次调用prepareStatement都会把statement的Batch内的数据刷新掉。
3.select操作可以通过sql语句实现批量获取,所以没有专门为select语句提供批量操作的接口,例如: select * from stuInfo where stu_name in (张三,李四,王五)
附加:
ConnectMysql类简单的对数据库连接及断开连接做了一层包装,代码如下:
public class ConnectMysqlSout {
public final String driver = "com.mysql.jdbc.Driver";
public Connection conn = null;
public PreparedStatement pst = null;
public void initConnection(String hostID,String database, String username, String password){
try {
Class.forName(driver);//指定连接类型
conn = DriverManager.getConnection("jdbc:mysql://"+hostID+"/"+database+"?user="+username+"&password="+password);
System.out.println("[ConnectMysql]-->initialize connection succeed.");
}
catch (Exception e) {
System.out.println("[ConnectMysql]-->initialize connection error.");
e.printStackTrace();
}
}
public void closeConnection(){
try {
if(null!= conn) conn.close();
if(null!= pst) pst.close();
System.out.println("[ConnectMysql]-->mysql connection closed.");
} catch (SQLException e) {
System.out.println("[ConnectMysql]-->close connection error.");
e.printStackTrace();
}
}
}