check1<-rimpala.query("select * from sum2")
Error in .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl, :
java.sql.SQLException: Method not supported
dim(sum2) is 49501 rows and 18 columns.
dim(sum2)是49501行和18列。
check1<-rimpala.query("select *from sum3")
dim(sum3) is 102 rows and 6 columns.
dim(sum3)是102行和6列。
It worked with smaller sample size.
它适用于较小的样本量。
sorry that I cant reproduce example to this. Is anyone encounter the same problem with larger data size? Any idea to solve this? Thanks.
对不起,我无法重现这个例子。是否有人遇到数据量更大的同样问题?有什么想法解决这个问题?谢谢。
3 个解决方案
#1
1
As noted elsewhere on *, RImpala does not implement executeUpdate
and so cannot run any query that modifies state. I suspect you hit your error not by running a larger SELECT query but rather because you tried to insert, update, or delete some data.
正如*上其他地方所述,RImpala没有实现executeUpdate,因此无法运行任何修改状态的查询。我怀疑你不是通过运行更大的SELECT查询而是因为你试图插入,更新或删除一些数据而遇到错误。
If you'd like to use Impala from R, I'd recommend using dplyrimpaladb.
如果您想使用R中的Impala,我建议您使用dplyrimpaladb。
#2
0
RImpala (v0.1.6) build is updated with the support to execute DDL queries using executeUpdate.
RImpala(v0.1.6)构建更新时支持使用executeUpdate执行DDL查询。
The latest build contains the following fixes / additions:
最新版本包含以下修复/添加:
- Support for DDL query execution.
- 支持DDL查询执行。
- fetchSize parameter in query function to state the number of records that can be retrieved in one round trip read from Impala.
- 查询函数中的fetchSize参数,用于说明从Impala读取的一次往返中可以检索的记录数。
- Fix for query failing when NULL values are being returned.
- 修复了返回NULL值时查询失败的问题。
- Compatiblity with CDH 5.x.x
- 兼容CDH 5.x.x.
You can run DDL queries using the query function as illustrated below:
您可以使用查询功能运行DDL查询,如下所示:
rimpala.query(Q="drop table sample_table",isDDL="true")
You can also specify the fetchSize in the query function to aid reading large data efficiently.
您还可以在查询函数中指定fetchSize以帮助有效地读取大数据。
rimpala.query(Q="select * from sample_table",fetchSize="10000")
Please find the latest build in Cran : http://cran.r-project.org/web/packages/RImpala/index.html
请在Cran中找到最新版本:http://cran.r-project.org/web/packages/RImpala/index.html
Source Code : https://github.com/Mu-Sigma/RImpala
源代码:https://github.com/Mu-Sigma/RImpala
#3
0
I have the same problem with the RImpala package and recommend to use the RJDBC package:
我对RImpala包有同样的问题,建议使用RJDBC包:
library(RJDBC)
drv <- JDBC(driverClass = "org.apache.hive.jdbc.HiveDriver",
classPath = list.files("path_to_jars",pattern="jar$",full.names=T),
identifier.quote="`")
conn <- dbConnect(drv, "jdbc:hive2://localhost:21050/;auth=noSasl")
check1 <- dbGetQuery(conn, "select *from sum3")
I used these jar files an evenything works as expected: https://downloads.cloudera.com/impala-jdbc/impala-jdbc-0.5-2.zip
我使用这些jar文件,按照预期工作:https://downloads.cloudera.com/impala-jdbc/impala-jdbc-0.5-2.zip
For more information and a speed comparison look at this blog post: http://datascience.la/r-and-impala-its-better-to-kiss-than-using-java/
有关更多信息和速度比较,请查看此博客文章:http://datascience.la/r-and-impala-its-better-to-kiss-than-using-java/
#1
1
As noted elsewhere on *, RImpala does not implement executeUpdate
and so cannot run any query that modifies state. I suspect you hit your error not by running a larger SELECT query but rather because you tried to insert, update, or delete some data.
正如*上其他地方所述,RImpala没有实现executeUpdate,因此无法运行任何修改状态的查询。我怀疑你不是通过运行更大的SELECT查询而是因为你试图插入,更新或删除一些数据而遇到错误。
If you'd like to use Impala from R, I'd recommend using dplyrimpaladb.
如果您想使用R中的Impala,我建议您使用dplyrimpaladb。
#2
0
RImpala (v0.1.6) build is updated with the support to execute DDL queries using executeUpdate.
RImpala(v0.1.6)构建更新时支持使用executeUpdate执行DDL查询。
The latest build contains the following fixes / additions:
最新版本包含以下修复/添加:
- Support for DDL query execution.
- 支持DDL查询执行。
- fetchSize parameter in query function to state the number of records that can be retrieved in one round trip read from Impala.
- 查询函数中的fetchSize参数,用于说明从Impala读取的一次往返中可以检索的记录数。
- Fix for query failing when NULL values are being returned.
- 修复了返回NULL值时查询失败的问题。
- Compatiblity with CDH 5.x.x
- 兼容CDH 5.x.x.
You can run DDL queries using the query function as illustrated below:
您可以使用查询功能运行DDL查询,如下所示:
rimpala.query(Q="drop table sample_table",isDDL="true")
You can also specify the fetchSize in the query function to aid reading large data efficiently.
您还可以在查询函数中指定fetchSize以帮助有效地读取大数据。
rimpala.query(Q="select * from sample_table",fetchSize="10000")
Please find the latest build in Cran : http://cran.r-project.org/web/packages/RImpala/index.html
请在Cran中找到最新版本:http://cran.r-project.org/web/packages/RImpala/index.html
Source Code : https://github.com/Mu-Sigma/RImpala
源代码:https://github.com/Mu-Sigma/RImpala
#3
0
I have the same problem with the RImpala package and recommend to use the RJDBC package:
我对RImpala包有同样的问题,建议使用RJDBC包:
library(RJDBC)
drv <- JDBC(driverClass = "org.apache.hive.jdbc.HiveDriver",
classPath = list.files("path_to_jars",pattern="jar$",full.names=T),
identifier.quote="`")
conn <- dbConnect(drv, "jdbc:hive2://localhost:21050/;auth=noSasl")
check1 <- dbGetQuery(conn, "select *from sum3")
I used these jar files an evenything works as expected: https://downloads.cloudera.com/impala-jdbc/impala-jdbc-0.5-2.zip
我使用这些jar文件,按照预期工作:https://downloads.cloudera.com/impala-jdbc/impala-jdbc-0.5-2.zip
For more information and a speed comparison look at this blog post: http://datascience.la/r-and-impala-its-better-to-kiss-than-using-java/
有关更多信息和速度比较,请查看此博客文章:http://datascience.la/r-and-impala-its-better-to-kiss-than-using-java/