I'm working with PostgreSQL 9.6 and Spark 2.0.0
我正在使用PostgreSQL 9.6和Spark 2.0.0
I want to create a DataFrame
form a postgreSQL table, as following:
我想从postgreSQL表创建一个DataFrame,如下所示:
val query =
"""(
SELECT events.event_facebook_id,
places.placeid, places.likes as placelikes,
artists.facebookId, artists.likes as artistlikes
FROM events
LEFT JOIN eventsplaces on eventsplaces.event_id = events.event_facebook_id
LEFT JOIN places on eventsplaces.event_id = places.facebookid
LEFT JOIN eventsartists on eventsartists.event_id = events.event_facebook_id
LEFT JOIN artists on eventsartists.artistid = artists.facebookid) df"""
The request is valid (if I run it on psql, I don't get any error) but with Spark, if I execute the following code, I get a NullPointerException
:
请求是有效的(如果我在psql上运行它,我没有得到任何错误)但使用Spark,如果我执行以下代码,我得到一个NullPointerException:
sqlContext
.read
.format("jdbc")
.options(
Map(
"url" -> claudeDatabaseUrl,
"dbtable" -> query))
.load()
.show()
If I change, in the query artists.facebookId
by an other column as artists.description
(which can be null
contrary to facebookId
), the exception disappears.
如果我在查询artists.facebookId中将其他列更改为artists.description(与facebookId相反可以为null),则异常消失。
I find this very very strange, any idea?
我发现这很奇怪,任何想法?
1 个解决方案
#1
0
You have different facebookId
's in your query: artists.facebook[I]d
and artists.facebook[i]d
.
你的查询中有不同的facebookId:artists.facebook [I] d和artists.facebook [i] d。
Please, try to use the correct one.
请尝试使用正确的。
#1
0
You have different facebookId
's in your query: artists.facebook[I]d
and artists.facebook[i]d
.
你的查询中有不同的facebookId:artists.facebook [I] d和artists.facebook [i] d。
Please, try to use the correct one.
请尝试使用正确的。