I want to read the uk.txt
file from a uk nga geonames download using python blaze and then odo to insert it into a Postgresql db.
我想读英国。来自英国nga geonames下载的txt文件,使用python blaze将其插入到Postgresql db中。
Code is:
代码是:
import blaze as bz
from odo import odo
dataPath = 'uk.txt'
myData = bz.Data(dataPath, sep='\t')
out = odo(myData, 'postgresql://postgres:postgres@localhost:5432/blaze_test::uk_geonames')
I get the error ValueError: cannot safely convert passed user dtype of <i8 for object dtyped data in column 0
that I think I understand as meaning "a datatype cant be converted to insert into the db"
我得到了error ValueError:无法安全地将传入的用户dtype (
Should I force dtype
to equal something? How would I fix this?
我应该强制dtype等于什么吗?我该怎么解决这个问题呢?
A sample input from the file is:
来自该文件的示例输入是:
RC UFI UNI LAT LONG DMS_LAT DMS_LONG MGRS JOG FC DSG PC CC1 ADM1 POP ELEV CC2 NT LC SHORT_FORM GENERIC SORT_NAME_RO FULL_NAME_RO FULL_NAME_ND_RO SORT_NAME_RG FULL_NAME_RG FULL_NAME_ND_RG NOTE MODIFY_DATE DISPLAY NAME_RANK NAME_LINK TRANSL_CD NM_MODIFY_DATE
1 380952 475802 54.086111 -6.655556 540510 -63920 29UPV5334795644 NN29-06 H STM EI,UK EI,UK N Clarebane CLAREBANERIVER Clarebane River Clarebane River CLAREBANERIVER Clarebane River Clarebane River 2014-06-27 1,2,3 2
1 个解决方案
#1
5
For some reason, the header isn't being inferred correctly. You can pass in the infer_header
keyword argument like so:
由于某些原因,标题没有被正确地推断出来。您可以传入infer_header关键字参数如下:
In [12]: from blaze import Data
In [13]: from odo import CSV, odo
In [14]: d = Data(CSV('uk.txt', sep='\t', has_header=True))
In [15]: d.head(5)
Out[15]:
RC UFI UNI LAT LONG DMS_LAT DMS_LONG \
0 1 380952 475802 54.086111 -6.655556 540510 -63920
1 1 380952 475801 54.086111 -6.655556 540510 -63920
2 1 380954 475805 54.104722 -6.648889 540617 -63856
3 1 380955 475806 54.098056 -6.644167 540553 -63839
4 1 380958 475810 54.040556 -6.614444 540226 -63652
MGRS JOG FC ... SORT_NAME_RG \
0 29UPV5334795644 NN29-06 H ... CLAREBANERIVER
1 29UPV5334795644 NN29-06 H ... CLAREBANE
2 29UPV5371497729 NN29-06 H ... ALINA LOUGH
3 29UPV5404796997 NN29-06 H ... CORLISSLOUGH
4 29UPV5620690667 NN29-06 H ... DRUMBOYLOUGH
FULL_NAME_RG FULL_NAME_ND_RG NOTE MODIFY_DATE DISPLAY NAME_RANK \
0 Clarebane River Clarebane River NaN 2014-06-27 1,2,3 2
1 Clarebane Clarebane NaN 2014-06-27 1,2,3 1
2 Alina, Lough Alina, Lough NaN 2014-06-27 1,2,3 1
3 Corliss Lough Corliss Lough NaN 2014-06-27 1,2,3 1
4 Drumboy Lough Drumboy Lough NaN 2014-06-27 1,2,3 1
NAME_LINK TRANSL_CD NM_MODIFY_DATE
0 NaN NaN 2014-06-27
1 NaN NaN 2014-06-27
2 NaN NaN 2014-06-27
3 NaN NaN 2014-06-27
4 NaN NaN 2014-06-27
[5 rows x 34 columns]
After that, simply odo
it into the desired table:
然后,简单地将它放入所需的表格中:
In [16]: t = odo(d, 'postgresql://localhost::uk')
In [17]: uk = Data(t)
In [19]: uk.head(5)
Out[19]:
RC UFI UNI LAT LONG DMS_LAT DMS_LONG \
0 1 380952 475802 54.086111 -6.655556 540510 -63920
1 1 380952 475801 54.086111 -6.655556 540510 -63920
2 1 380954 475805 54.104722 -6.648889 540617 -63856
3 1 380955 475806 54.098056 -6.644167 540553 -63839
4 1 380958 475810 54.040556 -6.614444 540226 -63652
MGRS JOG FC ... SORT_NAME_RG \
0 29UPV5334795644 NN29-06 H ... CLAREBANERIVER
1 29UPV5334795644 NN29-06 H ... CLAREBANE
2 29UPV5371497729 NN29-06 H ... ALINA LOUGH
3 29UPV5404796997 NN29-06 H ... CORLISSLOUGH
4 29UPV5620690667 NN29-06 H ... DRUMBOYLOUGH
FULL_NAME_RG FULL_NAME_ND_RG NOTE MODIFY_DATE DISPLAY NAME_RANK \
0 Clarebane River Clarebane River NaN 2014-06-27 1,2,3 2
1 Clarebane Clarebane NaN 2014-06-27 1,2,3 1
2 Alina, Lough Alina, Lough NaN 2014-06-27 1,2,3 1
3 Corliss Lough Corliss Lough NaN 2014-06-27 1,2,3 1
4 Drumboy Lough Drumboy Lough NaN 2014-06-27 1,2,3 1
NAME_LINK TRANSL_CD NM_MODIFY_DATE
0 NaN NaN 2014-06-27
1 NaN NaN 2014-06-27
2 NaN NaN 2014-06-27
3 NaN NaN 2014-06-27
4 NaN NaN 2014-06-27
[5 rows x 34 columns]
#1
5
For some reason, the header isn't being inferred correctly. You can pass in the infer_header
keyword argument like so:
由于某些原因,标题没有被正确地推断出来。您可以传入infer_header关键字参数如下:
In [12]: from blaze import Data
In [13]: from odo import CSV, odo
In [14]: d = Data(CSV('uk.txt', sep='\t', has_header=True))
In [15]: d.head(5)
Out[15]:
RC UFI UNI LAT LONG DMS_LAT DMS_LONG \
0 1 380952 475802 54.086111 -6.655556 540510 -63920
1 1 380952 475801 54.086111 -6.655556 540510 -63920
2 1 380954 475805 54.104722 -6.648889 540617 -63856
3 1 380955 475806 54.098056 -6.644167 540553 -63839
4 1 380958 475810 54.040556 -6.614444 540226 -63652
MGRS JOG FC ... SORT_NAME_RG \
0 29UPV5334795644 NN29-06 H ... CLAREBANERIVER
1 29UPV5334795644 NN29-06 H ... CLAREBANE
2 29UPV5371497729 NN29-06 H ... ALINA LOUGH
3 29UPV5404796997 NN29-06 H ... CORLISSLOUGH
4 29UPV5620690667 NN29-06 H ... DRUMBOYLOUGH
FULL_NAME_RG FULL_NAME_ND_RG NOTE MODIFY_DATE DISPLAY NAME_RANK \
0 Clarebane River Clarebane River NaN 2014-06-27 1,2,3 2
1 Clarebane Clarebane NaN 2014-06-27 1,2,3 1
2 Alina, Lough Alina, Lough NaN 2014-06-27 1,2,3 1
3 Corliss Lough Corliss Lough NaN 2014-06-27 1,2,3 1
4 Drumboy Lough Drumboy Lough NaN 2014-06-27 1,2,3 1
NAME_LINK TRANSL_CD NM_MODIFY_DATE
0 NaN NaN 2014-06-27
1 NaN NaN 2014-06-27
2 NaN NaN 2014-06-27
3 NaN NaN 2014-06-27
4 NaN NaN 2014-06-27
[5 rows x 34 columns]
After that, simply odo
it into the desired table:
然后,简单地将它放入所需的表格中:
In [16]: t = odo(d, 'postgresql://localhost::uk')
In [17]: uk = Data(t)
In [19]: uk.head(5)
Out[19]:
RC UFI UNI LAT LONG DMS_LAT DMS_LONG \
0 1 380952 475802 54.086111 -6.655556 540510 -63920
1 1 380952 475801 54.086111 -6.655556 540510 -63920
2 1 380954 475805 54.104722 -6.648889 540617 -63856
3 1 380955 475806 54.098056 -6.644167 540553 -63839
4 1 380958 475810 54.040556 -6.614444 540226 -63652
MGRS JOG FC ... SORT_NAME_RG \
0 29UPV5334795644 NN29-06 H ... CLAREBANERIVER
1 29UPV5334795644 NN29-06 H ... CLAREBANE
2 29UPV5371497729 NN29-06 H ... ALINA LOUGH
3 29UPV5404796997 NN29-06 H ... CORLISSLOUGH
4 29UPV5620690667 NN29-06 H ... DRUMBOYLOUGH
FULL_NAME_RG FULL_NAME_ND_RG NOTE MODIFY_DATE DISPLAY NAME_RANK \
0 Clarebane River Clarebane River NaN 2014-06-27 1,2,3 2
1 Clarebane Clarebane NaN 2014-06-27 1,2,3 1
2 Alina, Lough Alina, Lough NaN 2014-06-27 1,2,3 1
3 Corliss Lough Corliss Lough NaN 2014-06-27 1,2,3 1
4 Drumboy Lough Drumboy Lough NaN 2014-06-27 1,2,3 1
NAME_LINK TRANSL_CD NM_MODIFY_DATE
0 NaN NaN 2014-06-27
1 NaN NaN 2014-06-27
2 NaN NaN 2014-06-27
3 NaN NaN 2014-06-27
4 NaN NaN 2014-06-27
[5 rows x 34 columns]