I would like to read sample csv file shown in below
我想阅读下面显示的示例csv文件
--------------
|A|B|C|
--------------
|1|2|3|
--------------
|4|5|6|
--------------
|7|8|9|
--------------
I tried
pd.read_csv("sample.csv",sep="|")
But it didn't work well.
但它没有奏效。
How can I read this csv?
我怎么读这个csv?
3 个解决方案
#1
11
You can add parameter comment
to read_csv
and then remove columns with NaN
by dropna
:
您可以向read_csv添加参数注释,然后通过dropna删除NaN列:
import pandas as pd
import io
temp=u"""--------------
|A|B|C|
--------------
|1|2|3|
--------------
|4|5|6|
--------------
|7|8|9|
--------------"""
#after testing replace io.StringIO(temp) to filename
df = pd.read_csv(io.StringIO(temp), sep="|", comment='-').dropna(axis=1, how='all')
print (df)
A B C
0 1 2 3
1 4 5 6
2 7 8 9
More general solution:
更一般的解决方案
import pandas as pd
import io
temp=u"""--------------
|A|B|C|
--------------
|1|2|3|
--------------
|4|5|6|
--------------
|7|8|9|
--------------"""
#after testing replace io.StringIO(temp) to filename
#separator is char which is NOT in csv
df = pd.read_csv(io.StringIO(temp), sep="^", comment='-')
#remove first and last | in data and in column names
df.iloc[:,0] = df.iloc[:,0].str.strip('|')
df.columns = df.columns.str.strip('|')
#split column names
cols = df.columns.str.split('|')[0]
#split data
df = df.iloc[:,0].str.split('|', expand=True)
df.columns = cols
print (df)
A B C
0 1 2 3
1 4 5 6
2 7 8 9
#2
1
Try "import csv" rather than directly use pandas.
尝试“import csv”而不是直接使用pandas。
import csv
easy_csv = []
with open('sample.csv', 'rb') as csvfile:
test = csv.reader(csvfile, delimiter=' ', quotechar='|')
for row in test:
row_preprocessed = """ handling rows at here; removing |, ignoring row that has ----"""
easy_csv.append([row_preprocessed])
After this preprocessing, you can save it into comma separated csv files to easily handle on pandas.
在预处理之后,您可以将其保存为逗号分隔的csv文件,以便轻松处理pandas。
#3
#1
11
You can add parameter comment
to read_csv
and then remove columns with NaN
by dropna
:
您可以向read_csv添加参数注释,然后通过dropna删除NaN列:
import pandas as pd
import io
temp=u"""--------------
|A|B|C|
--------------
|1|2|3|
--------------
|4|5|6|
--------------
|7|8|9|
--------------"""
#after testing replace io.StringIO(temp) to filename
df = pd.read_csv(io.StringIO(temp), sep="|", comment='-').dropna(axis=1, how='all')
print (df)
A B C
0 1 2 3
1 4 5 6
2 7 8 9
More general solution:
更一般的解决方案
import pandas as pd
import io
temp=u"""--------------
|A|B|C|
--------------
|1|2|3|
--------------
|4|5|6|
--------------
|7|8|9|
--------------"""
#after testing replace io.StringIO(temp) to filename
#separator is char which is NOT in csv
df = pd.read_csv(io.StringIO(temp), sep="^", comment='-')
#remove first and last | in data and in column names
df.iloc[:,0] = df.iloc[:,0].str.strip('|')
df.columns = df.columns.str.strip('|')
#split column names
cols = df.columns.str.split('|')[0]
#split data
df = df.iloc[:,0].str.split('|', expand=True)
df.columns = cols
print (df)
A B C
0 1 2 3
1 4 5 6
2 7 8 9
#2
1
Try "import csv" rather than directly use pandas.
尝试“import csv”而不是直接使用pandas。
import csv
easy_csv = []
with open('sample.csv', 'rb') as csvfile:
test = csv.reader(csvfile, delimiter=' ', quotechar='|')
for row in test:
row_preprocessed = """ handling rows at here; removing |, ignoring row that has ----"""
easy_csv.append([row_preprocessed])
After this preprocessing, you can save it into comma separated csv files to easily handle on pandas.
在预处理之后,您可以将其保存为逗号分隔的csv文件,以便轻松处理pandas。