Python:使用Excel CSV文件只读取某些列和行

时间:2021-12-27 18:13:30

While I can read csv file instead of reading to whole file how can I print only certain rows and columns?

虽然我可以读取csv文件而不是读取整个文件,但如何只打印某些行和列?

Imagine as if this is Excel:

想象一下,好像这​​是Excel:

  A              B              C                  D                    E
State  |Heart Disease Rate| Stroke Death Rate | HIV Diagnosis Rate |Teen Birth Rate

Alabama     235.5             54.5                 16.7                 18.01

Alaska      147.9             44.3                  3.2                  N/A    

Arizona     152.5             32.7                 11.9                  N/A    

Arkansas    221.8             57.4                 10.2                  N/A    

California  177.9             42.2                  N/A                  N/A    

Colorado    145.3             39                    8.4                 9.25    

Heres what I have:

继承人我所拥有的:

import csv

try:
    risk = open('riskfactors.csv', 'r', encoding="windows-1252").read() #find the file

except:
    while risk != "riskfactors.csv":  # if the file cant be found if there is an error
    print("Could not open", risk, "file")
    risk = input("\nPlease try to open file again: ")
else:
    with open("riskfactors.csv") as f:
        reader = csv.reader(f, delimiter=' ', quotechar='|')

        data = []
        for row in reader:# Number of rows including the death rates 
            for col in (2,4): # The columns I want read   B and D
                data.append(row)
                data.append(col)
        for item in data:
            print(item) #print the rows and columns

I need to only read column B and D with all statistics to read like this:

我只需读取B列和D列,所有统计信息都是这样读的:

  A              B                D                    
 State  |Heart Disease Rate| HIV Diagnosis Rate |

 Alabama       235.5             16.7                

  Alaska       147.9             3.2                     

  Arizona      152.5             11.9                     

  Arkansas     221.8             10.2                    

 California    177.9             N/A                     

 Colorado      145.3             8.4                

Edited

no errors

Any ideas on how to tackle this? Everything I try isn't working. Any help or advice is much appreciated.

关于如何解决这个问题的任何想法?我尝试的一切都不起作用。非常感谢任何帮助或建议。

3 个解决方案

#1


3  

If you're still stuck, there's really no reason you have to read the file with the CSV module as all CSV files are just comma separated strings. So, for something simple you could try this, which would give you a list of tuples of the form (state,heart disease rate,HIV diagnosis rate)

如果您仍然卡住了,那么您无需使用CSV模块读取文件,因为所有CSV文件都只是逗号分隔的字符串。所以,对于简单的事情你可以尝试这个,这会给你一个表格的元组列表(状态,心脏病率,HIV诊断率)

output = []

f = open( 'riskfactors.csv', 'rU' ) #open the file in read universal mode
for line in f:
    cells = line.split( "," )
    output.append( ( cells[ 0 ], cells[ 1 ], cells[ 3 ] ) ) #since we want the first, second and third column

f.close()

print output

Just note that you would then have to go through and ignore the header rows if you wanted to do any sort of data analysis.

请注意,如果您想进行任何类型的数据分析,则必须通过并忽略标题行。

#2


9  

I hope you have heard about Pandas for Data Analysis.

我希望你听说过Pandas for Data Analysis。

The following code will do the job for reading columns however about reading rows, you might have to explain more.

以下代码将执行读取列的工作,但是有关读取行的信息,您可能需要解释更多。

import pandas
io = pandas.read_csv('test.csv',sep=",",usecols=(1,2,4)) # To read 1st,2nd and 4th columns
print io 

#3


2  

try this

data = []
for row in reader:# Number of rows including the death rates
    data.append([row[1],row[3]) # The columns I want read  B and D
for item in data
            print(item) #print the rows and columns

#1


3  

If you're still stuck, there's really no reason you have to read the file with the CSV module as all CSV files are just comma separated strings. So, for something simple you could try this, which would give you a list of tuples of the form (state,heart disease rate,HIV diagnosis rate)

如果您仍然卡住了,那么您无需使用CSV模块读取文件,因为所有CSV文件都只是逗号分隔的字符串。所以,对于简单的事情你可以尝试这个,这会给你一个表格的元组列表(状态,心脏病率,HIV诊断率)

output = []

f = open( 'riskfactors.csv', 'rU' ) #open the file in read universal mode
for line in f:
    cells = line.split( "," )
    output.append( ( cells[ 0 ], cells[ 1 ], cells[ 3 ] ) ) #since we want the first, second and third column

f.close()

print output

Just note that you would then have to go through and ignore the header rows if you wanted to do any sort of data analysis.

请注意,如果您想进行任何类型的数据分析,则必须通过并忽略标题行。

#2


9  

I hope you have heard about Pandas for Data Analysis.

我希望你听说过Pandas for Data Analysis。

The following code will do the job for reading columns however about reading rows, you might have to explain more.

以下代码将执行读取列的工作,但是有关读取行的信息,您可能需要解释更多。

import pandas
io = pandas.read_csv('test.csv',sep=",",usecols=(1,2,4)) # To read 1st,2nd and 4th columns
print io 

#3


2  

try this

data = []
for row in reader:# Number of rows including the death rates
    data.append([row[1],row[3]) # The columns I want read  B and D
for item in data
            print(item) #print the rows and columns