I'm wondering if anyone with a better understanding of python and gae can help me with this. I am uploading a csv file from a form to the gae datastore.
我想知道是否有人对python和gae有更好的理解可以帮助我。我正在将一个csv文件从表单上传到gae数据存储区。
class CSVImport(webapp.RequestHandler):
def post(self):
csv_file = self.request.get('csv_import')
fileReader = csv.reader(csv_file)
for row in fileReader:
self.response.out.write(row)
I'm running into the same problem that someone else mentions here - http://groups.google.com/group/google-appengine/browse_thread/thread/bb2d0b1a80ca7ac2/861c8241308b9717
我遇到了其他人在这里提到的问题 - http://groups.google.com/group/google-appengine/browse_thread/thread/bb2d0b1a80ca7ac2/861c8241308b9717
That is, the csv.reader is iterating over each character and not the line. A google engineer left this explanation:
也就是说,csv.reader迭代每个字符而不是行。一位谷歌工程师留下了这样的解释:
The call self.request.get('csv') returns a String. When you iterate over a string, you iterate over the characters, not the lines. You can see the difference here:
调用self.request.get('csv')返回一个String。迭代字符串时,迭代字符而不是行。你可以在这里看到不同之处:
class ProcessUpload(webapp.RequestHandler):
def post(self):
self.response.out.write(self.request.get('csv'))
file = open(os.path.join(os.path.dirname(__file__), 'sample.csv'))
self.response.out.write(file)
# Iterating over a file
fileReader = csv.reader(file)
for row in fileReader:
self.response.out.write(row)
# Iterating over a string
fileReader = csv.reader(self.request.get('csv'))
for row in fileReader:
self.response.out.write(row)
I really don't follow the explanation, and was unsuccessful implementing it. Can anyone provide a clearer explanation of this and a proposed fix?
我真的不遵循解释,并没有成功实施它。任何人都可以提供更明确的解释和建议的修复?
Thanks, August
谢谢,八月
3 个解决方案
#1
13
Short answer, try this:
简短的回答,试试这个:
fileReader = csv.reader(csv_file.split("\n"))
Long answer, consider the following:
答案很长,请考虑以下事项:
for thing in stuff:
print thing.strip().split(",")
If stuff is a file pointer, each thing is a line. If stuff is a list, each thing is an item. If stuff is a string, each thing is a character.
如果stuff是文件指针,则每个东西都是一行。如果stuff是一个列表,那么每个东西都是一个项目。如果stuff是一个字符串,那么每个东西都是一个字符。
Iterating over the object returned by csv.reader is going to give you behavior similar to iterating over the object passed in, only with each item CSV-parsed. If you iterate over a string, you'll get a CSV-parsed version of each character.
迭代csv.reader返回的对象会给你类似于迭代传入的对象的行为,只对每个CSV解析的项目。如果迭代字符串,您将获得每个字符的CSV解析版本。
#2
8
I can't think of a clearer explanation than what the Google engineer you mentioned said. So let's break it down a bit.
我想不出比你提到的谷歌工程师所说的更明确的解释。所以让我们分解一下吧。
The Python csv
module operates on file-like objects, that is a file or something that behaves like a Python file. Hence, csv.reader() expects to get a file object as it's only required parameter.
Python csv模块在类文件对象上运行,这是一个文件或行为类似于Python文件的东西。因此,csv.reader()希望获得一个文件对象,因为它只是必需的参数。
The webapp.RequestHandler
request object provides access to the HTTP parameters that are posted in the form. In HTTP, parameters are posted as key-value pairs, e.g., csv=record_one,record_two
. When you invoke self.request.get('csv')
this returns the value associated with the key csv as a Python string. A Python string is not a file-like object. Apparently, the csv
module is falling-back when it does not understand the object and simply iterating it (in Python, strings can be iterated over by character, e.g., for c in 'Test String': print c
will print each character in the string on a separate line).
webapp.RequestHandler请求对象提供对窗体中发布的HTTP参数的访问。在HTTP中,参数作为键值对发布,例如csv = record_one,record_two。当您调用self.request.get('csv')时,它会将与键csv关联的值作为Python字符串返回。 Python字符串不是类文件对象。显然,当csv模块不理解对象并简单地迭代它时(在Python中,字符串可以通过字符迭代,例如,对于'测试字符串'中的c来说,csv模块是回落的:print c将打印出每个字符。字符串在一个单独的行)。
Fortunately, Python provides a StringIO class that allows a string to be treated as a file-like object. So (assuming GAE supports StringIO, and there's no reason it shouldn't) you should be able to do this:
幸运的是,Python提供了一个StringIO类,允许将字符串视为类文件对象。所以(假设GAE支持StringIO,并且没有理由不应该这样做)你应该能够这样做:
class ProcessUpload(webapp.RequestHandler):
def post(self):
self.response.out.write(self.request.get('csv'))
# Iterating over a string as a file
stringReader = csv.reader(StringIO.StringIO(self.request.get('csv')))
for row in stringReader:
self.response.out.write(row)
Which will work as you expect it to.
哪个会像你期望的那样工作。
Edit I'm assuming that you are using something like a <textarea/>
to collect the csv file. If you're uploading an attachment, different handling may be necessary (I'm not all that familiar with Python GAE or how it handles attachments).
编辑我假设您正在使用类似的东西来收集csv文件。如果您要上传附件,可能需要进行不同的处理(我不熟悉Python GAE或它如何处理附件)。</p>
#3
0
You need to call csv_file = self.request.POST.get("csv_import")
and not csv_file = self.request.get("csv_import")
.
您需要调用csv_file = self.request.POST.get(“csv_import”)而不是csv_file = self.request.get(“csv_import”)。
The second one just gives you a string as you mentioned in your original post. But accessing via self.request.POST.get
gives you a cgi.FieldStorage object.
第二个只是给你一个你在原帖中提到的字符串。但是通过self.request.POST.get访问会给你一个cgi.FieldStorage对象。
This means that you can call csv_file.filename
to get the object’s filename and csv_file.type
to get the mimetype. Furthermore, if you access csv_file.file
, it’s a StringO object (a read-only object from the StringIO module), not just a string. As ig0774 mentioned in his answer, the StringIO module allows you to treat a string as a file.
这意味着您可以调用csv_file.filename来获取对象的文件名,使用csv_file.type来获取mimetype。此外,如果访问csv_file.file,它是一个StringO对象(StringIO模块中的只读对象),而不仅仅是一个字符串。正如ig0774在他的回答中提到的,StringIO模块允许您将字符串视为文件。
Therefore, your code can simply be:
因此,您的代码可以简单地:
class CSVImport(webapp.RequestHandler):
def post(self):
csv_file = self.request.POST.get('csv_import')
fileReader = csv.reader(csv_file.file)
for row in fileReader:
# row is now a list containing all the column data in that row
self.response.out.write(row)
#1
13
Short answer, try this:
简短的回答,试试这个:
fileReader = csv.reader(csv_file.split("\n"))
Long answer, consider the following:
答案很长,请考虑以下事项:
for thing in stuff:
print thing.strip().split(",")
If stuff is a file pointer, each thing is a line. If stuff is a list, each thing is an item. If stuff is a string, each thing is a character.
如果stuff是文件指针,则每个东西都是一行。如果stuff是一个列表,那么每个东西都是一个项目。如果stuff是一个字符串,那么每个东西都是一个字符。
Iterating over the object returned by csv.reader is going to give you behavior similar to iterating over the object passed in, only with each item CSV-parsed. If you iterate over a string, you'll get a CSV-parsed version of each character.
迭代csv.reader返回的对象会给你类似于迭代传入的对象的行为,只对每个CSV解析的项目。如果迭代字符串,您将获得每个字符的CSV解析版本。
#2
8
I can't think of a clearer explanation than what the Google engineer you mentioned said. So let's break it down a bit.
我想不出比你提到的谷歌工程师所说的更明确的解释。所以让我们分解一下吧。
The Python csv
module operates on file-like objects, that is a file or something that behaves like a Python file. Hence, csv.reader() expects to get a file object as it's only required parameter.
Python csv模块在类文件对象上运行,这是一个文件或行为类似于Python文件的东西。因此,csv.reader()希望获得一个文件对象,因为它只是必需的参数。
The webapp.RequestHandler
request object provides access to the HTTP parameters that are posted in the form. In HTTP, parameters are posted as key-value pairs, e.g., csv=record_one,record_two
. When you invoke self.request.get('csv')
this returns the value associated with the key csv as a Python string. A Python string is not a file-like object. Apparently, the csv
module is falling-back when it does not understand the object and simply iterating it (in Python, strings can be iterated over by character, e.g., for c in 'Test String': print c
will print each character in the string on a separate line).
webapp.RequestHandler请求对象提供对窗体中发布的HTTP参数的访问。在HTTP中,参数作为键值对发布,例如csv = record_one,record_two。当您调用self.request.get('csv')时,它会将与键csv关联的值作为Python字符串返回。 Python字符串不是类文件对象。显然,当csv模块不理解对象并简单地迭代它时(在Python中,字符串可以通过字符迭代,例如,对于'测试字符串'中的c来说,csv模块是回落的:print c将打印出每个字符。字符串在一个单独的行)。
Fortunately, Python provides a StringIO class that allows a string to be treated as a file-like object. So (assuming GAE supports StringIO, and there's no reason it shouldn't) you should be able to do this:
幸运的是,Python提供了一个StringIO类,允许将字符串视为类文件对象。所以(假设GAE支持StringIO,并且没有理由不应该这样做)你应该能够这样做:
class ProcessUpload(webapp.RequestHandler):
def post(self):
self.response.out.write(self.request.get('csv'))
# Iterating over a string as a file
stringReader = csv.reader(StringIO.StringIO(self.request.get('csv')))
for row in stringReader:
self.response.out.write(row)
Which will work as you expect it to.
哪个会像你期望的那样工作。
Edit I'm assuming that you are using something like a <textarea/>
to collect the csv file. If you're uploading an attachment, different handling may be necessary (I'm not all that familiar with Python GAE or how it handles attachments).
编辑我假设您正在使用类似的东西来收集csv文件。如果您要上传附件,可能需要进行不同的处理(我不熟悉Python GAE或它如何处理附件)。</p>
#3
0
You need to call csv_file = self.request.POST.get("csv_import")
and not csv_file = self.request.get("csv_import")
.
您需要调用csv_file = self.request.POST.get(“csv_import”)而不是csv_file = self.request.get(“csv_import”)。
The second one just gives you a string as you mentioned in your original post. But accessing via self.request.POST.get
gives you a cgi.FieldStorage object.
第二个只是给你一个你在原帖中提到的字符串。但是通过self.request.POST.get访问会给你一个cgi.FieldStorage对象。
This means that you can call csv_file.filename
to get the object’s filename and csv_file.type
to get the mimetype. Furthermore, if you access csv_file.file
, it’s a StringO object (a read-only object from the StringIO module), not just a string. As ig0774 mentioned in his answer, the StringIO module allows you to treat a string as a file.
这意味着您可以调用csv_file.filename来获取对象的文件名,使用csv_file.type来获取mimetype。此外,如果访问csv_file.file,它是一个StringO对象(StringIO模块中的只读对象),而不仅仅是一个字符串。正如ig0774在他的回答中提到的,StringIO模块允许您将字符串视为文件。
Therefore, your code can simply be:
因此,您的代码可以简单地:
class CSVImport(webapp.RequestHandler):
def post(self):
csv_file = self.request.POST.get('csv_import')
fileReader = csv.reader(csv_file.file)
for row in fileReader:
# row is now a list containing all the column data in that row
self.response.out.write(row)