
时间:2021-02-28 15:27:13

I'm trying to use Python and BeautifulSoup to scrape some web info, iterate through it and then insert some pieces into a sqlite3 DB. But I keep coming up with this error:

我尝试使用Python和BeautifulSoup来获取一些web信息,遍历它,然后将一些片段插入到sqlite3 DB中。但是我一直在想这个错误:

File "/Users/Chris/Desktop/BS4/", line 103, in TBTscrape c.execute(item) sqlite3.OperationalError: near "s": syntax error

文件“/用户/ Chris /桌面/ BS4 / TBTfile。“py”,第103行,在tbtc .execute(item) sqlite3。操作错误:接近“s”:语法错误。

This same code works fine on a very similar scraper and does not throw these errors. Here's the portion of it:


listicle.append("INSERT INTO headlines (heds, url, image_loc, time, source) VALUES ('" + TBTheadline + "', '" + TBTurl + "', '" + imageName + "', '" + TBTpostDate + "', '" + source + "')")

        print "TBT item already in database"

print listicle

for item in listicle:
    row = c.fetchall()
    print "This has been inserted succcessfully: ", item

1 个解决方案



You are concatenating the collected data into your SQL statements. Never do that, it is the mother of all anti-patterns. Aside from the problems you are seeing (probably due to a ' or similar character in the scraped HTML), you have a gaping security hole in your code (may or may not matter in your case).


Anyway, sqlite3 has a nice way of doing exactly what you want: executemany. In your case



conn.executemany("INSERT INTO headlines (heds, url, image_loc, time, source) VALUES (?,?,?,?,?)", listicle)



You are concatenating the collected data into your SQL statements. Never do that, it is the mother of all anti-patterns. Aside from the problems you are seeing (probably due to a ' or similar character in the scraped HTML), you have a gaping security hole in your code (may or may not matter in your case).


Anyway, sqlite3 has a nice way of doing exactly what you want: executemany. In your case



conn.executemany("INSERT INTO headlines (heds, url, image_loc, time, source) VALUES (?,?,?,?,?)", listicle)