I want to store articles in a database, but I cannot seem to find much information on the best way to do this, from what I have read it seems split between most people on how to effectively do this. A lot of people will suggest a way and others will point out sql injection issues, and I cannot seem to find much about this topic that is fairly new.
我想将文章存储在一个数据库中,但我似乎无法找到关于最佳方法的大量信息,从我所看到的内容看来,大多数人似乎都在如何有效地做到这一点。很多人会建议一种方式,其他人会指出sql注入问题,我似乎找不到很多关于这个相当新的主题。
Here is the html of an article:
这是一篇文章的html:
<div id="main">
<article>
<header>
<h3> Title </h3>
<time pubdate="pubdate"> 2011-07-22 </time>
</header>
<p> Article Text </p>
</article>
</div>
Ideally I guess it would be best to store the chunk of html making up each article into a database but there seems to be a lot of problems with this, and like I said I can't find many posts over this particular topic, and as someone new to php and databases I want to get some input on the best way to go about this before I proceed.
理想情况下,我想最好将构成每篇文章的html块存储到一个数据库中,但似乎有很多问题,就像我说我找不到这个特定主题的很多帖子,一个刚接触php和数据库的人我希望在继续之前获得关于最佳方法的一些输入。
6 个解决方案
#1
0
Store your article as TEXT :) Just pass it through this php function first to prevent injection attacks:
将您的文章存储为TEXT :)首先通过此php函数传递它以防止注入攻击:
// Prevent MySQL Injection Attacks
function cleanQuery($string){
if(get_magic_quotes_gpc()) // prevents duplicate backslashes
$string = stripslashes($string);
return mysql_escape_string($string);
}
#2
2
When ever I store a large amount of user text, I just base64 it, then before you display it, make sure to run it through htmlspecialchars, this will keep html from working, so htmlspecialchars(base64_decode($content))
would work fine for displaying.
If you are using bbcode for formatting, then make sure to run htmlspecialchars
before you start formatting your bbcode.
当我存储大量用户文本时,我只是基于它,然后在显示它之前,确保通过htmlspecialchars运行它,这将使html无法工作,所以htmlspecialchars(base64_decode($ content))可以正常工作显示。如果您使用bbcode进行格式化,请确保在开始格式化bbcode之前运行htmlspecialchars。
This isn't the only way, you can sanitize inputs without base64'ng it, but I see no reason not to, especially when nobody needs to see directly into the database.
这不是唯一的方法,你可以在没有base64'ng的情况下清理输入,但我认为没有理由不这样做,特别是当没有人需要直接看到数据库时。
#3
1
Storing it in a SQL db is fine, but you can and you must protect against SQL injection in your code.
将它存储在SQL数据库中很好,但您可以并且必须防止代码中的SQL注入。
ie, cleaning all user input before sending it to the db.
即,在将所有用户输入发送到数据库之前清除它们。
关于SQL注入的PHP手册
#4
1
I think the best method is to just store pure text, but usually that is not the case when you want to use extra formatting. You can convert the html tags to bbcodes or similar tags which can prevent sql injection however if you escape the html content it would be as safe as any other content. so do mysql_real_escape_string on whatever data you put into the database and you would be fine.
我认为最好的方法是只存储纯文本,但是当你想要使用额外的格式时通常情况并非如此。您可以将html标记转换为bbcodes或类似的标记,这可以阻止sql注入,但是如果你转义html内容,它将像任何其他内容一样安全。所以你在数据库中输入的数据都是mysql_real_escape_string,你会没事的。
However, the best practice would be to store the html code along with the article text as a html file which you can serve when the user requests the data but in the database you can just store purely text for indexing and search purposes. This is ideal as you would not need the html content for searching anyways and it will also prevent sql attacks if the content is purely text that is to be stored in the database. But as the user requests the file get the content of the html file for that article which contains the formatted text and serve that.
但是,最佳做法是将html代码与文章文本一起存储为html文件,您可以在用户请求数据时提供该文件,但在数据库中,您只需存储纯文本以进行索引和搜索。这是理想的,因为您不需要html内容进行搜索,如果内容纯粹是要存储在数据库中的文本,它也会阻止sql攻击。但是,当用户请求文件获取该文章的html文件的内容时,该文件包含格式化文本并提供该文件。
#5
1
use lucene or sphinx, either from Zend_Lucene or through solr. they will make the indexing for the article faster, and you can also do a full text search on them too. using lucene or solar to index and search in these cases is pretty much a standard procedure, and will let you scale to millions of articles.
使用lucene或sphinx,来自Zend_Lucene或通过solr。他们会更快地为文章编制索引,你也可以对它们进行全文搜索。在这些情况下使用lucene或solar来索引和搜索几乎是一个标准程序,并且可以让你扩展到数百万篇文章。
sphinx is a daemon that runs "in parallel" to the mysql daemon. for using sphinx, you can use the pecl sphinx extension.
sphinx是一个与mysql守护进程“并行”运行的守护进程。使用sphinx,你可以使用pecl sphinx扩展。
if you want to go with lucene, you can try zend_lucene or solr, which is actually a tomcat distro with an webapp that exposes lucene as a web service, so you can access it in a standard way, independantly of the language.
如果您想使用lucene,您可以尝试使用zend_lucene或solr,这实际上是一个带有webapp的tomcat发行版,它将lucene作为Web服务公开,因此您可以以标准方式访问它,与语言无关。
choosing either of them is ok. you can index by full text (content), and categories, or whatever you need to index by.
选择其中任何一个都可以。您可以按全文(内容),类别或任何需要索引的内容进行索引。
#6
1
the safest way to prevent sql injection here is to use prepared statement.
防止sql注入的最安全的方法是使用预准备语句。
$stmt = $con->prepare("INSERT INTO Articles (Title, Date, Article) VALUES (?, ?, ?)");
$stmt->bind_param("sss", $title, $currentDate, $articleBody);
The question marks represent the values you will pass. "sss" is saying that each of the 3 variables will be a string and then you can call this prepared statement and pass it the correct values.
问号代表您将通过的值。 “sss”表示3个变量中的每一个都是一个字符串,然后你可以调用这个预准备语句并传递正确的值。
$title = $_POST[title];
$currentDate = date("Y-m-d H:i:s");
$articleBody = $_POST[article];
$stmt->execute();
this will make sure that no malicious sql can be injected into your database.
这将确保不会将恶意sql注入您的数据库。
hope this helps!
希望这可以帮助!
#1
0
Store your article as TEXT :) Just pass it through this php function first to prevent injection attacks:
将您的文章存储为TEXT :)首先通过此php函数传递它以防止注入攻击:
// Prevent MySQL Injection Attacks
function cleanQuery($string){
if(get_magic_quotes_gpc()) // prevents duplicate backslashes
$string = stripslashes($string);
return mysql_escape_string($string);
}
#2
2
When ever I store a large amount of user text, I just base64 it, then before you display it, make sure to run it through htmlspecialchars, this will keep html from working, so htmlspecialchars(base64_decode($content))
would work fine for displaying.
If you are using bbcode for formatting, then make sure to run htmlspecialchars
before you start formatting your bbcode.
当我存储大量用户文本时,我只是基于它,然后在显示它之前,确保通过htmlspecialchars运行它,这将使html无法工作,所以htmlspecialchars(base64_decode($ content))可以正常工作显示。如果您使用bbcode进行格式化,请确保在开始格式化bbcode之前运行htmlspecialchars。
This isn't the only way, you can sanitize inputs without base64'ng it, but I see no reason not to, especially when nobody needs to see directly into the database.
这不是唯一的方法,你可以在没有base64'ng的情况下清理输入,但我认为没有理由不这样做,特别是当没有人需要直接看到数据库时。
#3
1
Storing it in a SQL db is fine, but you can and you must protect against SQL injection in your code.
将它存储在SQL数据库中很好,但您可以并且必须防止代码中的SQL注入。
ie, cleaning all user input before sending it to the db.
即,在将所有用户输入发送到数据库之前清除它们。
关于SQL注入的PHP手册
#4
1
I think the best method is to just store pure text, but usually that is not the case when you want to use extra formatting. You can convert the html tags to bbcodes or similar tags which can prevent sql injection however if you escape the html content it would be as safe as any other content. so do mysql_real_escape_string on whatever data you put into the database and you would be fine.
我认为最好的方法是只存储纯文本,但是当你想要使用额外的格式时通常情况并非如此。您可以将html标记转换为bbcodes或类似的标记,这可以阻止sql注入,但是如果你转义html内容,它将像任何其他内容一样安全。所以你在数据库中输入的数据都是mysql_real_escape_string,你会没事的。
However, the best practice would be to store the html code along with the article text as a html file which you can serve when the user requests the data but in the database you can just store purely text for indexing and search purposes. This is ideal as you would not need the html content for searching anyways and it will also prevent sql attacks if the content is purely text that is to be stored in the database. But as the user requests the file get the content of the html file for that article which contains the formatted text and serve that.
但是,最佳做法是将html代码与文章文本一起存储为html文件,您可以在用户请求数据时提供该文件,但在数据库中,您只需存储纯文本以进行索引和搜索。这是理想的,因为您不需要html内容进行搜索,如果内容纯粹是要存储在数据库中的文本,它也会阻止sql攻击。但是,当用户请求文件获取该文章的html文件的内容时,该文件包含格式化文本并提供该文件。
#5
1
use lucene or sphinx, either from Zend_Lucene or through solr. they will make the indexing for the article faster, and you can also do a full text search on them too. using lucene or solar to index and search in these cases is pretty much a standard procedure, and will let you scale to millions of articles.
使用lucene或sphinx,来自Zend_Lucene或通过solr。他们会更快地为文章编制索引,你也可以对它们进行全文搜索。在这些情况下使用lucene或solar来索引和搜索几乎是一个标准程序,并且可以让你扩展到数百万篇文章。
sphinx is a daemon that runs "in parallel" to the mysql daemon. for using sphinx, you can use the pecl sphinx extension.
sphinx是一个与mysql守护进程“并行”运行的守护进程。使用sphinx,你可以使用pecl sphinx扩展。
if you want to go with lucene, you can try zend_lucene or solr, which is actually a tomcat distro with an webapp that exposes lucene as a web service, so you can access it in a standard way, independantly of the language.
如果您想使用lucene,您可以尝试使用zend_lucene或solr,这实际上是一个带有webapp的tomcat发行版,它将lucene作为Web服务公开,因此您可以以标准方式访问它,与语言无关。
choosing either of them is ok. you can index by full text (content), and categories, or whatever you need to index by.
选择其中任何一个都可以。您可以按全文(内容),类别或任何需要索引的内容进行索引。
#6
1
the safest way to prevent sql injection here is to use prepared statement.
防止sql注入的最安全的方法是使用预准备语句。
$stmt = $con->prepare("INSERT INTO Articles (Title, Date, Article) VALUES (?, ?, ?)");
$stmt->bind_param("sss", $title, $currentDate, $articleBody);
The question marks represent the values you will pass. "sss" is saying that each of the 3 variables will be a string and then you can call this prepared statement and pass it the correct values.
问号代表您将通过的值。 “sss”表示3个变量中的每一个都是一个字符串,然后你可以调用这个预准备语句并传递正确的值。
$title = $_POST[title];
$currentDate = date("Y-m-d H:i:s");
$articleBody = $_POST[article];
$stmt->execute();
this will make sure that no malicious sql can be injected into your database.
这将确保不会将恶意sql注入您的数据库。
hope this helps!
希望这可以帮助!