如何在Sql Server中保存HTML数据

时间:2023-01-08 00:55:23

I have feedback panel where user can write HTML formated feedback using AJAX HTMLEditor

我有反馈面板,用户可以使用AJAX HTMLEditor编写HTML格式化反馈

I want to save this HTML DATA in SQL server

我想在SQL Server中保存此HTML DATA

HTML SOURCE

This is <span style="font-weight: bold; ">nice</span> question

HTML OUTPUT

This is nice question

Now how can i search to my database if your find "is nice" then my query can not response is nice because database contains HTML tags too.

现在我如何搜索我的数据库,如果你的发现“很好”然后我的查询无法响应很好,因为数据库也包含HTML标签。

So what are best practices to save and retrieve HTML data using SQL Query & ASP.net.

那么使用SQL Query和ASP.net保存和检索HTML数据的最佳实践是什么?

4 个解决方案

#1


4  

You might get some mileage out the SQL Server's full-text search capability. Here is a resource that describes strategies to apply full-text search to HTML text stored in SQL Server:

您可能会从SQL Server的全文搜索功能中获得一些好处。这是一个描述将全文搜索应用于存储在SQL Server中的HTML文本的策略的资源:

http://www.developmentnow.com/blog/SQL+Server+2005+Full+Text+Search+On+HTML+Documents.aspx

#2


2  

If you're using SQL Server 2008 then full-text indexing is a good option. Store your HTML in a varbinary(max) column and set its associated file type to ".html" in a file type column. The full-text indexer will parse the data as HTML and search only the text content while ignoring the HTML tags.

如果您使用的是SQL Server 2008,则全文索引是一个不错的选择。将HTML存储在varbinary(max)列中,并在文件类型列中将其关联的文件类型设置为“.html”。全文索引器将数据解析为HTML并仅搜索文本内容,同时忽略HTML标记。

#3


0  

Store the data twice in two different columns; once as HTML and another as just plain text. Do the display from the HTML column, and perform any searches against the text column.

将数据存储在两个不同的列中两次;一次作为HTML,另一次作为纯文本。从HTML列中显示,并对文本列执行任何搜索。

#4


0  

Another answer is to use a CTE to strip out the HTML before doing a search.

另一个答案是在进行搜索之前使用CTE去除HTML。

The following CTE extracts likely rows that satisfy the search criteria and recursively strips out the HTML. The Query then uses the results of the CTE to filter out rows still containing HTML and ones that don’t exactly match the search criteria.

以下CTE提取可能满足搜索条件的行,并递归地删除HTML。然后,查询使用CTE的结果过滤掉仍包含HTML的行和与搜索条件不完全匹配的行。

The CTE isn’t as complicated as it looks. Most of the fiddling is to cope with PATINDEX returning 0.

CTE并不像看起来那么复杂。大多数摆弄是为了应对PATINDEX返回0。

--** Test table
DECLARE @HTML TABLE (id INT IDENTITY, html VARCHAR(max))
INSERT INTO @HTML SELECT 'This is a <span style="font-weight: bold; ">nice</span> question';
INSERT INTO @HTML SELECT 'The cat sat <span style="font-weight: bold; ">on the</span> mat';

--** Search criteria
DECLARE @Search VARCHAR(50) = 'is a nice';

--** CTE to return the matching rows ignoring the HTML
;WITH Search_CTE (html_id, html_text)
AS (
    SELECT h.id AS 'html_id'
         , LEFT(h.html,REPLACE(PATINDEX('%<%',h.html)-1,-1,999999)) + SUBSTRING(h.html,CONVERT(INT,REPLACE(PATINDEX('%>%',h.html)+1,1,999999)),LEN(h.html)) AS 'html_text'
      FROM @HTML AS h
     WHERE h.html LIKE '%' + REPLACE(@Search,' ','%') + '%'
     UNION ALL
    SELECT c.html_id AS 'html_id'
         , LEFT(c.html_text,REPLACE(PATINDEX('%<%',c.html_text)-1,-1,999999)) + SUBSTRING(c.html_text,CONVERT(INT,REPLACE(PATINDEX('%>%',c.html_text)+1,1,999999)),LEN(c.html_text)) AS 'html_text'
      FROM Search_CTE AS c
     WHERE PATINDEX('%<%',c.html_text) > 0
)
SELECT h.html AS 'Original HTML'
     , cte.html_text AS 'HTML Text'
  FROM Search_CTE AS cte
  JOIN @HTML AS h
    ON h.id = cte.html_id
 WHERE PATINDEX('%<%',cte.html_text) = 0   --** Filter out rows still containing HTML
   AND html_text LIKE '%' + @Search + '%'; --** Filter out rows not matching the search criteria

This query has the limitation that it doesn't handle the situation where > or < is in the text, but this can be coded around if required.

此查询具有以下限制:它不处理文本中>或 <的情况,但如果需要,可以对其进行编码。< p>

#1


4  

You might get some mileage out the SQL Server's full-text search capability. Here is a resource that describes strategies to apply full-text search to HTML text stored in SQL Server:

您可能会从SQL Server的全文搜索功能中获得一些好处。这是一个描述将全文搜索应用于存储在SQL Server中的HTML文本的策略的资源:

http://www.developmentnow.com/blog/SQL+Server+2005+Full+Text+Search+On+HTML+Documents.aspx

#2


2  

If you're using SQL Server 2008 then full-text indexing is a good option. Store your HTML in a varbinary(max) column and set its associated file type to ".html" in a file type column. The full-text indexer will parse the data as HTML and search only the text content while ignoring the HTML tags.

如果您使用的是SQL Server 2008,则全文索引是一个不错的选择。将HTML存储在varbinary(max)列中,并在文件类型列中将其关联的文件类型设置为“.html”。全文索引器将数据解析为HTML并仅搜索文本内容,同时忽略HTML标记。

#3


0  

Store the data twice in two different columns; once as HTML and another as just plain text. Do the display from the HTML column, and perform any searches against the text column.

将数据存储在两个不同的列中两次;一次作为HTML,另一次作为纯文本。从HTML列中显示,并对文本列执行任何搜索。

#4


0  

Another answer is to use a CTE to strip out the HTML before doing a search.

另一个答案是在进行搜索之前使用CTE去除HTML。

The following CTE extracts likely rows that satisfy the search criteria and recursively strips out the HTML. The Query then uses the results of the CTE to filter out rows still containing HTML and ones that don’t exactly match the search criteria.

以下CTE提取可能满足搜索条件的行,并递归地删除HTML。然后,查询使用CTE的结果过滤掉仍包含HTML的行和与搜索条件不完全匹配的行。

The CTE isn’t as complicated as it looks. Most of the fiddling is to cope with PATINDEX returning 0.

CTE并不像看起来那么复杂。大多数摆弄是为了应对PATINDEX返回0。

--** Test table
DECLARE @HTML TABLE (id INT IDENTITY, html VARCHAR(max))
INSERT INTO @HTML SELECT 'This is a <span style="font-weight: bold; ">nice</span> question';
INSERT INTO @HTML SELECT 'The cat sat <span style="font-weight: bold; ">on the</span> mat';

--** Search criteria
DECLARE @Search VARCHAR(50) = 'is a nice';

--** CTE to return the matching rows ignoring the HTML
;WITH Search_CTE (html_id, html_text)
AS (
    SELECT h.id AS 'html_id'
         , LEFT(h.html,REPLACE(PATINDEX('%<%',h.html)-1,-1,999999)) + SUBSTRING(h.html,CONVERT(INT,REPLACE(PATINDEX('%>%',h.html)+1,1,999999)),LEN(h.html)) AS 'html_text'
      FROM @HTML AS h
     WHERE h.html LIKE '%' + REPLACE(@Search,' ','%') + '%'
     UNION ALL
    SELECT c.html_id AS 'html_id'
         , LEFT(c.html_text,REPLACE(PATINDEX('%<%',c.html_text)-1,-1,999999)) + SUBSTRING(c.html_text,CONVERT(INT,REPLACE(PATINDEX('%>%',c.html_text)+1,1,999999)),LEN(c.html_text)) AS 'html_text'
      FROM Search_CTE AS c
     WHERE PATINDEX('%<%',c.html_text) > 0
)
SELECT h.html AS 'Original HTML'
     , cte.html_text AS 'HTML Text'
  FROM Search_CTE AS cte
  JOIN @HTML AS h
    ON h.id = cte.html_id
 WHERE PATINDEX('%<%',cte.html_text) = 0   --** Filter out rows still containing HTML
   AND html_text LIKE '%' + @Search + '%'; --** Filter out rows not matching the search criteria

This query has the limitation that it doesn't handle the situation where > or < is in the text, but this can be coded around if required.

此查询具有以下限制:它不处理文本中>或 <的情况,但如果需要,可以对其进行编码。< p>