CLR表如何重视“流媒体”功能?

时间:2022-09-12 01:43:43

The MSDN Docs on table-valued Sql Clr functions states:

表值Sql Clr函数上的MSDN Docs声明:

Transact-SQL table-valued functions materialize the results of calling the function into an intermediate table. ... In contrast, CLR table-valued functions represent a streaming alternative. There is no requirement that the entire set of results be materialized in a single table. The IEnumerable object returned by the managed function is directly called by the execution plan of the query that calls the table-valued function, and the results are consumed in an incremental manner. ... It is also a better alternative if you have very large numbers of rows returned, because they do not have to be materialized in memory as a whole.

Transact-SQL表值函数将调用函数的结果实现为中间表。 ...相反,CLR表值函数表示流式替代。不要求在一个表中实现整个结果集。托管函数返回的IEnumerable对象由调用表值函数的查询的执行计划直接调用,结果以增量方式使用。 ...如果返回的行数非常多,它也是一个更好的选择,因为它们不必在整个内存中实现。

Then I find out that no data access is allowed in the 'Fill row' method. This means that you still have to do all of your data access in the init method and keep it in memory, waiting for 'Fill row' to be called. Have I misunderstood something? If I don't force my results into an array or list, I get an error: 'ExecuteReader requires an open and available Connection. The connection's current state is closed.'

然后我发现“填充行”方法中不允许任何数据访问。这意味着您仍然必须在init方法中执行所有数据访问并将其保留在内存中,等待调用“填充行”。我误解了什么吗?如果我不将结果强制转换为数组或列表,则会出现错误:'ExecuteReader需要一个开放且可用的连接。连接的当前状态已关闭。

Code sample:

代码示例:

[<SqlFunction(DataAccess = DataAccessKind.Read, FillRowMethodName = "Example8Row")>]
static member InitExample8() : System.Collections.IEnumerable = 
   let c = cn() // opens a context connection
   // I'd like to avoid forcing enumeration here:
   let data = getData c |> Array.ofSeq
   data :> System.Collections.IEnumerable

static member Example8Row ((obj : Object),(ssn: SqlChars byref)) = 
   do ssn <- new SqlChars(new SqlString(obj :?> string))
   ()

I'm dealing with several million rows here. Is there any way to do this lazily?

我在这里处理数百万行。有没有办法懒惰地这样做?

3 个解决方案

#1


8  

I'm assuming you're using SQL Server 2008. As mentioned by a Microsoft employee on this page, 2008 requires methods to be marked with DataAccessKind.Read much more frequently than 2005. One of those times is when the TVF participates in a transaction (which seemed to always be the case, when I tested). The solution is to specify enlist=false in the connection string, which, alas, cannot be combined with context connection=true. That means your connection string needs to be in typical client format: Data Source=.;Initial Catalog=MyDb;Integrated Security=sspi;Enlist=false and your assembly must be created with permission_set=external_access, at minimum. The following works:

我假设您正在使用SQL Server 2008.正如本页面上的Microsoft员工所提到的,2008年要求使用DataAccessKind.Read标记的方法比2005年更频繁。其中一个时间是TVF参与交易的时间(当我测试时,似乎总是如此)。解决方案是在连接字符串中指定enlist = false,唉,不能与context connection = true结合使用。这意味着您的连接字符串需要采用典型的客户端格式:Data Source = .; Initial Catalog = MyDb; Integrated Security = sspi; Enlist = false,并且必须至少使用permission_set = external_access创建程序集。以下作品:

using System;
using System.Collections;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;

namespace SqlClrTest {
    public static class Test {
        [SqlFunction(
            DataAccess = DataAccessKind.Read,
            SystemDataAccess = SystemDataAccessKind.Read,
            TableDefinition = "RowNumber int",
            FillRowMethodName = "FillRow"
            )]
        public static IEnumerable MyTest(SqlInt32 databaseID) {
            using (var con = new SqlConnection("data source=.;initial catalog=TEST;integrated security=sspi;enlist=false")) {
                con.Open();
                using (var cmd = new SqlCommand("select top (100) RowNumber from SSP1 where DatabaseID = @DatabaseID", con)) {
                    cmd.Parameters.AddWithValue("@DatabaseID", databaseID.IsNull ? (object)DBNull.Value : databaseID.Value);
                    using (var reader = cmd.ExecuteReader()) {
                        while (reader.Read())
                            yield return reader.GetInt32(0);
                    }
                }
            }
        }
        public static void FillRow(object obj, out SqlInt32 rowNumber) {
            rowNumber = (int)obj;
        }
    }
}

Here's the same thing in F#:

这与F#中的相同:

namespace SqlClrTest

module Test =

    open System
    open System.Data
    open System.Data.SqlClient
    open System.Data.SqlTypes
    open Microsoft.SqlServer.Server

    [<SqlFunction(
        DataAccess = DataAccessKind.Read,
        SystemDataAccess = SystemDataAccessKind.Read,
        TableDefinition = "RowNumber int",
        FillRowMethodName = "FillRow"
        )>]
    let MyTest (databaseID:SqlInt32) =
        seq {
            use con = new SqlConnection("data source=.;initial catalog=TEST;integrated security=sspi;enlist=false")
            con.Open()
            use cmd = new SqlCommand("select top (100) RowNumber from SSP1 where DatabaseID = @DatabaseID", con)
            cmd.Parameters.AddWithValue("@DatabaseID", if databaseID.IsNull then box DBNull.Value else box databaseID.Value) |> ignore
            use reader = cmd.ExecuteReader()
            while reader.Read() do
                yield reader.GetInt32(0)
        } :> System.Collections.IEnumerable

    let FillRow (obj:obj) (rowNumber:SqlInt32 byref) =
        rowNumber <- SqlInt32(unbox obj)

The good news is: Microsoft considers this a bug.

好消息是:微软认为这是一个错误。

#2


1  

Yes, you would need to pull the results into memory and then return from there. Although the intention would be that you avoid the need to do such operations.

是的,您需要将结果拉入内存然后从那里返回。虽然意图是你不需要做这样的操作。

You can see an example of the approach in one of the sections of the MSDN doc you linked to ("Sample: Returning the results of a SQL Query")

您可以在链接到的MSDN文档的其中一个部分中看到该方法的示例(“示例:返回SQL查询的结果”)

The examples are a bit contrived though as a real-world implementation of email validation would use a scalar rather than table function - returning a bool for each input email value rather than a list of those that are invalid.

虽然实际的电子邮件验证实现将使用标量而不是表函数 - 为每个输入电子邮件值返回一个bool而不是一个无效的列表,但这些示例有点人为。

Are you able to explain a bit more about what you're trying to achieve? There might be a better way of structuring the function.

你能解释一下你想要达到的目标吗?可能有更好的方法来构造函数。

#3


1  

What you can do is to wrap an SqlDataReader class with an IEnumerable which uses an enumerator which, when its "Next" method is called, does MoveNext on the SqlDataReader and returns the SqlDataReader. Then, your FillRow method expects SqlDataReader as a class. If you have your enumerator close the database connection and the SqlDataReader when it can't "next" any more, then you've effectively streamed your output to the FillRows function. You can do this with a ContextConnection=true as well...

你可以做的是用一个IEnumerable包装一个SqlDataReader类,它使用一个枚举器,当调用它的“Next”方法时,它在SqlDataReader上执行MoveNext并返回SqlDataReader。然后,您的FillRow方法将SqlDataReader作为一个类。如果你的枚举器关闭数据库连接和SqlDataReader,当它不能再“下一步”时,那么你已经有效地将输出流式传输到FillRows函数。您可以使用ContextConnection = true来执行此操作...

...the trouble here is that you have to be able to return the results of an actual query: if you're doing more complex things to create your result set, then you're out of luck.

...这里的麻烦是你必须能够返回实际查询的结果:如果你正在做更复杂的事情来创建你的结果集,那么你就不走运了。

#1


8  

I'm assuming you're using SQL Server 2008. As mentioned by a Microsoft employee on this page, 2008 requires methods to be marked with DataAccessKind.Read much more frequently than 2005. One of those times is when the TVF participates in a transaction (which seemed to always be the case, when I tested). The solution is to specify enlist=false in the connection string, which, alas, cannot be combined with context connection=true. That means your connection string needs to be in typical client format: Data Source=.;Initial Catalog=MyDb;Integrated Security=sspi;Enlist=false and your assembly must be created with permission_set=external_access, at minimum. The following works:

我假设您正在使用SQL Server 2008.正如本页面上的Microsoft员工所提到的,2008年要求使用DataAccessKind.Read标记的方法比2005年更频繁。其中一个时间是TVF参与交易的时间(当我测试时,似乎总是如此)。解决方案是在连接字符串中指定enlist = false,唉,不能与context connection = true结合使用。这意味着您的连接字符串需要采用典型的客户端格式:Data Source = .; Initial Catalog = MyDb; Integrated Security = sspi; Enlist = false,并且必须至少使用permission_set = external_access创建程序集。以下作品:

using System;
using System.Collections;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;

namespace SqlClrTest {
    public static class Test {
        [SqlFunction(
            DataAccess = DataAccessKind.Read,
            SystemDataAccess = SystemDataAccessKind.Read,
            TableDefinition = "RowNumber int",
            FillRowMethodName = "FillRow"
            )]
        public static IEnumerable MyTest(SqlInt32 databaseID) {
            using (var con = new SqlConnection("data source=.;initial catalog=TEST;integrated security=sspi;enlist=false")) {
                con.Open();
                using (var cmd = new SqlCommand("select top (100) RowNumber from SSP1 where DatabaseID = @DatabaseID", con)) {
                    cmd.Parameters.AddWithValue("@DatabaseID", databaseID.IsNull ? (object)DBNull.Value : databaseID.Value);
                    using (var reader = cmd.ExecuteReader()) {
                        while (reader.Read())
                            yield return reader.GetInt32(0);
                    }
                }
            }
        }
        public static void FillRow(object obj, out SqlInt32 rowNumber) {
            rowNumber = (int)obj;
        }
    }
}

Here's the same thing in F#:

这与F#中的相同:

namespace SqlClrTest

module Test =

    open System
    open System.Data
    open System.Data.SqlClient
    open System.Data.SqlTypes
    open Microsoft.SqlServer.Server

    [<SqlFunction(
        DataAccess = DataAccessKind.Read,
        SystemDataAccess = SystemDataAccessKind.Read,
        TableDefinition = "RowNumber int",
        FillRowMethodName = "FillRow"
        )>]
    let MyTest (databaseID:SqlInt32) =
        seq {
            use con = new SqlConnection("data source=.;initial catalog=TEST;integrated security=sspi;enlist=false")
            con.Open()
            use cmd = new SqlCommand("select top (100) RowNumber from SSP1 where DatabaseID = @DatabaseID", con)
            cmd.Parameters.AddWithValue("@DatabaseID", if databaseID.IsNull then box DBNull.Value else box databaseID.Value) |> ignore
            use reader = cmd.ExecuteReader()
            while reader.Read() do
                yield reader.GetInt32(0)
        } :> System.Collections.IEnumerable

    let FillRow (obj:obj) (rowNumber:SqlInt32 byref) =
        rowNumber <- SqlInt32(unbox obj)

The good news is: Microsoft considers this a bug.

好消息是:微软认为这是一个错误。

#2


1  

Yes, you would need to pull the results into memory and then return from there. Although the intention would be that you avoid the need to do such operations.

是的,您需要将结果拉入内存然后从那里返回。虽然意图是你不需要做这样的操作。

You can see an example of the approach in one of the sections of the MSDN doc you linked to ("Sample: Returning the results of a SQL Query")

您可以在链接到的MSDN文档的其中一个部分中看到该方法的示例(“示例:返回SQL查询的结果”)

The examples are a bit contrived though as a real-world implementation of email validation would use a scalar rather than table function - returning a bool for each input email value rather than a list of those that are invalid.

虽然实际的电子邮件验证实现将使用标量而不是表函数 - 为每个输入电子邮件值返回一个bool而不是一个无效的列表,但这些示例有点人为。

Are you able to explain a bit more about what you're trying to achieve? There might be a better way of structuring the function.

你能解释一下你想要达到的目标吗?可能有更好的方法来构造函数。

#3


1  

What you can do is to wrap an SqlDataReader class with an IEnumerable which uses an enumerator which, when its "Next" method is called, does MoveNext on the SqlDataReader and returns the SqlDataReader. Then, your FillRow method expects SqlDataReader as a class. If you have your enumerator close the database connection and the SqlDataReader when it can't "next" any more, then you've effectively streamed your output to the FillRows function. You can do this with a ContextConnection=true as well...

你可以做的是用一个IEnumerable包装一个SqlDataReader类,它使用一个枚举器,当调用它的“Next”方法时,它在SqlDataReader上执行MoveNext并返回SqlDataReader。然后,您的FillRow方法将SqlDataReader作为一个类。如果你的枚举器关闭数据库连接和SqlDataReader,当它不能再“下一步”时,那么你已经有效地将输出流式传输到FillRows函数。您可以使用ContextConnection = true来执行此操作...

...the trouble here is that you have to be able to return the results of an actual query: if you're doing more complex things to create your result set, then you're out of luck.

...这里的麻烦是你必须能够返回实际查询的结果:如果你正在做更复杂的事情来创建你的结果集,那么你就不走运了。