如何使用SQL“转置”数据,同时删除重复数据?

时间:2023-02-05 04:41:09

I have the following data structure in my database:

我的数据库中有以下数据结构:

LastName    FirstName    CourseName
John        Day          Pricing
John        Day          Marketing
John        Day          Finance
Lisa        Smith        Marketing
Lisa        Smith        Finance
etc...

The data shows employess within a business and which courses they have shown a preference to attend. The number of courses per employee will vary (i.e. as above, John has 3 courses and Lisa 2).

这些数据显示了企业内部的雇员,以及他们倾向于参加哪些课程。每个员工的课程数量会有所不同(例如,John有3门课程,Lisa有2门)。

I need to take this data from the database and pass it to a webpage view (asp.net mvc).

我需要从数据库中获取这些数据并将其传递给一个网页视图(asp.net mvc)。

I would like the data that comes out of my database to match the view as much as possible and want to transform the data using SQl so that it looks like the following:

我希望来自我的数据库的数据尽可能地匹配视图,并希望使用SQl对数据进行转换,使其看起来如下所示:

LastName    FirstName    Course1    Course2    Course3
John        Day          Pricing    Marketing  Finance
Lisa        Smith        Marketing  Finance

Any thoughts on how this may be achieved?

有什么想法吗?


Note: one of the reasons I am trying this approach is that the original data structure does not easily lend itself to be iterated over using the typical mvc syntax:

注意:我尝试使用这种方法的原因之一是,原始数据结构不容易被迭代使用典型的mvc语法:

<% foreach (var item in Model.courseData) { %>

Because of the duplication of names in the orignal data I would end up with lots of conditionals in my View which I would like to avoid.

由于原始数据中的名称重复,在我的视图中,我将会得到许多条件语句,我想要避免这些条件。

I have tried transforming the data using c# in my ViewModel but have found it tough going and feel that I could lighten the workload by leveraging SQL before I return the data.

我曾尝试在我的ViewModel中使用c#来转换数据,但我发现这很困难,我觉得可以在返回数据之前利用SQL来减轻工作负载。

Thanks.

谢谢。

6 个解决方案

#1


1  

NOTE: I don't want to use the Database PIVOT feature because the number of classes will most likely change and you have to update your SQL query.

What you are trying to do should not be done in the View.
The Model object should be simple enough to be rendered when passed to a View.
Because, the View shouldn't have any complicated logic in it.

你正在尝试做的事情不应该在视图中完成。模型对象应该足够简单,以便在传递给视图时呈现。因为视图中不应该有任何复杂的逻辑。

Therefore, we need to pre-process the data returned from the SQL Server Database first.

因此,我们需要先对SQL Server数据库返回的数据进行预处理。


I am going to assume that you have an object like this because you didn't mention it has a unique identifier like "CustomerId" or something.

我假设你有一个这样的对象因为你没有提到它有一个唯一的标识符比如"CustomerId"之类的。

public class CourseData {
    public string FirstName { get; set; }
    public string LastName  { get; set; }
    public string ClassName { get; set; }
}

This is going to be my [Model] object.

这将是我的[模型]对象。

public class MyViewModel {
    public IDictionary<string, IList<string>> CourseData { get; set; }
    public int MaxCourseCount { get; set; }
}

This is my Action method in the Controller.

这是我在控制器中的动作方法。

public ActionResult MyView()
{
    IDictionary<string, IList<string>> dict = new Dictionary<string, IList<string>>();

    // retrieve data from the database
    IList<CourseData> result = RetrieveData();
    foreach (var item in result)
    {
        // [FirstName] and [LastName] combo will be used as KEY entry
        string key = item.FirstName + " " + item.LastName;

        if (dict.ContainsKey(key))
        {
            // add the class name into an existing "string" collection
            dict[key].Add( item.ClassName );
        }
        else
        {
            // instantiate a new "string" collection and add the class name.
            dict[key] = new List<string> { item.ClassName };
        }
    }

    // find out which Person has attended the most number of classes.
    int maxCourseCount = 0;
    foreach (var key in dict.Keys)
    {
        int valueCount = dict[key].Count;
        if (valueCount > maxCourseCount)
            maxCourseCount = valueCount;
    }


    MyViewModel model = new MyViewModel {
        CourseData = dict,
        MaxCourseCount = maxCourseCount
    };
    return View(model);
}

I aggregated the data into a Data Structure that is easier to render. I intentionally added [MaxCourseCount] property to my Model object because it seems like you want to render the Course Data in a <table> format.

我将数据聚合到一个更容易呈现的数据结构中。我有意将[MaxCourseCount]属性添加到模型对象中,因为您似乎希望以

格式呈现课程数据。

So, all you need to do now is

你现在要做的就是

  1. loop through your Model.CourseData dictionary object's Keys.
  2. 遍历你的模型。CourseData字典对象的键。
  3. render the Course Names in individual <td> table cells.
  4. 在单个表单元格中呈现课程名称。
  5. render remaining <td> table cells based on your Model's [MaxCourseCount] value.
  6. 根据模型的[MaxCourseCount]值呈现剩余的表单元格。

I hope this helps.

我希望这可以帮助。

-Soe

国有企业

#2


1  

You can transpose the data using PIVOT, but you will get the courses as column names, and a 1 or 0 as data for attending / not attending. Using that and a bit of code in the ViewModel should get you what you want:

您可以使用PIVOT来转换数据,但是您将获得课程的列名,以及1或0作为参加/不参加的数据。在ViewModel中使用这些代码和一些代码可以得到您想要的:

SELECT  * FROM 
(SELECT FirstName, LastName, Course FROM Courses) src
PIVOT (COUNT(Course) FOR Course IN ([Finance] ,[Marketing],[Pricing])) AS pvt

Which gives you:

它给你:

FirstName    LastName  Finance Marketing Pricing
--------------------------------------------------------------
John         Day       1       1         1
Lisa         Smith     1       1         0

#3


1  

Before you begin the presentation layer, you may want to re-examine your data layer.

在开始呈现层之前,您可能需要重新检查您的数据层。

I would normalize your data into three tables. One that contains people called Person (with some primary key - your call) and one that contains courses called Course (again, with some form of primary key). Then, you can handle your mappings in a third table (call it Schedule) that relates the primary key of Person to the primary key of Course.

我将您的数据规范化为三个表。一个包含被称为Person(带有主键——您的调用)的人,一个包含课程被称为Course(同样,带有主键的某种形式)的人。然后,您可以在第三个表(称为Schedule)中处理映射,该表将Person的主键与主键联系起来。

If I can get to a spot where I can whip out an ER diagram, I'll post on an edit.

如果我能找到一个点,我可以拿出一个ER图,我将在编辑。

Once you have your data in order, it makes the task of displaying and arranging it much easier.

一旦你把你的数据整理好了,它会使显示和排列数据变得更容易。

#4


1  

You can do this in straight SQL (runnable example):

您可以直接使用SQL (runnable示例):

DECLARE @data AS TABLE (LastName varchar(25), FirstName varchar(25), CourseName varchar(25))

INSERT INTO @data VALUES ('John', 'Day', 'Pricing')
INSERT INTO @data VALUES ('John', 'Day', 'Marketing')
INSERT INTO @data VALUES ('John', 'Day', 'Finance')
INSERT INTO @data VALUES ('Lisa', 'Smith', 'Marketing')
INSERT INTO @data VALUES ('Lisa', 'Smith', 'Finance')

SELECT *
FROM (
SELECT LastName, FirstName, CourseName, Bucket = 'Course' + CAST(ROW_NUMBER() OVER(PARTITION BY LastName, FirstName ORDER BY CourseName) AS varchar) FROM @data
) AS n PIVOT (MIN(CourseName) FOR Bucket IN ([Course1], [Course2], [Course3])) AS pvt

#5


0  

You could use aggregate string concatenation with eg. ',' as separator and have results more like:

可以使用聚合字符串连接(例如)。','作为分隔符,结果更像:

LastName    FirstName    Courses
John        Day          Pricing, Marketing, Finance
Lisa        Smith        Marketing, Finance

#6


0  

There is nothing in your sample data to indicate priority. E.g., how do we know that Day John's first choice is Pricing? Presuming you had such a column, then you can do something like:

在您的示例数据中没有什么可以指示优先级。我们怎么知道那天约翰的第一选择是定价?假设你有这样一个专栏,那么你可以这样做:

Select LastName, FirstName
    , Min( Case When Priority = 1 Then CourseName End ) As Course1
    , Min( Case When Priority = 2 Then CourseName End ) As Course2  
    , Min( Case When Priority = 3 Then CourseName End ) As Course3
From Table
Group By LastName, FirstName

#1


1  

NOTE: I don't want to use the Database PIVOT feature because the number of classes will most likely change and you have to update your SQL query.

What you are trying to do should not be done in the View.
The Model object should be simple enough to be rendered when passed to a View.
Because, the View shouldn't have any complicated logic in it.

你正在尝试做的事情不应该在视图中完成。模型对象应该足够简单,以便在传递给视图时呈现。因为视图中不应该有任何复杂的逻辑。

Therefore, we need to pre-process the data returned from the SQL Server Database first.

因此,我们需要先对SQL Server数据库返回的数据进行预处理。


I am going to assume that you have an object like this because you didn't mention it has a unique identifier like "CustomerId" or something.

我假设你有一个这样的对象因为你没有提到它有一个唯一的标识符比如"CustomerId"之类的。

public class CourseData {
    public string FirstName { get; set; }
    public string LastName  { get; set; }
    public string ClassName { get; set; }
}

This is going to be my [Model] object.

这将是我的[模型]对象。

public class MyViewModel {
    public IDictionary<string, IList<string>> CourseData { get; set; }
    public int MaxCourseCount { get; set; }
}

This is my Action method in the Controller.

这是我在控制器中的动作方法。

public ActionResult MyView()
{
    IDictionary<string, IList<string>> dict = new Dictionary<string, IList<string>>();

    // retrieve data from the database
    IList<CourseData> result = RetrieveData();
    foreach (var item in result)
    {
        // [FirstName] and [LastName] combo will be used as KEY entry
        string key = item.FirstName + " " + item.LastName;

        if (dict.ContainsKey(key))
        {
            // add the class name into an existing "string" collection
            dict[key].Add( item.ClassName );
        }
        else
        {
            // instantiate a new "string" collection and add the class name.
            dict[key] = new List<string> { item.ClassName };
        }
    }

    // find out which Person has attended the most number of classes.
    int maxCourseCount = 0;
    foreach (var key in dict.Keys)
    {
        int valueCount = dict[key].Count;
        if (valueCount > maxCourseCount)
            maxCourseCount = valueCount;
    }


    MyViewModel model = new MyViewModel {
        CourseData = dict,
        MaxCourseCount = maxCourseCount
    };
    return View(model);
}

I aggregated the data into a Data Structure that is easier to render. I intentionally added [MaxCourseCount] property to my Model object because it seems like you want to render the Course Data in a <table> format.

我将数据聚合到一个更容易呈现的数据结构中。我有意将[MaxCourseCount]属性添加到模型对象中,因为您似乎希望以

格式呈现课程数据。

So, all you need to do now is

你现在要做的就是

  1. loop through your Model.CourseData dictionary object's Keys.
  2. 遍历你的模型。CourseData字典对象的键。
  3. render the Course Names in individual <td> table cells.
  4. 在单个表单元格中呈现课程名称。
  5. render remaining <td> table cells based on your Model's [MaxCourseCount] value.
  6. 根据模型的[MaxCourseCount]值呈现剩余的表单元格。

I hope this helps.

我希望这可以帮助。

-Soe

国有企业

#2


1  

You can transpose the data using PIVOT, but you will get the courses as column names, and a 1 or 0 as data for attending / not attending. Using that and a bit of code in the ViewModel should get you what you want:

您可以使用PIVOT来转换数据,但是您将获得课程的列名,以及1或0作为参加/不参加的数据。在ViewModel中使用这些代码和一些代码可以得到您想要的:

SELECT  * FROM 
(SELECT FirstName, LastName, Course FROM Courses) src
PIVOT (COUNT(Course) FOR Course IN ([Finance] ,[Marketing],[Pricing])) AS pvt

Which gives you:

它给你:

FirstName    LastName  Finance Marketing Pricing
--------------------------------------------------------------
John         Day       1       1         1
Lisa         Smith     1       1         0

#3


1  

Before you begin the presentation layer, you may want to re-examine your data layer.

在开始呈现层之前,您可能需要重新检查您的数据层。

I would normalize your data into three tables. One that contains people called Person (with some primary key - your call) and one that contains courses called Course (again, with some form of primary key). Then, you can handle your mappings in a third table (call it Schedule) that relates the primary key of Person to the primary key of Course.

我将您的数据规范化为三个表。一个包含被称为Person(带有主键——您的调用)的人,一个包含课程被称为Course(同样,带有主键的某种形式)的人。然后,您可以在第三个表(称为Schedule)中处理映射,该表将Person的主键与主键联系起来。

If I can get to a spot where I can whip out an ER diagram, I'll post on an edit.

如果我能找到一个点,我可以拿出一个ER图,我将在编辑。

Once you have your data in order, it makes the task of displaying and arranging it much easier.

一旦你把你的数据整理好了,它会使显示和排列数据变得更容易。

#4


1  

You can do this in straight SQL (runnable example):

您可以直接使用SQL (runnable示例):

DECLARE @data AS TABLE (LastName varchar(25), FirstName varchar(25), CourseName varchar(25))

INSERT INTO @data VALUES ('John', 'Day', 'Pricing')
INSERT INTO @data VALUES ('John', 'Day', 'Marketing')
INSERT INTO @data VALUES ('John', 'Day', 'Finance')
INSERT INTO @data VALUES ('Lisa', 'Smith', 'Marketing')
INSERT INTO @data VALUES ('Lisa', 'Smith', 'Finance')

SELECT *
FROM (
SELECT LastName, FirstName, CourseName, Bucket = 'Course' + CAST(ROW_NUMBER() OVER(PARTITION BY LastName, FirstName ORDER BY CourseName) AS varchar) FROM @data
) AS n PIVOT (MIN(CourseName) FOR Bucket IN ([Course1], [Course2], [Course3])) AS pvt

#5


0  

You could use aggregate string concatenation with eg. ',' as separator and have results more like:

可以使用聚合字符串连接(例如)。','作为分隔符,结果更像:

LastName    FirstName    Courses
John        Day          Pricing, Marketing, Finance
Lisa        Smith        Marketing, Finance

#6


0  

There is nothing in your sample data to indicate priority. E.g., how do we know that Day John's first choice is Pricing? Presuming you had such a column, then you can do something like:

在您的示例数据中没有什么可以指示优先级。我们怎么知道那天约翰的第一选择是定价?假设你有这样一个专栏,那么你可以这样做:

Select LastName, FirstName
    , Min( Case When Priority = 1 Then CourseName End ) As Course1
    , Min( Case When Priority = 2 Then CourseName End ) As Course2  
    , Min( Case When Priority = 3 Then CourseName End ) As Course3
From Table
Group By LastName, FirstName