I am writing a program which collects lots of individual pieces of data from a MySQL database and serializes this data into an XML document.
我正在编写一个程序,它从MySQL数据库中收集大量单独的数据,并将这些数据序列化为XML文档。
My XML document has five basic groups, each of which contains 3 subgroups and I am collecting about 100 pieces of data in total.
我的XML文档有五个基本组,每个组包含3个子组,我总共收集了大约100个数据。
The content itself is order details on a shopping website. I first run a query to work out and return the next Order ID which needs to be sent to the client, and then use this Order ID to pull back each piece of data (Address line 1, Address Line 2, Items purchased etc) from various tables in a relational database.
内容本身是购物网站上的订单详情。我首先运行一个查询来计算并返回需要发送到客户端的下一个订单ID,然后使用此订单ID来回退每个数据(地址行1,地址行2,购买的项目等)关系数据库中的各种表。
No matter which way I go about this it's messy. I end up with massive methods, with one variable defined for each piece of data I need, and each variable is equal to the return value of a method I have which runs a query against my MySQL database. Some of the data needs extra logic applied (eg some data needs cleaned, some dates need formatted a certain way for the XML).
无论我采取哪种方式,这都是凌乱的。我最终得到了大量的方法,为我需要的每个数据定义了一个变量,每个变量等于我对MySQL数据库运行查询的方法的返回值。一些数据需要额外的逻辑应用(例如,某些数据需要清理,某些日期需要格式化XML的某种方式)。
A few points
几点
- I didn't create the database structure or the shopping website, and I can't change it
- 我没有创建数据库结构或购物网站,我无法改变它
- I have a working program which is currently running. This is a program I wrote months ago which I am going back to improve
- 我有一个正在运行的工作程序。这是我几个月前写的一个程序,我要回去改进
Some example code is below:
下面是一些示例代码:
string valueOne = myMethod("SELECT `valueOne` FROM `tableOne` WHERE `OrderID` = '12345';");
string valueTwo = myMethod("SELECT `valueTwo` FROM `tableOne` WHERE `OrderID` = '12345';");
string valueThree = myMethod("SELECT `valueThree` FROM `tableTwo` WHERE `OrderID` = '12345';");
int valueFour = Convert.ToInt32(myMethod("SELECT `valueFour` FROM `tableThree` WHERE `OrderID` = '12345';"));
string valueFive = myMethod("SELECT `valueFive` FROM `tableThree` WHERE `OrderID` = '12345';");
if(valueFive == "FooBar")
{
//Do Stuff
}
string valueSix = myMethod("SELECT `valueSix` FROM `tableThree` WHERE `OrderID` = '12345';");
DateTime valueSeven = DateTime.Parse(myMethod("SELECT `valueSeven` FROM `tableFour` WHERE `OrderID` = '12345';"));
string valueEight = myMethod("SELECT `valueEight` FROM `tableFive` WHERE `OrderID` = '12345';");
string valueNine = String.Format("QWERTY - {0} - YTREWQ", myMethod("SELECT `valueNine` FROM `tableSix` WHERE `OrderID` = '12345';"));
string valueTen = myMethod("SELECT `valueTen` FROM `tableSeven` WHERE `OrderID` = '12345';");
MyClass fooBar = new MyClass()
{
valueOne = valueOne,
valueTwo = valueTwo,
valueThree = valueThree,
valueFour = valueFour,
valueFive = valueFive,
valueSix = valueSix,
valueSeven = valueSeven,
mySecondClass = new MySecondClass()
{
valueEight = valueEight,
valueNine = valueNine,
myThirdClass = new MyThirdClass() { valueTen = valueTen }
}
};
SerializeToXML<MyClass>(fooBar);
Imagine that, but with a lot more data. Yes it works, but it is untidy, difficult to maintain and, more generally, isn't very good.
想象一下,但是有更多的数据。是的它有效,但它不整洁,难以维护,更一般地说,不是很好。
So my question is, what is the proper way to collect large amounts of data in a .Net application?
所以我的问题是,在.Net应用程序中收集大量数据的正确方法是什么?
2 个解决方案
#1
2
Right off the bat it looks like you can combine a LOT of queries into a single one. For example, you have the following:
一下子,看起来你可以将很多查询合并到一个查询中。例如,您有以下内容:
string valueOne = myMethod("SELECT `valueOne` FROM `tableOne` WHERE `OrderID` = '12345';");
string valueTwo = myMethod("SELECT `valueTwo` FROM `tableOne` WHERE `OrderID` = '12345';");
string valueThree = myMethod("SELECT `valueThree` FROM `tableTwo` WHERE `OrderID` = '12345';");
int valueFour = Convert.ToInt32(myMethod("SELECT `valueFour` FROM `tableThree` WHERE `OrderID` = '12345';"));
string valueFive = myMethod("SELECT `valueFive` FROM `tableThree` WHERE `OrderID` = '12345';");
I would get rid of those individual queries and use a single one:
我会摆脱那些单独的查询并使用单个查询:
SELECT one.valueOne, one.valueTwo, two.valueThree, three.valueFour, three.valueFive
FROM tableOne one
INNER JOIN tableTwo two on (two.OrderId = one.OrderId)
INNER JOIN tableThree three on (three.OrderId = one.OrderId)
WHERE one.OrderId = '12345';
Quite frankly, it looks like you can do the same thing to the others.
坦率地说,看起来你可以对其他人做同样的事情。
Also, I'm not entirely sure what myMethod
does, but it sure looks like it's simply returning a scalar value on a query. Get rid of that and replace it with something that will give you a DataTable or object collection back. That way you can pull all the data you need in one go.
另外,我不完全确定myMethod的作用,但它确实看起来只是在查询上返回一个标量值。摆脱它并将其替换为将为您提供DataTable或对象集合的东西。这样,您可以一次性获取所需的所有数据。
NOTE: the Inner Joins work if you expect every one of those tables to have values based on the OrderId. If they don't, then do an outer join starting with the table that will always have an existing record. For example:
注意:如果您希望这些表中的每一个都具有基于OrderId的值,则内部联接可以工作。如果他们不这样做,那么从外部联接开始,该表将始终具有现有记录。例如:
SELECT one.valueOne, one.valueTwo, two.valueThree, three.valueFour, three.valueFive
FROM orders o
LEFT OUTER JOIN tableOne one on (one.OrderId = o.OrderId)
LEFT OUTER JOIN tableTwo two on (two.OrderId = o.OrderId)
LEFT OUTER JOIN tableThree three on (three.OrderId = o.OrderId)
WHERE o.OrderId = '12345';
#2
1
You are really under-utilizing object oriented programming. From your example it's hard to infer the relationships between your tables, but Chris Lively did a great job inferring the relationships between your tables, and you described it as an order details from a shopping website. Certainly that implies some consistent relationships.
你真的在利用面向对象的编程。从您的示例中可以很难推断出表格之间的关系,但Chris Lively在推断表格之间的关系方面做得很好,并且您将其描述为购物网站的订单详细信息。当然,这意味着一些一致的关系。
I would approach the problem this way:
我会这样解决问题:
- Develop the most efficient consolidated queries to get the data you need.
- 开发最有效的整合查询以获取所需的数据。
- For each query, create a class. Structure your classes to match your data ("five basic groups, each of which contains 3 subgroups").
- 对于每个查询,创建一个类。构建您的类以匹配您的数据(“五个基本组,每个组包含3个子组”)。
- Follow good OOP principles and hide the complexity required by a class in that class, if that's the only place it is required.
- 遵循良好的OOP原则并隐藏该类中所需的复杂性,如果这是唯一需要它的地方。
- Use much more descriptive names for your variables and classes.
- 为变量和类使用更多描述性名称。
That should yield much more readable, maintainable, "tidy" code.
这应该会产生更易读,可维护,“整洁”的代码。
#1
2
Right off the bat it looks like you can combine a LOT of queries into a single one. For example, you have the following:
一下子,看起来你可以将很多查询合并到一个查询中。例如,您有以下内容:
string valueOne = myMethod("SELECT `valueOne` FROM `tableOne` WHERE `OrderID` = '12345';");
string valueTwo = myMethod("SELECT `valueTwo` FROM `tableOne` WHERE `OrderID` = '12345';");
string valueThree = myMethod("SELECT `valueThree` FROM `tableTwo` WHERE `OrderID` = '12345';");
int valueFour = Convert.ToInt32(myMethod("SELECT `valueFour` FROM `tableThree` WHERE `OrderID` = '12345';"));
string valueFive = myMethod("SELECT `valueFive` FROM `tableThree` WHERE `OrderID` = '12345';");
I would get rid of those individual queries and use a single one:
我会摆脱那些单独的查询并使用单个查询:
SELECT one.valueOne, one.valueTwo, two.valueThree, three.valueFour, three.valueFive
FROM tableOne one
INNER JOIN tableTwo two on (two.OrderId = one.OrderId)
INNER JOIN tableThree three on (three.OrderId = one.OrderId)
WHERE one.OrderId = '12345';
Quite frankly, it looks like you can do the same thing to the others.
坦率地说,看起来你可以对其他人做同样的事情。
Also, I'm not entirely sure what myMethod
does, but it sure looks like it's simply returning a scalar value on a query. Get rid of that and replace it with something that will give you a DataTable or object collection back. That way you can pull all the data you need in one go.
另外,我不完全确定myMethod的作用,但它确实看起来只是在查询上返回一个标量值。摆脱它并将其替换为将为您提供DataTable或对象集合的东西。这样,您可以一次性获取所需的所有数据。
NOTE: the Inner Joins work if you expect every one of those tables to have values based on the OrderId. If they don't, then do an outer join starting with the table that will always have an existing record. For example:
注意:如果您希望这些表中的每一个都具有基于OrderId的值,则内部联接可以工作。如果他们不这样做,那么从外部联接开始,该表将始终具有现有记录。例如:
SELECT one.valueOne, one.valueTwo, two.valueThree, three.valueFour, three.valueFive
FROM orders o
LEFT OUTER JOIN tableOne one on (one.OrderId = o.OrderId)
LEFT OUTER JOIN tableTwo two on (two.OrderId = o.OrderId)
LEFT OUTER JOIN tableThree three on (three.OrderId = o.OrderId)
WHERE o.OrderId = '12345';
#2
1
You are really under-utilizing object oriented programming. From your example it's hard to infer the relationships between your tables, but Chris Lively did a great job inferring the relationships between your tables, and you described it as an order details from a shopping website. Certainly that implies some consistent relationships.
你真的在利用面向对象的编程。从您的示例中可以很难推断出表格之间的关系,但Chris Lively在推断表格之间的关系方面做得很好,并且您将其描述为购物网站的订单详细信息。当然,这意味着一些一致的关系。
I would approach the problem this way:
我会这样解决问题:
- Develop the most efficient consolidated queries to get the data you need.
- 开发最有效的整合查询以获取所需的数据。
- For each query, create a class. Structure your classes to match your data ("five basic groups, each of which contains 3 subgroups").
- 对于每个查询,创建一个类。构建您的类以匹配您的数据(“五个基本组,每个组包含3个子组”)。
- Follow good OOP principles and hide the complexity required by a class in that class, if that's the only place it is required.
- 遵循良好的OOP原则并隐藏该类中所需的复杂性,如果这是唯一需要它的地方。
- Use much more descriptive names for your variables and classes.
- 为变量和类使用更多描述性名称。
That should yield much more readable, maintainable, "tidy" code.
这应该会产生更易读,可维护,“整洁”的代码。