As background, I'm writing php code that parses csv files and does something with each row of each csv file. What that "something" is depends on the values in the row. It's easy to test the values with an "if" structure, however, hardcoding the conditions is not optimal for two reasons:
作为背景,我正在编写php代码来解析csv文件,并对每个csv文件的每一行执行一些操作。“某物”是什么,这取决于行中的值。使用“if”结构测试值很容易,但是硬编码条件不是最优的原因有两个:
-
there are several hundred possible conditions to be tested. That's just to begin with. More conditions will be added in the future.
有几百种可能的条件需要测试。这只是开始。将来会增加更多的条件。
-
Each csv row does not need to be tested for each condition; as soon as a condition for a row evaluates to true, no further conditions need be evaluated.
每个csv行不需要对每个条件进行测试;一旦行的条件计算为true,就不需要对进一步的条件进行评估。
Ideally, for my situation, the "if" conditions would be stored in a postgres table, put into a string variable one by one, and then each variable would tested by a single if structure (inside a loop of some kind) until a condition evaluates to true.
理想情况下,对于我的情况,“if”条件将存储在postgres表中,一个接一个地放入字符串变量中,然后每个变量将由一个if结构(在某种循环中)测试,直到条件计算为true。
Simplified example:
简化的例子:
$arrayOne[3] = "foo";
// in practice, the value of this variable would not be hard-coded;
// it would come from a postgres table
$conditionString="\$arrayOne[3] == \"VANILLA\"";
if($conditionString) {
// do something, then exit the loop this if statement would be
// inside of in actual practice
}
This question was essentially asked at
这个问题本质上是被问的。
PHP - if condition inside string
PHP - if条件中的字符串
There were three basic answers:
有三个基本的答案:
- You could use eval(), but DON'T! It's evil! (agreed)
- 您可以使用eval(),但不要使用!这是邪恶的!(同意)
- Hard-code all the conditions (not optimal for reasons provide above and others)
- 硬编码所有条件(由于上述和其他原因而不是最优)
- Store the conditions in a schema (which is what I'm doing) and then parse and convert to php code.
- 将条件存储在模式中(我就是这样做的),然后解析并转换为php代码。
Solution 3 is generally what I'm looking for, but second part seems inefficient and unnecessarily complex. After all, why build up php code from numerous strings (of an unknown number, by the way, which complicates storage in postgres) when it seems so much easier to just store and then evaluate the single string you need?
解决方案3通常是我想要的,但第二部分看起来效率低下,而且不必要地复杂。毕竟,为什么要从众多的字符串中构建php代码(顺便说一下,是一个未知的数字,这会使postgres中的存储变得更复杂),因为它看起来更容易存储,然后评估您需要的单个字符串?
Is there a way to do that?
有办法吗?
Many thanks for the responses so far. ComFreek, thanks to you in particular for the detailed response. The solution you suggest may be just what I need, but frankly, I don't have the experience to know immediately if that's the case. I'll definitely spend time trying to understand what you're saying. Hopefully, it will solve my problem.
非常感谢到目前为止的回复。ComFreek,特别感谢您的详细回复。你提出的解决方案可能正是我所需要的,但坦率地说,我没有经验立即知道是否如此。我肯定会花时间去理解你在说什么。希望它能解决我的问题。
In case it doesn't, and in the meantime, to answer a few questions others asked:
万一没有,同时,也要回答一些别人问的问题:
1) the if conditions will not usually be simple. Many will contain multiple and compound AND and OR tests. A sample condition, in pseudo-code, might be: ( field2=="BUY" AND ( strpos("REINVEST DIVIDEND", field6) OR strpos("CASH MERGER, field6) ) AND field2!="TTXY" AND field3>0 ).
1) if条件通常不会很简单。许多将包含多重复合和或测试。伪代码中的示例条件可能是:(field2= "BUY"和(strpos("REINVEST DIVIDEND", field6)或strpos("CASH MERGER, field6))和field2!=“TTXY”和field3 > 0)。
2) The CSV files come from numerous financial institutions. They contain much the same information, but each has unique data, and all of them have the data in different locations. What's more they express the data differently. In some, a payment is indicated by a negative number; in others, by a positive number. Some have separate fields for deposits and withdrawals; some indicate deposits and withdrawals with a code in another column. And so on. The code needs to determine the nature of the transaction (credit card purchase, check, stock buy or sell, retirement contribution, and on and on) and then, if it can, assign the correct debit/credit account numbers (from a chart of accounts) to that transaction. Altogether, there are hundreds of possible conditions, likely to become thousands. (In case anyone is wondering, the code can determine the institution a particular csv file comes from and will test the transactions in that file only against conditions relevant to that institution.)
2) CSV文件来自众多金融机构。它们包含几乎相同的信息,但是每个都有唯一的数据,并且所有的数据都位于不同的位置。而且他们表达数据的方式也不同。在某些情况下,付款用负数表示;另一些是正数。有些有单独的存取款字段;有些用另一列中的代码表示存款和取款。等等。代码需要确定交易的性质(信用卡购买、支票、股票购买或出售、退休缴款等等),然后,如果可以,将正确的借记卡/信用卡账号(来自账户图表)分配给该交易。总共有数百种可能的情况,可能会变成数千种。(如果有人想知道,代码可以确定一个特定的csv文件来自哪个机构,并且只根据与该机构相关的条件测试该文件中的事务)。
3) The code needs to be flexible enough to easily (in other words, without having to write new code) allow for new tests to be added in the future. For me, being able to add a new condition to a postgres table (which will then be another test for the code to check) is sufficient flexibility.
3)代码需要足够灵活,以便(换句话说,不需要编写新的代码)允许在未来添加新的测试。对我来说,能够向postgres表添加一个新条件(这将是另一个要检查的代码测试)是足够的灵活性。
To try to answer Phil's questions and comments (which I may not be understanding correctly):
试图回答菲尔的问题和评论(我可能理解不正确):
1) I know what preg_match is, but haven't really explored what it can do, so it may in fact be the answer to my problem. I'll check it out.
1)我知道preg_match是什么,但还没有真正探究它能做什么,所以它实际上可能是我问题的答案。我去查一下。
2) currently, the code does not group transactions (which is to say a single row from a single csv file); rather, it looks at each row, determines what it is, then stores it an additional data in appropriate postgres tables, then moves to the next row. There are certain "types" of transactions (say credit card purchases) but they're never grouped for further processing.
2)目前,该代码不分组事务(即从一个csv文件中提取一行);相反,它查看每一行,确定它是什么,然后将额外的数据存储在适当的postgres表中,然后移动到下一行。有某些“类型”的交易(比如信用卡购买),但它们从来不会被分组进行进一步的处理。
3) Each transaction should satisfy one and only condition (though that condition may be complex).
每个事务都应该满足一个且唯一的条件(尽管这个条件可能很复杂)。
4) With regard to matching the entire string, unless I'm missing something (very possible), it's not that simple. For example, let's say a given transaction is a stock purchase. The code can determine that by seeing that "action" field contains the word "Buy" AND the "quantity" field is greater than zero (one or the other of these conditions alone might not be enough to be certain the transaction is a stock purchase), but the "ticker" field could be any of thousands of strings that aren't known in advance--"GOOG", "MSFT", "KO" or whatever.
4)关于匹配整个字符串,除非我遗漏了什么(很可能),这并不是那么简单。例如,假设给定的交易是股票购买。看到的代码可以确定“行动”字段包含单词“购买”和“数量”字段大于零(这些条件中的一个或另一个可能并不足以确定该交易是股票购买),但“股票”字段可以是任何成千上万的字符串不提前知道——“google”,“微软”、“KO”等等。
Again, thanks to all for the responses so far.
再次感谢大家到目前为止的回复。
1 个解决方案
#1
6
Summary: build an extensible system of handlers for specific comparision types, and store related data in the database.
摘要:为特定的比较类型构建可扩展的处理程序系统,并在数据库中存储相关数据。
You need:
你需要:
- Well-known types of conditions
- 著名的类型的条件
- An extensible system of registerable handlers which deal with specific types of conditions (e.g.
EqualityHandler
,StringLengthComparisionHandler
) - 可注册处理程序的可扩展系统,处理特定类型的条件(例如EqualityHandler、stringlengthisionhandler)
- Each handler is associated with a documented object format
- 每个处理程序都与文档化的对象格式相关联
Advantages:
优点:
-
The system is highly extensible. If you ever need comparison type X or Y, just write a handler. This is really comparable with a plugin system of a browser or editor.
该系统具有高度的可扩展性。如果需要比较类型为X或Y,只需编写一个处理程序。这与浏览器或编辑器的插件系统相当。
-
You don't store code in the database. Storing code for equal types of comparison is totally against the DRY principle (Don't repeat yourself).
在数据库中不存储代码。为相同类型的比较存储代码完全违背干原则(不要重复自己)。
-
Unit tests. I cannot imagine how unit tests would work when you have a database containing those codes. They would be really painful.
单元测试。我无法想象当您有一个包含这些代码的数据库时,单元测试将如何工作。他们会非常痛苦。
Disadvantages:
缺点:
- It requires you to write some code before you can actually start evaluating your data.
However, this type of problem is virtually crying for an OOP solution! It really teaches you how OOP can be applied and used. At least in my opinion, it is fun to see how only adding one handler gives your application whole new functionalities! - 它要求您在实际开始计算数据之前先编写一些代码。然而,这种类型的问题实际上需要OOP解决方案!它确实教会了您如何应用和使用OOP。至少在我看来,看到只添加一个处理程序就能给应用程序带来全新的功能是很有趣的!
Pseudo code:
伪代码:
class EqualityHandler implements Handler
public function handle($handlerData, $data) {
// checks for equality and returns true or false
return true;
}
}
// TODO Act like Java: EqualityHandler.class (pass type of class)
$app->registerHandler('EqualityHandler');
// loop all rows
foreach ($row as $csvFields) {
foreach (retrieveConditions($row) as $condition) {
handleCondition($condition, $csvFields);
}
}
function handleCondition($condition, $csvFields) {
if ($app->getHandler($condition['type'])) {
return $app->instantiateHandler($condition['type'])->handle($condition, $csvFields);
}
else {
throw new HandlerNotFoundException('...');
}
}
#1
6
Summary: build an extensible system of handlers for specific comparision types, and store related data in the database.
摘要:为特定的比较类型构建可扩展的处理程序系统,并在数据库中存储相关数据。
You need:
你需要:
- Well-known types of conditions
- 著名的类型的条件
- An extensible system of registerable handlers which deal with specific types of conditions (e.g.
EqualityHandler
,StringLengthComparisionHandler
) - 可注册处理程序的可扩展系统,处理特定类型的条件(例如EqualityHandler、stringlengthisionhandler)
- Each handler is associated with a documented object format
- 每个处理程序都与文档化的对象格式相关联
Advantages:
优点:
-
The system is highly extensible. If you ever need comparison type X or Y, just write a handler. This is really comparable with a plugin system of a browser or editor.
该系统具有高度的可扩展性。如果需要比较类型为X或Y,只需编写一个处理程序。这与浏览器或编辑器的插件系统相当。
-
You don't store code in the database. Storing code for equal types of comparison is totally against the DRY principle (Don't repeat yourself).
在数据库中不存储代码。为相同类型的比较存储代码完全违背干原则(不要重复自己)。
-
Unit tests. I cannot imagine how unit tests would work when you have a database containing those codes. They would be really painful.
单元测试。我无法想象当您有一个包含这些代码的数据库时,单元测试将如何工作。他们会非常痛苦。
Disadvantages:
缺点:
- It requires you to write some code before you can actually start evaluating your data.
However, this type of problem is virtually crying for an OOP solution! It really teaches you how OOP can be applied and used. At least in my opinion, it is fun to see how only adding one handler gives your application whole new functionalities! - 它要求您在实际开始计算数据之前先编写一些代码。然而,这种类型的问题实际上需要OOP解决方案!它确实教会了您如何应用和使用OOP。至少在我看来,看到只添加一个处理程序就能给应用程序带来全新的功能是很有趣的!
Pseudo code:
伪代码:
class EqualityHandler implements Handler
public function handle($handlerData, $data) {
// checks for equality and returns true or false
return true;
}
}
// TODO Act like Java: EqualityHandler.class (pass type of class)
$app->registerHandler('EqualityHandler');
// loop all rows
foreach ($row as $csvFields) {
foreach (retrieveConditions($row) as $condition) {
handleCondition($condition, $csvFields);
}
}
function handleCondition($condition, $csvFields) {
if ($app->getHandler($condition['type'])) {
return $app->instantiateHandler($condition['type'])->handle($condition, $csvFields);
}
else {
throw new HandlerNotFoundException('...');
}
}