I have deloped some program that auto-generates some code used to automatically generate queries in a structured way in Java.
我已经下载了一些自动生成一些代码的程序,这些代码用于在Java中以结构化方式自动生成查询。
The latest option that I have added is to get a result of one table, while actually specifying constraints for some other table. The only precondition is that those tables have foreign keys to eachother.
我添加的最新选项是获取一个表的结果,同时实际指定其他表的约束。唯一的前提条件是这些表具有彼此的外键。
I'll only deal with the actual SQL queries here.
我只会在这里处理实际的SQL查询。
This is a valid SQL query which is often used:
这是一个经常使用的有效SQL查询:
SELECT businessPartners.businessPartnerId, businessPartners.name
FROM businessPartners
JOIN BP_emails ON businessPartners.businessPartnerId = BP_emails.businessPartnerId
JOIN emails ON BP_emails.emailId = emails.emailId
WHERE emails.email = "test@test.com"
It selects business partners based on their e-mail adres. businessPartners.businessPartnerId
and emails.emailId
are both primary keys and BP_emails
has the foreign keys in it.
它根据电子邮件地址选择业务合作伙伴。 businessPartners.businessPartnerId和emails.emailId都是主键,BP_emails中包含外键。
A similar structure is being used for invoices and links between invoices and email.
发票和电子邮件之间的发票和链接使用了类似的结构。
So I have also found (and verified) that it is possible to do this query:
所以我也发现(并验证)可以执行此查询:
SELECT businessPartners.businessPartnerId, businessPartners.name
FROM businessPartners
JOIN BP_emails ON businessPartners.businessPartnerId = BP_emails.businessPartnerId
JOIN emails ON BP_emails.emailId = emails.emailId
JOIN INV_emails ON emails.emailId = INV_emails.emailId
JOIN invoices ON INV_emails.invoiceId = invoices.invoiceId
WHERE invoices.invoiceId >=1
AND invoices.invoiceId <=1
First of all I have a hard time figuring out what it exactly means: I think it means something like: Give me all business partners which have invoices.invoiceId = 1
and where the email related to the invoice is the same as the email related to the business partner... So not much sense I think.
首先,我很难弄清楚它究竟意味着什么:我认为这意味着:给我所有有发票的业务合作伙伴.invoiceId = 1,与发票相关的电子邮件与相关的电子邮件相同商业伙伴......所以我觉得这没有多大意义。
So the question is: Up till where do multiple joins actually make sense? I have already had the need for two joins in my first example, are there legitimate examples of needing 3+ joins?
所以问题是:直到多个连接实际上有意义的地方?在我的第一个例子中,我已经需要两个连接,是否有需要3个以上连接的合法示例?
Any help would be appreciated with this.
任何帮助将不胜感激。
3 个解决方案
#1
1
Your queries look alright. I have had up to 10 joins with no problems on performance.
您的查询看起来没问题。我有多达10个连接,没有性能问题。
Some fun facts about MySQL performance:
关于MySQL性能的一些有趣事实:
-
Always use the MySQL
quotes
. I was tasked to improve the performance of a messy query. First thing I did was arrange the code in a readable fashion and add the quotes. 10% spike in performance was the result.始终使用MySQL引号。我的任务是改善凌乱的查询的性能。我做的第一件事是以可读的方式安排代码并添加引号。结果导致10%的表现飙升。
-
Always join by indexed numeric fields and never use two conditions in the join unless no other option exists cause it is a performance downer.
始终通过索引数字字段进行连接,并且永远不要在连接中使用两个条件,除非不存在其他选项,因为它是性能下降器。
-
In where conditions always add them in the order that will select the lest amount having the indexes first this can bring up to 99% boost in performance.
在条件总是按照将选择具有索引的最小量的顺序添加它们的情况下,这可以使性能提高99%。
Just my two cents.
只是我的两分钱。
#2
1
The rule of thumb I've heard is that more than seven tables in a JOIN is too many.
我听说过的经验法则是JOIN中有七个以上的表格太多了。
The key thing here is not the number of JOINs, but the proper ordering of the WHERE clauses. SQL is set-based, so if you execute the WHERE clause that excludes the maximum number of rows first you'll save work for subsequent filters.
这里关键的不是JOIN的数量,而是WHERE子句的正确排序。 SQL是基于集合的,因此如果您执行排除最大行数的WHERE子句,则首先保存后续过滤器的工作。
Indexing will affect performance, too. Make sure you have indexes on all columns in WHERE clauses.
索引也会影响性能。确保WHERE子句中的所有列都有索引。
It goes without saying that every table must have a primary key, and that's what you should JOIN on.
不言而喻,每个表都必须有一个主键,这就是你应该加入的。
Sorry, this is stupid:
对不起,这是愚蠢的:
WHERE invoices.invoiceId >=1
AND invoices.invoiceId <=1
If this is an example of what's auto-generated for you, I'd say you need a better generator.
如果这是一个为你自动生成的例子,我会说你需要一个更好的发电机。
#3
0
Hard to say really I doubt that example is too much of problem, though admittedly it's somewhat unweildy. Given what you are doing, the lengthy sql isn't that much of a problem as you are hiding it behind some hopefully more expressive presentation. I'd hesitate to put what amounts to an arbitary limit of number of relations you can express. If it turns out to be slow, then that's a schema change and as far as I can see, out of scope.
很难说真的我怀疑这个例子是太多的问题,虽然不可否认它有些不合时宜。鉴于你正在做什么,冗长的sql并没有那么大的问题,因为你隐藏在一些希望更具表现力的演示背后。我会毫不犹豫地说出你可以表达的关系数量的任意数量。如果事实证明它很慢,那么这就是模式的改变,据我所知,超出了范围。
#1
1
Your queries look alright. I have had up to 10 joins with no problems on performance.
您的查询看起来没问题。我有多达10个连接,没有性能问题。
Some fun facts about MySQL performance:
关于MySQL性能的一些有趣事实:
-
Always use the MySQL
quotes
. I was tasked to improve the performance of a messy query. First thing I did was arrange the code in a readable fashion and add the quotes. 10% spike in performance was the result.始终使用MySQL引号。我的任务是改善凌乱的查询的性能。我做的第一件事是以可读的方式安排代码并添加引号。结果导致10%的表现飙升。
-
Always join by indexed numeric fields and never use two conditions in the join unless no other option exists cause it is a performance downer.
始终通过索引数字字段进行连接,并且永远不要在连接中使用两个条件,除非不存在其他选项,因为它是性能下降器。
-
In where conditions always add them in the order that will select the lest amount having the indexes first this can bring up to 99% boost in performance.
在条件总是按照将选择具有索引的最小量的顺序添加它们的情况下,这可以使性能提高99%。
Just my two cents.
只是我的两分钱。
#2
1
The rule of thumb I've heard is that more than seven tables in a JOIN is too many.
我听说过的经验法则是JOIN中有七个以上的表格太多了。
The key thing here is not the number of JOINs, but the proper ordering of the WHERE clauses. SQL is set-based, so if you execute the WHERE clause that excludes the maximum number of rows first you'll save work for subsequent filters.
这里关键的不是JOIN的数量,而是WHERE子句的正确排序。 SQL是基于集合的,因此如果您执行排除最大行数的WHERE子句,则首先保存后续过滤器的工作。
Indexing will affect performance, too. Make sure you have indexes on all columns in WHERE clauses.
索引也会影响性能。确保WHERE子句中的所有列都有索引。
It goes without saying that every table must have a primary key, and that's what you should JOIN on.
不言而喻,每个表都必须有一个主键,这就是你应该加入的。
Sorry, this is stupid:
对不起,这是愚蠢的:
WHERE invoices.invoiceId >=1
AND invoices.invoiceId <=1
If this is an example of what's auto-generated for you, I'd say you need a better generator.
如果这是一个为你自动生成的例子,我会说你需要一个更好的发电机。
#3
0
Hard to say really I doubt that example is too much of problem, though admittedly it's somewhat unweildy. Given what you are doing, the lengthy sql isn't that much of a problem as you are hiding it behind some hopefully more expressive presentation. I'd hesitate to put what amounts to an arbitary limit of number of relations you can express. If it turns out to be slow, then that's a schema change and as far as I can see, out of scope.
很难说真的我怀疑这个例子是太多的问题,虽然不可否认它有些不合时宜。鉴于你正在做什么,冗长的sql并没有那么大的问题,因为你隐藏在一些希望更具表现力的演示背后。我会毫不犹豫地说出你可以表达的关系数量的任意数量。如果事实证明它很慢,那么这就是模式的改变,据我所知,超出了范围。