按纬度和经度的休眠标准

时间:2022-04-28 16:01:56

Having a MySQL table with more than 20 millions of rows, there is some way with Hibernate to build a criteria in order to get nearest rows given a latitude and longitude?

有一个超过2000万行的MySQL表,Hibernate有一些方法可以建立一个标准,以获得给定纬度和经度的最近行?

Using Criteria would be great because I need to use more filters (price, category, etc).

使用Criteria会很棒,因为我需要使用更多的过滤器(价格,类别等)。

Finally, it's posible get the rows ordered by distance? Or there are too much rows?

最后,它可以获得按距离排序的行吗?还是行太多了?

1 个解决方案

#1


Plan A With a large number of rows, INDEX(lat) is a non-starter, performance-wise, even with restricting to a stripe: AND lat BETWEEN 65 AND 69. INDEX(lat, lng) is no better because the optimizer would not use both columns, even with AND lng BETWEEN...

计划A对于大量行,INDEX(lat)在性能方面是非启动性的,即使限制条带:AND BET BETEEEN 65和69. INDEX(lat,lng)也不是更好,因为优化器会即使使用AND Lng BETWEEN也不要同时使用这两列

Plan B Your next choice will involve lat and lng, plus a subquery. And version 5.6 would be beneficial. It's something like this (after including INDEX(lat, lng, id)):

计划B您的下一个选择将涉及lat和lng,以及子查询。版本5.6将是有益的。它是这样的(在包括INDEX(lat,lng,id)之后):

SELECT ... FROM (
    SELECT id FROM tbl
        WHERE lat BETWEEN... 
          AND lng BETWEEN... ) x
    JOIN tbl USING (id)
    WHERE ...;

For various reasons, Plan B is only slightly better than Plan A.

出于各种原因,B计划仅略优于计划A.

Plan C With millions of rows, you will need my pizza parlor algorithm. This involves a Stored Procedure to repeatedly probe the table, looking for enough rows. It also involves PARTITIONing to get a crude 2D index. The link has reference code that includes filtering on things like category.

计划C有数百万行,您将需要我的披萨店算法。这涉及一个存储过程来重复探测表,寻找足够的行。它还涉及PARTITIONing以获得粗略的2D索引。该链接具有参考代码,其中包括对类别等内容进行过滤。

Plans A and B are O(sqrt(N)); Plan C is O(1). That is, for Plans A and B, if you quadruple the number of rows, you double the time taken. Plan C does not get slower as you increase N.

方案A和B是O(sqrt(N));计划C是O(1)。也就是说,对于计划A和B,如果您将行数增加四倍,则会将时间加倍。当你增加N时,计划C不会变慢。

#1


Plan A With a large number of rows, INDEX(lat) is a non-starter, performance-wise, even with restricting to a stripe: AND lat BETWEEN 65 AND 69. INDEX(lat, lng) is no better because the optimizer would not use both columns, even with AND lng BETWEEN...

计划A对于大量行,INDEX(lat)在性能方面是非启动性的,即使限制条带:AND BET BETEEEN 65和69. INDEX(lat,lng)也不是更好,因为优化器会即使使用AND Lng BETWEEN也不要同时使用这两列

Plan B Your next choice will involve lat and lng, plus a subquery. And version 5.6 would be beneficial. It's something like this (after including INDEX(lat, lng, id)):

计划B您的下一个选择将涉及lat和lng,以及子查询。版本5.6将是有益的。它是这样的(在包括INDEX(lat,lng,id)之后):

SELECT ... FROM (
    SELECT id FROM tbl
        WHERE lat BETWEEN... 
          AND lng BETWEEN... ) x
    JOIN tbl USING (id)
    WHERE ...;

For various reasons, Plan B is only slightly better than Plan A.

出于各种原因,B计划仅略优于计划A.

Plan C With millions of rows, you will need my pizza parlor algorithm. This involves a Stored Procedure to repeatedly probe the table, looking for enough rows. It also involves PARTITIONing to get a crude 2D index. The link has reference code that includes filtering on things like category.

计划C有数百万行,您将需要我的披萨店算法。这涉及一个存储过程来重复探测表,寻找足够的行。它还涉及PARTITIONing以获得粗略的2D索引。该链接具有参考代码,其中包括对类别等内容进行过滤。

Plans A and B are O(sqrt(N)); Plan C is O(1). That is, for Plans A and B, if you quadruple the number of rows, you double the time taken. Plan C does not get slower as you increase N.

方案A和B是O(sqrt(N));计划C是O(1)。也就是说,对于计划A和B,如果您将行数增加四倍,则会将时间加倍。当你增加N时,计划C不会变慢。