Ok here's my scenario:
好的,这是我的情景:
Programming language: Java
编程语言:Java
I have a MYSQL
database which has around 100,000,000 entries.
我有一个MYSQL数据库,有大约100,000,000个条目。
I have a a list of values in memory say valueList
with around 10,000 entries.
我在内存中有一个值列表,表示valueList,大约有10,000个条目。
I want to iterate through valueList
and check whether each value in this list, has a match in the database.
我想遍历valueList并检查此列表中的每个值是否在数据库中都匹配。
This means I have to make atleast 10,000 database calls which is highly inefficient for my application. Other way would be to load the entire database into memory once, and then do the comparison in the memory itself. This is fast but needs a huge amount of memory.
这意味着我必须至少进行10,000次数据库调用,这对我的应用程序来说效率非常低。其他方法是将整个数据库加载到内存中一次,然后在内存中进行比较。这很快但需要大量内存。
Could you guys suggest a better approach for this problem?
你们能为这个问题建议一个更好的方法吗?
EDIT :
编辑:
Suppose valueList consists of values like : {"New","York","Brazil","Detroit"}
假设valueList由以下值组成:{“New”,“York”,“Brazil”,“Detroit”}
From the database, I'll have a match for Brazil
and Detroit
. But not for New
and York
, though New York
would have matched. So the next step is , in case of any remaining non matched values, I combine them to see if they match now. So In this case, I combine New
and York
and then find the match.
从数据库中,我将匹配巴西和底特律。但不是纽约和纽约,尽管纽约会匹配。因此,下一步是,在任何剩余的非匹配值的情况下,我将它们组合起来以查看它们是否匹配。所以在这种情况下,我结合New和York然后找到匹配。
In the approach I was following before( one by one database call) , this was possible. But in case of the approach of creatign a temp table, this wont be possible
在我之前遵循的方法(逐个数据库调用)中,这是可能的。但是如果采用临时表的方法,这是不可能的
1 个解决方案
#1
7
You could insert the 10k records in a temporary table with a single insert like this
您可以将10k记录插入到具有单个插入的临时表中
insert into tmp_table (id_col)
values (1),
(3),
...
(7);
Then join the the 2 tables to get the desired results.
然后加入2个表以获得所需的结果。
I don't know your table structure, but it could be like this
我不知道你的表结构,但它可能是这样的
select s.*
from some_table s
inner join tmp_table t on t.id_col = s.id
#1
7
You could insert the 10k records in a temporary table with a single insert like this
您可以将10k记录插入到具有单个插入的临时表中
insert into tmp_table (id_col)
values (1),
(3),
...
(7);
Then join the the 2 tables to get the desired results.
然后加入2个表以获得所需的结果。
I don't know your table structure, but it could be like this
我不知道你的表结构,但它可能是这样的
select s.*
from some_table s
inner join tmp_table t on t.id_col = s.id