I have the following SQL:
我有以下SQL:
SELECT id, url
FROM link
WHERE visited = false
ORDER BY id
LIMIT 500;
--*500 is only a example
- * 500只是一个例子
I'm making a webcrawler and there is a table with links. This SQL returns the links to visit, but dont all them, only the quantitiy defined in the limit clause.
我正在制作一个webcrawler,还有一个带链接的表格。这个SQL返回要访问的链接,但不是全部,只有limit子句中定义的数量。
I will use threads and if the first execute this query, it will obtains the first 500 links, if the second thread execute the same query, it will obtains the next 500 links. In other words, first thead obtains links 1 to 500, second thread obtains 501 to 1000, third thread obtains 1001 to 1500 and so on.
我将使用线程,如果第一次执行此查询,它将获得前500个链接,如果第二个线程执行相同的查询,它将获得下一个500个链接。换句话说,首先得到链接1到500,第二个线程获得501到1000,第三个线程获得1001到1500,依此类推。
MAYBE it's dont need works with threads, but with different computers running the same application. I dont know if a need create a field in the table to set that row was in use by another thread/application or I can do this only with SQL/DBMS. I'm using PostgreSQL.
可能它不需要使用线程,但使用不同的计算机运行相同的应用程序。我不知道是否需要在表中创建一个字段来设置该行正由另一个线程/应用程序使用,或者我只能使用SQL / DBMS执行此操作。我正在使用PostgreSQL。
In other words AGAIN, I will need lock a consulted row to not appears in another query.
换句话说,我需要锁定一个咨询行,不会出现在另一个查询中。
2 个解决方案
#1
0
Have you tried for update/returning?
你有没有尝试更新/返回?
update link
set visiting = true
from (
select id
from link
where visiting = false
and visited = false
limit 500
for update
) as batch
where batch.id = link.id
returning *;
#2
0
Skip 1500 rows and take the next 500
跳过1500行并接下来的500行
SELECT id, url
FROM link
WHERE visited = false
ORDER BY id
LIMIT 500 OFFSET 1500
http://www.postgresql.org/docs/8.3/interactive/queries-limit.html
#1
0
Have you tried for update/returning?
你有没有尝试更新/返回?
update link
set visiting = true
from (
select id
from link
where visiting = false
and visited = false
limit 500
for update
) as batch
where batch.id = link.id
returning *;
#2
0
Skip 1500 rows and take the next 500
跳过1500行并接下来的500行
SELECT id, url
FROM link
WHERE visited = false
ORDER BY id
LIMIT 500 OFFSET 1500
http://www.postgresql.org/docs/8.3/interactive/queries-limit.html