Is there any way, linux specific or not, to have posix shared memory segments (obtained with shm_open()
) removed when no process is using them. i.e. have them reference counted and have the system remove them when the reference becomes 0
当没有进程在使用posix共享内存段(使用shm_open()获取)时,是否有任何方法(特定于linux)删除它们。也就是说,要对它们进行引用计数,当引用变为0时,让系统删除它们
A few notes:
一些笔记:
-
Establishing an atexit handler to remove them doesn't work if the program crashes.
如果程序崩溃,那么建立一个atexit处理程序来删除它们将不起作用。
-
Currently, the linux specific way, I embed the pid in the segment name, and try to find unused segments by walking /dev/shm in an external program. Which has the drawback of having to periodically clean them up externally in a rather hackish way.
目前,我将pid嵌入到段名中,并尝试在外部程序中通过走/dev/shm查找未使用的段。它的缺点是必须定期地用一种相当陈腐的方式从外部清理它们。
-
As the program can run multiple copies, using a well defined name for the segment that the program reuses when it starts up is not feasible.
由于程序可以运行多个副本,因此在程序启动时重用的段使用定义良好的名称是不可行的。
7 个解决方案
#1
5
No - at lest on Linux, the kernel doesn't contain anything that can do this. It's up to some application to call shm_unlink() at some point to get rid of a shared memory segment.
不——至少在Linux上,内核不包含任何可以实现这一点的东西。某些应用程序有时会调用shm_unlink()来删除共享内存段。
#2
4
If there is a point in your program's execution when it is well known, that all processes that need to open the shared memory segment have already done so, you can safely unlink it. Unlinking removes the object from the global namespace but it sill lingers around as long as there is at least one process that keep its file descriptor open. If a crash occurs after that point, the file descriptor is automatically closed and the reference count is decremented. Once no open descriptors to the unlinked shared memory block remain, it is deleted.
如果程序的执行中有一个点是众所周知的,那么所有需要打开共享内存段的进程都已经这样做了,那么您可以安全地断开它的链接。取消链接从全局命名空间中删除对象,但只要至少有一个进程保持其文件描述符的打开,该对象就会继续存在。如果在这一点之后发生崩溃,文件描述符将自动关闭,引用计数将减少。一旦没有未链接共享内存块的开放描述符存在,它就被删除。
This is useful in the following scenario: a process creates a shared memory block, unlinks it and then forks. The child inherits the file descriptor and can use the shared memory block to communicate with the parent. Once both processes terminate, the block is automatically removed as both file descriptors get closed.
这在以下场景中是有用的:进程创建一个共享内存块,取消它的链接,然后分叉。子程序继承文件描述符,并可以使用共享内存块与父程序通信。一旦两个进程终止,当两个文件描述符被关闭时,块将被自动删除。
While unlinked, the shared memory block is unavailable for other processes to open it. Meanwhile, if one use shm_open()
with the same name as the unlinked block, a new and completely different shared memory block would be created instead.
当未链接时,其他进程无法打开共享内存块。同时,如果使用与未链接块同名的shm_open(),则将创建一个全新的、完全不同的共享内存块。
#3
2
I found a way using a system command and the Linux command "fuser" which allow to list the processes which opened a file. This way, you can check if the shared memory file (located in /dev/shm") is still in use and delete it if not. Note that the operations of check / delete / create must be enclosed in a inter-processes critical section using a named mutex or named semaphore or file lock.
我找到了一种使用系统命令和Linux命令“fuser”的方法,它允许列出打开文件的进程。这样,您可以检查共享内存文件(位于/dev/shm中)是否仍在使用,如果没有,则删除它。注意,check / delete / create的操作必须包含在进程间的关键部分中,使用命名互斥或命名信号量或文件锁。
std::string shm_file = "/dev/shm/" + service_name + "Shm";
std::string cmd_line = "if [ -f " + shm_file + " ] ; then if ! fuser -s " + shm_file + " ; then rm -f " + shm_file + " ; else exit 2 ; fi else exit 3 ; fi";
int res = system(cmd_line.c_str());
switch (WEXITSTATUS(res)) {
case 0: _logger.warning ("The shared memory file " + shm_file + " was found orphan and is deleted"); break;
case 1: _logger.critical("The shared memory file " + shm_file + " was found orphan and cannot be deleted"); break;
case 2: _logger.trace ("The shared memory file " + shm_file + " is linked to alive processes"); break;
case 3: _logger.trace ("The shared memory file " + shm_file + " is not found"); break;
}
#4
1
For the shared memory, created using sysV API, it is possible to have such a behaviour. On Linux only. It is not POSIX shared memory, but may work for you.
对于使用sysV API创建的共享内存,可以有这样的行为。在Linux上。它不是POSIX共享内存,但可以为您工作。
In the book The Linux Programming Interface one of the possible parameters for shmctl() is described the following way.
在《Linux编程接口》一书中,shmctl()的一个可能参数如下所示。
IPC_RMID Mark the shared memory segment and its associated shmid_ds data structure for deletion. If no processes currently have the segment attached, deletion is immediate; otherwise, the segment is removed after all processes have detached from it (i.e., when the value of the shm_nattch field in the shmid_ds data structure falls to 0). In some applications, we can make sure that a shared memory segment is tidily cleared away on application termination by marking it for deletion immediately after all processes have attached it to their virtual address space with shmat(). This is analogous to unlinking a file once we’ve opened it. On Linux, if a shared segment has been marked for deletion using IPC_RMID, but has not yet been removed because some process still has it attached, then it is possible for another process to attach that segment. However, this behavior is not portable: most UNIX implementations prevent new attaches to a segment marked for deletion. (SUSv3 is silent on what behavior should occur in this scenario.) A few Linux applications have come to depend on this behavior, which is why Linux has not been changed to match other UNIX implementations.
IPC_RMID将共享内存段及其关联的shmid_ds数据结构标记为删除。如果没有进程当前有段连接,删除是即时的;否则,在所有进程都与段分离之后,段被删除(即。,当shmid_ds shm_nattch字段的值落在0)数据结构。在某些应用程序中,我们可以确保一个共享内存段上整齐地清除应用程序终止后立即被标记为删除所有进程在它自己的虚拟地址空间与调用shmat()。这类似于打开一个文件后取消它的链接。在Linux上,如果一个共享段使用IPC_RMID标记为删除,但是还没有被删除,因为一些进程仍然有它的附件,那么另一个进程附加该段是可能的。但是,这种行为是不可移植的:大多数UNIX实现都防止对标记为删除的段进行新的附加。(SUSv3对这种情况下应该发生什么行为保持沉默。)一些Linux应用程序已经开始依赖这种行为,这就是为什么Linux没有被修改以与其他UNIX实现相匹配。
#5
0
Could you not just use a global counting semaphore to reference count? Wrap the attach and detach calls so that the semaphore is incremented when you attach to the memory and decremented when you detach. Release the segment when a detach reduces the semaphore to zero.
难道你不能使用全局计数信号量来引用计数吗?封装attach和detach调用,以便在连接到内存时信号量增加,在分离时信号量减少。当分离将信号量减少到零时释放段。
#6
0
Not sure, if the below works, or feasible. But my try.
不确定,以下是否可行,是否可行。但我试一试。
Why do not you execute the helper program, which is executed each time your program crashed.
为什么不执行助手程序呢?每次程序崩溃时都执行这个程序。
ie:
即:
/proc/sys/kernel/core_pattern to /path/to/Myprogram %p
Myprogram is executed when process crashes, and probably you can explore further.
Myprogram是在进程崩溃时执行的,您可以进一步研究。
see
看到
man 5 core. for more information.
Hope this helps to some extend.
希望这能在一定程度上有所帮助。
#7
0
Let's assume the most complicated case:
让我们假设最复杂的情况:
- You have several processes communicating via shared memory
- 您有几个进程通过共享内存进行通信
- They can start and finish at any time, even multiple times. That means there is no master process, nor is there a dedicated "first" process that can initialize the shared memory.
- 他们可以在任何时候开始和结束,甚至多次。这意味着不存在主进程,也不存在可以初始化共享内存的专用“first”进程。
- That means, e.g., there is no point where you can safely unlink the shared memory, so neither Sergey's nor Hristo's answers work.
- 这意味着,例如,您没有必要安全地断开共享内存的连接,因此谢尔盖和赫里斯托的答案都不管用。
I see two possible solutions and would welcome feedback on them because the internet is horribly silent on this question:
我看到了两种可能的解决方案,并欢迎对它们的反馈,因为互联网在这个问题上沉默得可怕:
-
Store the pid (or a more specific process identifier if you have one) of the last process that wrote to the shared memory inside the shared memory as a lock. Then you could do sth. like the following pseudo code:
将作为锁写入共享内存的上一个进程的pid(或更特定的进程标识符,如果您有的话)存储在共享内存中。然后你可以做如下的伪代码:
int* pshmem = open shared memory() while(true) nPid = atomic_read(pshmem) if nPid = 0 // your shared memory is in a valid state break else // process nPid holds a lock to your shared memory // or it might have crashed while holding the lock if process nPid still exists // shared memory is valid break else // shared memory is corrupt // try acquire lock if atomic_compare_exchange(pshmem, nPid, my pid) // we have the lock reinitialize shared memory atomic_write(pshem, 0) // release lock else // somebody else got the lock in the meantime // continue loop
This verifies that the last writer didn't die while writing. The shared memory is still persisting longer than any of your processes.
这证明了最后一个作者在写作时没有死。共享内存仍然比您的任何进程都要长。
-
Use a reader/writer file lock to find out if any process is the first process opening the shared memory object. The first process may then reinitialize the shared memory:
使用读写器文件锁来确定是否有任何进程是打开共享内存对象的第一个进程。第一个过程可能会重新初始化共享内存:
// try to get exclusive lock on lockfile int fd = open(lockfile, O_RDONLY | O_CREAT | O_EXLOCK | O_NONBLOCK, ...) if fd == -1 // didn't work, somebody else has it and is initializing shared memory // acquire shared lock and wait for it fd = open(lockfile, O_RDONLY | O_SHLOCK) // open shared memory else // we are the first // delete shared memory object // possibly delete named mutex/semaphore as well // create shared memory object (& semaphore) // degrade exclusive lock to a shared lock flock(fd, LOCK_SH)
File locks seem to be the only (?) mechanism on POSIX systems that is cleared up automatically when the process dies. Unfortunately, the list of caveats to use them is very, very long. The algorithm assumes
flock
is supported on the underlying filesystem at least on the local machine. The algorithm doesn't care if the locks are actually visible to other processes on NFS filesystems or not. They only have to be visible for all processes accessing the shared memory object.文件锁似乎是POSIX系统上惟一的(?)机制,在进程结束时自动清除该机制。不幸的是,使用它们的注意事项非常非常长。该算法假定底层文件系统上至少在本地机器上支持flock。该算法不关心NFS文件系统上的其他进程是否实际可见这些锁。它们只对访问共享内存对象的所有进程可见。
#1
5
No - at lest on Linux, the kernel doesn't contain anything that can do this. It's up to some application to call shm_unlink() at some point to get rid of a shared memory segment.
不——至少在Linux上,内核不包含任何可以实现这一点的东西。某些应用程序有时会调用shm_unlink()来删除共享内存段。
#2
4
If there is a point in your program's execution when it is well known, that all processes that need to open the shared memory segment have already done so, you can safely unlink it. Unlinking removes the object from the global namespace but it sill lingers around as long as there is at least one process that keep its file descriptor open. If a crash occurs after that point, the file descriptor is automatically closed and the reference count is decremented. Once no open descriptors to the unlinked shared memory block remain, it is deleted.
如果程序的执行中有一个点是众所周知的,那么所有需要打开共享内存段的进程都已经这样做了,那么您可以安全地断开它的链接。取消链接从全局命名空间中删除对象,但只要至少有一个进程保持其文件描述符的打开,该对象就会继续存在。如果在这一点之后发生崩溃,文件描述符将自动关闭,引用计数将减少。一旦没有未链接共享内存块的开放描述符存在,它就被删除。
This is useful in the following scenario: a process creates a shared memory block, unlinks it and then forks. The child inherits the file descriptor and can use the shared memory block to communicate with the parent. Once both processes terminate, the block is automatically removed as both file descriptors get closed.
这在以下场景中是有用的:进程创建一个共享内存块,取消它的链接,然后分叉。子程序继承文件描述符,并可以使用共享内存块与父程序通信。一旦两个进程终止,当两个文件描述符被关闭时,块将被自动删除。
While unlinked, the shared memory block is unavailable for other processes to open it. Meanwhile, if one use shm_open()
with the same name as the unlinked block, a new and completely different shared memory block would be created instead.
当未链接时,其他进程无法打开共享内存块。同时,如果使用与未链接块同名的shm_open(),则将创建一个全新的、完全不同的共享内存块。
#3
2
I found a way using a system command and the Linux command "fuser" which allow to list the processes which opened a file. This way, you can check if the shared memory file (located in /dev/shm") is still in use and delete it if not. Note that the operations of check / delete / create must be enclosed in a inter-processes critical section using a named mutex or named semaphore or file lock.
我找到了一种使用系统命令和Linux命令“fuser”的方法,它允许列出打开文件的进程。这样,您可以检查共享内存文件(位于/dev/shm中)是否仍在使用,如果没有,则删除它。注意,check / delete / create的操作必须包含在进程间的关键部分中,使用命名互斥或命名信号量或文件锁。
std::string shm_file = "/dev/shm/" + service_name + "Shm";
std::string cmd_line = "if [ -f " + shm_file + " ] ; then if ! fuser -s " + shm_file + " ; then rm -f " + shm_file + " ; else exit 2 ; fi else exit 3 ; fi";
int res = system(cmd_line.c_str());
switch (WEXITSTATUS(res)) {
case 0: _logger.warning ("The shared memory file " + shm_file + " was found orphan and is deleted"); break;
case 1: _logger.critical("The shared memory file " + shm_file + " was found orphan and cannot be deleted"); break;
case 2: _logger.trace ("The shared memory file " + shm_file + " is linked to alive processes"); break;
case 3: _logger.trace ("The shared memory file " + shm_file + " is not found"); break;
}
#4
1
For the shared memory, created using sysV API, it is possible to have such a behaviour. On Linux only. It is not POSIX shared memory, but may work for you.
对于使用sysV API创建的共享内存,可以有这样的行为。在Linux上。它不是POSIX共享内存,但可以为您工作。
In the book The Linux Programming Interface one of the possible parameters for shmctl() is described the following way.
在《Linux编程接口》一书中,shmctl()的一个可能参数如下所示。
IPC_RMID Mark the shared memory segment and its associated shmid_ds data structure for deletion. If no processes currently have the segment attached, deletion is immediate; otherwise, the segment is removed after all processes have detached from it (i.e., when the value of the shm_nattch field in the shmid_ds data structure falls to 0). In some applications, we can make sure that a shared memory segment is tidily cleared away on application termination by marking it for deletion immediately after all processes have attached it to their virtual address space with shmat(). This is analogous to unlinking a file once we’ve opened it. On Linux, if a shared segment has been marked for deletion using IPC_RMID, but has not yet been removed because some process still has it attached, then it is possible for another process to attach that segment. However, this behavior is not portable: most UNIX implementations prevent new attaches to a segment marked for deletion. (SUSv3 is silent on what behavior should occur in this scenario.) A few Linux applications have come to depend on this behavior, which is why Linux has not been changed to match other UNIX implementations.
IPC_RMID将共享内存段及其关联的shmid_ds数据结构标记为删除。如果没有进程当前有段连接,删除是即时的;否则,在所有进程都与段分离之后,段被删除(即。,当shmid_ds shm_nattch字段的值落在0)数据结构。在某些应用程序中,我们可以确保一个共享内存段上整齐地清除应用程序终止后立即被标记为删除所有进程在它自己的虚拟地址空间与调用shmat()。这类似于打开一个文件后取消它的链接。在Linux上,如果一个共享段使用IPC_RMID标记为删除,但是还没有被删除,因为一些进程仍然有它的附件,那么另一个进程附加该段是可能的。但是,这种行为是不可移植的:大多数UNIX实现都防止对标记为删除的段进行新的附加。(SUSv3对这种情况下应该发生什么行为保持沉默。)一些Linux应用程序已经开始依赖这种行为,这就是为什么Linux没有被修改以与其他UNIX实现相匹配。
#5
0
Could you not just use a global counting semaphore to reference count? Wrap the attach and detach calls so that the semaphore is incremented when you attach to the memory and decremented when you detach. Release the segment when a detach reduces the semaphore to zero.
难道你不能使用全局计数信号量来引用计数吗?封装attach和detach调用,以便在连接到内存时信号量增加,在分离时信号量减少。当分离将信号量减少到零时释放段。
#6
0
Not sure, if the below works, or feasible. But my try.
不确定,以下是否可行,是否可行。但我试一试。
Why do not you execute the helper program, which is executed each time your program crashed.
为什么不执行助手程序呢?每次程序崩溃时都执行这个程序。
ie:
即:
/proc/sys/kernel/core_pattern to /path/to/Myprogram %p
Myprogram is executed when process crashes, and probably you can explore further.
Myprogram是在进程崩溃时执行的,您可以进一步研究。
see
看到
man 5 core. for more information.
Hope this helps to some extend.
希望这能在一定程度上有所帮助。
#7
0
Let's assume the most complicated case:
让我们假设最复杂的情况:
- You have several processes communicating via shared memory
- 您有几个进程通过共享内存进行通信
- They can start and finish at any time, even multiple times. That means there is no master process, nor is there a dedicated "first" process that can initialize the shared memory.
- 他们可以在任何时候开始和结束,甚至多次。这意味着不存在主进程,也不存在可以初始化共享内存的专用“first”进程。
- That means, e.g., there is no point where you can safely unlink the shared memory, so neither Sergey's nor Hristo's answers work.
- 这意味着,例如,您没有必要安全地断开共享内存的连接,因此谢尔盖和赫里斯托的答案都不管用。
I see two possible solutions and would welcome feedback on them because the internet is horribly silent on this question:
我看到了两种可能的解决方案,并欢迎对它们的反馈,因为互联网在这个问题上沉默得可怕:
-
Store the pid (or a more specific process identifier if you have one) of the last process that wrote to the shared memory inside the shared memory as a lock. Then you could do sth. like the following pseudo code:
将作为锁写入共享内存的上一个进程的pid(或更特定的进程标识符,如果您有的话)存储在共享内存中。然后你可以做如下的伪代码:
int* pshmem = open shared memory() while(true) nPid = atomic_read(pshmem) if nPid = 0 // your shared memory is in a valid state break else // process nPid holds a lock to your shared memory // or it might have crashed while holding the lock if process nPid still exists // shared memory is valid break else // shared memory is corrupt // try acquire lock if atomic_compare_exchange(pshmem, nPid, my pid) // we have the lock reinitialize shared memory atomic_write(pshem, 0) // release lock else // somebody else got the lock in the meantime // continue loop
This verifies that the last writer didn't die while writing. The shared memory is still persisting longer than any of your processes.
这证明了最后一个作者在写作时没有死。共享内存仍然比您的任何进程都要长。
-
Use a reader/writer file lock to find out if any process is the first process opening the shared memory object. The first process may then reinitialize the shared memory:
使用读写器文件锁来确定是否有任何进程是打开共享内存对象的第一个进程。第一个过程可能会重新初始化共享内存:
// try to get exclusive lock on lockfile int fd = open(lockfile, O_RDONLY | O_CREAT | O_EXLOCK | O_NONBLOCK, ...) if fd == -1 // didn't work, somebody else has it and is initializing shared memory // acquire shared lock and wait for it fd = open(lockfile, O_RDONLY | O_SHLOCK) // open shared memory else // we are the first // delete shared memory object // possibly delete named mutex/semaphore as well // create shared memory object (& semaphore) // degrade exclusive lock to a shared lock flock(fd, LOCK_SH)
File locks seem to be the only (?) mechanism on POSIX systems that is cleared up automatically when the process dies. Unfortunately, the list of caveats to use them is very, very long. The algorithm assumes
flock
is supported on the underlying filesystem at least on the local machine. The algorithm doesn't care if the locks are actually visible to other processes on NFS filesystems or not. They only have to be visible for all processes accessing the shared memory object.文件锁似乎是POSIX系统上惟一的(?)机制,在进程结束时自动清除该机制。不幸的是,使用它们的注意事项非常非常长。该算法假定底层文件系统上至少在本地机器上支持flock。该算法不关心NFS文件系统上的其他进程是否实际可见这些锁。它们只对访问共享内存对象的所有进程可见。