与igraph或其他libaries重叠的社区检测。

时间:2023-02-03 08:26:05

I would like to detect overlapping communities in small networks/graphs. By overlapping, I mean that a node can be included within more than one communities/clusters in the output of the detection algorithm.

我想在小网络/图表中发现重叠的社区。通过重叠,我的意思是,在检测算法的输出中,一个节点可以包含在多个社区/集群中。

I have looked at various community detection algorithms curretly provided by igraph, but I think none of them handles overlapping communities.

我研究了igraph提供的各种社区检测算法,但我认为它们都不能处理重叠的社区。

Ideally, I would like to be able to programmatically utilize some implementation of such algorithm(s) in Python. However, implementation in other languages is OK too.

理想情况下,我希望能够以编程的方式在Python中使用这种算法的一些实现。然而,其他语言的实现也可以。

4 个解决方案

#1


9  

I have implemented the hierarchical link clustering algorithm of Ahn et al a while ago using the Python interface of igraph; see its source code here.

我在之前使用igraph的Python接口实现了Ahn等人的层次链接聚类算法;参见这里的源代码。

Also, implementing CFinder in Python using igraph is fairly easy; this is what I came up with:

此外,使用igraph在Python中实现CFinder相当简单;这就是我想到的:

#!/usr/bin/env python
from itertools import combinations

import igraph
import optparse

parser = optparse.OptionParser(usage="%prog [options] infile")
parser.add_option("-k", metavar="K", default=3, type=int,
        help="use a clique size of K")

options, args = parser.parse_args()

if not args:
    parser.error("Required input file as first argument")

k = options.k
g = igraph.load(args[0], format="ncol", directed=False)
cls = map(set, g.maximal_cliques(min=k))

edgelist = []
for i, j in combinations(range(len(cls)), 2):
    if len(cls[i].intersection(cls[j])) >= k-1:
        edgelist.append((i, j))

cg = igraph.Graph(edgelist, directed=False)
clusters = cg.clusters()
for cluster in clusters:
    members = set()
    for i in cluster:
        members.update(cls[i])
    print "\t".join(g.vs[members]["name"])

#2


2  

If you don't mind using another programming language, you have CFinder (Java), which is based on clique percolation (it basically looks for tightly connected cliques), OSLOM (C++), which optimizes a statistical measure, and certainly others.

如果您不介意使用另一种编程语言,那么您就有了CFinder (Java),它是基于clique percolation(它基本上是寻找紧密连接的cliques), OSLOM (c++),它优化了一个统计度量,当然还有其他的。

Otherwise, if you want to stick to python, you could also apply the method by Evans & Lambiotte '09: 1) transform your graph into a line graph, 2) apply a regular community detection method on it, for example using igraph, and 3) obtain overlapping communities. Converting your graph into a line graph does not look too complicated, and should be fast, provided your original graph is not too large. It will be faster than performing the community detection itself, in any case.

否则,如果您想要坚持python,您也可以使用Evans & Lambiotte '09: 1)方法将您的图转换成线形图,2)应用常规的社区检测方法,例如使用igraph, 3)获得重叠的社区。把你的图形转换成线形图看起来不太复杂,而且应该很快,只要你的原始图形不太大。无论如何,它将比社区检测本身更快。

Note there are alternatives methods to that of Evans & Lambiotte to get overlapping communities from a regular (mutually exclusive communities) method, such as those of Bennet et al. '12 or Wang et al. '09. However, implementing them seems less straightforward.

请注意,Evans & Lambiotte的替代方法可以从常规的(互斥的社区)方法中得到重叠的社区,例如班奈特等人的方法。“12或Wang等人。”09年。然而,实现它们似乎不那么简单。

#3


1  

According to this blog, networkx can now compute for overlapping communities.

根据这个博客,networkx现在可以计算重叠社区。

The code below is for clique percolation method and is available in Networkx 11.6. Github here

下面的代码是clique percolation方法,在Networkx 11.6中可用。Github在这里

import networkx as nx
from itertools import combinations

def get_percolated_cliques(G, k):
perc_graph = nx.Graph()
cliques = list(frozenset(c) for c in nx.find_cliques(G) if len(c) >= k)
perc_graph.add_nodes_from(cliques)

# Add an edge in the clique graph for each pair of cliques that percolate
for c1, c2 in combinations(cliques, 2):
    if len(c1.intersection(c2)) >= (k - 1):
        perc_graph.add_edge(c1, c2)

for component in nx.connected_components(perc_graph):
    yield(frozenset.union(*component))

#4


0  

The CFinder implementing the Clique Percolation Method (CPM). If you use python, Networkx have already implement the same(see this link).

实现Clique Percolation方法(CPM)的CFinder。如果您使用python,那么Networkx已经实现了相同的功能(参见此链接)。

>>> G = nx.complete_graph(5)
>>> K5 = nx.convert_node_labels_to_integers(G,first_label=2)
>>> G.add_edges_from(K5.edges())
>>> c = list(nx.k_clique_communities(G, 4))
>>> list(c[0])
[0, 1, 2, 3, 4, 5, 6]
>>> list(nx.k_clique_communities(G, 6))

#1


9  

I have implemented the hierarchical link clustering algorithm of Ahn et al a while ago using the Python interface of igraph; see its source code here.

我在之前使用igraph的Python接口实现了Ahn等人的层次链接聚类算法;参见这里的源代码。

Also, implementing CFinder in Python using igraph is fairly easy; this is what I came up with:

此外,使用igraph在Python中实现CFinder相当简单;这就是我想到的:

#!/usr/bin/env python
from itertools import combinations

import igraph
import optparse

parser = optparse.OptionParser(usage="%prog [options] infile")
parser.add_option("-k", metavar="K", default=3, type=int,
        help="use a clique size of K")

options, args = parser.parse_args()

if not args:
    parser.error("Required input file as first argument")

k = options.k
g = igraph.load(args[0], format="ncol", directed=False)
cls = map(set, g.maximal_cliques(min=k))

edgelist = []
for i, j in combinations(range(len(cls)), 2):
    if len(cls[i].intersection(cls[j])) >= k-1:
        edgelist.append((i, j))

cg = igraph.Graph(edgelist, directed=False)
clusters = cg.clusters()
for cluster in clusters:
    members = set()
    for i in cluster:
        members.update(cls[i])
    print "\t".join(g.vs[members]["name"])

#2


2  

If you don't mind using another programming language, you have CFinder (Java), which is based on clique percolation (it basically looks for tightly connected cliques), OSLOM (C++), which optimizes a statistical measure, and certainly others.

如果您不介意使用另一种编程语言,那么您就有了CFinder (Java),它是基于clique percolation(它基本上是寻找紧密连接的cliques), OSLOM (c++),它优化了一个统计度量,当然还有其他的。

Otherwise, if you want to stick to python, you could also apply the method by Evans & Lambiotte '09: 1) transform your graph into a line graph, 2) apply a regular community detection method on it, for example using igraph, and 3) obtain overlapping communities. Converting your graph into a line graph does not look too complicated, and should be fast, provided your original graph is not too large. It will be faster than performing the community detection itself, in any case.

否则,如果您想要坚持python,您也可以使用Evans & Lambiotte '09: 1)方法将您的图转换成线形图,2)应用常规的社区检测方法,例如使用igraph, 3)获得重叠的社区。把你的图形转换成线形图看起来不太复杂,而且应该很快,只要你的原始图形不太大。无论如何,它将比社区检测本身更快。

Note there are alternatives methods to that of Evans & Lambiotte to get overlapping communities from a regular (mutually exclusive communities) method, such as those of Bennet et al. '12 or Wang et al. '09. However, implementing them seems less straightforward.

请注意,Evans & Lambiotte的替代方法可以从常规的(互斥的社区)方法中得到重叠的社区,例如班奈特等人的方法。“12或Wang等人。”09年。然而,实现它们似乎不那么简单。

#3


1  

According to this blog, networkx can now compute for overlapping communities.

根据这个博客,networkx现在可以计算重叠社区。

The code below is for clique percolation method and is available in Networkx 11.6. Github here

下面的代码是clique percolation方法,在Networkx 11.6中可用。Github在这里

import networkx as nx
from itertools import combinations

def get_percolated_cliques(G, k):
perc_graph = nx.Graph()
cliques = list(frozenset(c) for c in nx.find_cliques(G) if len(c) >= k)
perc_graph.add_nodes_from(cliques)

# Add an edge in the clique graph for each pair of cliques that percolate
for c1, c2 in combinations(cliques, 2):
    if len(c1.intersection(c2)) >= (k - 1):
        perc_graph.add_edge(c1, c2)

for component in nx.connected_components(perc_graph):
    yield(frozenset.union(*component))

#4


0  

The CFinder implementing the Clique Percolation Method (CPM). If you use python, Networkx have already implement the same(see this link).

实现Clique Percolation方法(CPM)的CFinder。如果您使用python,那么Networkx已经实现了相同的功能(参见此链接)。

>>> G = nx.complete_graph(5)
>>> K5 = nx.convert_node_labels_to_integers(G,first_label=2)
>>> G.add_edges_from(K5.edges())
>>> c = list(nx.k_clique_communities(G, 4))
>>> list(c[0])
[0, 1, 2, 3, 4, 5, 6]
>>> list(nx.k_clique_communities(G, 6))