Community Detection Code

This page provides the code developed for the community detection work and points out to code written by others that was used. Please feel free to contact me if you have any trouble with it or find a bug somewhere. All my code is free and can be reused. Should you make use of it in a publication, please cite the relevant paper.

 


C++ code

The C++ code implementing the work from Fast Multi-Scale Detection of Relevant Communities in Large Scale Networks and Fast Multi-Scale Community Detection based on Local Criteria within a Multi-Threaded Algorithm (see publications) with a selection of six criteria is available for download below. The archive contains the whole framework with algorithms, graph and communities file readers and writers, analysis tools, etc. Please read the Readme.txt file for instructions on how to compile and use the tools.

Latest version (beta):

Previous version:

 


Matlab code

Get all the available Matlab code in an archive file here or see below for a detail of the available files. Don't forget to get the external files from the authors' website (mpower2, Louvain, InfoMap) if you want to use these functions. See below.

 

The work on Fast Multi-Scale Detection of Relevant Communities in Large Scale Networks uses C++ implementations but I also wrote corresponding Matlab implementations as prototypes:

  • fast_mo.m: Fast greedy modularity optimisation algorithm based on the multi-scale algorithm but optimised for modularity (mono-scale).
  • mscd_afg.m: Fast multi-scale community detection algorithm using the criterion from Arenas et al.
  • mscd_hslsw.m: Fast multi-scale community detection algorithm using the criterion from Huang et al.
  • mscd_lfk.m: Fast multi-scale community detection algorithm using the criterion from Lancichinetti et al.
  • mscd_rb.m: Fast multi-scale community detection algorithm using the criterion from Reichardt & Bornholdt.
  • mscd_rn.m: Fast multi-scale community detection algorithm using the criterion from Ronhovde & Nussinov.
  • mscd_so.m: Fast multi-scale community detection algorithm using stability optimisation.

Note that for LFK and HSLSW the Matlab algorithms are older versions compared to the final C++ ones which are better optimised.

Also scale parameters must be given in order of increasing scale. For SO it goes incremental from 0 (e.g. [0:0.1:2]). For the other criteria 0 is the coarsest level and should therefore be last (e.g. [2:-0.1:0]).

 

In the papers Multi-Scale Community Detection using Stability as Optimisation Criterion in a Greedy Algorithm and Multi-Scale Community Detection using Stability Optimisation we used various algorithms implemented in Matlab. I wrote an implementation of the fast Newman algorithm an then adaptations to stability, Reichardt & Bornholdt's, Danon et al.'s and Ronhovde & Nussinov's criteria. (Arenas et al.'s criterion can be computed using Newman's algorithm on the adjacency matrix plus the self-loops.) These algorithms are available below:

  • gso_discrete.m: Greedy stability optimisation using the Markov chain model.
  • gso_continuous.m: Greedy stability optimisation using a time-continuous Markov process.
  • msgso_discrete.m: Multi-step greedy stability optimisation using the Markov chain model.
  • msgso_continuous.m: Multi-step greedy stability optimisation using a time-continuous Markov process.
  • rgso_discrete.m: Randomised greedy stability optimisation using the Markov chain model.
  • rgso_continuous.m: Randomised greedy stability optimisation using a time-continuous Markov process.
  • gso.m: Front-end function calling any of the aforementioned algorithm based on its parameter values. It also enables, if specified, a pre-processing phase removing nodes with a single neighbour and a post-processing phased based on the Kernighan-Lin algorithm adapted for stability.
  • ogso.m: Same as the above function but for overlapping communities. The network is first converted to an edge-graph and then community detection is performed on the edge-graph.
  • fast_newman.m: Greedy modularity optimisation with Newman's fast algorithm.
  • danon.m: Greedy optimisation of Danon et al.'s criterion.
  • louvain_so.m: Uses the Louvain method to optimise stability. This function assumes the louvain function is available and running. See below.
  • reichardt.m: Greedy optimisation of Reichardt & Bornholdt's criterion.
  • ronhovde.m: Greedy optimisation of Ronhovde & Nussinov's criterion.

 

I also wrote some tool functions that are required by some algorithms or are just useful when working with communities:

  • adj2pajek.m: Converts an adjacency matrix to a Pajek file.
  • com2pajek.m: Converts a community partition to a Pajek file.
  • check_coms.m: Checks that a community set contains only connected components.
  • generate_h134.m: Generates the H13-4 network.
  • generate_rb.m: Generates the Ravasz and Barabasi (2003) networks.
  • is_connected.m: Returns true if the given component is connected.
  • get_community_matrix.m: Computes the community matrix 'e' used in the computation of modularity and of some derived criteria.
  • get_indicator_matrix.m: Computes the indicator matrix 'H' used in the computation of stability.
  • gnmi.m: Computes the generalised normalised mutual information between two (overlapping) sets of communities.
  • kernighan_lin.m: Kernighan-Lin refinement algorithm adapted for stability.
  • line_graph: Computes the line-graph (or edge-graph) of a given graph.
  • modularity.m: Computes the modularity of a partition.
  • nmi.m: Computes the normalised mutual information between two partitions.
  • stability.m: Computes the stability of a partition.

Also the code can make use of the fast matrix multiplication function mpower2 provided by J. Tursa available here which replaces the initial Matlab function mpower. The initial Matlab function mpower can perfectly be used but mpower2 provides faster computation.

Regarding the Louvain method it is available here. Download the hybrid C++/Matlab version and compile it for Matlab. You can then use the wrapper louvain.m to interact with it like with the other algorithms.

The InfoMap method available from here can be compiled to a system binary and called from Matlab using the wrapper infomap.m. (Amend in the file the InfoMap binary directory.)

For graph and community visualisation I use MatlabBGL and GraphViz for Matlab. I also use Pajek (see conversion scripts). The more recent software Gephi offers a nice interactive user interface and compatibility with many graph file formats including Pajek's formats.