Skip to the content.

k-Clique Count Estimation

Overview

This guide provides instructions for compiling and running our k-Clique count estimation algorithm SR-kCCE. It estimates the number of k-cliques in a given graph with a given relative error epsilon; the failure probability (i.e., produce an estimation with relative error larger than epsilon) is 1%.

This program also includes the implementation of other algorithms, Pivoter and DPColorPath.

If you are using the code, please cite our paper.

Lijun Chang, Rashmika Gamage, Jeffrey Xu Yu:
Efficient k-Clique Count Estimation with Accuracy Guarantee.
Proc. VLDB Endow. 17(11): 3707-3719 (2024)

Compilation

To compile the code, run the following commands:

$ make clean
$ make

It will generate an executable file named kCliqueC.

Execution

To run the kCliqueC program, use the following command:

$ ./kCliqueC -g {path_to_graph} -k {k_value} -a {algorithm} -e {epsilon}

Parameters:

Examples for reproducing the paper experimental results:

1. Get the exact 12-clique count in com-youtube

$ ./kCliqueC -b -g datasets/com-youtube -k 12 -a Pivoter

The running time and memory usage (reported in Figures 11–13) can be obtained by using the /usr/bin/time command.

2. Get the estimated 12-clique count in com-youtube by our algorithm SR-kCCE with a relative error 0.001

$ ./kCliqueC -b -g datasets/com-youtube -k 12 -a SR-kCCE -e 0.001

By comparing the estimated count with the exact count, we can get the actual relative error achieved by SR-kCCE as reported in Figures 7 and 8

The k-clique density mu (reported in Figure 10) can be obtained as (total_estimate_cnt - exact_cnt)/UB.

The running time and memory usage (reported in Figures 11–13) can be obtained by using the /usr/bin/time command.

3. Get the estimated 12-clique count in com-youtube by the existing algorithm DPColorPath with a relative error 0.001

$ ./kCliqueC -b -g datasets/com-youtube -k 12 -a DPColorPath -e 0.001

By comparing the estimated count with the exact count, we can get the actual relative error achieved by DPColorPath as reported in Figure 8

The k-clique density mu (reported in Figure 10) can be obtained as (total_estimate_cnt - exact_cnt)/UB.

The running time and memory usage (reported in Figures 11–13) can be obtained by using the /usr/bin/time command.

4. Get the estimated 12-clique count in com-youtube by the existing algorithm DPColorPath with a fixed number (i.e., 50,000,000) samples

$ ./kCliqueC -b -g datasets/com-youtube -k 12 -a DPColorPath -t 50000000

By comparing the estimated count with the exact count, we can get the actual relative error achieved by DPColorPath5e7 as reported in Figure 8

5. Run our algorithm SR-kCCE with a fixed number (e.g., 1000) of refinement and relative error 0.001

$ ./kCliqueC -b -g datasets/com-youtube -k 12 -a SR-kCCE -e 0.001 -r 1000

From the output, we can get the results reported in Figure 14

Data Format

The SR-kCCE program supports two data formats for input graphs: the default text format and a more time-efficient binary format.

Default Text Format

Example of edges.txt

5 7
0 1
0 2
1 2
1 3
1 4
2 3
3 4

Binary Format

For faster graph loading, the binary format is recommended. Each graph is represented by two binary files: b_adj.bin and b_degree.bin.

To read the input graph from the binary format, add the -b flag when running the kCliqueC program:

$ ./kCliqueC -g datasets/CA-GrQc -k 9 -a SR-kCCE -e 0.001 -b 

More Information

For more details on the data format, refer to the Cohesive Subgraph Book datasets page.

Bug fixes

A bug in Pivotor_matrix.h -> init_C() was fixed on 21 October 2024.