Skip to content

SoheilaAnsari/PheroLP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

13 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

ACO-LP: Supervised Link Prediction via Ant-Colony Pheromone Features

Made with MATLAB License: MIT

MATLAB implementation of a supervised missing-link prediction pipeline that uses an Ant-Colony-Optimization-derived pheromone (PH) feature, combined with the classical Common Neighbor (CN) and Node-Link Clustering (NLC) features, on eight real-world complex networks.


What is in this repository

acolp-link-prediction/
β”œβ”€β”€ README.md                     ← you are here
β”œβ”€β”€ CITATION.cff                  ← machine-readable citation
β”œβ”€β”€ LICENSE                       ← MIT (code)
β”œβ”€β”€ .gitignore
β”œβ”€β”€ data/                         ← 8 benchmark networks (see data/README.md)
β”œβ”€β”€ Supervised/                   ← supervised pipeline (28 files)
β”‚   └── Main.m                    ←   entry point
└── Unsupervised/                 ← unsupervised baselines (12 files)
    β”œβ”€β”€ Unsupervised_CN_V1.m      ←   CN baseline
    └── Unsupervised_NLC_V0.m     ←   NLC baseline

The two implementation folders are kept self-contained on purpose: a few filenames are shared between them (adj_gen.m, CN.m, NLC.m, sort_column1.m, …) but the supervised and unsupervised versions are not identical, for example, sort_column1.m sorts ascending for triangle search in the supervised code and descending for top-L ranking in the unsupervised code. Keep them separate when you run experiments.

See Supervised/README.md and Unsupervised/README.md for run instructions.


Method in one paragraph

For each candidate node pair, we compute three structural features: CN (number of common neighbours), NLC (node-link clustering coefficient, Wu et al. 2016), and PH (a pheromone score derived from an Ant-Colony triangle search; see Β§3.3 of the paper). Class imbalance is handled with ADASYN oversampling, the candidate pool is restricted to node pairs at graph distance ≀ 2, and a linear-kernel SVM is trained on a 10-fold cross-validation split. Performance is reported in terms of Recall, F1-measure, and AUP (area under the precision-recall curve).

The PH feature is the main contribution. Its design is inspired by the unsupervised ACO link prediction of Sherkat et al. (2015), but here the pheromone values are not used to score links directly, they are used as a feature that a downstream classifier can combine with other indices. The combined feature vector SVM(NLC, CN, PH) gives the best result on the largest benchmark (PolBlogs, AUP β‰ˆ 0.90).


Quick start

You will need MATLAB R2018b or newer with the Statistics and Machine Learning Toolbox (for fitcsvm, perfcurve, crossvalind) and the graph functions (graph, adjacency, degree).

% From the repo root, add the code and data to the path
addpath(genpath('Supervised'));
addpath(genpath('Unsupervised'));
addpath('data');

% Run the supervised pipeline (default dataset is 'polblogs')
cd Supervised
Main

To switch datasets, edit the dataset variable near the top of Main.m. Supported values:

'Karate', 'Dolphins', 'Football', 'NS',       % small-scale
'Jazz',   'celegans', 'USAir',    'polblogs'  % large-scale

To turn features on/off, edit the three flags in Main.m:

NLC_feat = 1;   CN_feat = 0;   ACO_feat = 1;

Datasets

Eight publicly available real-world networks are bundled in data/. Full attribution, edge counts, and reference papers are listed in data/README.md. Source: SNAP, Newman's network data collection, KONECT, and the network repository.


How to cite

If you use this code or the PH feature in your own work, please cite

@misc{ansari2021acolp_code, author = {Ansari, Soheila}, title = {{ACO-LP}: Supervised Link Prediction via Ant-Colony Pheromone Features (MATLAB)}, year = {2018}, howpublished = {{https://github.com//acolp-link-prediction}} }


A `CITATION.cff` file is provided at the repo root so GitHub will show a
**"Cite this repository"** button automatically.
---

## Acknowledgments

This work builds on two prior ideas that should be cited alongside
the code:

* **Sherkat, E., Rahgozar, M., Asadpour, M.** (2015).
  *Structural link prediction based on ant colony approach in social
  networks.* Physica A 419, 80–94. β€” origin of the (a)/(b) triangle
  sub-graph idea that the PH feature builds on.
* **Wu, Z., Lin, Y., Wan, H., Jamil, W.** (2016).
  *Predicting top-L missing links with node and link clustering
  information in large-scale networks.* J. Stat. Mech. 083202. β€”
  origin of the NLC feature.

The ADASYN oversampling implementation in `Supervised/ADASYN.m`
follows He, Bai, Garcia & Li (2008).

About

🐜 Predicting missing links in complex networks using ant-colony pheromones as a feature for supervised learning.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages