news.py (7695, 2023-03-01)
news_utils.py (743, 2023-03-01)
requirements.txt (56, 2023-03-01)
test.py (1740, 2023-03-01)
test_network.csv (258924, 2023-03-01)
# Node Embeddings Without Similarity assumptions (NEWS)
### Given a network, create embedding vectors for each node without assuming any similarity measure between unconnected nodes.
Sample run:
$ /usr/bin/python3 -u test.py --dataset test_network.csv --dim 32 --batch_size 500 --lr 0.05 --epochs 75 --save_every=40
This runs 75 epochs and saves the embedding in
test_network_dim_32_bs_500_lr_0.050_news.npz.
Similar files are created after every 40 epochs.
Most results in the paper use
batch_size=100
and
lr=0.01.
To load the embedding:
$ jupyter console
> import news_utils
> emb, node_ids = news_utils.get_embedding('test_network_dim_32_bs_500_lr_0.050_news.npz')
_emb_ is a matrix where
emb[i] is the embedding vector for node
node_ids[i].
If you use this code, please cite the following paper
Avoiding Biases due to Similarity Assumptions in Node Embeddings,
by D. Chakrabarti,
in KDD 2022.