Awesome Directed Graph

Oates9895 · October 26, 2020, 2:34am

2022-now

Directed Graph Auto-Encoders | Proceedings of the AAAI Conference on Artificial Intelligence

Hilbert Space

Spectral graph theory for directed graph

digraph neural network

ICLR 21 Directed Acyclic Graph Neural Networks | OpenReview
ICML 21 http://proceedings.mlr.press/v139/beani21a.html
ICML 21 Directed Graph Embeddings in Pseudo-Riemannian Manifolds
ICLR 20 DIMENet
Spectral-based Graph Convolutional Network for Directed Graphs
Directed Graph Convolutional Network
Directed Graph Neural Network

Wavelet for directed graph

Data

	nodes	edges	classes	features
dblp	17716	105734	4	0 (可以自己处理成词向量)
blogs (polblogs)	1490	19090	2	0 (大概)
Email (email-Eu)	1005	25571	42	0
cora (linq)	2708	5429	7	1433
cora-cite	23166	91500	10	0
cora-full	19793	65311	70	8710
citeseer	3312	4732	6	3703
pubmed	19717	44338	3	500
webkb	877	1608	5	1703

GitHub - GRAND-Lab/TriDNR: Tri-Party Deep Network Representation, IJCAI-16
– dblp
https://github.com/abojchevski/graph2gauss/tree/master/data
– core-full core-ml
GitHub - shchur/gnn-benchmark: Framework for evaluating Graph Neural Network models on semi-supervised node classification task
– core-full core-ml photo computer

Other

Learning from Labeled and Unlabeled Data on a Directed Graph (citation 459)
Graph Signal Processing: Overview, Challenges and Applications (citation 427)
A Geometric Framework for Transfer Learning Using Manifold Alignment (citation 18)

Oates9895 · February 2, 2021, 3:40pm

Ref from "NIPS 20 Digraph Inception Convolutional Networks

Transform digraphs to undirected by relaxing its direction structure

T. N. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,” arXiv preprint arXiv:1609.02907, 2016.

We assign separate relation nodes r1 and r2 for each entity pair (e1,r,e2) as (e1,r1) and (e2,r2). Entity nodes are described by sparse feature vectors. We extend the number of features in NELL by assigning a unique one-hot representation for every relation node, effectively resulting in a 61,278-dim sparse feature vector per node. The semi-supervised task here considers the extreme case of only a single labeled example per class in the training set. We construct a binary, symmetric adjacency matrix from this graph by setting entries Aij = 1, if one or more edges are present between nodes i and j.
F. Wu, T. Zhang, A. H. d. Souza Jr, C. Fifty, T. Yu, and K. Q. Weinberger, “Simplifying graph convolutional networks,” arXiv preprint arXiv:1902.07153, 2019. ICML

A ∈ R n×n is a symmetric (typically sparse) adjacency matrix
Y. Wang, W. Wang, Y. Liang, Y. Cai, J. Liu, and B. Hooi, “Nodeaug: Semi-supervised node classification with data augmentation,” in Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2020, pp. 207–217.

Each of them contains an unweighted adjacency matrix and bag-of-words features.

Looking for structural patterns and reformulate the graph.

learn specific structure by defining motifs [28], inheritance relationship [18] and second-order proximity[39]. （have to stipulate learning templates or rules in advance, and is not capable to deal with complex structures beyond their definitions.）

F. Monti, K. Otness, and M. M. Bronstein, “Motifnet: a motif-based graph convolutional network for directed graphs,” in 2018 IEEE Data Science Workshop (DSW). IEEE, 2018, pp. 225–228.

a symmetric motif adjacency matrix => motif Laplacian
M.Kampffmeyer,Y.Chen,X.Liang,H.Wang,Y.Zhang,andE.P.Xing,“Rethinkingknowledge graph propagation for zero-shot learning,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 11 487–11 496.

use two adjacency matrices: Aa ∈ R N×N denotes the connections from nodes to their ancestors, whereas Ad denotes the connections from nodes to their descendants.
Z. Tong, Y. Liang, C. Sun, D. S. Rosenblum, and A. Lim, “Directed graph convolutional network,” arXiv preprint arXiv:2004.13970, 2020.

First-order Proximity Matrix, Second-order Proximity Matrix (adjacency matrix)

Only apply for strongly connected digraph according to their definition

Y. Ma, J. Hao, Y. Yang, H. Li, J. Jin, and G. Chen, “Spectral-based graph convolutional network for directed graphs,” arXiv preprint arXiv:1907.08990, 2019.

$$
1/2(P+P^T)
$$

ICLR21 DIRECTED ACYCLIC GRAPH NEURAL NETWORKS

Oates9895 · February 10, 2021, 2:12pm

Digraph Fourier transform

Spectral methods

The GSO $S ∈ R_{N×N}$ is a matrix whose entry $S_{ji}$ can be non-zero only if $i = j$ or if $(i, j) ∈ E$.

consider a generic asymmetric GSO S, for instance the adjacency matrix A or one of the several generalized Laplacians for digraphs

H. Sevi, G. Rilling, and P. Borgnat, “Harmonic analysis on directed graphs and applications: from Fourier analysis to wavelets”
F. Chung, “Laplacians and the Cheeger inequality for directed graphs”

(transition matrix P of an ergodic random walk X)
Normalized directed graph Laplacian
Random walk Laplacian
Combinatorial Laplacian

GSO

diagonalizable

$S = Vdiag(λ)V−1$ , with $V := [v_1, …, v_N ]$ denoting the (non-orthogonal) eigenvectors of S and $λ := [λ_1, …, λ_N ]^T$ its possibly complex-valued eigenvalues.

GFT: $\hat{x} = V^{−1}x$
1. ——, "Discrete signal processing on graphs: Frequency analysis,”
  
  In the DSPG framework, a graph Fourier basis corresponds to the Jordan basis of the graph adjacency matrix A
non-diagonalizable
- Jordan decomposition of S and use its generalized eigenvectors as the GFT basis
  1. J. A. Deri and J. M. F. Moura, “Spectral projector-based graph Fourier transforms,”
    
    see also [11] for a careful treatment of the non-diagonalizable case which relies on oblique spectral projectors to define the GFT
- GFT approaches rely on projections onto the (non-orthogonal) eigenvectors of a judicious random walk operator on the digraph [9], [10]

Property:
$$
TV_1(x) := ||x - \hat{S}x||_1
$$

frequency ordering: $λi ≻ λj$ if $TV_1(v_i) > TV_1(v_j)$
$S = S/|λ_{max}|$ and $λ_{max}$ is the spectral radius of S
does not ensure that constant signals have zero variation
generalized eigenvectors of asymmetric GSOs need not be orthonormal, implying that Parseval’s identity will not hold and hence the signal power is not preserved across the vertex and dual domains
obtaining the Jordan decomposition → expensive, numerically unstable
uniqueness of the representation (GSO has repeated eigenvalues)

Orthonormal transform learning

based on the minimization of the convex Lovasz extension of the graph cut size
13. S. Sardellitti, S. Barbarossa, and P. Di Lorenzo, “On the graph Fourier transform for directed graphs,”
DGFT
14. R. Shafipour, A. Khodabakhsh, G. Mateos, and E. Nikolova, “A directed graph Fourier transform with spread frequency components,”

Oates9895 · July 23, 2021, 3:06am

Data Set

Citation Networks

CORA-ML
CITESEER
Pubmed
nodes are documents and edges are citation links.
The datasets contain sparse bag-of-words feature vectors for each document and a list of citation links between documents. We treat the citation links as (undirected) edges and construct a binary, symmetric adjacency matrix A. Each document has a class label. For training, we only use 20 labels per class, but all feature vectors.
Cora-Full
DBLP

Amazon Co-purchase Networks

AM-PHOTO
AM-COMPUTER
where nodes represent goods, while edges represent two kinds of goods that are often purchased together.

Knowledge Graph

NELL
(Carlson et al., 2010; Yang et al., 2016) is a bipartite graph dataset extracted from a knowledge graph with 55,864 relation nodes and 9,891 entity nodes.
NELL is a dataset extracted from the knowledge graph introduced in (Carlson et al., 2010). A knowledge graph is a set of entities connected with directed, labeled edges (relations). We follow the pre-processing scheme as described in Yang et al. (2016). We assign separate relation nodes r1 and r2 for each entity pair (e1,r,e2) as (e1,r1) and (e2,r2). Entity nodes are described by sparse feature vectors. We extend the number of features in NELL by assigning a unique one-hot representation for every relation node, effectively resulting in a 61,278-dim sparse feature vector per node. The semi-supervised task here considers the extreme case of only a single labeled example per class in the training set. We construct a binary, symmetric adjacency matrix from this graph by setting entries Aij = 1, if one or more edges are present between nodes i and j.

Co-author networks

Coauthor-CS (short as Co-CS)
Coauthor-Physics (short as Co-Phy)

Reddit

code

github.com/matenure/FastGCN

Training Accuracy much lower than Validation Accuracy

opened 11:53AM - 15 Jan 19 UTC

closed 07:45PM - 30 Jan 19 UTC

Tiiiger

In running the sample code, I found that the training accuracy is much lower tha…n the validation accuracy, which is different from training GraphSAGE on reddit in their repo. Is this normal. For example, the logging I got: Epoch: 0042 train_loss= 1.72849 train_acc= 0.66406 val_loss= 3.17200 val_acc= 0.90848 time per batch= 0.01104 Epoch: 0043 train_loss= 1.84603 train_acc= 0.59375 val_loss= 3.18259 val_acc= 0.90506 time per batch= 0.01108 Epoch: 0044 train_loss= 1.86952 train_acc= 0.60156 val_loss= 3.17415 val_acc= 0.90324 time per batch= 0.01116 Also, the paper reports the F1 measure. How to get F1 score using the codebase?

Blogs

A directed network of hyperlinks among a large set of U.S. political weblogs from before the 2004 election [2]. It includes blog political affiliation as metadata. Links between blogs were automatically extracted from a crawl of the front page of the blog. In addition, the authors drew on various sources (blog directories, and incoming and outgoing links and posts around the time of the 2004 presidential election) and classified 758 blogs as left-leaning and the remaining 732 as right-leaning.
https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.653.5324&rep=rep1&type=pdf

Wikipedia

The hyperlink network of Wikipedia pages on editorial norms [9], in 2015. Nodes are Wikipedia entries, and two entries are linked by a directed edge if one hyperlinks to the other. Editorial norms cover content creation, interactions between users, and formal administrative structure among users and admins. Metadata includes page information such as creation date, number of edits, page views and so on. The number of norm categories is also given.

Email

The network was generated using email data from a large European research institution [14]. We have anonymized information about all incoming and outgoing email between members of the research institution. There is an edge (u, v) in the network if person u sent person v at least one email. The emails only represent communication between institution members. The dataset also contains ground-truth community memberships of the nodes. Each individual belongs to exactly one of 42 departments at the research institute.
http://snap.stanford.edu/data/#email

Task

Node classification
predict community structure on Reddit
https://arxiv.org/pdf/1902.07153.pdf

Baseline

spectral-based GNNs (2)
ChebNet [11], GCN [19], SGC [44] , APPNP [21] and InfoMax [41]
spatial-based GNNs (2)
GraphSage [16] and GAT [40]
Digraph GNNs
DGCN [39]
Graph Inception
SIGN [32]