Awesome Directed Graph

Data

nodes edges classes features
dblp 17716 105734 4 0 (可以自己处理成词向量)
blogs (polblogs) 1490 19090 2 0 (大概)
Email (email-Eu) 1005 25571 42 0
cora (linq) 2708 5429 7 1433
cora-cite 23166 91500 10 0
cora-full 19793 65311 70 8710
citeseer 3312 4732 6 3703
pubmed 19717 44338 3 500
webkb 877 1608 5 1703

Spectral graph theory for directed graph

  1. Signal Processing on Directed Graphs
  2. Analysis of Spectral Space Properties of Directed Graphs using Matrix Perturbation Theory with Application in Graph Partition ICDM 15 (citation 13)
  3. On Spectral Analysis of Directed Graphs with Graph Perturbation Theory
  4. Directed Spectral Graph Theory: Sparsification and Asymmetric Linear System Solver
  5. Simple eigenvalues of graphs and digraphs
  6. Linear Algebraic Primitives Beyond Laplacian Solvers
  7. Laplacians and the Cheeger Inequality for Directed Graphs (citation 224)
  8. Spectral analysis of non-Hermitian matrices and directed graphs
  9. Graph Fourier transform based on directed Laplacian (citation 10)
  10. Directed Spectral Graph Theory: Sparsification and Asymmetric Linear System Solver
  11. On the Graph Fourier Transform for Directed Graphs (citation 62)
  12. A Directed Graph Fourier Transform With Spread Frequency Components

digraph neural network


Wavelet for directed graph

  1. Harmonic analysis on directed graphs and applications: from Fourier analysis to wavelets
  2. The Construction of Spectral Graph Wavelet for Directed Graph using Its Laplacian (page=71)
  3. Graph Signal Processing for Directed Graphs Based on the Hermitian Laplacian (ECML-PKDD 2019)
  4. Applied and computational harmonic analysis on graphs and networks (citation 18)
  5. A unified framework for harmonic analysis of functions on directed graphs and changing data (citation 25)
  6. Multiresolution analysis of functions on directed networks
  7. Bipartite Graph Filter Banks: Polyphase Analysis and Generalization
  8. The extended generalized Haar-Walsh transform and applications

Other

  1. Learning from Labeled and Unlabeled Data on a Directed Graph (citation 459)
  2. Graph Signal Processing: Overview, Challenges and Applications (citation 427)
  3. A Geometric Framework for Transfer Learning Using Manifold Alignment (citation 18)
1 Like

Ref from "NIPS 20 Digraph Inception Convolutional Networks

Transform digraphs to undirected by relaxing its direction structure

  1. T. N. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,” arXiv preprint arXiv:1609.02907, 2016.

    We assign separate relation nodes r1 and r2 for each entity pair (e1,r,e2) as (e1,r1) and (e2,r2). Entity nodes are described by sparse feature vectors. We extend the number of features in NELL by assigning a unique one-hot representation for every relation node, effectively resulting in a 61,278-dim sparse feature vector per node. The semi-supervised task here considers the extreme case of only a single labeled example per class in the training set. We construct a binary, symmetric adjacency matrix from this graph by setting entries Aij = 1, if one or more edges are present between nodes i and j.

  2. F. Wu, T. Zhang, A. H. d. Souza Jr, C. Fifty, T. Yu, and K. Q. Weinberger, “Simplifying graph convolutional networks,” arXiv preprint arXiv:1902.07153, 2019. ICML

    A ∈ R n×n is a symmetric (typically sparse) adjacency matrix

  3. Y. Wang, W. Wang, Y. Liang, Y. Cai, J. Liu, and B. Hooi, “Nodeaug: Semi-supervised node classification with data augmentation,” in Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2020, pp. 207–217.

    Each of them contains an unweighted adjacency matrix and bag-of-words features.

Looking for structural patterns and reformulate the graph.

learn specific structure by defining motifs [28], inheritance relationship [18] and second-order proximity[39]. (have to stipulate learning templates or rules in advance, and is not capable to deal with complex structures beyond their definitions.)

  1. F. Monti, K. Otness, and M. M. Bronstein, “Motifnet: a motif-based graph convolutional network for directed graphs,” in 2018 IEEE Data Science Workshop (DSW). IEEE, 2018, pp. 225–228.

    a symmetric motif adjacency matrix => motif Laplacian

  2. M.Kampffmeyer,Y.Chen,X.Liang,H.Wang,Y.Zhang,andE.P.Xing,“Rethinkingknowledge graph propagation for zero-shot learning,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 11 487–11 496.

    use two adjacency matrices: Aa ∈ R N×N denotes the connections from nodes to their ancestors, whereas Ad denotes the connections from nodes to their descendants.

  3. Z. Tong, Y. Liang, C. Sun, D. S. Rosenblum, and A. Lim, “Directed graph convolutional network,” arXiv preprint arXiv:2004.13970, 2020.

    First-order Proximity Matrix, Second-order Proximity Matrix (adjacency matrix)

Only apply for strongly connected digraph according to their definition

  1. Y. Ma, J. Hao, Y. Yang, H. Li, J. Jin, and G. Chen, “Spectral-based graph convolutional network for directed graphs,” arXiv preprint arXiv:1907.08990, 2019.

    1/2(P+P^T)

ICLR21 DIRECTED ACYCLIC GRAPH NEURAL NETWORKS

Digraph Fourier transform

Spectral methods

The GSO S ∈ R_{N×N} is a matrix whose entry S_{ji} can be non-zero only if i = j or if (i, j) ∈ E.

consider a generic asymmetric GSO S, for instance the adjacency matrix A or one of the several generalized Laplacians for digraphs

  1. H. Sevi, G. Rilling, and P. Borgnat, “Harmonic analysis on directed graphs and applications: from Fourier analysis to wavelets

  2. F. Chung, “Laplacians and the Cheeger inequality for directed graphs

(transition matrix P of an ergodic random walk X)
Normalized directed graph Laplacian
Random walk Laplacian
Combinatorial Laplacian

GSO

  • diagonalizable

    S = Vdiag(λ)V−1 , with V := [v_1, ..., v_N ] denoting the (non-orthogonal) eigenvectors of S and λ := [λ_1, ..., λ_N ]^T its possibly complex-valued eigenvalues.

    GFT: \hat{x} = V^{−1}x

    1. ——, "Discrete signal processing on graphs: Frequency analysis,”

      In the DSPG framework, a graph Fourier basis corresponds to the Jordan basis of the graph adjacency matrix A

  • non-diagonalizable

    • Jordan decomposition of S and use its generalized eigenvectors as the GFT basis

      1. J. A. Deri and J. M. F. Moura, “Spectral projector-based graph Fourier transforms,”

        see also [11] for a careful treatment of the non-diagonalizable case which relies on oblique spectral projectors to define the GFT

    • GFT approaches rely on projections onto the (non-orthogonal) eigenvectors of a judicious random walk operator on the digraph [9], [10]

Property:

TV_1(x) := ||x - \hat{S}x||_1
  • frequency ordering: λi ≻ λj if TV_1(v_i) > TV_1(v_j)

  • S = S/|λ_{max}| and λ_{max} is the spectral radius of S

  • does not ensure that constant signals have zero variation

  • generalized eigenvectors of asymmetric GSOs need not be orthonormal, implying that Parseval’s identity will not hold and hence the signal power is not preserved across the vertex and dual domains

  • obtaining the Jordan decomposition -> expensive, numerically unstable

  • uniqueness of the representation (GSO has repeated eigenvalues)

Orthonormal transform learning

Data Set

Citation Networks

  • CORA-ML
  • CITESEER
  • Pubmed
    nodes are documents and edges are citation links.
    The datasets contain sparse bag-of-words feature vectors for each document and a list of citation links between documents. We treat the citation links as (undirected) edges and construct a binary, symmetric adjacency matrix A. Each document has a class label. For training, we only use 20 labels per class, but all feature vectors.
  • Cora-Full
  • DBLP

Amazon Co-purchase Networks

  • AM-PHOTO
  • AM-COMPUTER
    where nodes represent goods, while edges represent two kinds of goods that are often purchased together.

Knowledge Graph

  • NELL
    (Carlson et al., 2010; Yang et al., 2016) is a bipartite graph dataset extracted from a knowledge graph with 55,864 relation nodes and 9,891 entity nodes.
    NELL is a dataset extracted from the knowledge graph introduced in (Carlson et al., 2010). A knowledge graph is a set of entities connected with directed, labeled edges (relations). We follow the pre-processing scheme as described in Yang et al. (2016). We assign separate relation nodes r1 and r2 for each entity pair (e1,r,e2) as (e1,r1) and (e2,r2). Entity nodes are described by sparse feature vectors. We extend the number of features in NELL by assigning a unique one-hot representation for every relation node, effectively resulting in a 61,278-dim sparse feature vector per node. The semi-supervised task here considers the extreme case of only a single labeled example per class in the training set. We construct a binary, symmetric adjacency matrix from this graph by setting entries Aij = 1, if one or more edges are present between nodes i and j.

Co-author networks

  • Coauthor-CS (short as Co-CS)
  • Coauthor-Physics (short as Co-Phy)

Reddit

code

Blogs

A directed network of hyperlinks among a large set of U.S. political weblogs from before the 2004 election [2]. It includes blog political affiliation as metadata. Links between blogs were automatically extracted from a crawl of the front page of the blog. In addition, the authors drew on various sources (blog directories, and incoming and outgoing links and posts around the time of the 2004 presidential election) and classified 758 blogs as left-leaning and the remaining 732 as right-leaning.
https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.653.5324&rep=rep1&type=pdf

Wikipedia

The hyperlink network of Wikipedia pages on editorial norms [9], in 2015. Nodes are Wikipedia entries, and two entries are linked by a directed edge if one hyperlinks to the other. Editorial norms cover content creation, interactions between users, and formal administrative structure among users and admins. Metadata includes page information such as creation date, number of edits, page views and so on. The number of norm categories is also given.

Email

The network was generated using email data from a large European research institution [14]. We have anonymized information about all incoming and outgoing email between members of the research institution. There is an edge (u, v) in the network if person u sent person v at least one email. The emails only represent communication between institution members. The dataset also contains ground-truth community memberships of the nodes. Each individual belongs to exactly one of 42 departments at the research institute.
http://snap.stanford.edu/data/#email

Task

Baseline

  • spectral-based GNNs (2)
    ChebNet [11], GCN [19], SGC [44] , APPNP [21] and InfoMax [41]
  • spatial-based GNNs (2)
    GraphSage [16] and GAT [40]
  • Digraph GNNs
    DGCN [39]
  • Graph Inception
    SIGN [32]