Many interesting real-world graphs, encountered in modelling social, transportation, financial transactions, or academic citation networks, are *directed*. The direction of the edges often conveys crucial insights, otherwise lost if one considers only the connectivity pattern of the graph.

In contrast, most Graph Neural Networks (GNNs) that have made remarkable strides in a variety of graph ML applications, operate under the assumption that the input graph is *undirected*. Making the input graph undirected has become so prevalent over the years that PyTorch-Geometric, one of the most popular GNN libraries, includes a general utility function that automatically makes graphs undirected when loading datasets [2].

This inclination towards undirected graphs comes from two “primordial sins” of GNNs. First, undirected graphs have symmetric Laplacians with orthogonal eigenvectors offering a natural generalisation of the Fourier transform, on which early spectral GNNs relied to function properly. Second, early datasets used to benchmark GNNs were predominantly *homophilic* graphs [3], such as Cora and Pubmed [4]. On such datasets disregarding the direction by converting the directed graph into an undirected one appears to be advantageous, early evidence whereof has helped cement the “undirected” paradigm.

We challenge this *status quo* in our recent paper [5], showing that directionality can bring extensive gains in heterophilic settings.

The homophily of a graph is typically measured as the fraction of neighbours with the same label as the node itself, averaged across all nodes (*node homophily*). For directed graphs, we propose the *weighted node homophily*:

h(S) = 1/*n* Σ*ᵤ* ( Σ*ᵥ* *sᵤᵥ* * **I**[*yᵤ* = *yᵥ*] ) / Σ*ᵥ* *sᵤᵥ*

where **I** denotes the indicator function, *n* is the number of nodes, and **S** is a general adjacency matrix, which can be picked up as 𝐀** **or** **𝐀ᵀ, or as higher-order matrices, such as 𝐀𝐀ᵀ** **or 𝐀² for directed graphs, or as the symmetric matrix 𝐀ᵤ** **= (𝐀** **+ 𝐀ᵀ) / 2 and its higher-order counterpart 𝐀ᵤ², if the graph is considered as undirected.

Even when 1-hop neighbours are heterophilic [6], the situation may change when going to farther nodes. Compared to the undirected case, there are four distinct 2-hops in directed graphs represented by the matrices 𝐀²,** **(𝐀ᵀ)², 𝐀𝐀ᵀ, and** **𝐀ᵀ𝐀, which can manifest different levels of (weighted) homophily.

Given that GNNs operate through multiple-hop aggregations, they can leverage the homophily of any 2-hop (or even further hops) of a graph. To have a comprehensive metric capturing the maximum homophily a GNN can leverage in principle, we introduce the notion of *effective homophily*, defined as the maximum weighted node homophily at any hop of the graph.

Empirically, we observe that the effective homophily of directed homophilic datasets is left unchanged when making the graph undirected. In heterophilic graphs, in contrast, this conversion decreases the effective homophily by almost 30% on average.

In particular, we observe that 𝐀𝐀ᵀ and** **𝐀ᵀ𝐀** **consistently appear to be the “most homophilic matrices” for heterophilic graphs.

To provide an intuition about why this is the case, imagine we are trying to predict the publication year of a specific academic paper, for instance, the Kipf & Welling 2016 GCN paper, given the directed citation network and the year of publication of the other papers. Consider two different kinds of 2-hop relationships: one where we look at papers cited by the papers that our paper of interest *v* cites (represented by the *v*th row of the matrix 𝐀²), and another where we look at papers that cite the same sources as our paper (represented by (𝐀𝐀ᵀ)*ᵥ*).

In the first case (𝐀²), let us start from the GCN paper and follow its citations twice. We land on a paper by Frasconi *et al.* from 1998. This older paper does not give us much helpful information about when our GCN paper was published because it is too far in the past.

In the second case (𝐀𝐀ᵀ), we begin with the GCN paper, follow a citation, and then come back to a paper that cites the same source, like the 2017 GAT paper. This paper is much closer to our GCN paper’s publication year and thus provides a better clue. More generally, nodes that share more references, like in our second example, will have higher scores in 𝐀𝐀ᵀ, and thus contribute more to our final prediction.

Now, consider an undirected 2-hop relationship (𝐀ᵤ²), which is just the average of the four possible 2-hop matrices. This includes our first type (like Frasconi et al.), which was not very helpful. Therefore, the highly useful 𝐀𝐀ᵀ matrix gets diluted by less informative matrices, like 𝐀², leading to a less homophilic operator, resulting in a less reliable predictor overall.

While we have used a citation network in our example, this intuition applies more broadly. In a social network, for instance, an influencer’s characteristics are more likely to resemble those of users who share many followers with them, represented by 𝐀ᵀ𝐀. Similarly, in a transaction network, two accounts sending money to the same set of accounts (captured by 𝐀𝐀ᵀ), are likely to exhibit similar behaviour.

In order to leverage directionality effectively, we propose the *Directed Graph Neural Network* (Dir-GNN) framework, which extends MPNNs to directed graphs by performing separate aggregations over the in- and out-neighbours of a node:

**m***ᵤ*⁽ᵏ⁾ᵢₙ = AGGᵢₙ({{**x***ᵥ*⁽ᵏ⁻¹⁾, **x***ᵤ*⁽ᵏ⁻¹⁾) : (*u*,*v*) ∈ E }})

**m***ᵤ*⁽ᵏ⁾ₒᵤₜ = AGGₒᵤₜ({{**x***ᵥ*⁽ᵏ⁻¹⁾, **x***ᵤ*⁽ᵏ⁻¹⁾) : (*u*,*v*) ∈ E }})

**x***ᵤ*⁽ᵏ⁾ = COM(**x***ᵤ*⁽ᵏ⁻¹⁾, **m***ᵤ*⁽ᵏ⁾ᵢₙ, **m***ᵤ*⁽ᵏ⁾ₒᵤₜ)

where the aggregation maps AGGᵢₙ and AGGₒᵤₜ, as well as the combination maps COM are learnable (usually a small neural network). Importantly, AGGᵢₙ and AGGₒᵤₜ can have independent sets of parameters to allow for different aggregations over in- and out-edges.

Interestingly, this procedural pattern resembles the one implemented by a natural extension of the classical Weisefiler-Lehman graph isomorphism test (1-WL) to directed graphs [7]. This connection is instrumental: in terms of discriminative power, we show that Dir-GNN is *strictly more powerful *than standard MPNNs, which either convert the graph to undirected or propagate messages only in the direction of the edges.

Our framework is also flexible: it is easy to define directed counterparts of specific architectures such as GCN, GraphSAGE or GAT. For example, we can define Dir-GCN as:

𝐗⁽ᵏ⁾ = σ(𝐒ₒᵤₜ𝐗⁽ᵏ⁻¹⁾𝐖ₒᵤₜ⁽ᵏ⁾ + (𝐒ₒᵤₜ)ᵀ𝐗⁽ᵏ⁻¹⁾𝐖ᵢₙ⁽ᵏ⁾)

where 𝐒ₒᵤₜ= **D**ₒᵤₜ⁻¹ᐟ² 𝐀 **D**ᵢₙ⁻¹ᐟ² and **D**ᵢₙ and **D**ₒᵤₜ represent the diagonal in- and out-degree matrices, respectively.

We also show that Dir-GNN, when iteratively applied over multiple layers, leads to more homophilic aggregations. Unlike other models, Dir-GNN can access the four 2-hop matrices 𝐀²,** **(𝐀ᵀ)², 𝐀𝐀ᵀ and** **𝐀ᵀ𝐀 and learn to weigh them differently. In contrast, a model operating on the undirected graph has only access to 𝐀ᵤ², while models propagating information exclusively along in- or out-edges are limited to (𝐀ᵀ)² and 𝐀² respectively.

Dir-GNN, thanks to its separate aggregation of the two directions, is therefore the only model operating on 𝐀𝐀ᵀ and** **𝐀ᵀ𝐀, which we have shown to be the most homophilic matrices and therefore the most reliable predictor.

We first compared GraphSAGE and its directed extension (Dir-SAGE) on a synthetic task requiring directionality information. The results confirm that only Dir-SAGE(in+out), with access to both in- and out-edges, is able to almost perfectly solve the task. The model acting on the undirected version of the graph performs no better than chance, while the models acting on only in- or out-edges perform similarly obtaining around 75% accuracy.

We further validated our approach with an ablation study comparing GCN, GraphSAGE and GAT base models with their Dir- extensions. On heterophilic datasets, using directionality brings exceptionally large gains (10% to 20% absolute) in accuracy across all three base GNN models. Moreover, Dir-GNN beats state-of-the-art models designed especially for heterophilic graphs.

These results suggest that, when present, using the edge direction can significantly improve learning on heterophilic graphs. In contrast, discarding it is so harmful that not even complex architectures can make up for this loss of information.

On the other hand, on homophilic datasets using directionality leaves the performance unchanged (or even slightly hurts). This is in line with our findings that using directionality as in our framework generally increases the effective homophily of heterophilic datasets, while leaving it almost unchanged for homophilic datasets.