Research/Blog

CellStrat > Research/Blog > Artificial Intelligence > Graph Neural Networks > Relational GCN for Heterogeneous Graphs

Relational GCN for Heterogeneous Graphs

June 23, 2020
Posted by: vsinghal
Category: Graph Neural Networks Publishing Retail

#CellStratAILab #disrupt4.0 #WeCreateAISuperstars #WhereLearningNeverStops

Recently, our AI Lab Researcher Gouthaman Asokan presented an amazing session on Relational Graphical Convolutional Networks (Relational GCN). These are particularly useful for mining heterogeneous groups.

Table of Contents :-

GCN
HOMOGENEOUS AND HETEROGENEOUS GRAPHS
KEY IDEAS OF RELATIONAL GCN
FINDING PAGERANK USING MESSAGE PASSING
BUILDING A RELATIONAL GCN FOR THE ACM DATASET

Intro to GCN :-

GCN (Graphical Convolutional Networks) exploits structural information of a dataset to improve the extraction of node representations. Here, the edges have no value.

*Image credit : https://www.semanticscholar.org/paper/Towards-Machine-Learning-Enabled-Automatic-Design-W%C3%85HLIN/6af7126fea32189b8754ef2f1e1b8477729cc19e*

We can have homogeneous and heteregeneous graphs.

Heterogeneous graphs are graphs that contain different types of nodes and edges. A homogeneous graph is just a special case of a heterograph with only one type of node and edge.

*Image credit : https://guides.co/g/the-network-effects-bible/121732*

*Image credit : https://medium.com/stellargraph/knowing-your-neighbours-machine-learning-on-graphs-9b7c3d0d5896*

Relational GCN vs GCN :-

Knowledge graphs have triplets in the form of subject, relation and the object. Thus, we need to consider the edges for the relations.

*Image credit : https://docs.dgl.ai/en/0.4.x/tutorials/basics/5_hetero.html*

Example:
Barack Obama – Subject
Occupation – Relation
Politician – Object

Application of Relational GCN – Recommender Systems :-

Users and the movies are the two types of nodes in the Recommender Systems. User-Movie interactions can be marked with edges with ratings.

Solving statistical Relational learning tasks :-

Entity classification is done by attaching a softmax classifier at the final embedding of an entity (node). Training is through loss of standard cross-entropy.

Link prediction is done by reconstructing an edge with an autoencoder architecture, using a parameterized score function. Training uses negative sampling.

*Image credit : https://www.analyticsvidhya.com/blog/2020/01/link-prediction-how-to-predict-your-future-connections-on-facebook/*

Key ideas of R-GCN :-

GCN-Hidden representation for each node i at (l+1)th layer:

Relational GCN-Hidden representation for each node i at (l+1)th layer:

Each relation r has a different weight matrix.

Weight Wr(l) is a linear combination of basis transformation Vb(l) with coefficients a(l)

Computing Update for single node :-

*Image credit : https://arxiv.org/pdf/1703.06103.pdf*

Activations (d-dimensional vectors) from neighboring nodes (dark blue) are gathered and then transformed for each relation type individually (for both in- and outgoing edges).

The resulting representation (green) is accumulated in a (normalized) sum and passed through an activation function (such as the ReLU).

Message Passing :-

In DGL, the message passing and feature transformations are user-defined functions (UDFs).

Message functions are expressed as Edge UDFs.

dgl.function.copy_src(src, out) – An edge UDF that computes the output using the source node feature data.

Reduce functions are Node UDFs.

dgl.function.sum(msg, out) – A node UDF that sums the messages in the node’s mailbox.

*Image credit : https://docs.dgl.ai/en/0.4.x/tutorials/basics/3_pagerank.html*

PageRank using Message Passing :-

PageRank algorithm: PageRank is an algorithm used by Google Search to rank web pages in their search engine results.

Below table shows each webpage and its connected links.

*Image credit : https://www.analyticsvidhya.com/blog/2018/11/introduction-text-summarization-textrank-python*

In each iteration of PageRank, every node (web page) first scatters its PageRank value uniformly to its downstream nodes. The new PageRank value of each node is computed by aggregating the received PageRank values from its neighbors, which is then adjusted by the damping factor.

N is the number of nodes in the graph; D(v) is the out-degree of a node v; and N(u) is the neighbor nodes.

Ontology – Graph format :-

Ontologies are semantic data models that define the types of things that exist in our domain and the properties that can be used to describe them.

Knowledge Graph :-

A knowledge graph is created when you apply an ontology (our data model) to a set of individual data points (our book, author, and publisher data).

Ontology+data = knowledge graph

Citation Network – ACM dataset :-

Association for Computing Machinery (ACM) publishes an ACM dataset that contains two million papers, their authors, publication venues, and the other papers that were cited. This information can be represented as a heterogeneous graph.

The tables below show the various ACM Nodes :-

The ACM Edges are depicted below :-

PyTorch implementation :-

Torch.nn.ModuleDict – Holds submodules in a dictionary

Torch.nn.Linear – Applies a linear transformation to the incoming data

Torch.nn.Parameter – A kind of Tensor that is to be considered a module parameter

Torch.nn.init.xavier_uniform_ – Fills the input Tensor with values according to the method described in “Understanding the difficulty of training deep feedforward neural networks – Glorot, X. & Bengio, Y. (2010)”, using a uniform distribution

Torch.nn.ParameterDict – Holds parameters in a dictionary