data:image/s3,"s3://crabby-images/53472/5347265eeab007a91942cdc158469f4a9749e3c4" alt=""
Pan Chuang AI sharing
Transferred from Xinzhiyuan
Editor: sleepy little salted fish
[guide] Google uses a powerful tool GNN (Figure neural network) in the environment of spam detection, traffic estimation and YouTube content tags. On November 18, Google and DeepMind opened the TensorFlow GNN tool to help basic research in the fields of traffic prediction, rumor and false news detection, disease transmission modeling, physical simulation and so on.
On November 18, Google and DeepMind released TensorFlow GNN (neural network).
data:image/s3,"s3://crabby-images/5688d/5688de4a100827e923a57b0bbb77ac7029e3341a" alt=""
At present, Google has used early versions of this library in environments such as spam detection, traffic estimation and YouTube content tags.
data:image/s3,"s3://crabby-images/bb3f3/bb3f3db4967fa3fb04b1e18592ebc7ac92458ab7" alt=""
Why use GNN?
Graph is an abstract data structure used to represent the association relationship between objects. It is described by nodes / vertices and edges: vertices represent objects and edges represent the relationship between objects.
In the real world and engineering systems, graphs are everywhere.
For example, a group of objects, places or people and the connections between them can usually be described graphically.
Generally, the data seen in machine learning problems is structured or relational, so it can also be described by graphs.
After decades of basic research, GNN has made progress in many fields, such as traffic prediction, rumor and false news detection, disease transmission modeling, physical simulation and understanding why molecules smell.
data:image/s3,"s3://crabby-images/876f0/876f06d35ebbe84d1c1ead5439d0f85fce5c28e7" alt=""
The graph simulates the relationship between different types of data: Web page (left), social connection (middle) or molecule (right)
Through GNN, we can answer the questions about the various features of graphs. For example, the various "shapes" observed in the figure: the circles in the figure may represent sub molecules or close social relations.
In the node level task, GNN can classify the nodes of the graph and predict the partition and affinity in the graph, similar to image classification or segmentation.
In edge level tasks, GNN can be used to discover the connections between entities, such as "trimming" the edges in the graph with GNN, so as to identify the state of objects in the scene.
Structure of TF-GNN
TF-GNN provides a building block for implementing the GNN model in TensorFlow.
In addition to the modeling API, TF-GNN also provides a large number of tools around the difficult task of processing graph data: graph data structure based on Tensor, data processing pipeline, and some sample models for users to get started quickly.
data:image/s3,"s3://crabby-images/bb7f4/bb7f43be6ccf7e3c1bea15061d2a808c02f97760" alt=""
Various parts of TF-GNN that make up the workflow
The initial version of the TF-GNN library contains many utilities and functions, including:
- A high-level Keras style API for creating GNN models can be easily combined with other types of models. GNN is often used in conjunction with ranking, depth retrieval (dual encoders) or other types of models (images, text, etc.).
- GNN API for heterogeneous graphs. Many graph problems in the real world contain different types of nodes and edges. Therefore, Google chose to provide a simple way to model.
- A well-defined schema to declare the topology of a graph and the tools to validate it. This pattern describes the shape of its training data and is used to guide other tools.
- A GraphTensor composite tensor type that holds graph data. It can be processed in batches and has available graph operation procedures.
- An operation library for graphtenser structure:
- Relevant tools for effective propagation and pooling operations on nodes and edges.
- A standard convolution library can be easily extended by ML engineers / researchers.
- A high-level API for product engineers to quickly build GNN models without worrying about their details.
- Encode the graph training data on disk and a library for parsing these data into data structures. Among them, the model can extract various features.
Use example
For example, use the TF-GNN Keras API to build a model and recommend movies to users according to the content they watch and the type they like.
Specify the type of edge and node configuration by using the ConvGNNBuilder method, that is, use weightedsumrevolution for the edge. Each time you pass through GNN, the node value will be updated through the Dense interconnect layer:
import tensorflow as tf import tensorflow_gnn as tfgnn # Model hyper-parameters: h_dims = {'user': 256, 'movie': 64, 'genre': 128} # Model builder initialization: gnn = tfgnn.keras.ConvGNNBuilder( lambda edge_set_name: WeightedSumConvolution(), lambda node_set_name: tfgnn.keras.layers.NextStateFromConcat( tf.keras.layers.Dense(h_dims[node_set_name])) ) # Two rounds of message passing to target node sets: model = tf.keras.models.Sequential([ gnn.Convolve({'genre'}), # sends messages from movie to genre gnn.Convolve({'user'}), # sends messages from movie and genre to users tfgnn.keras.layers.Readout(node_set_name="user"), tf.keras.layers.Dense(1) ])
In addition, GNN can use a more powerful custom model architecture in some scenarios.
For example, specify that certain movies or genres have more weight when recommended.
Then, a more advanced GNN can be generated by customizing the volume of the map.
In the following code, the weightedsumrevolution class is used to collect the values of edges as the sum of the weights of all edges:
class WeightedSumConvolution(tf.keras.layers.Layer): """Weighted sum of source nodes states.""" def call(self, graph: tfgnn.GraphTensor, edge_set_name: tfgnn.EdgeSetName) -> tfgnn.Field: messages = tfgnn.broadcast_node_to_edges( graph, edge_set_name, tfgnn.SOURCE, feature_name=tfgnn.DEFAULT_STATE_NAME) weights = graph.edge_sets[edge_set_name]['weight'] weighted_messages = tf.expand_dims(weights, -1) * messages pooled_messages = tfgnn.pool_edges_to_node( graph, edge_set_name, tfgnn.TARGET, reduce_type='sum', feature_value=weighted_messages) return pooled_messages
Although convolution is written considering only source and target nodes, TF-GNN ensures its applicability and can work seamlessly on heterogeneous graphs (with various types of nodes and edges).
Installation instructions
This is currently installed tensorflow_gnn preview is the only way. Virtual environments are highly recommended.
- Clone tensorflow_gnn
$> git clone https://github.com/tensorflow/gnn.git tensorflow_gnn
- Install TensorFlow
TF-GNN requires a function in TensorFlow 2.7: TF ExtensionTypes.
$> pip install tensorflow
- Installing Bazel
Bazel is needed to build the source code of TF-GNN.
- Installing GraphViz
TF-GNN will use GraphViz as a visualization tool. The installation method varies depending on the operating system. For example, in Ubuntu:
$> sudo apt-get install graphviz graphviz-dev
- Install tensorflow_gnn
$> cd tensorflow_gnn && python3 -m pip install
reference material:
https://blog.tensorflow.org/2021/11/introducing-tensorflow-gnn.html?m=1
https://github.com/tensorflow/gnn