ArangoDB performance test

ArangoDB test

Test environment: x-node cluster, xCpu, xg memory

start-up

Single node startup mode: systemctl arangodb3 start

–database.directory = /var/lib/arangodb3

Cluster startup mode:

# Start Master
arangodb --starter.data-dir=/root/arangodb/db1 --server.storage-engine=rocksdb start

# Start other nodes
arangodb --starter.data-dir=/root/arangodb/db2 --server.storage-engine=rocksdb --starter.join XX.XX.XX.XXX start
arangodb --starter.data-dir=/root/arangodb/db3 --server.storage-engine=rocksdb --starter.join XX.XX.XX.XXX start

–database.directory = /root/arangodb/db1/dbserver8530/data

Get random vertices
for v in airports
	sort rand()
	limit 1
	return v

Test item

Data loading test
1. Load vertex
arangoimp --file /home/junhu/airports.csv --collection airports --create-collection true --type csv --server.endpoint "http+tcp://XX.XX.XX.XXX:8529"

2. Loading edge
arangoimp --file /home/junhu/flights.csv --collection flights --create-collection true --type csv --create-collection-type edge --server.endpoint "http+tcp://XX.XX.XX.XXX:8529"
  1. Time consuming to load time arangoimp
  2. Disk capacity after loading su -sb /var/lib/arangodb3
K-degree query test

Starting from the vertex, find the total count of all vertices at the end of the path with length K

Randomly select 3 nodes for K-degree query test and take the average value

Direction: inbound outbound any

1. Query test
return length(for v in airports
    filter v._id=="airports/LAX"
    for i in 1..1 any v flights 
    OPtions {
        uniqueVertices:'path',
        uniqueEdges:'none'
    }
    return Distinct i)
    
2. Secondary query test
3. Three degree query test

Fig. algorithm performance test

1. establish Graph
2. Load vertex
arangoimp --file /home/junhu/airports.csv --collection airports --type csv --server.endpoint "http+tcp://XX.XX.XX.XXX:8529"

3. Loading edge
arangoimp --file /home/junhu/flights.csv --collection flights --type csv --create-collection-type edge --server.endpoint "http+tcp://XX.XX.XX.XXX:8529"
connected components

In an undirected graph, if any two vertices vivi and vjvj have paths connected, the undirected graph is called a connected graph.

Weakly connected subgraph (WCC) is a collection of all vertices that can be connected with each other and the edges between vertices. If it is a directed edge, its direction is ignored. WCC query finds and marks all weakly connected subgraphs in the graph. This query requires traversing each vertex and each edge.

arangosh

var pregel = require('@arangodb/pregel');
var handle = pregel.start("connectedcomponents", "GraphName", {maxGSS:10, resultField: "compend",shardKeyAttribute:"_key",store:false})
pregel.status(handle)

pregel.cancel(handle)
PageRank
PageRank The core idea is whether a website is reliable or not. Let the whole Internet vote and select the most relevant web page weight through the voting results
PageRank Is an iterative algorithm, which traverses each edge in each iteration and calculates a score value for each vertex. After many times
 After iteration, these scores will converge to the scores of steady state. Our test is to run 10 iterations.

arangosh

var pregel = require('@arangodb/pregel');
var handle = pregel.start("pagerank", "GraphName", {maxGSS:10, resultField: "rank",shardKeyAttribute:"_key",store:false,"gss":10})
pregel.status(handle)

pregel.cancel(handle)
Single source shortest path

Given a vertex, calculate the shortest path distance from this vertex to any other vertex

var pregel = require("@arangodb/pregel");
pregel.start("sssp", "graphname", {maxGSS:10, source: "vertices/1337", _resultField: "distance"});
Community Detection

The subgraph corresponding to the closely connected node subset is called community. The community node sets that do not intersect with each other are called disjoint communities, and the ones that do intersect are called overlapping communities. The phenomenon that a network diagram contains communities is called community structure. Community structure is a common feature in the network. Given a network diagram, the process of finding out its community structure is called community detection.

const pregel = require("@arangodb/pregel");
const handle = pregel.start("labelpropagation", "yourgraph", {maxGSS: 100, resultField: "community"});
Vertex Centrality

Centrality is a commonly used concept in graph/network analysis. It is used to express the degree of centrality of a vertex in the whole network, also known as centrality. According to different methods of measuring centrality, it can be divided into Degree centrality (in Degree centrality, out Degree centrality, etc.) according to different directions, proximity centrality (or close centrality), intermediate centrality (or intermediate neutral line, better centrality), etc

Closeness centrality:

const pregel = require("@arangodb/pregel");
const handle = pregel.start("effectivecloseness", "yourgraph", {resultField: "closeness"});

Betweenness centrality:

const pregel = require("@arangodb/pregel");
const handle = pregel.start("linerank", "yourgraph", {resultField: "linerank"});

reference

Keywords: Database

Added by zoidberg on Thu, 10 Feb 2022 15:36:02 +0200