GQL Syntax
GQL has a similar language syntax to SQL: a GQL query is built as a sequence of statements such as the SELECT statement. Compared to SQL, GQL provides the MATCH statement which allows to express graph patterns by drawing ascii art.
SELECT x.name, y (1)
FROM "my-graph" (2)
MATCH (x:Person) -[:lives]-> (y:City) (3)
| 1 | Returns as response the data bound to variables. |
| 2 | The name of the graph data model to query (see Graph Data Model). |
| 3 | A path pattern to match on the graph. |
Element Patterns
A graph pattern is made of node and edge patterns. The element pattern is the building block for expressing nodes or edges. The element pattern is written as a tuple of 3 clauses, all optional:
-
A variable name that can then be referenced.
-
A label from the data model, which can be either that of an entity or of a relation.
-
A filtering clause to constrain the set of nodes/edges to match.
For example, below is an element pattern where the variable is x, the label Person, and the filtering clause indicated by the WHERE keyword is field:value OR field2:value.
x:Person WHERE "field:value OR field2:value"
Node Pattern
A node pattern is expressed with parenthesis, wrapping an element pattern.
(x:Person) (1)
| 1 | Showing only the MATCH statement part of a GQL query, this is a single node pattern. |
A node pattern can also be empty, i.e., without any clause, and is written as (). This can be useful to express complex graph patterns.
In case two node patterns follow each other consecutively, then they get merged and are then bound to the same node.
In the query below, the two node patterns are merged, thus matching a node from the data model that has both Student and Professor labels. Filters if provided are AND’ed. The SELECT statement would return the same data twice since both variables are selected.
SELECT x, y
FROM "my-graph"
MATCH (x:Student) (y:Professor)
Edge Pattern
An edge pattern is expressed as the sequence of three element patterns:
-
The 1st is the source of the edge, with the element pattern wrapped in parenthesis.
-
The 2nd is the edge itself, with the element pattern wrapped in square brackets
[]. -
The 3rd is the destination of the edge, with the element pattern wrapped in parenthesis.
The direction of the edge is represented with an angle bracket: → depicts an edge going from the source to the destination, while ← depicts an edge in the opposite direction.
(x:Person) -[:lives]-> (:City) (1)
| 1 | Showing only the MATCH statement part of a GQL query, this is a single edge pattern with Person as source and City as destination, and edge is labelled lives. This edge is directed to the right, as depicted with →. |
Similar to the node pattern, clauses of the edge element pattern can be ommited, e.g., to express an edge which label is not known.
(:City) <- (:Person) (1)
| 1 | This represents an edge from a Person to a City with the label unknown. |
|
Applying a filter on an edge is not yet supported, as we do not currently support edges with properties. See Relations as Entities as the current mechanism for addressing this need.
|
Path pattern
A path pattern consists in the succession of several nodes and edges patterns. Below is depicted the path connecting a Person to a Country, via the intermediate node City.
SELECT x, y, z
FROM "my-graph"
MATCH (x:Person) -[:lives]-> (y:City) <-[:has_capital]- (z:Country) (1)
| 1 | A path pattern made of two edges. |
The set of path patterns can be further restricted by using a selective path search.
Conjunction of path patterns
A MATCH statement may contain one or more path patterns, separated by commas ,. The path patterns must connect to one another via variables.
SELECT x
FROM "my-graph"
MATCH (:Forum) -> (x:Person) -> (:Tag), (1)
(x) -> (:Company)
| 1 | Two path patterns separated by a comma. |
|
The set of path patterns must form a connected graph, meaning that there cannot be a path pattern that is disconnected from others. For example, the query below is not supported because both path patterns don’t connect to each other. Even if both match a
|
Parenthesized Expression
A parenthesized expression is used within a path pattern, and represents a group of one or more node and edge patterns.
SELECT x, y
FROM "my-graph"
MATCH ( (x:Person) -[:lives]-> (y:City) ) (1)
| 1 | A parenthesized expression with a single edge pattern. |
Quantified Pattern
A quantified pattern expresses the number of times that pattern should be repeated when evaluating matches on the graph. This allows to make the pattern matching dynamic by varying the shape of the matching patterns. A quantifier represents an interval between two numbers and is represented as {n, m}, where n is the lower bound of the interval and m the upper bound. There exist several shorthands:
-
{n}is equivalent to{n,n}. -
*is equivalent to{0,m}withmbeing a cluster-defined maximum value. -
+is equivalent to{1,m}withmbeing a cluster-defined maximum value.
|
The dynamic cluster setting |
Quantified edge pattern
A quantifier can be applied on an edge pattern. This can be useful to represents loops or paths of unknown length.
(:Person) ->{1,2} (:Company) (1)
// Possible expansions of the quantified edge pattern
(:Person) -> (:Company) (2)
(:Person) -> () -> (:Company) (3)
| 1 | Showing only the MATCH statement part of a GQL query, the unlabeled edge from Person to Company should appear at least once and at most twice in the matching pattern. |
| 2 | The path of length 1 connecting a Person directly to a Company via an unlabeled edge. |
| 3 | The path of length 2 connecting a Person to a Company via an intermediate unlabeled node. The edges are both unlabeled. |
Quantified parenthesized expression
It is possible to apply a quantifier on a parenthesized expression. This allows to express complex paths that repeat.
For example, the quantifier {1,2} is applied to a parenthesized expression in the query below. That query matches a single path pattern and returns persons connected via one or two companies.
SELECT x, y
FROM "my-graph"
MATCH (x:Person) -[:knows]->
( (:Person) -> (:Company) -> (:Person) ){1,2} (1)
-[:knows]-> (y:Person)
| 1 | the sequence of patterns (:Person) → (:Company) → (:Person) occurs at least once, and at most twice. |
WHERE Clause in Element Patterns
An element pattern has an optional WHERE clause which allows to constrain the set of nodes or edges matching a particular pattern. The filter can be expressed using two different syntaxes: one based on the Lucene query syntax, the other on the Elasticsearch JSON DSL syntax.
Lucene query syntax
The WHERE clause is a string written using the Lucene query syntax.
The text is evaluated as an Elasticsearch query_string query.
The queried index is the one pointed to by the label of the element pattern, as defined in the graph data model.
(:Person WHERE "first_name:alice") (1)
(:Person WHERE "last_name:smith AND (first_name:alice OR first_name:bob)") (2)
| 1 | Showing only the MATCH statement part of a GQL query, this restricts the Person to those which the first_name field in the Elasticsearch document contains the term alice. |
| 2 | The Lucene query syntax can express complex conditions, e.g., a boolean combination of terms. |
Elasticsearch JSON DSL syntax
The keyword ES used after WHERE allows to use the Elasticsearch JSON query DSL as a filter.
This is useful when there is a need for querying capabilities that cannot be expressed using the Lucene query syntax.
(:Person WHERE ES {"query":{"term":{"first_name":{"value":"alice"}}}}) (1)
| 1 | Showing only the MATCH statement part of a GQL query, a filtering clause written with the JSON query DSL of Elasticsearch. |
SELECT Statement and Variable Binding
It is possible to bind a variable to different patterns of the query, and then retrieve the data associated with the bound pattern using the SELECT statement.
Select a variable bound to a node
The variable bound to a node pattern can be selected. A node from the graph matching that pattern is represented as a list with information detailing the Elasticsearch document it is from.
SELECT y (1)
FROM "my-graph"
MATCH (x:Person) -[:lives]-> (y:City)
| 1 | This selects data from the pattern bound to y. |
When the variable is bound to a node, selecting it returns a list describing the Elasticsearch document that is represented by that node. The list can be for example ["person", "123"] where the elements mean the following:
-
The 1st element
personis the index name that contains the document represented byy. -
The 2nd element
123is the _id field of the document represented byy.
Select a node’s property via the bound variable
It is also possible to retrieve the property of a node, which is taken from a field of the Elasticsearch document that is represented by that node.
SELECT x.name (1)
FROM "my-graph"
MATCH (x:Person) -[:lives]-> (y:City)
| 1 | This selects the field name from the node pattern bound to x, labelled Person. |
Select a variable bound to an edge
The variable bound to an edge pattern can be selected. We return the label of the edge in that case. This is useful when the label of an edge in a pattern is unknown.
SELECT u (1)
FROM "my-graph"
MATCH (:Person) -[u]-> (:City)
| 1 | This returns isLocatedIn since u is bound to an edge which only possible label based on the graph data model is isLocatedIn. |
Select the variable bound to a path pattern
A path pattern can also be bound to a variable. This is useful when the components of a path are needed as result. The variable is then bound to all node and edge patterns declared in the path pattern.
SELECT p (2)
FROM "my-graph"
MATCH p = (:Person) -[:lives]-> (:City) <-[:has_capital]- (:Country) (1)
| 1 | The variable p is bound to the path pattern (x:Person) -[:lives]→ (y:City) ←[:has_capital]- (z:Country). |
| 2 | The variable p is selected. |
The result of selecting p is a list containing the textual representation of the nodes and edges making up the path that matched the pattern.
[
[ "person", "123" ],
[ "lives" ],
[ "city", "Berlin" ],
[ "has_capital" ],
[ "country", "Germany" ]
]