Graph Data Model

With its graph API, Siren Federate allows to view existing Elasticsearch indices as property graphs.

A property graph is a data model where pairs of entities (i.e., vertices) are connected by directed relations (i.e., edges). Entities and relationships are associated to a label and can have properties.

Edges with properties are not currently supported. If an edge needs to be associated with properties, the current solution is to model that edge as a node. See Relations as Entities.

The graph data model is a JSON object that describes the Entities and Relations in a graph.

{
  "model": {
    "label": "<NAME>", (1)
    "entities": [], (2)
    "relations": [] (3)
  }
}

1	The name of a graph, that is also used to reference the corresponding property graph inside GQL queries (see GQL Syntax).
2	The set of entities to be considered for this graph.
3	The set of relations between entities.

Siren Federate requires that a model is provided as part of its graph API requests. This allows to view existing Elasticsearch documents as nodes of a property graph, while edges are joins between the indices of such documents.

The Elasticsearch indices and documents below are used to illustrate the concepts of entities and relations described further in this section.

PUT /people
{
  "mappings": {
    "properties": {
      "id": { "type": "integer" },
      "firstName": { "type": "keyword" },
      "age": { "type": "integer" },
      "knows": { "type": "integer" }
    }
  }
}

PUT /forums
{
  "mappings": {
    "properties": {
      "id": { "type": "integer" },
      "title": { "type": "keyword" },
      "members": { "type": "integer" },
      "messages": { "type": "integer" }
    }
  }
}

PUT /messages
{
  "mappings": {
    "properties": {
      "id": { "type": "integer" },
      "content": { "type": "text" },
      "creator": { "type": "integer" }
    }
  }
}

POST _bulk
{ "index" : { "_index" : "people" } }
{ "id": 1, "firstName": "John", "age": 25, "knows": 2 }
{ "index" : { "_index" : "people" } }
{ "id": 2, "firstName": "Paul", "age": 42 }
{ "index" : { "_index" : "forums" } }
{ "id": 1, "title": "BeatlesFans", "members": [1, 2], "messages": 1 }
{ "index" : { "_index" : "messages" } }
{ "id": 1, "content": "Hello, world!", "creator": 2 }

Such a collection of documents could be seen as a property graph like the one below, featuring two nodes with Person label, one node with Forum label, and one node with Message label. One person ("John") knows the other ("Paul") and both are members of the "BeatlesFans" forum, as shown by the hasMember relations. The forum is the containerOf a "Hello, World!" message which was written by "Paul" as indicated by the hasCreator relation.

  ┌──────────────────┐
  │      Person      │
  │                  │
  │ id: 1            ◄─────────hasMember───────────┐
  │ firstName: John  │                             │
  │                  │                             │
  └─────────┬────────┘                             │
            │                                      │
            │                                      │
            │                            ┌─────────┴─────────┐                    ┌───────────────────┐
            │                            │       Forum       │                    │      Message      │
            │                            │                   │                    │                   │
          knows                          │ id: 1             ├─────containerOf────► id: 1             │
            │                            │ title: BeatlesFans│                    │ content: "Hello,  │
            │                            │                   │                    │            World!"│
            │                            └─────────┬─────────┘                    └─────────┬─────────┘
            │                                      │                                        │
            │                                      │                                        │
            │                                      │                                        │
  ┌─────────▼────────┐                             │                                        │
  │      Person      │                             │                                        │
  │                  │                             │                                        │
  │ id: 2            ◄─────────hasMember───────────┘                                        │
  │ firstName: Paul  │                                                                      │
  │                  │                                                                      │
  └─────────▲────────┘                                                                      │
            │                                                                               │
            └──────────────────────────────────hasCreator───────────────────────────────────┘

In the context of our example, the model would look like the following:

{
  "model": {
    "label": "my-graph",
    "entities": [
      { "label": "Person", "indices": [ "people" ] },
      { "label": "Forum", "indices": [ "forums" ] },
      { "label": "Message", "indices": [ "messages" ] }
    ],
    "relations": [
      {
        "label": "knows",
        "from": "Person",
        "to": "Person",
        "conditions": {
          "must": [
            { "op": "EQ", "from_key": "knows", "to_key": "id" }
          ]
        }
      },
      {
        "label": "hasMember",
        "from": "Forum",
        "to": "Person",
        "conditions": {
          "must": [
            { "op": "EQ", "from_key": "members", "to_key": "id" }
          ]
        }
      },
      {
        "label": "containerOf",
        "from": "Forum",
        "to": "Message",
        "conditions": {
          "must": [
            { "op": "EQ", "from_key": "messages", "to_key": "id" }
          ]
        }
      },
      {
        "label": "hasCreator",
        "from": "Message",
        "to": "Person",
        "conditions": {
          "must": [
            { "op": "EQ", "from_key": "creator", "to_key": "id" }
          ]
        }
      }
    ]
  }
}

In our example, we can see that three classes of entities can occur in our property graph, i.e., those labelled as "Person", "Forum", or "Message". "Person" entities are represented by documents coming from the "people" index, while "Forum" entities come from the "forums" index, and "Message" entities from the "messages" index.

Similarly, we can see that three classes of relations exist in our example, i.e., those labelled "knows", "hasMember", and "containerOf". For instance, a "hasMember" relations will connect a "Forum"-labelled entity with a "Person"-labelled entity if their corresponding documents satisfy a set of join conditions. In our example, the only condition is that one values of the "members" property from the "Forum" entity is equal (i.e., EQ) to one of the values from the "id" property of the "Person" entity. Similar considerations applies to the other classes of relations from our example.

Entities

An entity represents a node in the graph with a given label. An entity is defined from a set of indices, and each document represents a particular node of the graph

Below is the entity definition for a Person in the example graph whose age is at least 18 years old.

{
  "label": "Person", (1)
  "indices": [ "people" ], (2)
  "request": { (3)
    "query": {
      "range": {
        "age": {
          "gte": 18
        }
      }
    }
  }
}

1	The `label` of this entity.
2	The `indices` where such entities are stored. An index can be a pattern, e.g., `people*`.
3	An optional `request` object which defines a query to filter the set of possible entities.

It is possible to have several entities defined with the same label.

Relations

A relation represents a directed edge in the graph with a given label. The relation is defined as the join from an index to the another based on some conditions.

Below is the relation definition for the edge hasMember in the example graph connecting a Forum to a Person.

{
   "label": "hasMember", (1)
   "from": "Forum", (2)
   "to": "Person", (3)
   "conditions": { (4)
     "must": [ (5)
       {
         "op": "EQ", (6)
         "from_key": "members", (7)
         "to_key": "id" (8)
       }
     ]
   }
}

1	The `label` of this relation.
2	The label of the entity that this relation connects `from`.
3	The label of the entity that this relation connects `to`.
4	The `conditions` that define how two entities relates to each other.
5	A boolean clause for the set of conditions. Only `must` is currently supported.
6	The type of condition this represents, here `EQ` is the equality between two fields.
7	The field name in the entity referenced in `from`. The field must appear in one of the indices of that entity.
8	The field name in the entity referenced in `to`. The field must appear in one of the indices of that entity.

It is possible to have several relations defined with the same label.

Relations as Entities

While the property graph model supports edges with properties, this feature is not currently implemented by Siren Federate. To address this limitation, relations with properties must be modelled as entities.

For example, imagine to extend our example by connecting ''Paul'' and ''John'' with a new calls relation. This relation represents a phone call and has some properties like the date of the call and its duration.

   ┌──────────┐                    ┌──────────┐
   │  Person  │                    │  Person  │
   │          │                    │          │
   │name: Paul├───────calls────────►name: John│
   │          │  date: 10-12-2025  │          │
   └──────────┘  duration: 3mins   └──────────┘

To represent this property graph with Siren Federate, the calls relation must be modelled with a Call entity containing the properties of the call. This entity must then be connected to "Paul" and "John" with two new relations with no properties (e.g., makes and isReceivedBy) that preserve "Paul" and "John" as source and destination of the phone call.

                    ┌────────────────┐
                    │      Call      │
                    │                │
         ┌──────────►date: 10-12-2025├──────────┐
         │          │duration: 3mins │          │
         │          │                │          │
         │          └────────────────┘          │
         │                                      │
       makes                               isReceivedBy
         │                                      │
    ┌────┴─────┐                           ┌────▼─────┐
    │  Person  │                           │  Person  │
    │          │                           │          │
    │name: Paul│                           │name: John│
    │          │                           │          │
    └──────────┘                           └──────────┘

The corresponding model would look like

{
  "model": {
    "label": "my-graph",
    "entities": [
      { "label": "Person", "indices": [ "people" ] },
      { "label": "Call", "indices": [ "phonecalls" ] },
      ...
    ],
    "relations": [
      {
        "label": "makes",
        "from": "Person",
        "to": "Call",
        "conditions": {
          "must": [
            { "op": "EQ", "from_key": "id", "to_key": "caller_id" }
          ]
        }
      },
      {
        "label": "isReceivedBy",
        "from": "Call",
        "to": "Person",
        "conditions": {
          "must": [
            { "op": "EQ", "from_key": "callee_id", "to_key": "id" }
          ]
        }
      },
      ...
    ]
  }
}

This technique is also known as reification among the RDF community.