Search APIs

Siren Federate introduces two new search actions, /siren/<INDEX>/_search that replaces the /<INDEX>/_search Elasticsearch action, and /siren/<INDEX>/_msearch that replaces the /<INDEX>/_msearch Elasticsearch action. Both actions are extensions of the original Elasticsearch actions and therefore support the same API. One must use these actions with the join query clause, as the join query clause is not supported by the original Elasticsearch actions.

Permissions: To use the APIs that are listed in this section, ensure that the cluster-level wildcard action cluster:internal/federate/* is granted by the security system.

Search API

The search API allows you to execute a search query and get back search hits that match the query.

Request

curl -XGET 'http://localhost:9200/siren/<INDEX>/_search'

curl -XPOST 'http://localhost:9200/siren/<INDEX>/_search'

curl -XGET 'http://localhost:9200/siren/_search'

curl -XPOST 'http://localhost:9200/siren/_search'

Path parameter

<index>

(Optional, string) Comma-separated list or wildcard expression of index names used to limit the request.

Permissions: To use this API, ensure that the index-level wildcard action indices:data/read/federate/search* and the indices:data/read/federate/planner/search action are granted by the security system.

Scroll API

The scroll API allows to paginate search hits. Similarly to Elasticsearch, you pass a scroll parameter to the Search API to set the duration of a scroll. Then to go through each pages or clear a scroll, you use the endpoint /siren/_search/scroll/<SCROLL_ID> instead of the /_search/scroll/<SCROLL_ID> indicated in the Elasticsearch documentation.

Permissions: In addition to the permissions for the search API, the scroll API also requires the following index-level actions to be granted by the security system:

  • indices:data/read/federate/scroll

  • indices:data/read/federate/scroll/clear

Multi Search API

The multi search API allows to execute several search requests within the same API.

Request

curl -XGET 'http://localhost:9200/siren/<INDEX>/_msearch'

curl -XPOST 'http://localhost:9200/siren/<INDEX>/_msearch'

curl -XGET 'http://localhost:9200/siren/_msearch'

curl -XPOST 'http://localhost:9200/siren/_msearch'

Path parameter

<index>

(Optional, string) Comma-separated list or wildcard expression of index names used to limit the request.

Permissions: To use this API, ensure that the index-level wildcard action indices:data/read/federate/search* and the indices:data/read/federate/planner/msearch action are granted by the security system.

Search Request

The syntax for the body of the search request is identical to the one supported by the Elasticsearch search API, with the additional support for the join query clause in the Query DSL.

Parameters

In addition to the parameters supported by the Elasticsearch search API, the Federate search API introduces the following additional parameters:

task_timeout

A task timeout, bounding a task to be executed within the specified time value (in milliseconds) and returns with the values accumulated up to that point when expired. Defaults to no timeout (-1).

debug

To retrieve debug information from the query planner. Defaults to false.

Taking advantage of the join query cache

The join query cache is responsible for caching the results of a join query clause at the shard level. If an index has one or more replicas, it is recommended that you specify the preference parameter of the search request.

If no preference parameter is specified, the search request is processed against a random selection of shards. In such a scenario, the join query cache on every shard may differ and the chance of having a positive cache hit decreases.

For example, it is common practice to specify a user session ID as preference, so that the same set of shards are selected across the search requests of a same user.

Search Response

The response returned by Federate’s search API is similar to the response returned by Elasticsearch’s search API. It extends the response with a planner object which includes information about the query plan execution.

is_pruned

The request response may have been truncated for several reasons and the flag is_pruned indicates that the search results are incomplete:

query_plan

If the debug parameter was enabled, it will also include detailed information and statistics about the query plan execution within a query_plan object.

If the `debug` parameter was disabled and the response was truncated, then
a simplifed query plan is displayed with information detailing the causes of
the truncation.
[source,json]
-----------------------------------------------------------
{
    "_shards": {
        "failed": 0,
        "skipped": 0,
        "successful": 5,
        "total": 5
    },
    "hits": {
        "hits": [],
        "max_score": 0.0,
        "total": 0
    },
    "planner": {
        "is_pruned": true,
        "is_truncated": true,
        "node": "AYex2HdPTu-cwkqwaquH1w",
        "query_plan": {
            "children": [
                {
                    "failures": [
                        {
                            "reason": "Unable to allocate buffer of size 2097152 due to memory limit. Current allocation: 0",
                            "type": "out_of_memory_exception"
                        }
                    ],
                    "type": "SearchTaskBroadcastRequest"
                }
            ],
            "type": "SearchJoinRequest"
        },
        "timestamp": {
            "start_in_millis": 1579776194845,
            "stop_in_millis": 1579776195243,
            "took_in_millis": 398
        },
        "took_in_millis": 398
    },
    "timed_out": false,
    "took": 19
}
-----------------------------------------------------------

The is_pruned flag is deprecated and will be renamed to is_truncated in version 20.0.

Cancelling a Request

A search or a multi search request can be cancelled explicitely by a user. In order to do so, you need to pass a X-Opaque-Id header which is used to identify the request. The endpoint for cancelling a request is /_siren/job/<ID>/_cancel. By default, the cancel request will wait for all tasks associated to the search to be cancelled. This can be disabled by passing false to the boolean parameter wait_for_completion.

Permissions: To use this API, ensure that the cluster-level action cluster:admin/federate/job/cancel is granted by the security system.

Usage

Let’s identify a search request with the name my-request:

$ curl -H "Content-Type: application/json" -H "X-Opaque-Id: my-request" 'http://localhost:9200/siren/_search'

Then to cancel it, issue a request as follows:

$ curl -XPOST -H "Content-Type: application/json" 'localhost:9200/_siren/job/my-request/_cancel'

If successful, the response will acknowledge the request and give a listing of the cancelled tasks:

{
  "acknowledged" : true,
  "tasks" : [
    {
      "node" : "5ILUA44uSee-VxsBsNbsNA",
      "id" : 947,
      "type" : "transport",
      "action" : "indices:siren/plan",
      "description" : "federate query",
      "start_time_in_millis" : 1524815599457,
      "running_time_in_nanos" : 199131478,
      "cancellable" : true,
      "headers" : {
        "X-Opaque-Id" : "my-request"
      }
    }
  ]
}