Auditing user sessions

If you want to perform internal audits, you can enable the session audit feature. This feature allows you to log the actions that are performed by the logged-in user.

Configuring session logs

To enable the session audit feature, in the investigate.yml file, set the enabled parameter to true in the following code:

siren_audit:
  enabled: true                       # The value is false by default.
  temporary_logs_folder: './optimize' # The folder where the system can store the unsaved entries during shutdown or restart.
                                      # The default value is the './optimize' folder.
  logged_types:                       # The entry types that are logged.
    - ui                              # User actions, such as opening a dashboard and exporting a graph
    - get                             # Elasticsearch get or mget data requests.
    - search                          # Elasticsearch search or msearch data requests.
    - count                           # Elasticsearch search requests used for fetching various counts where no documents are retrieved size= 0.
    - savedObject                     # Operations on Investigate configuration objects, such as dashboards and visualizations.
    - auth                            # Log in and log out operations.
    - security                        # Security related evens, for example, change of object ownership.
    - dataExport                      # JSON and CSV data exports operations.
    - dev                             # Elasticsearch requests issued from the Dev Tools console.
    - systemSearch                    # Elasticsearch search requests internal to Investigate, for example, ping to check Elasticsearch health.
    - globalSearch                    # Global search requests.
    - globalSearchCount               # Global search requests used to get entity counts.
    - collectionsSearch               # Collections search requests used to get collections in the SirenAPI.
    - other                           # Other audited requests not categorized above. For example, Elasticsearch requests other than search or get.

  logged_saved_object_types:          # By default, enable logging on all saved object types
    - graph
    - relational-graph
    - visualization
    - ontology-model
    - config
    - dashboard
    - sidebargroup
    - eid
    - relation
    - template
    - datasource
    - script
    - search
    - url
    - doc
    - fingerprint
    - sirenapiscript
    - sidebaroptions
  logged_saved_object_body_types: []  # Types of saved objects to log the body, before and after operation. By default, the value is empty.
                                      # You can use any of the types that are listed under `logged_saved_object_types`.
                                      # Note: Be aware that the 'graph' type can contain sensitive information in the body.
                                      # The body of a graph can be very large, depending on the number of nodes it contains.
                                      # For example, 100 nodes is considered a small object, 5k nodes is a large object, and 20k nodes is a very large object.
  log_response_body: false            # If the value is set to 'true', the system logs the response body for Elasticsearch search requests and saved objects requests (excluding responses that contain the body of the object).
  log_header_names: []                # Array of header names to be logged in addition to all headers with "siren-" prefix which are always logged when audit is enabled
  log_remote_ip_from_header_name: 'my-header' # By default Investigate will check for x-forwarded-for header and extract the first ip address from it
                                              # Sometimes a non standard header is used to store the client ip address
                                              # In such case this property allows to specify such header name
  wait_for_outputs_initialization: false      # When set to true, Investigate will not start until all configured outputs are initialized
  disable_when_outputs_down: false    # When set to true, Investigate will redirect to status page if any of configured outputs are down
  # If you do not define the outputs, the system uses the main Elasticsearch cluster to store audit entries.
  # It is possible to configure more than one remote Elasticsearch output.
  outputs:
    - elasticsearch:
        manage_template: true                 # default true
        max_retries: 3                        # default 3
        retry_delay: 1000                     # default 1000 ms
        queue_size: 1000                      # default 1000
        flush_interval: 10000                 # default 10000 ms
        flush_size: 50                        # default 50
        index_prefix: 'siren-audit-'          # default siren-audit-
        index_interval: daily|weekly|monthly  # default monthly
        number_of_shards: 1                   # default 1
        number_of_replicas: 1                 # default 1
        url: https://host:port
        username: username
        password: password
        ssl:
          verificationMode: none | certificate | full  # default full
          certificateAuthorities: [ "/path/to/your/CA.pem" ]
          certificate: /path/to/your/client.crt
          key: /path/to/your/client.key
          keyPassphrase: changeme
        healthCheck:
          delay: 5000
          timeout: 60000


# When set, it is written to the instanceName field of audit entries
# Useful when multiple Investigate instances are configured to log audit entries to the same Elasticsearch cluster
investigate_core.instance_name: "my-instance-01"

Session log indices

After the feature is enabled, the system automatically creates an index that stores the audit entries. If you use the default value of elasticsearch as the output type, the system creates the index by using the output configuration credentials. If no output is defined, the system creates the index on the main Elasticsearch cluster by using the Siren Investigate System credentials.

Make sure that either the remote cluster user or the Siren Investigate Administrator user has the following security permissions:

Can store an index template.
Can create indices with a name starting with the configured prefix, for example, the default prefix is siren-audit-.
Can write to those indices.

Session log entries

The system can log the following types of session data:

ui: Actions that are performed by the user as they are interacting with the application. For example, a log entry is created when a user opens a dashboard.
savedObject: Access to system configuration objects. For example, a log entry is created when a user requests a configuration object for particular dashboard.
get: All data-related document fetches by explicit id. An Investigate page will fetch documents by id in certain circumstances, like when clicking the "Show more" button in Siren Search result panels.
search: All data-related search queries. For example, a log entry is created when a visualization on a dashboard is fetching the data to render.
count: All queries that obtain the document count. For example, a log entry is created when a query is used to fetch the count of documents visible on a dashboard.
dataExport: A log entry is created each time a user attempts to export data as JSON or CSV.
dev: A log entry is created each time a user send request from Dev tools console.
systemSearch: All internal data-related search queries. For example, an internal log entry is created when an investigate system periodically issue a request to fetch Investigate license information.
globalSearch: All search queries made from Siren Search or from the Global Search on the Dashboard page.
globalSearchCount: All search queries made from Siren Search or from the Global Search on the Dashboard page to get entity counts.
collectionsSearch: All queries made by scripts to get collection lists.
other: Other requests that don’t fit the previous categories, like Elastisearch requests other than get or search.

Entry filtering

When the log_response_body option is set to true, you can customize the source fields logged from Elasticsearch search response hits by setting the log_response_filters option to a list of filters.

A filter is defined by the following attributes:

type: if set, the filter will only apply to that type of session data. It can be set to either hit or data-export. Defaults to hit.
configuration.includes: a list of field names to include in the audit log entry.
configuration.excludes: a list of field names to exclude from the audit log entry. If a field is excluded it will not be logged even if specified in the includes list.
configuration.pattern: if set, the filter is only applied to hits from a specific Elasticsearch index pattern. Defaults to * (all indices).

Both configuration.includes and configuration.excludes support the wildcard matching using the * character.

Excluding or including a field will automatically exclude and include any field nested within that field.

If no filter is defined for an index, all of the fields in hits from that index will be logged.

Examples

Excluding the fields amount and owner from any index

siren_audit:
  enabled: true
  log_response_body: true
  log_response_filters:
  - configuration:
      excludes:
      - amount
      - owner

Including only the fields owner and label in hits from the company index

siren_audit:
  enabled: true
  log_response_body: true
  log_response_filters:
  - configuration:
      pattern: company
      includes:
      - owner
      - label

Exclude the metadata.tags field and all of its children in any hit

siren_audit:
  enabled: true
  log_response_body: true
  log_response_filters:
  - configuration:
      excludes:
      - metadata.tags

Exclude the children of metadata.tags in any hit

siren_audit:
  enabled: true
  log_response_body: true
  log_response_filters:
  - configuration:
      excludes:
      - metadata.tags.*

Avoid logging fields for any hit coming from the transactions index

siren_audit:
  enabled: true
  log_response_body: true
  log_response_filters:
  - configuration:
      pattern: 'transactions'
      excludes:
      - '*'

Including only the title field from the article index in a data export

siren_audit:
  enabled: true
  log_response_body: true
  log_response_filters:
  - type: 'data-export'
    configuration:
      pattern: article
      includes:
      - title

Including only the fields from the article index in a data export that end with the substring label

siren_audit:
  enabled: true
  log_response_body: true
  log_response_filters:
  - type: 'data-export'
    configuration:
      pattern: article
      includes:
      - '*label'

Note: If your expression starts with *, you must wrap it in single quotes.

Data export

If you have configured dataExport under logged_types and log_response_body is set to true, session data from a Data export will be logged to the audit instance in batches of 100 records. This is configurable under the following parameter log_export_batch_size.

Example

siren_audit:
  enabled: true
  log_response_body: true
  log_export_batch_size: 200                        # The value is 100 by default. The range is 1 - 2000.

Increasing log_export_batch_size will log more documents to the response body of a dataExport record in Siren Audit. If you experience an error while searching the dataExport record, it is advised to decrease this parameter in order to split the data into smaller batches.

If the logged_saved_object_body_types option contains a graph, it can log potentially sensitive information in the audit entry. This happens because saved objects of type graph can contain copies of fields from documents in the Elasticsearch indices.

When the log_response_body option is set to true, you must use a different instance of Siren Investigate to view the session audit logs - one that has auditing disabled. If the same instance is used to view the logs, the log entries will quickly grow in size, causing errors. This is due to recursion; a request logs a response, which when viewed again, again logs a response.

In the event that the output remote cluster is down, the system can accumulate the log entries in a back-end, in-memory queue. If the queue reaches its maximum size, the system starts to lose the oldest audit entries. To avoid data loss, system administrators must actively monitor the health of remote audit clusters to keep them online. You can disable access to Siren Investigate when one of the audit outputs is unavailable by setting siren_audit.disable_when_outputs_down: true in the configuration file.