HTTP Interface for Administration and Monitoring

This is an introduction to ArangoDB's HTTP interface for administration and monitoring of the server.

Read global logs from the server

returns the server logs

GET /_admin/log

Query Parameters

  • upto (optional): Returns all log entries up to log level upto. Note that upto must be:
  • fatal or 0
  • error or 1
  • warning or 2
  • info or 3
  • debug or 4 The default value is info.

  • level (optional): Returns all log entries of log level level. Note that the query parameters upto and level are mutually exclusive.

  • start (optional): Returns all log entries such that their log entry identifier (lid value) is greater or equal to start.

  • size (optional): Restricts the result to at most size log entries.

  • offset (optional): Starts to return log entries skipping the first offset log entries. offset and size can be used for pagination.

  • search (optional): Only return the log entries containing the text specified in search.

  • sort (optional): Sort the log entries either ascending (if sort is asc) or descending (if sort is desc) according to their lid values. Note that the lid imposes a chronological order. The default value is asc.

Returns fatal, error, warning or info log messages from the server's global log. The result is a JSON object with the following attributes:

HTTP 200

A json document with these Properties is returned:

  • lid (string): a list of log entry identifiers. Each log message is uniquely identified by its @LIT{lid} and the identifiers are in ascending order.
  • level: A list of the loglevels for all log entries.
  • timestamp (string): a list of the timestamps as seconds since 1970-01-01 for all log entries.
  • topic: a list of the topics of all log entries
  • text: a list of the texts of all log entries
  • totalAmount: the total amount of log entries before pagination.

Return Codes

  • 200:

Response Body

  • lid (string): a list of log entry identifiers. Each log message is uniquely identified by its @LIT{lid} and the identifiers are in ascending order.
  • level: A list of the loglevels for all log entries.
  • text: a list of the texts of all log entries
  • topic: a list of the topics of all log entries
  • timestamp (string): a list of the timestamps as seconds since 1970-01-01 for all log entries.
  • totalAmount: the total amount of log entries before pagination.

  • 400: is returned if invalid values are specified for upto or level.

  • 500: is returned if the server cannot generate the result due to an out-of-memory error.

Return the current server loglevel

returns the current loglevel settings

GET /_admin/log/level

Returns the server's current loglevel settings. The result is a JSON object with the log topics being the object keys, and the log levels being the object values.

Return Codes

  • 200: is returned if the request is valid

  • 500: is returned if the server cannot generate the result due to an out-of-memory error.

Modify and return the current server loglevel

modifies the current loglevel settings

PUT /_admin/log/level

Modifies and returns the server's current loglevel settings. The request body must be a JSON object with the log topics being the object keys and the log levels being the object values.

The result is a JSON object with the adjusted log topics being the object keys, and the adjusted log levels being the object values.

It can set the loglevel of all facilities by only specifying the loglevel as string without json.

Possible loglevels are:

  • FATAL - There will be no way out of this. ArangoDB will go down after this message.
  • ERROR - This is an error. you should investigate and fix it. It may harm your production.
  • WARNING - This may be serious application-wise, but we don't know.
  • INFO - Something has happened, take notice, but no drama attached.
  • DEBUG - output debug messages
  • TRACE - trace - prepare your log to be flooded - don't use in production.

A JSON object with these properties is required:

  • audit-service: One of the possible loglevels.
  • cache: One of the possible loglevels.
  • syscall: One of the possible loglevels.
  • communication: One of the possible loglevels.
  • audit-authentication: One of the possible loglevels.
  • agencycomm: One of the possible loglevels.
  • startup: One of the possible loglevels.
  • general: One of the possible loglevels.
  • cluster: One of the possible loglevels.
  • audit-view: One of the possible loglevels.
  • collector: One of the possible loglevels.
  • audit-documentation: One of the possible loglevels.
  • engines: One of the possible loglevels.
  • trx: One of the possible loglevels.
  • mmap: One of the possible loglevels.
  • agency: One of the possible loglevels.
  • authentication: One of the possible loglevels.
  • memory: One of the possible loglevels.
  • performance: One of the possible loglevels.
  • config: One of the possible loglevels.
  • authorization: One of the possible loglevels.
  • development: One of the possible loglevels.
  • datafiles: One of the possible loglevels.
  • views: One of the possible loglevels.
  • ldap: One of the possible loglevels.
  • replication: One of the possible loglevels.
  • threads: One of the possible loglevels.
  • audit-database: One of the possible loglevels.
  • v8: One of the possible loglevels.
  • ssl: One of the possible loglevels.
  • pregel: One of the possible loglevels.
  • audit-collection: One of the possible loglevels.
  • rocksdb: One of the possible loglevels.
  • supervision: One of the possible loglevels.
  • graphs: One of the possible loglevels.
  • compactor: One of the possible loglevels.
  • queries: One of the possible loglevels.
  • heartbeat: One of the possible loglevels.
  • requests: One of the possible loglevels.

Return Codes

  • 200: is returned if the request is valid

  • 400: is returned when the request body contains invalid JSON.

  • 405: is returned when an invalid HTTP method is used.

  • 500: is returned if the server cannot generate the result due to an out-of-memory error.

Reloads the routing information

Reload the routing table.

POST /_admin/routing/reload

Reloads the routing information from the collection routing.

Return Codes

  • 200: Routing information was reloaded successfully.

Read the statistics

return the statistics information

GET /_admin/statistics

Returns the statistics information. The returned object contains the statistics figures grouped together according to the description returned by _admin/statistics-description. For instance, to access a figure userTime from the group system, you first select the sub-object describing the group stored in system and in that sub-object the value for userTime is stored in the attribute of the same name.

In case of a distribution, the returned object contains the total count in count and the distribution list in counts. The sum (or total) of the individual values is returned in sum.

Return Codes

  • 200: Statistics were returned successfully.

Examples

shell> curl --dump - http://localhost:8529/_admin/statistics

HTTP/1.1 200 OK
content-type: application/json; charset=utf-8
x-content-type-options: nosniff

show response body

Statistics description

fetch descriptive info of statistics

GET /_admin/statistics-description

Returns a description of the statistics returned by /_admin/statistics. The returned objects contains an array of statistics groups in the attribute groups and an array of statistics figures in the attribute figures.

A statistics group is described by

  • group: The identifier of the group.
  • name: The name of the group.
  • description: A description of the group.

A statistics figure is described by

  • group: The identifier of the group to which this figure belongs.
  • identifier: The identifier of the figure. It is unique within the group.
  • name: The name of the figure.
  • description: A description of the figure.
  • type: Either current, accumulated, or distribution.
  • cuts: The distribution vector.
  • units: Units in which the figure is measured.

Return Codes

  • 200: Description was returned successfully.

Examples

shell> curl --dump - http://localhost:8529/_admin/statistics-description

HTTP/1.1 200 OK
content-type: application/json; charset=utf-8
x-content-type-options: nosniff

show response body

Return role of a server in a cluster

Get to know whether this server is a Coordinator or DB-Server

GET /_admin/server/role

Returns the role of a server in a cluster. The role is returned in the role attribute of the result. Possible return values for role are:

  • SINGLE: the server is a standalone server without clustering
  • COORDINATOR: the server is a coordinator in a cluster
  • PRIMARY: the server is a primary database server in a cluster
  • SECONDARY: the server is a secondary database server in a cluster
  • AGENT: the server is an agency node in a cluster
  • UNDEFINED: in a cluster, UNDEFINED is returned if the server role cannot be determined.

Return Codes

  • 200: Is returned in all cases.

Return id of a server in a cluster

Get to know the internal id of the server

GET /_admin/server/id

Returns the id of a server in a cluster. The request will fail if the server is not running in cluster mode.

Return Codes

  • 200: Is returned when the server is running in cluster mode.

  • 500: Is returned when the server is not running in cluster mode.

Return whether or not a server is available

Return whether or not a server is available

GET /_admin/server/availability

Return availability information about a server.

This is a public API so it does not require authentication. It is meant to be used only in the context of server monitoring only.

Return Codes

  • 200: This API will return HTTP 200 in case the server is up and running and usable for arbitrary operations, is not set to read-only mode and is currently not a follower in case of an active failover setup.

  • 503: HTTP 503 will be returned in case the server is during startup or during shutdown, is set to read-only mode or is currently a follower in an active failover setup.

    Cluster

Queries statistics of DBserver

allows to query the statistics of a DBserver in the cluster

GET /_admin/clusterStatistics

Query Parameters

  • DBserver (required):

Queries the statistics of the given DBserver

Return Codes

  • 200: is returned when everything went well.

  • 400: the parameter DBserver was not given or is not the ID of a DBserver

  • 403: server is not a coordinator.

Queries the health of cluster for monitoring

Returns the health of the cluster as assessed by the supervision (agency)

GET /_admin/cluster/health

Queries the health of the cluster for monitoring purposes. The response is a JSON object, containing the standard code, error, errorNum, and errorMessage fields as appropriate. The endpoint-specific fields are as follows:

  • ClusterId: A UUID string identifying the cluster
  • Health: An object containing a descriptive sub-object for each node in the cluster. Each entry in Health will be keyed by the node ID and contain the the following attributes:

    • Endpoint: A string representing the network endpoint of the server.
    • Role: The role the server plays. Possible values are "AGENT", "COORDINATOR", and "DBSERVER".
    • CanBeDeleted: Boolean representing whether the node can safely be removed from the cluster.

    Additionally, if the node is a Coordinator or DBServer, it will also have the following attributes:

    • Status: A string indicating the health of the node as assessed by the supervision (agency). This should be considered primary source of truth for node health. If the node is responding normally to requests, it is "GOOD". If it has missed one heartbeat, it is "BAD". If it has been declared failed by the supervision, which occurs after missing heartbeats for about 15 seconds, it will be marked "FAILED".
    • SyncStatus: The last sync status reported by the node. This value is primarily used to determine the value of Status. Possible values include "UNKNOWN", "UNDEFINED", "STARTUP", "STOPPING", "STOPPED", "SERVING", "SHUTDOWN".
    • ShortName: A string representing the shortname of the server, e.g. "DBServer1".
    • Timestamp: ISO 8601 timestamp specifying the last heartbeat received.
    • Host: An optional string, specifying the host machine if known.

Return Codes

  • 200: is returned when everything went well.