CC-Agency: Advanced Configuration

GPU Nodes

If Nvidia GPUs are available on a node, they can be configured as follows.

controller:
  docker:
    nodes:
      gpu_node1:
        base_url: "tcp://192.168.0.101:2376"
        tls:
          verify: "/home/cc/.docker/machine/machines/gpu_node1/ca.pem"
          client_cert:
            - "/home/cc/.docker/machine/machines/gpu_node1/cert.pem"
            - "/home/cc/.docker/machine/machines/gpu_node1/key.pem"
          assert_hostname: False
        hardware:
          gpus:
            - id: 0
              vram: 1024
            - id: 1
              vram: 1024

This configuration means that two GPUs are present on a node “gpu_node1”. Each GPU has 1024 MB VRAM. Currently only Nvidia-GPUs are supported. To make the GPUs accessible for docker, Nvidia-Docker has to be installed on each GPU node.

The IDs shown in the configuration, are the nvidia-device-IDs, which can be identified with nvidia-smi for example (see nvidia-smi).

Notification Hooks

To send HTTP notifications if a batch has entered a final state (succeeded, failed or cancelled), you can configure notification hooks in the agency configuration as follows.

controller:
  notification_hooks:
    - url: "http://example.com/notify"

    - url: "http://example.com/auth-notify"
      auth:
        username: "username"
        password: "password"

This configuration will result in a HTTP Post to every url given if a batch in this agency reaches a new state. The auth field is optional.

The content of a notification is a list of batch information, specifying the batchId and the new state of the batch:

Example:

{
  "batches": [
    {
      "batchId": "5c6sb537h4d0k4353ch27c",
      "state": "succeeded"
    }
  ]
}