Deployment

Congress has two modes for deployment: single-process and multi-process. If you are interested in test-driving Congress or are not concerned about high-availability, the single-process deployment is best because it is easiest to set up. If you are interested in making Congress highly-available you want the multi-process deployment.

In the single-process version, you run Congress as a single operating-system process on one node (i.e. container, VM, physical machine).

In the multi-process version, you start with the 3 components of Congress (the API, the policy engine, and the datasource drivers). You choose how many copies of each component you want to run, how you want to distribute those components across processes, and how you want to distribute those processes across nodes.

Section Configuration Options describes the common configuration options for both single-process and multi-process deployments. After that HA Overview and HA Deployment describe how to set up the multi-process deployment.

Configuration Options

In this section we highlight the configuration options that are specific to Congress. To generate a sample configuration file that lists all available options, along with descriptions run the following commands:

$ cd /path/to/congress
$ tox -egenconfig

The tox command will create the file etc/congress.conf.sample, which has a comprehensive list of options. All options have default values, which means that even if you specify no options Congress will run.

The options most important to Congress are described below, all of which appear under the [DEFAULT] section of the configuration file.

drivers
The list of permitted datasource drivers. Default is the empty list. The list is a comma separated list of Python class paths. For example: drivers = congress.datasources.neutronv2_driver.NeutronV2Driver,congress.datasources.glancev2_driver.GlanceV2Driver
datasource_sync_period
The number of seconds to wait between synchronizing datasource config from the database. Default is 0.
enable_execute_action
Whether or not congress will execute actions. If false, Congress will never execute any actions to do manual reactive enforcement, even if there are policy statements that say actions should be executed and the conditions of those actions become true. Default is True.

One of Congress’s new experimental features is distributing its services across multiple services and even hosts. Here are the options for using that feature.

node_id
Unique ID of this Congress instance. Can be any string. Useful if you want to create multiple, distributed instances of Congress. Appears in the [DSE] section.

Here are the most often-used, but standard OpenStack options. These are specified in the [DEFAULT] section of the configuration file.

auth_strategy
Method for authenticating Congress users. Can be assigned to either ‘keystone’ meaning that the user must provide Keystone credentials or to ‘noauth’ meaning that no authentication is required. Default is ‘keystone’.
verbose
Controls whether the INFO-level of logging is enabled. If false, logging level will be set to WARNING. Default is true. Deprecated.
debug
Whether or not the DEBUG-level of logging is enabled. Default is false.

HA Overview

Some applications require Congress to be highly available. Some applications require a Congress Policy Engine (PE) to handle a high volume of queries. This guide describes Congress support for High Availability (HA) High Throughput (HT) deployment.

Please see the OpenStack High Availability Guide for details on how to install and configure OpenStack for High Availability.

HA Types

Warm Standby

Warm Standby is when a software component is installed and available on the secondary node. The secondary node is up and running. In the case of a failure on the primary node, the software component is started on the secondary node. This process is usually automated using a cluster manager. Data is regularly mirrored to the secondary system using disk based replication or shared disk. This generally provides a recovery time of a few minutes.

Active-Active (Load-Balanced)

In this method, both the primary and secondary systems are active and processing requests in parallel. Data replication happens through software capabilities and would be bi-directional. This generally provides a recovery time that is instantaneous.

Congress HAHT

Congress provides Active-Active for the Policy Engine and Warm Standby for the Datasource Drivers.

Run N instances of the Congress Policy Engine in active-active configuration, so both the primary and secondary systems are active and processing requests in parallel.

One Datasource Driver (DSD) per physical datasource, publishing data on oslo-messaging to all policy engines.

+-------------------------------------+      +--------------+
|       Load Balancer (eg. HAProxy)   | <----+ Push client  |
+----+-------------+-------------+----+      +--------------+
     |             |             |
PE   |        PE   |        PE   |        all+DSDs node
+---------+   +---------+   +---------+   +-----------------+
| +-----+ |   | +-----+ |   | +-----+ |   | +-----+ +-----+ |
| | API | |   | | API | |   | | API | |   | | DSD | | DSD | |
| +-----+ |   | +-----+ |   | +-----+ |   | +-----+ +-----+ |
| +-----+ |   | +-----+ |   | +-----+ |   | +-----+ +-----+ |
| | PE  | |   | | PE  | |   | | PE  | |   | | DSD | | DSD | |
| +-----+ |   | +-----+ |   | +-----+ |   | +-----+ +-----+ |
+---------+   +---------+   +---------+   +--------+--------+
     |             |             |                 |
     |             |             |                 |
     +--+----------+-------------+--------+--------+
        |                                 |
        |                                 |
+-------+----+   +------------------------+-----------------+
|  Oslo Msg  |   | DBs (policy, config, push data, exec log)|
+------------+   +------------------------------------------+
  • Performance impact of HAHT deployment:
    • Downtime: < 1s for queries, ~2s for reactive enforcement
    • Throughput and latency: leverages multi-process and multi-node parallelism
    • DSDs nodes are separated from PE, allowing high load DSDs to operate more smoothly and avoid affecting PE performance.
    • PE nodes are symmetric in configuration, making it easy to load balance evenly.
    • No redundant data-pulling load on datasources
  • Requirements for HAHT deployment
    • Cluster manager (eg. Pacemaker + Corosync) to manage warm standby
    • Does not require global leader election

Details

  • Datasource Drivers (DSDs):
    • One datasource driver per physical datasource
    • All DSDs run in a single DSE node (process)
    • Push DSDs: optionally persist data in push data DB, so a new snapshot can be obtained whenever needed
    • Warm Standby:
      • Only one set of DSDs running at a given time; backup instances ready to launch
      • For pull DSDs, warm standby is most appropriate because warm startup time is low (seconds) relative to frequency of data pulls
      • For push DSDs, warm standby is generally sufficient except for use cases that demand sub-second latency even during a failover
  • Policy Engine (PE):
    • Replicate policy engine in active-active configuration.
    • Policy synchronized across PE instances via Policy DB
    • Every instance subscribes to the same data on oslo-messaging
    • Reactive Enforcement: All PE instances initiate reactive policy actions, but each DSD locally selects a leader to listen to. The DSD ignores execution requests initiated by all other PE instances.
      • Every PE instance computes the required reactive enforcement actions and initiates the corresponding execution requests over oslo-messaging
      • Each DSD locally picks a PE instance as leader (say the first instance the DSD hears from in the asymmetric node deployment, or the PE instance on the same node as the DSD in a symmetric node deployment) and executes only requests from that PE
      • If heartbeat contact is lost with the leader, the DSD selects a new leader
      • Each PE instance is unaware of whether it is a leader
    • Node Configurations:
      • Congress supports the Two Node-Types (API+PE nodes, all-DSDs) node configuration because it gives reasonable support for high-load DSDs while keeping the deployment complexities low.
    • Local Leader for Action Execution:
      • Local Leader: every PE instance sends action-execution requests, but each receiving DSD locally picks a “leader” to listen to
      • Because there is a single active DSD for a given data source, it is a natural spot to locally choose a “leader” among the PE instances sending reactive enforcement action execution requests. Congress supports the local leader style because it avoids the deployment complexities associated with global leader election. Furthermore, because all PE instances perform reactive enforcement and send action execution requests, the redundancy opens up the possibility for zero disruption to reactive enforcement when a PE instance fails.
  • API:
    • Each node has an active API service
    • Each API service routes requests for the PE to its associated intranode PE
    • Requests for any other service (eg. get data source status) are routed to the Datasource and/or Policy Engine, which will be fielded by some active instance of the service on some node
  • Load balancer:
    • Layer 7 load balancer (e.g. HAProxy) distributes incoming API calls among the nodes (each running an API service).
    • load balancer optionally configured to use sticky session to pin each API caller to a particular node. This configuration avoids the experience of going back in time.
  • External components (load balancer, DBs, and oslo messaging bus) can be made highly available using standard solutions (e.g. clustered LB, Galera MySQL cluster, HA rabbitMQ)

Performance Impact

  • In single node deployment, there is generally no performance impact.
  • Increased latency due to network communication required by multi-node deployment
  • Increased reactive enforcement latency if action executions are persistently logged to facilitate smoother failover
  • PE replication can achieve greater query throughput

End User Impact

Different PE instances may be out-of-sync in their data and policies (eventual consistency). The issue is generally made transparent to the end user by making each user sticky to a particular PE instance. But if a PE instance goes down, the end user reaches a different instance and may experience out-of-sync artifacts.

HA Deployment

This section shows how to deploy Congress with High Availability (HA). Congress is divided to 2 parts in HA. First part is API and PolicyEngine Node which is replicated with Active-Active style. Another part is DataSource Node which is deployed with warm-standby style. Please see the HA Overview for details.

+-------------------------------------+      +--------------+
|       Load Balancer (eg. HAProxy)   | <----+ Push client  |
+----+-------------+-------------+----+      +--------------+
     |             |             |
PE   |        PE   |        PE   |        all+DSDs node
+---------+   +---------+   +---------+   +-----------------+
| +-----+ |   | +-----+ |   | +-----+ |   | +-----+ +-----+ |
| | API | |   | | API | |   | | API | |   | | DSD | | DSD | |
| +-----+ |   | +-----+ |   | +-----+ |   | +-----+ +-----+ |
| +-----+ |   | +-----+ |   | +-----+ |   | +-----+ +-----+ |
| | PE  | |   | | PE  | |   | | PE  | |   | | DSD | | DSD | |
| +-----+ |   | +-----+ |   | +-----+ |   | +-----+ +-----+ |
+---------+   +---------+   +---------+   +--------+--------+
     |             |             |                 |
     |             |             |                 |
     +--+----------+-------------+--------+--------+
        |                                 |
        |                                 |
+-------+----+   +------------------------+-----------------+
|  Oslo Msg  |   | DBs (policy, config, push data, exec log)|
+------------+   +------------------------------------------+

New config settings for setting the DSE node type:

  • N (>=2 even okay) nodes of PE+API node

    $ python /usr/local/bin/congress-server --api --policy-engine --node-id=<api_unique_id>
    
  • One single DSD node

    $ python /usr/local/bin/congress-server --datasources --node-id=<datasource_unique_id>
    

Nodes which DataSourceDriver runs on takes warm-standby style. Congress assumes cluster manager handles the active-standby cluster. In this document, we describe how to make HA of DataSourceDriver node by Pacemaker .

See the OpenStack High Availability Guide for general usage of Pacemaker and how to deploy Pacemaker cluster stack. The guide has some HA configuration for other OpenStack projects.

Prepare OCF resource agent

You need a custom Resource Agent (RA) for DataSoure Node HA. The custom RA is located in Congress repository, /path/to/congress/script/ocf/congress-datasource. Install the RA with following steps.

$ cd /usr/lib/ocf/resource.d
$ mkdir openstack
$ cd openstack
$ cp /path/to/congress/script/ocf/congress-datasource ./congress-datasource
$ chmod a+rx congress-datasource

Configure RA

You can now add the Pacemaker configuration for Congress DataSource Node resource. Connect to the Pacemaker cluster with the crm configure command and add the following cluster resources. After adding the resource make sure commit the change.

primitive ds-node ocf:openstack:congress-datasource \
   params config="/etc/congress/congress.conf" \
   node_id="datasource-node" \
   op monitor interval="30s" timeout="30s"

Make sure that all nodes in the cluster have same config file with same name and path since DataSource Node resource, ds-node, uses config file defined at config parameter to launch the resource.

The RA has following configurable parameters.

  • config: a path of Congress’s config file
  • node_id(Option): a node id of the datasource node. Default is “datasource-node”.
  • binary(Option): a path of Congress binary Default is “/usr/local/bin/congress-server”.
  • additional_parameters(Option): additional parameters of congress-server