This page explains the different terms used in the Watcher system.
They are sorted in alphabetical order.
An Action is what enables Watcher to transform the current state of a Cluster after an Audit.
An Action is an atomic task which changes the current state of a target Managed resource of the OpenStack Cluster such as:
In most cases, an Action triggers some concrete commands on an existing OpenStack module (Nova, Neutron, Cinder, Ironic, etc.).
An Action has a life-cycle and its current state may be one of the following:
Some default implementations are provided, but it is possible to develop new implementations which are dynamically loaded by Watcher at launch time.
An Action Plan specifies a flow of Actions that should be executed in order to satisfy a given Goal. It also contains an estimated global efficacy alongside a set of efficacy indicators.
An Action Plan is generated by Watcher when an Audit is successful which implies that the Strategy which was used has found a Solution to achieve the Goal of this Audit.
In the default implementation of Watcher, an action plan is composed of a list of successive Actions (i.e., a Workflow of Actions belonging to a unique branch).
However, Watcher provides abstract interfaces for many of its components, allowing other implementations to generate and handle more complex Action Plan(s) composed of two types of Action Item(s):
An Action Plan may be described using standard workflow model description formats such as Business Process Model and Notation 2.0 (BPMN 2.0) or Unified Modeling Language (UML).
To see the life-cycle and description of Action Plan states, visit the Action Plan state machine.
The Administrator is any user who has admin access on the OpenStack cluster. This user is allowed to create new projects for tenants, create new users and assign roles to each user.
The Administrator usually has remote access to any host of the cluster in order to change the configuration and restart any OpenStack service, including Watcher.
In the context of Watcher, the Administrator is a role for users which allows them to run any Watcher commands, such as:
The Administrator is also allowed to modify any Watcher configuration files and to restart Watcher services.
In the Watcher system, an Audit is a request for optimizing a Cluster.
The optimization is done in order to satisfy one Goal on a given Cluster.
For each Audit, the Watcher system generates an Action Plan.
To see the life-cycle and description of an Audit states, visit the Audit State machine.
An Audit may be launched several times with the same settings (Goal, thresholds, ...). Therefore it makes sense to save those settings in some sort of Audit preset object, which is known as an Audit Template.
An Audit Template contains at least the Goal of the Audit.
It may also contain some error handling settings indicating whether:
and how many retries should be attempted before failure occurs (also the latter can be complex: for example the scenario in which there are many first-time failures on ultimately successful Actions).
Moreover, an Audit Template may contain some settings related to the level of automation for the Action Plan that will be generated by the Audit. A flag will indicate whether the Action Plan will be launched automatically or will need a manual confirmation from the Administrator.
Last but not least, an Audit Template may contain a list of extra parameters related to the Strategy configuration. These parameters can be provided as a list of key-value pairs.
Please, read the official OpenStack definition of an Availability Zone.
A Cluster is a set of physical machines which provide compute, storage and networking resources and are managed by the same OpenStack Controller node. A Cluster represents a set of resources that a cloud provider is able to offer to his/her customers.
A data center may contain several clusters.
The Cluster may be divided in one or several Availability Zone(s).
A Cluster Data Model is a logical representation of the current state and topology of the Cluster Managed resources.
It is represented as a set of Managed resources (which may be a simple tree or a flat list of key-value pairs) which enables Watcher Strategies to know the current relationships between the different resources) of the Cluster during an Audit and enables the Strategy to request information such as:
In a word, this data model enables the Strategy to know:
In the Watcher project, we aim at providing a some generic and basic Cluster Data Model for each Goal, usable in the associated Strategies through a plugin-based mechanism that are directly accessible from the strategies classes in order to:
There may be various generic and basic Cluster Data Models proposed in Watcher helpers, each of them being adapted to achieving a given Goal:
Note however that a developer can use his/her own Cluster Data Model if the proposed data model does not fit his/her needs as long as the Strategy is able to produce a Solution for the requested Goal. For example, a developer could rely on the Nova Data Model to optimize some compute resources.
The Cluster Data Model may be persisted in any appropriate storage system (SQL database, NoSQL database, JSON file, XML File, In Memory Database, ...). As of now, an in-memory model is built and maintained in the background in order to accelerate the execution of strategies.
The Cluster History contains all the previously collected timestamped data such as metrics and events associated to any managed resource of the Cluster.
Just like the Cluster Data Model, this history may be used by any Strategy in order to find the most optimal Solution during an Audit.
In the Watcher project, a generic Cluster History API is proposed with some helper classes in order to :
Note however that a developer can use his/her own history management system if the Ceilometer system does not fit his/her needs as long as the Strategy is able to produce a Solution for the requested Goal.
The Cluster History data may be persisted in any appropriate storage system (InfluxDB, OpenTSDB, MongoDB,...).
A controller node is a machine that typically runs the following core OpenStack services:
In many configurations, Watcher will reside on a controller node even if it can potentially be hosted on a dedicated machine.
Please, read the official OpenStack definition of a Compute Node.
A Customer is the person or company which subscribes to the cloud provider offering. A customer may have several Project(s) hosted on the same Cluster or dispatched on different clusters.
In the private cloud context, the Customers are different groups within the same organization (different departments, project teams, branch offices and so on). Cloud infrastructure includes the ability to precisely track each customer’s service usage so that it can be charged back to them, or at least reported to them.
A Goal is a human readable, observable and measurable end result having one objective to be achieved.
Here are some examples of Goals:
Please, read the official OpenStack definition of a Host Aggregate.
A running virtual machine, or a virtual machine in a known state such as suspended, that can be used like a hardware server.
A Managed resource is one instance of Managed resource type in a topology with particular properties and dependencies on other Managed resources (relationships).
For example, a Managed resource can be one virtual machine (i.e., an instance) hosted on a compute node and connected to another virtual machine through a network link (represented also as a Managed resource in the Cluster Data Model).
A Managed resource type is a type of hardware or software element of the Cluster that the Watcher system can act on.
Here are some examples of Managed resource types:
It can be any of the the official list of available resource types defined in OpenStack for HEAT.
An efficacy indicator is a single value that gives an indication on how the solution produced by a given strategy performed. These efficacy indicators are specific to a given goal and are usually used to compute the gobal efficacy of the resulting action plan.
In Watcher, these efficacy indicators are specified alongside the goal they relate to. When a strategy (which always relates to a goal) is executed, it produces a solution containing the efficacy indicators specified by the goal. This solution, which has been translated by the Watcher Planner into an action plan, will see its indicators and global efficacy stored and would now be accessible through the Watcher API.
An efficacy specfication is a contract that is associated to each Goal that defines the various efficacy indicators a strategy achieving the associated goal should provide within its solution. Indeed, each solution proposed by a strategy will be validated against this contract before calculating its global efficacy.
The Optimization Efficacy is the objective measure of how much of the Goal has been achieved in respect with constraints and SLAs defined by the Customer.
The way efficacy is evaluated will depend on the Goal to achieve.
Of course, the efficacy will be relevant only as long as the Action Plan is relevant (i.e., the current state of the Cluster has not changed in a way that a new Audit would need to be launched).
For example, if the Goal is to lower the energy consumption, the Efficacy will be computed using several efficacy indicators (KPIs):
All those indicators are computed within a given timeframe, which is the time taken to execute the whole Action Plan.
The efficacy also enables the Administrator to objectively compare different Strategies for the same goal and same workload of the Cluster.
Projects represent the base unit of “ownership” in OpenStack, in that all resources in OpenStack should be owned by a specific project. In OpenStack Identity, a project must be owned by a specific domain.
Please, read the official OpenStack definition of a Project.
A Scoring Engine is an executable that has a well-defined input, a well-defined output, and performs a purely mathematical task. That is, the calculation does not depend on the environment in which it is running - it would produce the same result anywhere.
Because there might be multiple algorithms used to build a particular data model (and therefore a scoring engine), the usage of scoring engine might vary. A metainfo field is supposed to contain any information which might be needed by the user of a given scoring engine.
SLA means Service Level Agreement.
The resources are negotiated between the Customer and the Cloud Provider in a contract.
Most of the time, this contract is composed of two documents:
Note that the SLA is more general than the SLO in the sense that the former specifies what service is to be provided, how it is supported, times, locations, costs, performance, and responsibilities of the parties involved while the SLO focuses on more measurable characteristics such as availability, throughput, frequency, response time or quality.
You can also read the Wikipedia page for SLA which provides a good definition.
A SLA violation happens when a SLA defined with a given Customer could not be respected by the cloud provider within the timeframe defined by the official contract document.
A Service Level Objective (SLO) is a key element of a SLA between a service provider and a Customer. SLOs are agreed as a means of measuring the performance of the Service Provider and are outlined as a way of avoiding disputes between the two parties based on misunderstanding.
You can also read the Wikipedia page for SLO which provides a good definition.
A Solution is the result of execution of a strategy (i.e., an algorithm). Each solution is composed of many pieces of information:
A Solution is different from an Action Plan because it contains the non-scheduled list of Actions which is produced by a Strategy. In other words, the list of Actions in a Solution has not yet been re-ordered by the Watcher Planner.
Note that some algorithms (i.e. Strategies) may generate several Solutions. This gives rise to the problem of determining which Solution should be applied.
Two approaches to dealing with this can be envisaged:
A Strategy is an algorithm implementation which is able to find a Solution for a given Goal.
There may be several potential strategies which are able to achieve the same Goal. This is why it is possible to configure which specific Strategy should be used for each goal.
Some strategies may provide better optimization results but may take more time to find an optimal Solution.
This component is in charge of executing the Action Plan built by the Watcher Decision Engine.
See: System Architecture for more details on this component.
This database stores all the Watcher domain objects which can be requested by the Watcher API or the Watcher CLI:
The Watcher domain being here “optimization of some resources provided by an OpenStack system”.
See System Architecture for more details on this component.
This component is responsible for computing a set of potential optimization Actions in order to fulfill the Goal of an Audit.
It first reads the parameters of the Audit from the associated Audit Template and knows the Goal to achieve.
It then selects the most appropriate Strategy depending on how Watcher was configured for this Goal.
The Strategy is then executed and generates a set of Actions which are scheduled in time by the Watcher Planner (i.e., it generates an Action Plan).
See System Architecture for more details on this component.
The Watcher Planner is part of the Watcher Decision Engine.
This module takes the set of Actions generated by a Strategy and builds the design of a workflow which defines how-to schedule in time those different Actions and for each Action what are the prerequisite conditions.
It is important to schedule Actions in time in order to prevent overload of the Cluster while applying the Action Plan. For example, it is important not to migrate too many instances at the same time in order to avoid a network congestion which may decrease the SLA for Customers.
It is also important to schedule Actions in order to avoid security issues such as denial of service on core OpenStack services.
Some default implementations are provided, but it is possible to develop new implementations which are dynamically loaded by Watcher at launch time.
See System Architecture for more details on this component.