https://blueprints.launchpad.net/dragonflow/+spec/is-chassis-alive-support
Chassis is important to some functionalities, for example, segments support, router gateway. Dragonflow should provide a way to check if a chassis is active. So that other functionalities can use it.
Currently, Dragonflow doesn’t provide a way to check if a chassis is active. This causes several problems.
Make Dragonflow controller report its timestamp to Dragonflow Northbound Database periodly. Add a method to tell if a chassis is active.
The implementation will under the assumption that all the nodes in OpenStack cloud have consistent time. This is a reliable assumption because it is recommended to use NTP(Network Time Protocol) to properly synchronize services among nodes, according to [1].
[1] | https://docs.openstack.org/newton/install-guide-obs/environment-ntp.html |
As distributed controller of SDN(software defined network), Dragonflow controller can be used to monitor and manage other local services, for example, Dragonflow L3 agent, metadata proxy, and other services in future. By using [2] from OpenStack Neutron, it is easy to start/stop/check the local services.
[2] | neutron.agent.linux.external_process.ProcessManager |
Dragonflow controller can report the status of local services to Dragonflow Northbound Database, when it reports its timestamp.
For the services that should run with neutron-server process, it could also be managed by ProcessManager mentioned above.
Add a new configuration option, chassis_down_time, which means that the chassis will be considered as down if it doesn’t report itself for such a long time. The default value of chassis_down_time will be 75 seconds, which should be at least more than twice of report_interval described below.
Add a new configuration option, report_interval. Dragonflow controller will report timestamp to Dragonflow Northbound Database by using this option as time interval. The default value of report_interval is 30 seconds, which should not cause big impact to the performance of Dragonflow Northbound Database.
Add a new field called timestamp to Chassis in Dragonflow Northbound Database. This field will not be exposed. The Chassis class in Dragonflow Northbound will provide a new method called is_active. The new method will compare timestamp of chassis and current time. If timestamp is older than current time, and the gap is greater than chassis_down_time, the method will return false.
Add a new field called service_status to Chassis in Dragonflow Northbound Database. The status of local services will be recorded in this field.
The new DB cli commands that are mentioned below will show the Chassis status according to the return value of this method. Administrator can then delete the stale Chassis.
Other functionalities, for example ml2 port binding, can avoid using the stale Chassis by checking the return value of this method. If the return value is false, ml2 can report error on port binding. The details depend on the implementation of other functionalities. This method just provides the possibility to do that.
Dragonflow controller should silently ignore the update of timestamp. Actually, it should only concern about the IP address change of chassis once virtual tunnel port is implemented at [3].
[3] | https://blueprints.launchpad.net/dragonflow/+spec/virtual-tunnel-port-support |
Dragonflow DB CLI should provide 2 commands.
So that administrator can clean the stale chassis.
None
None
Except where otherwise noted, this document is licensed under Creative Commons Attribution 3.0 License. See all OpenStack Legal Documents.