Ussuri Series Release Notes

10.4.0-34

New Features

  • Adds support for libvirt SASL authentication. It is enabled by default. LP#1964013

Upgrade Notes

  • The addition of libvirt SASL authentication requires a new password in passwords.yml, libvirt_sasl_password. This may be generated using the existing kolla-genpwd and kolla-mergepwd tooling.

  • The addition of libvirt SASL authentication requires both the nova_libvirt and nova_compute containers to be updated simultaneously, using new images with the necessary Cyrus SASL dependencies, as well as configuration containing the SASL credentials.

Security Issues

  • Explicitly removes the net.ipv4.ip_forward sysctl from /etc/sysctl.conf on hosts with Neutron L3 Agent. In the absence of another source for this sysctl, it should revert to the default of 0 after the next reboot. This is a follow up to a previous change which stopped setting the sysctl, but leaves existing systems with the original value of 1 set.

    A deployer looking to more aggressively change the value may set neutron_l3_agent_host_ipv4_ip_forward to 0 using a Yoga release of Kolla Ansible. This option will be removed in future. Any deployments still relying on the previous value may set neutron_l3_agent_host_ipv4_ip_forward to 1. LP#1945453

  • Kolla Ansible used to run Ironic’s tftpd as an (unprivileged) root user. Now, it will explicitly use the nobody user.

  • Fixes an issue where the default configuration of libvirt did not use authentication for the API exposed over TCP on the internal API network. This allowed anyone with access to the internal API network read-write access to libvirt. While the internal API network is typically trusted, other services on this network generally at least require authentication.

    SASL authentication is now enabled for libvirt by default. Kolla Ansible supports libvirt TLS since the Train release, and this is recommended to provide a higher level of security. LP#1964013

  • Adds mitigation for the Apache Log4j2 Remote Code Execution (RCE) Vulnerability in Elasticsearch - CVE-2021-44228.

Bug Fixes

  • Removes custom value of max_allowed_secret_in_bytes in barbican.conf. The default maximum size in Barbican was doubled to avoid issues with some certificates. LP #1957795

  • Fixes broken elasticsearch_curator container by adding the necessary “LANG=en_US.UTF-8” to the crontab. LP#1919328

  • Fixes unable to connect to zun console when kolla_enable_tls_external is true. Access to console of any zun container fails when kolla_enable_tls_external is true. This fix sets the protocol for wsproxy base_url in zun.conf according to the value of kolla_enable_tls_external LP#1957117

  • Fixed bug #1987982 This bug caused the database log_bin_trust_function_creators variable not to be set back to “OFF” after a keystone upgrade.

  • adds back the option to configure the rabbitmq clustering interface via kolla LP#1900160 <https://bugs.launchpad.net/kolla-ansible/+bug/1900160>

  • Fixes an issue seen when using Jinja2 3.1.0.

  • Fixes the configuration option setting the type of endpoint used by Neutron to send requests to Placement. LP#1960503

  • Fixes a configuration issue with Node Exporter causing all file system metrics of a host to be identical. LP#1961438

  • Fixes an issue where a failure of any Nova compute service to register itself would cause only the host querying the nova API to fail. Now, only hosts that fail to register will fail the Kolla Ansible run. Alternatively, to fail all hosts in a cell when any compute service fails to register, set nova_compute_registration_fatal to true. LP#1940119

  • The prometheus openstack exporters are now behind haproxy, providing a unique time series in the prometheus database. Also ensures that only one exporter queries the openstack APIs at any given time interval. With the previous behavior each openstack exporter was scraped at the same time. This caused each exporter to query the openstack APIs simultaneously introducing unneccesary load and duplicate time series in the prometheus database due to the instance label being unique for each exporter. LP#1972818

10.4.0

New Features

  • Add new option prometheus_openstack_exporter_timeout to override default scrape_timeout for openstack exporter job.

  • Adds support for elasticsearch storage backend with cloudkitty: That feature let you store cloudkitty rating documents directly within your elasticsearch cluster.

    If you already have an elasticsearch cluster running for logging it create a new cloudkitty specific index. That let you use kibana, grafana or any other interface to browse your rating data and create appropriate dashboard or build an appropriate billing service over it.

    Adds support for prometheus as a fetcher/collector for cloudkitty: That feature let you use prometheus metrics as your source of rating. Using prometheus let you rate pretty much any openstack object directly from the kolla provided exporters (Openstack_exporter) or your own customs exporters.

  • Adds config parameter haproxy_nova_spicehtml5_proxy_tunnel_timeout to configure the Tunnel TimeOut directive for spicehtml5proxy haproxy service.

  • Adds a new variable, disable_firewall, which defaults to true. If set to false, then the host firewall will not be disabled during kolla-ansible bootstrap-servers.

  • Adds two new variables service_images_pull_retries and service_images_pull_delay which control the behaviour of image pulling tasks. These are useful if your registry is not 100% reliable (usually due to load). The defaults have been set to 3 retries and 5 seconds delay to ensure a better default experience (these are actually Ansible defaults when task retries are enabled).

  • Adds support for configuring the filter and gather_subset arguments for the setup module via kolla_ansible_setup_filter and kolla_ansible_setup_gather_subset respectively. These can be used to reduce the number of facts, which can have a significant effect on performance of Ansible.

  • New variable ironic_enable_keystone_integration was added. It helps to add keystone connection information into ironic.conf if we want to connect to existing keystone (not installing it at the same time).

Upgrade Notes

  • Updates all references to Ansible facts within Kolla Ansible from using individual fact variables to using the items in the ansible_facts dictionary. This allows users to disable fact variable injection in their Ansible configuration, which may provide some performance improvement. Check for facts referenced in local configuration files, and update to use ansible_facts before disabling fact variable injection.

  • Modifies the default value of ceph_nova_user from nova to the value of ceph_cinder_user, in line with the default for ceph_nova_keyring. Users who have overridden ceph_nova_keyring to use separate keyrings for Nova and Cinder should also override ceph_nova_user to match the Nova keyring. LP#1934145

  • Modifies the default value of rabbitmq_server_additional_erl_args from an empty string to +S 2:2 +sbwt none +sbwtdcpu none +sbwtdio none.

Security Issues

  • Fixes net.ipv4.ip_forward not to be enabled by Kolla Ansible on the default network namespace. It was enabled on hosts with Neutron L3 Agent (thus in most common setups with OVS and/or Linux Bridge, but not OVN) and allowed, unless users had extra iptables rules to avoid that, any traffic to be accepted for forwarding (as long as it was routable and passed other checks). Users of existing setups are advised to re-evaluate whether they need this sysctl enabled and disable if not necessary. Kolla Ansible will simply no longer try to set this sysctl at all. Neutron L3 Agent handles forwarding enablement per managed namespace. LP#1945453

Bug Fixes

  • Fixes monasca-thresh to correctly submit the topology to Storm. The previous container ran the topology in local mode (within the container), and didn’t use the Storm cloud. The new container handles submitting the topology to Storm and also handles killing and replaces the topology when it’s configuration has changed. As a result, the monasca-thresh container is only used for submission, and exits after that’s completed. The logs for the topology will now be available in the storm worker-artifact logs. LP#1808805

  • Fixes an issue where configuration in containers could become stale. This prevented containers with updated configuration from being restarted, e.g., if the kolla-ansible genconfig and kolla-ansible deploy-containers commands were used together. LP#1848775

  • Fixes elasticsearch fluentd output being enabled when elasticsearch is not enabled. LP#1927880

  • Fixes an issue seen when customising the Docker Yum repository URL on CentOS, where the docker_yum_gpgkey variable is not used consistently. LP#1934913

  • Fixes an issue where spice console is freezed after while, see LP#1938549.

  • Fixed broken kolla-toolbox container when RabbitMQ is disabled and IPv6 is used. LP#1939883

  • Fixes mariadb-clustercheck not to run when there is no HAProxy. LP#1944114

  • No longer creates directories for haproxy and swift logs where they are not needed. LP#1945070

  • Fixes an error in placement role which prevents to deploy the placement service when custom policy file is used. LP#1948835

  • Fixes missing current Ansible version in the error message. LP#1948979

  • Fixes an issue with Cyborg deployment. LP#1937911

  • Fixes an issue with config.json for neutron-server when a VMware plugin agent is used.

  • Fixes an issue with Neutron linuxbridge ML2 agent when neutron_external_interface includes multiple interfaces. LP#1863935

  • Fixes an issue with Manila configuration which was missing a [glance] section, preventing some drivers from operating.

  • Fixes an issue with default Nova configuration for Ceph where the RBD user is set to nova, but only a cinder keyring is copied. The default value of ceph_nova_user is changed to the value of ceph_cinder_user, in line with the default for ceph_nova_keyring. LP#1934145

Other Notes

  • Optimised image pulling to avoid looping over disabled services.

10.3.0

New Features

  • Adds kolla_sysctl_conf_path variable that allows to customise the path to sysctl.conf that will be modified by Kolla Ansible plays. The default is /etc/sysctl.conf as it was before.

  • Adds a new flag, docker_disable_default_network, which defaults to no. Docker is using 172.17.0.0/16 by default for bridge networking on docker0, and this might cause routing problems for operator networks. Setting this flag to yes will disable Docker’s bridge networking. This feature will be enabled by default from the Wallaby 12.0.0 release.

  • Added a new haproxy configuration variable, haproxy_host_ipv4_tcp_retries2, which allows users to modify this kernel option. This option sets maximum number of times a TCP packet is retransmitted in established state before giving up. The default kernel value is 15, which corresponds to a duration of approximately between 13 to 30 minutes, depending on the retransmission timeout. This variable can be used to mitigate an issue with stuck connections in case of VIP failover, see bug 1917068 for details.

  • Adds the ability to override the automatic detection of fluentd_version and fluentd_binary. These can now be defined as extra variables. This removes the dependency of having docker configured for config generation.

  • Adds support for collecting Prometheus metrics from RabbitMQ. This is enabled by default when Prometheus and RabbitMQ are enabled, and may be disabled by setting enable_prometheus_rabbitmq_exporter to false.

  • Allows to skip and unset sysctl variables controlled by Kolla Ansible plays using KOLLA_SKIP and KOLLA_UNSET values.

Bug Fixes

  • Fixes an issue with kolla-ansible bootstrap-servers if Zun is enabled where Zun-specific configuration for Docker was applied to all nodes. LP#1914378

  • Fix the issue when Swift deployed with S3 Token Middleware enabled. Fixes LP#1862765

  • Fixes the Northbound and Southbound database socket paths in OVN.

  • chronyd crash loop if server is rebooted (Debian) LP#1915528

  • Fixed an issue when Docker was configured after startup on Debian/Ubuntu, which resulted in iptables rules being created - before they were disabled. LP#1923203

  • A bug where sriov_agent.ini wasn’t copied due to Permission denied error was fixed. LP#1923467

  • Fixed an issue where docker python SDK 5.0.0 was failing due to missing six - introduced a constraint to install version lower than 5.x. LP#1928915

  • Fixes more-than-2-node RabbitMQ upgrade failing randomly. LP#1930293.

  • Fixes Swift deploy when TLS enabled. Added the missing handler and corrected the container name. LP#1931097

  • Fixes missing region_name in keystone_auth sections. See bug 1933025 for details.

  • Fixes iscsid failing in current CentOS 8 based images due to pid file being needlessly set. LP#1933033

  • Fixes host bootstrap on Debian not removing the conflicting packages. It now behaves in accordance with the docs. LP#1933122

  • Fixes an issue where kolla-ansible exits with a zero exit code when executed with a bogus command name. LP#1929397

  • Fixes potential issue with Alertmanger in non-HA deployments. In this scenario, peer gossip protocol is now disabled and Alertmanager won’t try to form a cluster with non-existing other instances. LP#1926463

  • Adds a new flag, docker_disable_ip_forward, which defaults to no and can be used (by setting yes) to disable docker’s ip-forward option which makes docker set net.ipv4.ip_forward sysctl to 1. This is to protect from creating all-forwarding hosts. LP#1931615

  • Fixes an issue when generating /etc/hosts during kolla-ansible bootstrap-servers when one or more hosts has an api_interface with dashes (-) in its name. LP#1927357

  • Fixes some configuration issues around Barbican logging. LP#1891343

  • Fixes some configuration issues around Cinder logging. LP#1916752

  • Fix the wrong configuration of the ovs-dpdk service. this breaks the deployment of kolla-ansible. For more details please see bug 1908850.

  • Fixes an issue with keepalived which was not recreated during an upgrade if configuration is unchanged. LP#1928362

  • Fixes an issue with Magnum when TLS is enabled. LP#781062

  • Fixes an issue with executing kolla-ansible when installed via pip install --user. LP#1915527

  • Fixes an issue where masakari.conf was generated for the masakari-instancemonitor service but not used.

  • Fixes an issue where masakari-monitors.conf was generated for the masakari-api and masakari-engine services but not used.

  • Uses a consistent variable name for container dimensions for masakari-instancemonitor - masakari_instancemonitor_dimensions. The old name of masakari_monitors_dimensions is still supported.

  • Fixes an issue with Octavia deployment when using a custom service auth project. If octavia_service_auth_project is set to a project that does not exist, Octavia deployment would fail. The project is now created. LP#1922100

  • Fixes LP#1892376 by updating deprecated syntax in the Monasca Elasticsearch template.

  • Removes whitespace around equal signs in zookeeper.cfg which were preventing the zkCleanup.sh script from running correctly.

Other Notes

  • Following Cinder upstream, support for using ZFSSA with Cinder has been removed. ZFSSA was unsupported in Train and later removed in Ussuri.

10.2.0

New Features

  • Adds a new flag, docker_disable_default_iptables_rules, which defaults to no. Docker is manipulating iptables rules by default to provide network isolation, and this might cause problems if the host already has an iptables based firewall. A common problem is that Docker sets the default policy of the FORWARD chain in the filter to DROP. Setting docker_disable_default_iptables_rules to yes will disable Docker’s iptables manipulation. This feature will be enabled by default from the Victoria 11.0.0 release.

  • Improves performance of the common role by generating all fluentd configuration in a single file.

  • Improves performance of the common role by generating all logrotate configuration in a single file.

Known Issues

  • Since Ussuri, there is a bug in how Ceph (RBD) is handled with Cinder: the backend_host option is missing from the generated configuration for external Ceph. The symptoms are that volumes become unmanageable until extra admin action is taken. This does not affect the data plane - running virtual machines are not affected.

    There is a related issue regarding active-active cinder-volume services (single-host cinder-volume not affected), which is that they should not have been configured with backend_host in the first place but with cluster and proper coordination instead. Some users might have customised their config already to address this issue.

    The Kolla team is investigating the best way to address this for all its users. In the meantime, please ensure that, before upgrading to Ussuri, the backend_host option is set to its previous value (the default was rbd:volumes) via a config override.

    For more details please refer to the referenced bug. Do note this issue affects both new deployments and upgrades. LP#1904062

Upgrade Notes

  • When deploying Monasca with Logstash 6, any custom Logstash 2 configuration for Monasca will need to be updated to work with Logstash 6. Please consult the documentation.

  • baremetal role now uses CentOS 8 package repository for Docker CE (compared to 7 previously).

  • The Prometheus OpenStack exporter now uses internal endpoints to communicate with OpenStack services, to match the configuration of other services deployed by Kolla Ansible. Using public endpoints can be retained by setting the prometheus_openstack_exporter_endpoint_type variable to public.

  • The default value of REST_API_REQUIRED_SETTINGS was synchronized with Horizon. You may want to review settings exposed by the updated configuration.

Security Issues

  • The admin-openrc.sh file generated by kolla-ansible post-deploy was previously created with root:root ownership and 644 permissions. This would allow anyone with access to the same directory to read the file, including the admin credentials. The ownership of admin-openrc.sh is now set to the user executing kolla-ansible, and the file is assigned a mode of 600. This change can be applied by running kolla-ansible post-deploy.

Bug Fixes

  • Add support to use bifrost-deploy behind proxy. It uses existing container_proxy variable.

  • Fixes handling of /dev/kvm permissions to be more robust against host-level actions. LP#1681461

  • IPv6 fully-routed topology (/128 addressing) is now allowed (where applicable). LP#1848941

  • When deploying Elasticsearch 6, Logstash 2 was deployed by default which is not compatible with Elasticsearch 6. Logstash 6 is now deployed by default.

  • Fix Castellan (Barbican client) when used with enabled TLS. LP#1886615

  • Fixes --configdir parameter to apply to default passwords.yml location. LP#1887180

  • fluentd is now logging to /var/log/kolla/fluentd/fluentd.log instead of stdout. LP#1888852

  • Fixes deploy-containers action missing for the Masakari role. LP#1889611

  • An issue has been fixed when keystone container would be stuck in restart loop with a message that fernet key is stale. LP#1895723

  • Fixes haproxy_single_service_split template to work with default for mode (http). LP#1896591

  • Fixed invalid fernet cron file path on Debian/Ubuntu from /var/spool/cron/crontabs/root/fernet-cron to /var/spool/cron/crontabs/root. LP#1898765

  • Add with_first_found on placement for placement-api wsgi configuration to allow overwrite from users. LP#1898766

  • OVN will no longer schedule SNAT routers on compute nodes when neutron_ovn_distributed_fip is enabled. LP#1901960

  • RabbitMQ services are now restarted serially to avoid a split brain. LP#1904702

  • Fixes LP#1906796 by adding notice and note loglevels to monasca log-metrics drop configuration

  • Fixes Swift’s stop action. It will no longer try to start swift-object-updater container again. LP#1906944

  • Fixes an issue with the kolla-ansible prechecks command with Docker 20.10. LP#1907436

  • Fixes an issue with kolla-ansible mariadb_recovery when the mariadb container does not exist on one or more hosts. LP#1907658

  • fix deploy freezer failed when use kolla_dev_mod LP#1888242

  • Fixes issues with some CloudKitty commands trying to connect to an external TLS endpoint using HTTP. LP#1888544

  • Fixes an issue where Docker may fail to start if iptables is not installed. LP#1899060

  • The admin-openrc.sh file generated by kolla-ansible post-deploy was previously created with root:root ownership and 644 permissions. This would allow anyone with access to the same directory to read the file, including the admin credentials. The ownership of admin-openrc.sh is now set to the user executing kolla-ansible, and the file is assigned a mode of 600. This change can be applied by running kolla-ansible post-deploy.

  • Fixes an issue during deleting evacuated instances with encrypted block devices. LP#1891462

  • Fixes an issue where Keystone Fernet key rotation may fail due to permission denied error if the Keystone rotation happens before the Keystone container starts. LP#1888512

  • Fixes an issue with Keystone startup when Fernet key rotation does not occur within the configured interval. This may happen due to one of the Keystone hosts being down at the scheduled time of rotation, or due to uneven intervals between cron jobs. LP#1895723

  • Fixes an issue with Kibana upgrade on Debian/Ubuntu systems. LP#1901614

  • Reverts the arp_responder option setting to the default (‘False’) for the LinuxBridge agent, as this is known to cause problems with l2_population as well as other issues such as not being fully compatible with the allowed-address-pairs extension. LP#1892776

  • Fixes an issue with the Neutron Linux bridge ML2 driver where the firewall driver configuration was not applied. LP#1889455

  • Fixes an issue with Masakari and internal TLS where CA certificates were not copied into containers, and the path to the CA file was not configured. Depends on masakari bug 1873736 being fixed. LP#1888655

  • Fixes an issue where Grafana instances would race to bootstrap the Grafana DB. See LP#1888681.

  • Fixes LP#1892210 where the number of open connections to Memcached from neutron-server would grow over time until reaching the maximum set by memcached_connection_limit (5000 by default), at which point the Memcached instance would stop working.

  • An issue where when Kafka default topic creation was used to create a Kafka topic, no redundant replicas were created in a multi-node cluster. LP#1888522. This affects Monasca which uses Kafka, and was previously masked by the legacy Kafka client used by Monasca which has since been upgraded in Ussuri. Monasca users with multi-node Kafka clusters should consultant the Kafka documentation to increase the number of replicas.

  • Fixes an issue where the br_netfilter kernel module was not loaded on compute hosts. LP#1886796

  • The Prometheus OpenStack exporter now uses internal endpoints to communicate with OpenStack services, to match the configuration of other services deployed by Kolla Ansible.

  • Prevents adding a new Keystone host to an existing cluster when not targeting all Keystone hosts (e.g. due to --limit or --serial arguments), to avoid overwriting existing Fernet keys. LP#1891364

  • Reduce the use of SQLAlchemy connection pooling, to improve service reliability during a failover of the controller with the internal VIP. LP#1896635

  • No longer configures the Prometheus OpenStack exporter to use the prometheus Docker volume, which was never required.

Other Notes

  • Add trove-guestagent.conf for trove

10.1.0

New Features

  • Adds ability to provide a custom elasticsearch config.

Upgrade Notes

  • Changes the default value of kibana_elasticsearch_ssl_verify from false to true. LP#1885110

  • Apache ZooKeeper will now be automatically deployed whenever Apache Storm is enabled.

Bug Fixes

  • Fixes an issue when using ip addresses instead of hostnames in Ansible inventory. OpenvSwitch role sets system-id based on inventory_hostname, which in case of ip addresses in is first ip octet. Such a deployment would result in multiple OVN chassis with duplicate name e.g. “10” connecting to OVN Southbound database - which spawns high numbers of create/delete events in Encap database table - leading to near 100% CPU usage of OVN/OVS/Neutron processes.

  • Fixes an issue with Manila deployment starting openvswitch and neutron-openvswitch-agent containers when enable_manila_backend_generic was set to False. LP#1884939

  • Fixes the Elasticsearch Curator cron schedule run. LP#1885732

  • Fixes an incorrect configuration for nova-conductor when a custom Nova policy was applied, preventing the nova_conductor container from starting successfully. LP#1886170

  • Fixes an incorrect Ceph keyring file configuration in gnocchi.conf, which prevented Gnocchi from connecting to Ceph. LP#1886711

  • In line with clients for other services used by Magnum, Cinder and Octavia also use endpoint_type = internalURL. In the same tune, these services also use the globally defined openstack_region_name.

  • Fix the configuration of the etcd service so that its protocol is independant of the value of the internal_protocol parameter. The etcd service is not load balanced by HAProxy, so there is no proxy layer to do TLS termination when internal_protocol is configured to be https.

  • Fixes LP#1885885 where the default chunk size in the Monasca Fluentd output plugin increased from 8MB to 256MB for file buffering which exceeded the limit allowed by the Monasca Log / Unified API.

  • Adds a new variable fluentd_elasticsearch_cacert, which defaults to the value of openstack_cacert. If set, this will be used to set the path of the CA certificate bundle used by Fluentd when communicating with Elasticsearch. LP#1885109

  • Improves error reporting in kolla-genpwd and kolla-mergepwd when input files are not in the expected format. LP#1880220.

  • Fixes Magnum trust operations in multi-region deployments.

  • Deploys Apache ZooKeeper if Apache Storm is enabled explicitly. ZooKeeper would only be deployed if Apache Kafka was also enabled, which is often done implicitly by enabling Monasca.

10.0.0

Prelude

The Kolla Ansible 10.0.0 release is the first release in the Ussuri cycle. Notable changes include:

  • all playbooks and scripts now use Python 3 and support for Python 2 has been dropped

  • CentOS 8 is now supported as a host operating system and container image, and support for CentOS 7 has been dropped

  • Ceph deployment support has been dropped

  • configuration of external Ceph integration has been streamlined

  • initial support for TLS encryption of backend API services, providing end-to-end encryption of API traffic for Barbican, Cinder, Glance, Heat, Horizon, Keystone, Nova and Placement

  • support for deployment of Open Virtual Network (OVN) and integration of it with Neutron

New Features

  • Adds Elasticsearch Curator for managing aggregated log data.

  • Adds configuration variables cron_logrotate_rotation_interval and cron_logrotate_rotation_count to set the logrotate rotation interval and count.

  • Adds a mechanism to customize prometheus.yml. Please read the the documentation. for more details.

  • Add support for two new Senlin services; senlin-conductor and senlin-health-manager. Both of these services are required for Senlin to be fully functional starting with the Ussuri release.

  • Adds a mechanism to copy user defined files via the extras directory of prometheus config. This can can be useful for certain prometheus config customizations that reference additional files. An example is setting up file based service discovery.

  • Adds a new variable, influxdb_datadir_volume. This allows you control where the docker volume for InfluxDB is created. A performance tuning is to set this to a path on a high performance flash drive.

  • Adds a new variable, kafka_datadir_volume. This allows you to control where the Kafka data is stored. Generally you will want this to be a spinning disk, or an array of spinning disks.

  • Add a new container zun-cni-daemon for Zun service. This container is a daemon service for implementing the CNI plugin for Zun.

  • Allow operators to use custom parameters with the ceilometer-upgrade command. This is quite useful when using the dynamic pollster subsystem; that sub-system provides flexibility to create and edit pollsters configs, which affects Gnocchi resource-type configurations. However, Ceilometer uses default and hard-coded resource-type configurations; if one customizes some of its default resource-types, he/she can get into trouble during upgrades. Therefore, the only way to work around it is to use the --skip-gnocchi-resource-types flag.

  • Adds new checks to kolla-ansible prechecks that validate that expected Ansible groups exist.

  • Kolla Ansible checks now that the local Ansible Python environment is coherent, i.e. used Ansible can see Kolla Ansible. LP#1856346

  • Adds support for CentOS 8 as a host Operating System and base container image. This is the only major version of CentOS supported from the Ussuri release. The Train release supports both CentOS 7 and 8 hosts, and provides a route for migration.

  • Introduces user modifiable variables instead of fixed names for Ceph keyring files used by external Ceph functionality.

  • Configures all openstack services to use the globally defined Certificate Authority file to verify HTTPS connections. The global CA file is configured by the openstack_cacert parameter.

  • When kolla_copy_ca_into_containers is configured to yes, the certificate authority files in /etc/kolla/certificates/ca will be copied into service containers to enable trust for those CA certificates. This is required for any certificates that are either self-signed or signed by a private CA, and are not already present in the service image trust store. Otherwise, either CA validation will need to be explicitly disabled or the path to the CA certificate must be configured in the service using the openstack_cacert parameter.

  • Adds a prune-images command for Docker image pruning on hosts. See blueprint for details.

  • Fluentd now buffers logs locally to file when the Monasca API is unreachable.

  • Adds configuration options to enable backend TLS encryption from HAProxy to the Keystone, Glance, Heat, Placement, Horizon, Barbican, and Cinder services. When used in conjunction with enabling TLS for service API endpoints, network communcation will be encrypted end to end, from client through HAProxy to the backend service.

  • Delegates execution of the Ansible uri module to service containers using kolla_toolbox. This will enable any certificates that are already copied and extracted into the service container to be automatically validated. This is particularly useful in the case that the certificate is either self-signed or signed by a local (private) CA.

  • Introduce External Ceph user IDs as variables to allow non-standard Ceph authentication IDs in OpenStack service configuration without the need to override configuration files.

  • Adds a --clean argument to kolla-mergepwd. It allows to clean old (no longer used) keys from the passwords file.

  • Adds support for generating self-signed certificates for both the internal and external (public) networks via the kolla-ansible certificates command. If they are the same network, then the certificate files will be the same.

  • Self-signed TLS certificates can be used to test TLS in a development OpenStack environment. The kolla-ansible certificates command will generate the required self-signed TLS certificates. This command has been updated to first create a self-signed root certificate authority. The command then generates the internal and external facing certificates and signs them using the root CA. If backend TLS is enabled, the command will generate the backend certificate and sign it with the root CA.

  • HAProxy - Add the ability to define custom HAProxy services in {{ node_custom_config }}/haproxy/services.d/

  • Adds a new precheck for supported host OS distributions. Currently supported distributions are CentOS/RHEL 8, Debian Buster and Ubuntu Bionic. This check can be disabled by setting prechecks_enable_host_os_checks to false.

  • Adds support for deployment of OVN and integration of it with Neutron. This includes deployment of:

    • OVN databases (ovn-sb-db and ovn-nb-db)

    • Southbound and Northbound databases connector (ovn-northd)

    • Hypervisor components ovn-controller and neutron-ovn-metadata-agent

  • Add Object Storage service (Swift) support for Ironic.

  • Adds support for managing Ceilometer dynamic pollster configuration in Kolla Ansible. This feature will look for configurations in {{ node_custom_config }}/ceilometer/pollster.d/ by default. If there are configs there, they are copied to the control nodes, to configure Ceilometer dynamic pollster sub-system.

  • Enable Galera node state checking by using clustercheck script that is used by HAProxy to define node up/down state.

  • Introduces a new configuration variable mariadb_wsrep_extra_provider_options allowing users to set additional WSREP options.

  • Adds support for the Neutron policy file in both .json and .yaml format.

  • Adds a new variable, openstack_tag, which is used as the default Docker image tag in place of openstack_release. The default value is openstack_release, with a suffix set via openstack_tag_suffix. The suffix is empty except on CentOS 8 where it is set to -centos8. This allows for the availability of images based on CentOS 7 and 8.

  • Prometheus server can now be disabled, allowing the exporters to be deployed without it. The default behaviour of deploying Prometheus server when Prometheus is enabled remains.

Known Issues

  • Python Requests library will not trust self-signed or privately signed CAs even if they are added into the OS trusted CA folder and update-ca-trust is executed. For services that rely on the Python Requests library, either CA verification must be explicitly disabled in the service or the path to the CA certificate must be configured using the openstack_cacert parameter.

Upgrade Notes

  • Adds a maximum supported version check for Ansible. Kolla Ansible now requires at least Ansible 2.8 and supports up to 2.9. See blueprint for details.

  • Avoids unnecessary fact gathering using the setup module. This should improve the performance of environments using fact caching and the Ansible smart fact gathering policy. See blueprint for details.

  • CentOS 7 is no longer supported as a host Operating System or base container image. CentOS users should migrate to CentOS 8. The Train release supports both CentOS 7 and 8 images, and provides a route for migration.

  • Some images were supported by CentOS 7 but lack suitable packages in CentOS 8, and are no longer supported for CentOS. See Kolla release notes for details.

  • Support for the SCSI target daemon (tgtd) has been removed for CentOS/RHEL 8. The default value of cinder_target_helper is now lioadm on CentOS/RHEL 8, but remains as tgtadm on other platforms.

  • For cinder (cinder-volume and cinder-backup), glance-api and manila keyrings behavior has changed and Kolla Ansible deployment will not copy those keys using wildcards (ceph.*), instead will use newly introduced variables. Your environment may render unusable after an upgrade if your keys in /etc/kolla/config do not match default values for introduced variables.

  • The default migration_interface is moved from network_interface to api_interface, which is treaded as internal and security network plane in most case.

  • The gnocchi-statsd daemon is no longer enabled by default. If you are using the daemon, you will need to set enable_gnocchi_statsd: "yes" to continue using it in your deployment.

  • Erlang 22.x dropped support for HiPE so the rabbitmq_hipe_compile variable has been removed.

  • Changes default value of enable_haproxy_memcached to no. Memcached has not been accessed via haproxy since at least the Rocky release. Users depending on haproxy for memcached for other software may want to change this back to yes.

  • Python 2.7 support has been dropped. The last release of Kolla Ansible to support Python 2.7 is OpenStack Train. The minimum version of Python now supported by Kolla Ansible is Python 3.6.

  • The default behavior for generating the cinder.conf template has changed. An rbd-1 section will be generated when external Ceph functionality is used, i.e. cinder_backend_ceph is set to true. Previously it was only included when Kolla Ansible internal Ceph deployment mechanism was used.

  • The rbd section of nova.conf for nova-compute is now generated when nova_backend is set to "rbd". Previously it was only generated when both enable_ceph was "yes" and nova_backend was set to "rbd".

  • The kolla_logs Docker volume is now mounted into the Elasticsearch container to expose logs which were previously written erroneously to the container filesystem. It is up to the user to migrate any existing logs if they so desire and this should be done before applying this fix. LP#1859162

  • The default value for kolla_external_fqdn_cacert has been changed from: “{{ node_config }}/certificates/haproxy-ca.crt” to: “{{ node_config }}/certificates/ca/haproxy.crt”

    and the default value for kolla_external_fqdn_cacert has been changed from: “{{ node_config }}/certificates/haproxy-ca-internal.crt” to: “{{ node_config }}/certificates/ca/haproxy-internal.crt”

    These variables set the value for the OS_CACERT environment variable in admin-openrc.sh. This has been done to allow these certificates to be copied into containers when kolla_copy_ca_into_containers is true.

  • Replaced kolla_external_fqdn_cacert and kolla_internal_fqdn_cacert with kolla_admin_openrc_cacert, which by default is not set. OS_CACERT is now set to the value of kolla_admin_openrc_cacert in the generated admin-openrc.sh file.

  • Glance deployment now uses Multi-Store support. Users that have default_stores in their service config overrides for glance-api.conf should remove it and use default_backend if needed.

  • The enable_cadf_notifications variable was removed. CADF is the default notification format in keystone. To enable keystone notifications, users can now set keystone_default_notifications_topic_enabled to yes or enable Ceilometer via enable_ceilometer.

  • Removes support for the enable_xtrabackup variable that was deprecated in favour of enable_mariabackup in the Train (9.0.0) release.

  • Support for deploying Ceph has been removed, after it was deprecated in Stein. Please use an external tool to deploy Ceph and integrate it with Kolla Ansible deployed OpenStack by following the external Ceph guide.

  • The octavia user is no longer given the admin role in the admin project. Octavia does not require this role and instead uses octavia user with admin role in service project. During an upgrade the octavia user is removed from the admin project.

    For existing deployments this may cause problems, so a octavia_service_auth_project variable has been added which may be set to admin to return to the previous behaviour.

    To switch an existing deployment from using the admin project to the service project, it will at least be necessary to create the required security group in the service project, and update octavia_amp_secgroup_list to this group’s ID. Ideally the Amphora flavor and network would also be recreated in the service project, although this does not appear to be necessary for operation, and will impact existing Amphorae.

    See bug 1873176 for details.

  • Support for configuration of Neutron related to integration with ONOS has been removed.

  • Support for deployment of OpenDaylight controller and configuration of Neutron related to integration with OpenDaylight have been removed.

  • Neutron Linux bridge and Open vSwitch Agents config has been split out into linuxbridge_agent.ini and openvswitch_agent.ini respectively. Please move your custom service config from ml2_conf.ini into those files.

  • The Monasca Log API has been removed. All logs now go to the unified Monasca API when Monasca is enabled. Any custom Fluentd configuration and inventory files will need to be updated. Any monasca_log_api containers will be removed automatically.

Deprecation Notes

  • Deprecates support for deploying with Hyper-V integrations. In Victoria support for these will be removed from Kolla Ansible.

    This is dictated by lack of interest and maintenance.

    See also the post to openstack-discuss

  • Deprecates support for deploying MongoDB. In Victoria support for deploying MongoDB will be removed from Kolla Ansible. Note CentOS 8 already lost support for MongoDB due to decisions made upstream.

    This affects Panko as it will no longer be possible to get automatic deployment of MongoDB database for it. However, the default, SQL, backend is and will be supported via MariaDB.

    MongoDB lost its position in OpenStack environment after controversial relicensing under their custom SSPL (Server Side Public License) which did not pass OSI (Open Source Initiative) validation.

  • The neutron-fwaas project was deprecated in the Neutron stadium and will be removed from stadium in the Wallaby cycle. The support for neutron-fwaas in the Neutron and Horizon roles is deprecated as of the Ussuri release and will be removed in the Wallaby cycle.

  • Deprecates support for deploying with VMware integrations. In Victoria support for these will be removed from Kolla Ansible.

    This is dictated by lack of interest and maintenance.

    See also the post to openstack-discuss

  • Deprecates support for deploying with XenAPI integrations. In Victoria support for these will be removed from Kolla Ansible.

    This is dictated by lack of interest and maintenance, and upstream decision of deprecation by Nova (for the same reasons).

    See also the post to openstack-discuss. And the Nova notice.

  • The congress project is no longer maintained. This has been retired since Victoria and has not been used by other OpenStack services since.

  • Customizing Neutron Linux bridge and Open vSwitch Agents config via ml2_conf.ini is deprecated. The config has been split out for these agents into linuxbridge_agent.ini and openvswitch_agent.ini respectively. In this release (Ussuri) custom service config ml2_conf.ini overrides will still be used when merging configs - but but that functionality will be removed in the Victoria release.

Security Issues

  • Fixes leak of RabbitMQ password into Ansible logs. LP#1865840

Bug Fixes

  • Fix that the cyborg conductor failed to communicate with placement. See bug 1873717.

  • Fix that cyborg agent failed to start privsep daemon. Add privileged capability for cyborg agent. See bug 1873715.

  • Adds necessary region_name to octavia.conf when enable_barbican is set to true. LP#1867926

  • Adds /etc/timezone to Debian/Ubuntu containers. LP#1821592

  • Fixes an issue with Nova live migration not using migration_interface_address even when TLS was not used. When migrating an instance to a newly added compute host, if addressing depended on /etc/hosts and it had not been updated on the source compute host to include the new compute host, live migration would fail. This did not affect DNS-based name resolution. Analogically, Nova live migration would fail if the address in DNS//etc/hosts was not the same as migration_interface_address due to user customization. LP#1729566

  • Fixes Kibana deployment with the new E*K stack (6+). LP#1799689

  • Reworks Keystone fernet bootstrap which had tendencies to fail on multinode setups. See bug 1846789 for details.

  • Fix prometheus-openstack-exporter to use CA certificate.

  • Changes Manila cephfs share driver to manila.share.drivers.cephfs.driver.CephFSDriver, as the old driver was deprecated.

  • External Ceph: copy also cinder keyring to nova-compute. Since Train nova-compute needs also the cinder key in case rbd user is set to Cinder, because volume/pool checks have been moved to use rbd python library. Fixes LP#1859408

  • Fix qemu loading of ceph.conf (permission error). LP#1861513

  • Remove /run bind mounts in Neutron services causing dbus host-level errors and add /run/netns for neutron-dhcp-agent and neutron-l3-agent. LP#1861792

  • Fixes an issue where old fluentd configuration files would persist in the container across restarts despite being removed from the node_custom_config directory. LP#1862211

  • Use more permissive regex to remove the offending 127.0.1.1 line from /etc/hosts. LP#1862739

  • Each Prometheus mysqld exporter points now to its local mysqld instance (MariaDB) instead of VIP address. LP#1863041

  • Cinder Backup has now access to kernel modules to load e.g. iscsi_tcp module. LP#1863094

  • Makes RabbitMQ hostname address resolution precheck stronger by requiring uniqueness of resolution to avoid later issues. LP#1863363

  • Fix protocol used by neutron-metadata-agent to connect to Nova metadata service. This possibly affected internal TLS setup. Fixes LP#1864615

  • Fixes haproxy role to avoid restarting haproxy service multiple times in a single Ansible run. LP#1864810 LP#1875228

  • Fixes an issue with deploying Grafana when using IPv6. LP#1866141

  • Fixes elasticsearch deployment in IPv6 environments. LP#1866727

  • Fixes failure to deploy telegraf with monitoring of zookeeper due to wrong variable being referenced. LP#1867179

  • Fixes deployment of fluentd without any enabled OpenStack services. LP#1867953

  • Fix missing glance_ca_certificates_file variable in glance.conf. LP#1869133

  • Adds missing vitrage-persistor service, required by Vitrage deployments for storing data. LP#1869319

  • Fixes designate-worker not to use etcd as its coordination backend because it is not supported by Designate (no group membership support available via tooz). LP#1872205

  • Fixes Octavia in internally-signed (e.g. self-signed) cert TLS deployments by providing path to CA cert file in proper config places. LP#1872404

  • Fixes source-IP-based load balancing for Horizon when using the “split” HAProxy service template.

  • Fixes issue where HAProxy would have no backend servers in its config files when using the “split” config template style.

  • Manage nova scheduler workers through openstack_service_workers variable. LP#1873753

  • Removing chrony package and AppArmor profile from docker host if containerized chrony is enabled. LP#1882513

  • Add missing “become: true” on some VMWare related tasks. Fixed on Copying VMware vCenter CA file and Copying over nsx.ini.

  • fix deploy nova failed when use kolla_dev_mod.

  • Remove the meta field of the Swift rings from the default rsync_module template. Having it by default, undocumented, can lead to unexpected behavior when the Swift documentation states that this field is not processed.

  • Fixes the default CloudKitty configuration, which included the gnocchi_collector and keystone_fetcher options that were deprecated in Stein and removed in Train. See bug 1876985 for details.

  • When etcd is used with cinder_coordination_backend and/or designate_coordination_backend, the config has been changed to use the etcd3gw (aka etcd3+http) tooz coordination driver instead of etcd3 due to issues with the latter’s availability and stability. etcd3 does not handle well eventlet-based services, such as cinder’s and designate’s. See bugs 1852086 and 1854932 for details. See also tooz change introducing etcd3gw.

  • Adds configuration to set also_notifies within the pools.yaml file when using the Infoblox backend for Designate.

    Pushing a DNS NOTIFY packet to the master does not cause the DNS update to be propagated onto other nodes within the cluster. This means each node needs a DNS NOTIFY packet otherwise users may be given a stale DNS record if they query any worker node. For details please see bug 1855085

  • Fixes an issue with Docker client timeouts where Docker reports ‘Read timed out’. The client timeout may be configured via docker_client_timeout. The default timeout has been increased to 120 seconds. See bug for details.

  • Fixes an issue with Cinder upgrades that would cause online schema migration to fail. LP#1880753

  • Fix cyborg api container failed to load api paste file. For details please see bug 1874028.

  • Fix elasticsearch schema in fluentd when kolla_enable_tls_internal is true.

  • Fixes an issue where fernet_token_expiry would fail the pre-checks despite being set to a valid value. Please see bug 1856021 for more details.

  • Fixes an issue with HAProxy prechecks when scaling out using --limit or --serial. LP#1868986.

  • Fixes an issue with the HAProxy monitor VIP precheck when some instances of HAProxy are running and others are not. See bug 1866617.

  • The kolla_logs Docker volume is now mounted into the Elasticsearch container to expose logs which were previously written erroneously to the container filesystem. LP#1859162

  • Fixes MariaDB issues in multinode scenarios which affected deployment, reconfiguration, upgrade and Galera cluster resizing. They were usually manifested by WSREP issues in various places and could lead to need to recover the Galera cluster. Note these issues were due to how MariaDB was handled during Kolla Ansible runs and did not affect Galera cluster during normal operations unless MariaDB was later touched by Kolla Ansible. Users wishing to run actions on their Galera clusters using Kolla Ansible are strongly advised to update. For details please see the following Launchpad bug records: bug 1857908 and bug 1859145.

  • Fixes an issue with Nova when deploying new compute hosts using --limit. LP#1869371.

  • Adapts Octavia to the latest dual CA certificate configuration. The following files should exist in /etc/kolla/config/octavia/:

    • client.cert-and-key.pem

    • client_ca.cert.pem

    • server_ca.cert.pem

    • server_ca.key.pem

    See the Octavia documentation for details on generating these files.

  • Fixes an issue with RabbitMQ where tags would be removed from the openstack user after deploying Nova. This prevents the user from accessing the RabbitMQ management UI. LP#1875786

  • Fixes an issue where a failure in pulling an image could lead to a container being removed and not replaced. See bug 1852572 for details.

  • Since Openstack services can now be configured to use TLS enabled REST endpoints, urls should be constructed using the {{ internal_protocol }} and {{ external_protocol }} configuration parameters.

  • Construct service REST API urls using kolla_internal_fqdn instead of kolla_internal_vip_address. Otherwise SSL validation will fail when certificates are issued using domain names.

  • Fixes an issue with the kolla-ansible stop command where it may fail trying to stop non-existent containers. LP#1868596.

  • Fixes Swift volume mounting failing on kernel 4.19 and later due to removal of nobarrier from XFS mount options. See bug 1800132 for details.

  • Fixes an issue with fluentd parsing of WSGI logs for Aodh, Masakari, Qinling, Vitrage and Zun. See bug 1720371 for details.

  • Fixes gnocchi-api script name for Ubuntu/Debian binary deployments. LP#1861688

  • Fixes glance_api to run as privileged and adds missing mounts so it can use an iscsi cinder backend as its store. LP#1855695

  • When upgrading from Rocky to Stein HAProxy configuration moves from using a single configuration to assembling a file from snippets for each service. Applying the HAProxy tag to the entire play ensures that HAProxy configuration is generated for all services when the HAProxy tag is specified. For details please see bug 1855094.

  • Fixes an issue with the ironic_ipxe container serving instance images. See bug 1856194 for details.

  • Fixes an issue with Kibana deployment when openstack_cacert is unset. See bug 1864180 for details.

  • Fixes an issue with Monasca deployment where an invalid variable (monasca_log_dir) is referenced. See bug 1864181 for details.

  • Fixes an issue where host configuration tasks (sysctl, loading kernel modules) could be performed during the kolla-ansible genconfig command. See bug 1860161 for details.

  • Fixes an issue with port prechecks for the Placement service. See bug 1861189 for details.

  • Fixes templating of Prometheus configuration when Alertmanager is disabled. In a deployment where Prometheus is enabled and Alertmanager is disabled the configuration for the Prometheus will fail when templating as the variable prometheus_alert_rules does not contain the key files. LP#1854540

  • Removes the [http]/max-row-limit = 10000 setting from the default InfluxDB configuration, which resulted in the CloudKitty v1 API returning only 10000 dataframes when using InfluxDB as a storage backend. See bug 1862358 for details.

  • Skydive’s API and the web UI now rely on Keystone for authentication. Only users in the Keystone project defined by skydive_admin_tenant_name will be able to authenticate. See LP#1870903 <https://launchpad.net/bugs/1870903> for more details.

  • Fixes an issue where Elasticsearch API requests made during Kibana, Elasticsearch and Monasca deployment could have an invalid body. See bug 1864177 for details.

  • masakari-monitor will now use the internal API to reach masakari-api. LP#1858431

  • Switch endpoint_type from public to internal for octavia communicating with the barbican service. See bug 1875618 for details.