Current Series Release Notes

28.0.0-89

Prelude

Naming convention for containers has changed in Ansible inventory to match requirements defined by RFC1034. From now on, all newly added containers will not have underscore (_) symbol neither in their inventory_hostname nor in container_name This change does not touch or alter already existing containers in the inventory. In order to apply new naming convention for an existing environment you need to delete containers on the host and from the inventory. After that new container name will be generated and container can be created again.

RabbitMQ Quorum Queues are enabled by default along with other significant improvements to oslo.messaging, like Queue Manager, Transient Queues to use Quorum, usage of Stream Queues for fanout. You can check more details for these options in oslo.messaging release notes

Added support to deploy Skyline dashboard.

New Features

  • RabbitMQ policies now support the apply_to parameter to e.g have a policy target only classic_queues, exchanges, …

  • Added new variable blazar_nova_aggregate_name that allows to control Nova aggregate name for Blazar. When is not False (default value ‘freepool’), aggregate in topic will be created during deployment.

  • Implements variable blazar_manager_plugins that allows to configure list of enabled plugins for Blazar.

  • Blazar now using memcached for token caching. List of memcached servers can be controlled using blazar_memcached_servers variable.

  • Added extra variables to allow control Blazar integration with Nova:

    • nova_blazar_enabled (bool) - Control if Blazar integration should be configured

    • nova_blazar_scheduler_filters (list) - Extra filters that will be enabled in Nova scheduler

    • nova_blazar_pip_packages (list) - Extra Python packages that will be installed on Nova scheduler hosts

  • The default Ceph release deployed in the openstack-ansible all-in-one is switched from the Quicny to Reef. It remains that this integration is primarily a test fixture and that the recommendation for production deployments is to deploy an independant ceph cluster.

  • Added variable nova_cell_force_update that can be set to True during runtime to force update cell records. This might be useful in case of password rotation for DB users for cell0 or any other changes in connection configuration.

  • Added variable galera_install_method to control whether external repositories should be configured or role should attempt to install from distro default ones instead.

  • Added property protection configuration, managed with new variables: glance_property_protection_file_overrides (should be configured for a configuration to run, look in Default variables for an example), glance_property_protection_rule_format, glance_property_protection_file.

  • haproxy_pki_create_certificates was implemented. It allows users to explicitly disable certificates generation with PKI role but keep using it for certificates distribution.

  • New variables are added to allow a user defined playbooks to be specified that run pre and post the existing code in setup-hosts, setup-infrastructure and setup-openstack. OpenStack-Ansible deployments may be extended and have additional user defined functions added using these hooks, which will have full access to the OSA inventory and host variables.

  • Horizon is now co-installabe with Skyline dashboard. In case both Horizon and Skyline are deployed, Horizon will be served by /horizon URI.

  • Added variable ironic_user_driver_types that allow to extend the default ironic_driver_types as well as override existing default records.

  • Added extra variables to Neutron role for Availability Zones configuration:

    • neutron_default_availability_zones - defines a list of AZs where l3/dhcp agents or OVN routers/ports will be scheduled to when hint is not provided in a request to Neutron API.

    • neutron_availability_zone - Availability Zone of the current component. It is recommended to leverage group/host_vars for setting this variable. For OVN this variable can contain multiple values spearated by colon.

  • Added openstack-resources.yml playbook that aims to handle creation and futher adjustment of OpenStack resources. It relies heavily on Ansible collection for OpenStack. The main goal of the playbook to provide a unified method of creating and managing common resources, like images, flavors, compute aggregates, networks, etc. Playbook can consume following variables, that are provided to simmilar ones in openstack.osa.openstack_resources role:

    • openstack_user_identity

    • openstack_user_compute

    • openstack_user_network

    • openstack_user_image

    • openstack_user_coe

    Please reffer to the role documentation and examples for more details.

  • Added new variables to os_neutron role that allow to adjust connection to OVS manager:

    • neutron_ovsdb_manager_host: defaults to 127.0.0.1

    • neutron_ovsdb_manager_port: defaults to 6640

    • neutron_ovsdb_manager_proto: defaults to tcp

    • neutron_ovsdb_manager_connection: Combines proto, host and port into a valid connection string for neutron plugins.

  • Implemented new variables to control new oslo.messaging behaviour:

    • oslomsg_rabbit_stream_fanout: Enabled if oslomsg_rabbit_quorum_queues is also set to True

    • oslomsg_rabbit_transient_quorum_queues: Enabled if oslomsg_rabbit_stream_fanout is True

    • oslomsg_rabbit_qos_prefetch_count: Must be set to a positive value if oslomsg_rabbit_stream_fanout is True

    • oslomsg_rabbit_queue_manager: Disabled by default. It is recommended to enable only for containerized deployments. Please check oslo.messaging bug report for more details of why it should not be used for metal deployments.

    Each service also has corresponsive variables prefixed with service name, like <service>_oslomsg_rabbit_stream_fanout to control them separately.

  • Add the abbility to configure the logging options with the variable rabbitmq_log of the rabbitmq-server using key-value pairs. The default values for journald (true) and file (false) are keept but more options (see https://www.rabbitmq.com/logging.html) can be configured now.

  • OpenStack-Ansible can now be used to deploy Skyline, an alternantive dashboard. New example files have been added to env.d and conf.d to support the Skyline infrastructure, and a playbook named os-skyline-install.yml has been added to deploy the API and console service.

  • A new override, skyline_client_max_body_size, has been introduced to support large image uploads via the Skyline dashboard. The default value of 1100M supports upstream Ubuntu and Rocky Linux images, but can be increased to support larger images or decreased to encourage the use of the CLI.

  • Trove role introduced variables to independently configure RPC/Notification communication for Guest Agent: - trove_guest_oslomsg_rabbit_quorum_queues - trove_guest_oslomsg_rpc_port - trove_guest_oslomsg_rpc_userid - trove_guest_oslomsg_rpc_password - trove_guest_oslomsg_rpc_vhost

  • Implemented installation of extra Python packages inside Ansible virtual environment. By default, extra requirements should be defined in /etc/openstack_deploy/user-ansible-venv-requirements.txt file. Path to the requirements file can be overriden using USER_ANSIBLE_REQUIREMENTS_FILE environment variable.

Known Issues

  • With recent changes to config_template module, it is not possible anymore to have variables as dictionary keys in overrides. Example below will not be renderred properly:

    config_overrides:
      "{{ inventory_hostname }}":
        cruel: world
    

    This limitation boils down to Ansible design and will be true for any other module as well. In order to overcome it, you can transform the dictionary to a Jinja2 format:

    config_overrides: |-
      {{
        {
          inventory_hostname: {
            'cruel': 'world'
          }
        }
      }}
    

Upgrade Notes

  • Additional variables are available when MariaDB is configured to use TLS, enabled by setting galera_use_ssl to true. galera_require_secure_transport to require that all client connections are encrypted, defaulting to false. galera_tls_version to provide a list of accepted TLS protocols, defaulting to ‘TLSv1.2,TLSv1.3’.

  • Floating IP plugin for Blazar (virtual.floatingip.plugin) is now enabled by default. Use blazar_manager_plugins variable to change the list of enabled plugins if needed.

  • Ensure that in openstack_user_config.yml / conf.d cloudkitty_hosts are replaced with rating_hosts. For deployments with LXC containers: after upgrade is completed make sure that Cloudkitty is not running on your LXC hosts anymore, after which you should be able to remove LXC hosts from cloudkitty_all group in inventory.

  • For deployments with nova_backend_ssl: True TLS certificates for Nova API backends will be re-generated during upgrade. From now on they will be suffixed with _api.

  • Format of magnum_glance_images has changed to the one compatible with openstack_resources role. Please reffer to os_magnum role documentation for a relevant example.

  • Services, that were makred as Inactive (Murano, Senlin, Sahara) will not be upgraded as they were not released for 2024.1 (Caracal). In order to keep managing these services on older versions you will need to do following actions:

    • Create file /etc/openstack_deploy/user-role-requirements.yml with following content:

      - name: os_<service>
        scm: git
        src: https://opendev.org/openstack/openstack-ansible-os_<service>
        version: master
        trackbranch: master
      
    • Playbooks for managing services can be found inside role examples, ie: /etc/ansible/roles/os_<service>/examples/playbook.yml

    • Services will remain in your previously generated inventory until you explicitly remove them.

  • With enablement of RabbitMQ Quorum Queues by default, all vhosts will be re-created and re-named without leading slash (‘/’). For instance, /nova vhost will be renamed to nova. This might affect configured monitoring alert rules or statistics gathering.

    Renaming process will also lead for prolonged downtime of services during upgrade which will last from vhost renaming until service role execution completition. This might be especially important for services like Nova and Neutron where role runtime may take a while to cover all hosts.

    You can disable usage of Quorum Queues and use previous default of HA Queues by defining oslomsg_rabbit_quorum_queues: False in your user_variables.yml

    Please check RabbitMQ maintenance documentation for more infromation on how to switch between these 2 modes with reduced downtime.

  • When using RabbitMQ in a high availability cluster (non-quorum queues), transient ‘reply_’ queues are now included in the HA policy where they previously were not. Note that this will increase the load on the RabbitMQ cluster, particularly for deployments with large numbers of compute nodes.

  • It is highly recommended to explicitly disable trove_guest_oslomsg_rabbit_quorum_queues during upgrade in case oslomsg_rabbit_quorum_queues: True, which is default behaviour since 2024.1 (Caracal). Migration to Quorum queues for Trove Guestagent is not supported and might be troublesome, as already spawned agents will not reload configuration. New deployments though may utilize quorum queues from the very beginning safely.

Deprecation Notes

  • Format of client key inside ceph_extra_components variable has been deprecated in favor of a mapping with one required attribute name. Having client key defined as a simple list is kept for backwards compatibility but will be removed in future releases.

  • Variable heat_deferred_auth_method has been deprecated and has no effect. Default behaviour is to use trusts as deferred method, and that is the only working option as of today. Usage of password is broken with keystonev3 and should not be used.

  • Following roles are not going to be part of 2024.1 release due to services being trafered to Inactive state: - Murano - Senlin - Sahara

    Playbooks for managing these services were also removed out of the tree and can be found only under specific role examples folder.

  • Variables controlling systemd-networkd default filename templating when one is not supplied were deprecated and has no effect from now on.

    • systemd_networkd_filename

    • systemd_networkd_filename_alt

    It is highly recommended to provide filename parameter explicitly whenever you define systemd_netdevs or systemd_networks structures.

  • Generation of SSH keypairs for Ironic users has been deprecated and removed. A variable ironic_recreate_keys has been removed and has no effect.

  • Variable neutron_ovs_socket_path has been deprecated and will be silently ignored. Please use neutron_ovsdb_manager_connection in order to override connection to OVS.

Bug Fixes

  • Blazar endpoints are now versioned and were suffixed with /v1 by default

  • Blazar service authentication was fixed

  • Fixes user-collection-requirements bootstrap process, when defied by deployer collection uses “git+file” as a source scheme. Previously an unexpected version of collection could get installed when using “git+file” scheme.

  • Backwards compatibility of client key inside ceph_extra_components variable has been fixed to support both a list and a list of mappings.

  • Due to issue in env.d defenition for Cloudkitty, service was installed not only inside LXC containers, but also to all LXC hosts, which was not intended. This was fixed in env.d definition for the service, and it was renamed from cloudkitty_hosts to rating_hosts, which should be reflected in your openstack_user_config.yml or conf.d files.

  • Fixes format of ceph_conf_overrides_rgw variable by converting override dictionary to Jinja2 format to workaround Ansible limitation on usage of variables as keys in dictionary.

  • After adding localhost to inventory explicitly this resulted in potential FQDN change due to adding a record for localhost into managed block inside /etc/hosts file. This is now fixed and record for 127.0.0.1 will be removed from managed by Ansible blocks inside /etc/hosts file.

  • PKI role idempotence has been fixed for the metal scenario when nova-compute was placed on the same hosts as nova-api. Previously, certificates were re-generated each run due to non-unique names.

  • Change of horizon_webroot variable is now respected and will be reflected in Apache configuration to serve static files and define wsgi path accordingly.

  • Limitation on group naming in physical_skel section of env.d files regarding usage of underscore symbol was released.

  • Multiple routes can be supplied to the systemd network and they will be placed to a separate configuration file /etc/systemd/network/{{ filename }}.d/routes.conf

    Previously defining multiple routes will result in squashing them together under same section name, while for them to work properly each descriped route must be placed in it’s own section.

  • Due to missing parameter Nova cell0 used to be configured to not use TLS for MySQL communication even when nova_galera_use_ssl was explicitly enabled. It is fixed now and cell0 should be updated on the next playbook run.

Other Notes

  • Glance is using uWSGI when Ceph is used as a storage.

  • Tags matching service names have been removed from os-<service>-install.yml playbooks. For example ‘nova’, ‘neutron’, ‘cinder’ etc. These tags were only useful in conjunction with setup-openstack.yml, but were found to have unexpected behaviour, potentially causing important tasks to be skipped.

  • When Skyline is deployed with the built-in HAProxy server it will, by default, listen on port 80 when ssl is disabled and port 443 when ssl is enabled. Skyline backend in it’s term will listen on port 9999.

    When Skyline is attempted to be deployed with Horizon, Skyline will take precedence by serving on port 80/443. In the meanwhile Horizon will be available in “subdirectory” /horizon.