Train Series (12.2.0 - 13.0.x) Release Notes

13.0.7-30

Known Issues

  • TinyCore Linux 10.x, which powers the TinyIPA ramdisk in the Ussuri and Train releases of OpenStack, is no longer able to be built due to certificate verification issues as time moves forward. We have embedded a fix for use by Ironic’s CI only. As a general reminder, TinyIPA should not be used in production deployments.

Upgrade Notes

  • Operators upgrading from earlier versions using PXE should explicitly set [pxe]ipxe_bootfile_name, [pxe]uefi_ipxe_bootfile_name, and possibly [pxe]ipxe_bootfile_name_by_arch settings, as well as a iPXE specific [pxe]ipxe_config_template override, if required.

    Setting the [pxe]ipxe_config_template to no value will result in the [pxe]pxe_config_template being used. The default value points to the supplied standard iPXE template, so only highly customized operators may have to tune this setting.

  • An automated detection of a IPMI BMC hardware vendor has been added to appropriately handle IPMI BMC variations. Ironic will now query this and save this value if not already set in order to avoid querying for every single operation. Operators upgrading should expect an elongated first power state synchronization if for nodes with the ipmi hardware type.

  • On Train release, to use certification file on HTTPS connection, iRMC driver requires python-scciclient version to be >=0.8.2,<0.9.0 or >=0.9.5,<0.10.0 and packaging >=16.5

  • Operators may need to check their /httpboot/redfish folder permissions if using redfish-virtual-media. The conductor was previously creating the folder with incorrect permissions.

  • A permission setting has been added for redfish-virtual-media boot interface, which allows for explicit file permission setting when the driver is being used. The default for the new [redfish]file_permission setting is ``0u644, or 644 if manually changed using chmod on the command line. Operators MAY need to adjust this if they were running the conductor with a specific umask to work around the permission setting defect.

Security Issues

  • Modifies the irmc hardware type to include a capability to control enforcement of HTTPS certificate verification. By default this is enforced. python-scciclient version must be >=0.8.2,<0.9.0 or >=0.9.5,<0.10.0 Or certificate verification will not occur.

Bug Fixes

  • Fixes Ironic integration with Cinder because of changes which resulted as part of the recent Security related fix in bug 2004555. The work in Ironic to track this fix was logged in bug 2019892. Ironic now sends a service token to Cinder, which allows for access restrictions added as part of the original CVE-2023-2088 fix to be appropriately bypassed. Ironic was not vulnerable, but the restrictions added as a result did impact Ironic’s usage. This is because Ironic volume attachments are not on a shared “compute node”, but instead mapped to the physical machines and Ironic handles the attachment life-cycle after initial attachment.

  • Addresses the lack of an ability to explicitly set different bootloaders for iPXE and PXE based boot operations via their respective ipxe and pxe boot interfaces.

  • Fixes the problem about grub2 config file. Some higher versions of grub2 (e.g. 2.05 or 2.06-rc1) use grub.cfg-01-MAC, while another lower versions of grub2 (e.g. 2.04) use MAC.conf, so we generate both paths in order to be compatible with both.

  • Fixes idrac-wsman management interface set_boot_device method that would fail deployment when there are existing jobs present with error “Failed to change power state to ‘’power on’’ by ‘’rebooting’’. Error: DRAC operation failed. Reason: Unfinished config jobs found: <list of existing jobs>. Make sure they are completed before retrying.”. Now there can be non-BIOS jobs present during deployment. This will still fail for cases when there are BIOS jobs present. In such cases should consider moving to idrac-redfish that does not have this limitation when setting boot device.

  • Fixes issues when UEFI boot mode has been requested with persistent boot to DISK where some versions of ipmitool do not properly handle multiple options being set at the same time. While some of this logic was addressed in upstream ipmitool development, new versions are not released and vendors maintain downstream forks of the ipmitool utility. When considering vendor specific selector differences along with the current stance of new versions from the upstream ipmitool community, it only made sense to handle this logic with-in Ironic. In part this was because if already set the selector value would not be updated. Now ironic always transmits the selector value for UEFI.

  • Fixes handling of Supermicro UEFI supporting BMCs with the ipmi hardware type such that an appropriate boot device selector value is sent to the remote BMC to indicate boot from local storage. This is available for both persistent and one-time boot applications. For more information, please consult story 2008241.

  • Fixes handling of the ipmi hardware type where UEFI boot mode and “one-time” boot to PXE has been requested. As Ironic now specifically transmits the raw commands, this setting should be properly appied where previously PXE boot operations may have previously occured in Legacy BIOS mode.

  • Fixes the virtual disks creation by changing PERC H740P controller mode from Enhanced HBA to RAID in delete_configuration clean step. PERC H740P controllers supports RAID mode and Enhanced HBA mode. When the controller is in Enhanced HBA, it creates single disk RAID0 virtual disks of NON-RAID physical disks. Hence the request for VD creation with supported RAID fails due to no available physical disk. This patch converts the PERC H740P RAID controllers to RAID mode if enhanced HBA mode found enabled See bug bug 2007711 for more details

  • Fixes idrac-wsman power interface to wait for the hardware to reach the target state before returning. For systems where soft power off at the end of deployment to boot to instance failed and forced hard power off was used, this left node successfully deployed in off state without any errors. This broke other workflows expecting node to be on booted into OS at the end of deployment. Additional information can be found in story 2009204.

  • Adds driver_info/irmc_verify_ca option to specify certification file. Default value of driver_info/irmc_verify_ca is False.

  • Adds handling of Redfish BMC’s which lack a BootSourceOverrideMode flag, such that it is no longer a fatal error for a deployment if the BMC does not support this field. This most common on BMCs which feature only a partial implementation of the ComputerSystem resource boot, but may also be observable on some older generations of BMCs which recieved updates to have partial Redfish support.

  • Fixes connection caching issues with Redfish BMCs where AccessErrors were previously not disqualifying the cached connection from being re-used. Ironic will now explicitly open a new connection instead of using the previous connection in the cache. Under normal circumstances, the sushy redfish library would detect and refresh sessions, however a prior case exists where it may not detect a failure and contain cached session credential data which is ultimately invalid, blocking future access to the BMC via Redfish until the cache entry expired or the ironic-conductor service was restarted. For more information please see story 2009719.

  • Fixes the redfish-virtual-media and related based drivers to utilize an explicit file permission instead of rely upon the ironic-conductor umask, which may be incorrect. This can be tuned with the [redfish]file_permission setting.

  • Fixes an issue where the default folder permission for the redfish-virtual-media driver where the folder permissions for the /httpboot/redfish folder was being created with incorrect permissions.

  • The fix for story 2008252 synced the boot mode after changing the boot device because Supermicro nodes reset the boot mode if not included in the boot device set. However this can cause a problem on Dell nodes when changing the mode uefi->bios or bios->uefi, see story 2008712 for details. Restrict the syncing of the boot mode to Supermicro.

  • When Ironic configures the BootSourceOverrideTarget setting via Redfish, on Supermicro BMCs it must always configure BootSourceOverrideEnabled or that will revert to default (Once) on the BMC, see story 2008547 for details. This is different than what is currently implemented for other BMCs in which the BootSourceOverrideEnabled is not configured if it matches the current setting (see story 2007355).

    This requires that node.properties[‘vendor’] be ‘supermicro’ which will be set by Ironic from the Redfish system response or can be set manually.

  • Introduces lazy-loading of ports, portgroups, volume connections and volume targets in task manager to fix performance issues. For periodic tasks which create a task manager object but don’t require the aforementioned data (e.g. power sync), this change should reduce the number of database interactions by around two thirds, speeding up overall execution.

  • Fixes an issue of powering off with the idrac-wsman management interface while the execution of a clear job queue cleaning step is proceeding. Prior to this fix, the clean step would fail when powering off a node.

Other Notes

  • The [conductor]power_state_change_timeout default value has been extended to 60 seconds from 30 seconds. This is due to some API interfaces with Redfish, may cache the power state and thus may take longer than thirty seconds to update after a change has been requested. Please see here for more information.

  • Adds a detect_vendor management interface method to the ipmi hardware type. This method is being promoted as a higher level interface as the fundimental need to be able to have logic aware of the hardware vendor is necessary with vendor agnostic drivers where slight differences require slightly different behavior.

13.0.7

Known Issues

  • Some ipmitool builds, in particular on machines running Red Hat Enterprise Linux 8.2, have changed the default cipher suite being offered which can cause ipmitool to completely fail to negotiate a connection with the BMC. Operators who encounter this situation should use the ipmi_cipher_suite parameter in the driver_info field to override and directly assert the required cipher. Because of potential security implications of attempting second level auto-negotiation and known BMC vendor behaviors, this must be identified by the operator and explicitly set as logic to attempt to navigate through situations like this may have undesirable results.

Bug Fixes

  • Fixes an issue with the ansible deployment interface where automatic root deviec selection would accidently choose the system CD-ROM device, which was likely to occur when the ansible deployment interface was used with virtual media boot. The ansible deployment interface now ignores all Ramdisks, Loopbacks, CD-ROMs, and floppy disk devices.

  • Fixes a bug in the idrac hardware type where when creating one or more virtual disks on a RAID controller that supports passthru mode (PERC H730P), the cleaning step would finish before the job to create the virtual disks actually completed. This could result in the client attempting to perform another action against the iDRAC that creates a configuration job, and that action would fail since the job to create the virtual disk would still be executing. This patch fixes this issue by only allowing the cleaning step to finish after the job to create the virtual disk completes. See bug bug 2007285 for more details.

  • Certain RAID controllers (PERC H730P) require physical disks to be switched from non-RAID (JBOD) mode to RAID mode to be included in a virtual disk. When this conversion happens, the available free space on the physical disk is reduced due to some space being allocated to RAID mode housekeeping. If the user requests a virtual disk (a RAID 1 for example) with a size close to the max size of the physical disks when they are in JBOD mode, then creation of the virtual disk following conversion of the physical disks from JBOD to RAID mode will fail since there is not enough space due to the space used by RAID mode housekeeping. This patch works around this issue by recalculating the RAID volume size after physical disk conversion has completed and the free space on the converted drives is known. Note that this may result in a virtual disk that is slightly smaller than the requested size, but still the max size that the drives can support. See bug bug 2007359 for more details

  • Fixes a potential race in the hash ring code that could result in the hash rings never updated after their initial load.

  • Fixes the deprecated idrac hardware interface implementation __init__ methods to call their base class __init__ methods before emitting a log message warning about their deprecation. For more information, see story 2008197.

  • Allows configuring IPMI cipher suite via the new driver_info parameter ipmi_cipher_suite to enable operators to navigate ipmitool behavior changes around supported ciphers.

  • After changing the boot device via Redfish, check that the boot mode being reported matches what is configured and, if not, set it to the configured value. Some BMCs change the boot mode when the device is set via Redfish, see story 2008252 for details.

13.0.6

New Features

  • Adds a new [ipmi]use_ipmitool_retries option. When set to True and timing is supported by ipmitool, the number of retries and command interval will be passed to ipmitool so that ipmitool will do the retries. When set to False, ironic will do the retries. Default is True.

Known Issues

  • Some BMCs do not support the Channel Cipher Suites command that newer versions of ipmitool use. These versions of ipmitool will resend this command for each ipmitool retry, resulting in long response times. Setting [ipmi]use_ipmitool_retries to false will avoid this situation by implementing retries on the ironic level.

Bug Fixes

  • Allows deleting nodes with a broken driver unless they require stopping serial console.

  • Fixes json_rpc client connections always using HTTP even if use_ssl was set to True.

  • Hardware type idrac converts physical drives from JBOD to RAID mode before building RAID on them.

  • Hardware type idrac converts physical drives from RAID to JBOD mode after RAID delete_configuration cleaning step through raid interface. This ensures that the individual disks freed by deleting the virtual disks are visible to the OS.

  • When Ironic is doing IPMI retries the configured min_command_interval should be used instead of a default value of 1, which may be too short for some BMCs.

  • Fixes an issue where ironic-conductor initialization could return a NodeNotLocked error for requests requiring locks when the conductor was starting. This was due to the conductor removing locks after beginning accepting new work. The lock removal has been moved to after the Database connectivity has been established but before the RPC bus is initialized.

13.0.5

Bug Fixes

  • Fixes a rare issue when agent successfully powers off a node after deployment, but ironic never learns about it and does another reboot.

  • Cleans up nodes stuck in the deleting state on conductor restart.

  • Fixes fast-track deployments with the direct deploy interface that used to hang previously.

  • Fixes a workaround for hardware that does not support persistent boot device setting with the redfish or idrac-redfish management interface implementation. When such situation is detected, ironic falls back to one-time boot device setting, restoring it on every reboot or power on.

    For more information, see story 2007733.

  • No longer tries to set local_gb to MAX when building RAID with the root disk using MAX for its size.

13.0.4

New Features

  • For baremetal operations on DHCPv6-stateful networks multiple IPv6 addresses can now be allocated for neutron ports created for provisioning, cleaning, rescue or inspection. The new parameter [neutron]/dhcpv6_stateful_address_count controls the number of addresses to allocate (Default: 1).

  • Makes management interface of redfish hardware type not changing current boot frequency if currently set is the same as the desired one. The goal is to avoid touching potentially faulty BMC option whenever possible.

  • Adds a new [ipmi]debug option that allows users to explicitly turn IPMI command debugging on, as opposed to relying upon the system debug setting [DEFAULT]debug. Users wishing to continue to log this output should set [ipmi]debug to True in their ironic.conf.

Known Issues

  • Some redfish-enabled hardware is known not to support persistent boot device setting that is used by the Bare Metal service for deployed instances. The redfish hardware type tries to work around this problem, but rebooting such an instance in-band may cause it to boot incorrectly. A predictable boot order should be configured in the node’s boot firmware to avoid issues and at least metadata cleaning must be enabled. See this mailing list thread for technical details.

Upgrade Notes

  • Changing minimum version of Ansible for use with the ansible deploy_interface to version 2.5.

  • The version of sushy can now be updated to 3.2.0 or later to address issues with managing persistant boot mode setting with Redfish Baseboard Management Controllers.

  • If [DEFAULT]force_raw_images is set to true, then MD5 will not be utilized to recalculate the image checksum. This requires the ironic-python-agent ramdisk to be at least version 3.4.0.

  • Debug logging control has been moved to the [ipmi]debug configuration setting as opposed to the “conductor” [DEFAULT]debug setting as the existing ipmitool output can be extremely misleading for users. Operators who wish to continue to log ipmitool verbose output in their logs should explicitly set the [ipmi]debug command to True.

Security Issues

  • Image checksum recalculation when images are forced to raw images, are now calculated using SHA3-256 if MD5 was selected. This is now unconditional.

Bug Fixes

  • Fixes an issue with the agent client code where checks of the agent command status had no logic to prevent an intermittent or transient connection failure from causing the entire operation to fail.

  • Fixes the default disk detection in the ansible deploy interface with Python 3. Previously a random disk was used, resulting in boot failures, now the first disk is used (as intended). This required bumping the minimum required Ansible version to 2.5.

  • Fixes ‘Invalid parameter value for SpanLength’ when configuring RAID using Python 3. This passed incorrect data type to iDRAC, e.g., instead of 2 it passed 2.0. See story 2004265.

  • Fixes RAID configuration using idrac-wsman RAID interface where node remains in ‘clean wait’ provisioning state forever. See story 2007567.

  • Fixes an issue where a node may be locked from changes if a conductor’s hostname case is changed before restarting the conductor service.

  • Improves interoperability with Redfish BMCs by untying node boot mode change from other boot parameters change (such as boot device, boot frequency). This fix requires a newer version of the sushy library, version 3.2.0.

  • Fixes vague node last_error field reporting upon deploy step failure by providing the exception error message in addition to the step that failed.

  • The ‘no address available’ problem seen when network booting on DHCPv6-stateful networks is fixed with the support for allocating multiple IPv6 addresses. See bug: 1861032.

  • Fixes an agent command issue in the bootloader installation process that can present itself as a connection timeout under heavy IO load conditions. Now installation commands have an internal timeout which is double the conductor wide [agent]command_timeout. For more information, see bug 2007483.

  • Fixed a bug where rebooting a node managed by the idrac hardware type when using the WS-MAN power interface sometimes fails with a The command failed to set RequestedState error. See bug 2007487 for details.

  • To provide a workaround for incorrect boot order problems on some hardware, the redfish hardware type now supports the noop management interface, similarly to IPMI and SNMP.

  • Rebooting a node with the redfish power interface is now implemented via a power off request followed by power on to avoid returning success when a node stays powered on after the reboot request.

  • Provides a workaround for hardware that does not support persistent boot device setting with the redfish hardware type. When such situation is detected, ironic will fall back to one-time boot device setting, restoring it on every reboot.

Other Notes

  • Ramdisk logs are now collected during cleaning the same way as during deployment.

13.0.3

Security Issues

  • Prevents additional updates of an agent callback_url through the agent heartbeat /v1/heartbeat/<node_uuid> endpoint as the callback_url should remain stable through the cleaning, provisioning, or rescue processes. Should anything such as an unexpected agent reboot cause the callback_url, heartbeat operations will now be ignored. More information can be found at story 2006773.

Bug Fixes

  • Now passing proper flags during clean up of iPXE boot environments, so that no leftovers are left after node tear down.

  • Use SHA256 for comparing file contents instead of MD5. This improves FIPS compatibility.

  • Corrects logic in the entry path of node cleaning and deployment processes to prohibit agent_url from being preemptively removed if fast_track is enabled and in use. This allows fast track cleaning and deployment operations to succeed.

  • Fixes an issue that when ipxe interface is in use with [pxe]ipxe_enabled set to false, the PXE configuration is not handled properly which prevents the machine from performing a successful iPXE boot.

  • Fix path used to virtual media iso, when served over local HTTP server([redfish]use_swift=false).

  • Fixes an issue with fasttrack where a recent security related change to prevent the agent_url field from being updated in a node, to functionally prevent fast_track from succeeding as the node would fail with an exception indicating the agent_url could not be found. The required agent_url value is now preserved when the fast track feature is enabled as the running ramdisk is not shut down.

  • Add timeout when querying agent for commands status. Without it, node can lock up for a quite long time and ironic will not allow to perform any operations with it.

  • When installing a whole disk image using iscsi, set up the bootloader even if a root partition can not be found. The bootloaders will be located on the disk.

13.0.2

Security Issues

  • Node secrets (such as BMC credentials) are no longer logged when JSON RPC is used and DEBUG logging is enabled.

Bug Fixes

  • Fixes a bug in the idrac hardware type where a race condition can occur on a host that has a mix of controllers where some support realtime mode and some do not. The approach is to use only realtime mode if all controllers support realtime. This removes the race condition. See bug 2006502 https://storyboard.openstack.org/#!/story/2006502 for details

  • Fixes issue where the resource list API returned results with requested fields only until the API MAX_LIMIT. After the API MAX_LIMIT is reached the API started ignoring user requested fields. This fix will make sure that the next url generated by the pagination code will include the user requested fields as query parameter.

  • Fixes drive sensors information collection in redfish management interface. Prior to this fix, wrong Redfish schema has been used for Drive resource what has been causing exception and ultimately sensor data collection failure.

  • Fixes a possible console lockup issue in case of PID file not being yet created while daemon start has call already returned success return code.

  • Fixes a bug in the idrac hardware type where executing the clear_job_queue clean step, pending non-BIOS config jobs (E.g. create/delete virtual disk) were not being deleted before job execution.

    See bug 2006580 https://storyboard.openstack.org/#!/story/2006580 for details

  • Fixes a bug with the grub ramdisk boot template handling, such that the template now properly references the user provided kernal and ramdisk. Previously the deployment ramdisk and kernel was referenced in the template.

13.0.1

Bug Fixes

  • Fixes a bug in the idrac hardware type where configuration job for RAID delete_configuration cleaning step gets created even when there are no virtual disks or hotspares/dedicated hotspares present on any controller. See bug 2006562 https://storyboard.openstack.org/#!/story/2006562 for details.

13.0.0

Prelude

“Choooooo! Choooooo!” The Train is now departing the station. The OpenStack Bare Metal as a service team is proud to announce the release of Ironic 13.0.0. This release brings the long desired feature of software RAID configuration, Redfish virtual media boot support, sensor data improvements, and numerous bug fixes. We hope you enjoy your ride on the OpenStack Ironic Train.

New Features

  • Adds support for deploy steps to the idrac-wsman raid interface. The methods apply_configuration and delete_configuration can be used as deploy steps.

  • Adds a new delete_existing argument to the create_configuration clean step on the idrac-wsman raid interface which can be used to delete existing virtual disks. The default for this argument is False.

  • Adds support for deploy steps to bios interface of ilo hardware type. The methods factory_reset and apply_configuration can be used as deploy steps.

  • Adds support for deploy steps to the management interface of the ilo hardware type. The methods reset_ilo, reset_ilo_credential, reset_bios_to_default, reset_secure_boot_keys_to_default, clear_secure_boot_keys and update_firmware can be used as deploy steps.

  • Adds support for deploy steps to raid interface of ilo5 hardware type. The methods apply_configuration and delete_configuration can be used as deploy steps.

  • Adds support for deploy steps to bios interface of redfish hardware type. The methods factory_reset and apply_configuration can be used as deploy steps.

  • Adds virtual media boot interface to redfish hardware type supporting virtual media boot. The redfish-virtual-media boot interface operates on the same kernel/ramdisk as, for example, PXE boot interface does, however redfish-virtual-media boot interface can additionally require EFI system partition image (ESP) when performing UEFI boot. Either the [conductor]bootloader configuration option or the [driver_info]/bootloader node attribute can be used to convey ESP location to ironic. Bootable ISO images can be served to BMCs either from Swift or from an HTTP server running on an ironic conductor machine. This is controlled by the [redfish]use_swift ironic configuration option.

  • Adds sensor data collector to redfish management interface. Temperature, power, cooling and drive health metrics are collected.

  • Add target_raid_config data to ironic variable under raid_config top-level key which will expose the RAID configuration to the ansible driver. See story 2006417 for details.

  • Adds a clear_job_queue cleaning step to the idrac-wsman management interface. The clear_job_queue cleaning step clears the Lifecycle Controller job queue including any pending jobs.

  • Adds an ilo-ipxe boot interface to ilo hardware type which allows for instance level iPXE enablement as opposed to conductor-wide enablement of iPXE. To perform iPXE boot with ilo-ipxe boot interface:

  • Adds power state change callbacks of an instance to the Compute service by performing API notifications. This feature is enabled by default and can be disabled via the new [nova]send_power_notifications configuration option.

    Whenever there is a change in the power state of a physical instance, the Bare Metal service will send a power-update external event to the Compute service which will cause the power state of the instance to be updated in the Compute database. It also adds the possibility of bringing up/down a physical instance through the Bare Metal service API even if it was put down/up through the Compute service API.

  • The deploy and/or rescue kernel and ramdisk can now be configured via the new configuration options deploy_kernel, deploy_ramdisk, rescue_kernel and rescue_ramdisk respectively.

  • Adds a new configuration option [drac]boot_device_job_status_timeout that specifies the maximum amount of time (in seconds) to wait for the boot device configuration job to transition to the scheduled state to allow a reboot or power on action to complete.

  • Adds initial idrac hardware type support of interface implementations that utilize the Redfish out-of-band (OOB) management protocol and are compatible with the integrated Dell Remote Access Controller (iDRAC) baseboard management controller (BMC), presently those of the management and power hardware interfaces. They are named idrac-redfish.

    Introduces a new name for the idrac interface implementations, idrac-wsman, and deprecates idrac. They both use the Web Services Management (WS-Man) OOB management protocol.

    The idrac hardware type declares support for those new interface implementations, in addition to all interface implementations it has been supporting. The priority order of supported interfaces remains the same. Interface implementations which rely on WS-Man continue to have the highest priority, and the new idrac-wsman is listed before the deprecated idrac. It now supports the following interface implementations, which are listed in priority order from highest to lowest:

    • bios: no-bios

    • boot: ipxe, pxe

    • console: no-console

    • deploy: iscsi, direct, ansible, ramdisk

    • inspect: idrac-wsman, idrac, inspector, no-inspect

    • management: idrac-wsman, idrac, idrac-redfish

    • network: flat, neutron, noop

    • power: idrac-wsman, idrac, idrac-redfish

    • raid: idrac-wsman, idrac, no-raid

    • rescue: no-rescue, agent

    • storage: noop, cinder, external

    • vendor: idrac-wsman, idrac, no-vendor

    For more information, see story 2004592.

  • Adds idrac hardware type support of an inspect interface implementation that utilizes the Redfish out-of-band (OOB) management protocol and is compatible with the integrated Dell Remote Access Controller (iDRAC) baseboard management controller (BMC). It is named idrac-redfish.

    The idrac hardware type declares support for that new interface implementation, in addition to all inspect interface implementations it has been supporting. The highest priority inspect interfaces remain the same, those which rely on the Web Services Management (WS-Man) OOB management protocol. The new ‘idrac-redfish’ immediately follows those. It now supports the following inspect interface implementations, listed in priority order from highest to lowest: idrac-wsman, idrac, idrac-redfish, inspector, and no-inspect.

  • Adds functionality to perform out-of-band sanitize disk-erase operation for iLO5 based HPE Proliant servers. Management interface ilo5 has been added to ilo5 hardware type. A clean step erase_devices has been added to management interface ilo5 to support this operation.

  • Adds support for the Intel IPMI Hardware with a new hardware type intel-ipmitool. This hardware type is the same as the ipmi hardware type with additional support of Intel Speed Select Performance Profile Technology. It uses the intel-ipmitool management interface, which supports setting the desired configuration level for Intel SST-PP.

  • Ironic API service now supports HTTP proxy headers parsing with the help of oslo.middleware package, enabled via new option [oslo_middleware]/enable_proxy_headers_parsing (False by default).

    This enables more complex setups of Ironic API service, for example when the same service instance serves both internal and public API endpoints via separate proxies.

    When proxy headers parsing is enabled, the value of [api]/public_endpoint option is ignored.

  • Allows retrying PXE/iPXE boot during deployment, cleaning and rescuing. This feature is disabled by default and can be enabled by setting [pxe]boot_retry_timeout to the timeout (in seconds) after which the boot should be retried.

    The new option [pxe]boot_retry_check_interval defines how often to check the nodes for timeout and defaults to 90 seconds.

  • Adds support for software RAID via the generic hardware manager when using a Train release ironic-python-agent deployment or cleaning ramdisk.

    This may be used by means of the target_raid_config a single RAID-1 or one RAID-1 plus one RAID-N can be configured (where N can be 0, 1, and 1+0). The RAID is created/deleted during manual cleaning. Note that this initial implementation will use all available devices for the setup of the software RAID device(s). More information is available in the Ironic Administrator documentation.

  • Foreign drives and global and dedicated hot spares will be freed up during the RAID delete_configuration cleaning step.

Upgrade Notes

  • In order to support power state change call backs to nova, the [nova] section must be configured in the Bare Metal service configuration. As the functionality to process the event is new to nova’s Train release, this should only be set to True in ironic, once ALL nova-compute instances have been upgraded to the Train release of nova.

  • The Cisco cisco-ucs-managed and cisco-ucs-standalone hardware types and cimc and ucsm hardware interfaces which were deprecated in the 12.1.0 release have now been removed.

    After upgrading, if any of these hardware types or interfaces are specified in ironic’s configuration options, the ironic-conductor service will fail to start. Any existing ironic nodes with these hardware types or interfaces will become inoperational via ironic after the upgrade. If these hardware types or interfaces are being used, the affected nodes should be changed to use other hardware types or interfaces; or install these hardware types (and interfaces) from elsewhere separately. For more information, see story 2005033.

  • The deprecated configuration options enabled and service_url from the inspector section have been removed.

  • The python-ironic-inspector-client package is no longer required for the inspector inspect interface (openstacksdk is used instead).

  • The deprecated options url, url_timeout and auth_strategy from the [neutron] section have been removed. Use endpoint_override, timeout and auth_type respectively.

  • When a failure occurs during cleaning, nodes will no longer be shut down. The behaviour was changed to prevent harm and allow for an admin intervention when sensitive operations, such as firmware upgrades, are performed and fail during cleaning.

  • The deprecated options glance_api_servers, glance_api_insecure, glance_cafile and auth_strategy from the [glance] section have been remove. Please use the corresponding keystoneauth options instead.

  • The do_disk_erase, has_disk_erase_completed and get_available_disk_types interfaces of ‘proliantutils’ library has been enhanced to support out-of-band sanitize disk-erase operation for ilo5 hardware type. To leverage this feature, the ‘proliantutils’ library needs to be upgraded to version ‘2.9.0’.

  • Users of the irmc hardware type with iPXE should switch to the ipxe boot interface from the deprecated [pxe]ipxe_enabled option.

  • Explicit support for CoreOS Ironic Python Agent images has been removed. If you use a ramdisk based on CoreOS, you may want to re-add coreos.configdrive=0 to your PXE templates, see story 1433812 for the background.

  • The deprecated ironic/api/app.wsgi script has been removed. The automatically generated ironic-api-wsgi script must be used instead.

  • Support for elilo has been removed as support was deprecated and elilo has been dropped by most Linux distributions. Users should migrate to another PXE loader.

Deprecation Notes

  • The configuration option [glance]glance_num_retries has been renamed to [glance]num_retries. The old name will be removed in a future release.

  • The idrac interface implementation name is deprecated in favor of a new name, idrac-wsman, and may be removed in a future release. A deprecation warning will be logged for every loaded idrac interface implementation. Use idrac-wsman instead.

  • The ironic-lib configuration option [disk_utils]iscsi_verify_attempts has been deprecated in favor of:

    • [iscsi]verify_attempts to specify the number of attempts to establish an iSCSI connection.

    • [disk_utils]partition_detection_attempts to specify the number of attempts to find a newly created partition.

Bug Fixes

  • Fixes an issue where if there is a pending BIOS config job in job queue, then ironic will abandon an introspection attempt for the node, which will cause overall introspection to fail.

  • Allows deleting unbound ports on an active node. See story 2006385 for details.

  • Fixes a confusing AttributeError if an adapter returns None for the bare metal API.

  • Prevents the adapter configuration options from getting ignored if a matching endpoint cannot be found. An error is now raised.

  • By immediately conveying power state changes of a node through external events to the Compute service, the Bare Metal service becomes the source of truth about the node’s power state, preventing the Compute service from forcing wrong power states on instances during the periodic power state synchronization between the Compute and Bare Metal services.

    Note

    There is a possibility of a race condition due to the nova-ironic power sync task happening during or right before the power state change event is received from the Bare Metal service, in which case the instance state will be forced on the baremetal node.

  • Fixes an issue in the discovery playbook for the ansible deploy interface that prevented gathering WWN and serial numbers under Python 3.

  • Fixes an issue with using serial number as root device hints with the ansible deploy interface.

  • Fixes an issue regarding the ansible deploy interface, where the configdrive partition could not be correctly built if the node root device was set to some logical device (like an md array, /dev/md0). https://storyboard.openstack.org/#!/story/2006334

  • Fixes deploying non-public images using the ansible deploy interface.

  • Currently Ironic allows entering deployment or cleaning for nodes in maintenance mode. However, heartbeats do not cause any actions for such nodes, thus deployment or cleaning will never finish if the nodes are not moved out of maintenance. A new configuration option [conductor]allow_provisioning_in_maintenance (defaulting to True) is added to configure this behavior. If it is set to False, deployment and cleaning will be prevented from nodes in maintenance mode.

  • Fixes an issue with asynchronous deploy steps that poll for completion where the step could fail to execute. The deployment_polling and cleaning_polling flags may be used by driver implementations to signal that the driver is polling for completion. See story 2003817 for details.

  • Fixes an issue in the idrac hardware type where a configuration job does not transition to the correct state and start execution during a power on or reboot operation. If the boot device is being changed, the system might complete its POST before the job is ready, leaving the job in the queue, and the system will boot from the wrong device. See bug 2004909 for details.

  • Fixes a bug where ironic would shut a node down upon cleaning failure. Now, the node stays powered on (as documented and intended).

  • Fixes an issue where baremetal node deployment would fail on clouds with a high number of security groups. Listing the security groups took too long. Instead of listing all security groups, a query filter was added to list only the security groups to be used for the network. (See bug 2006256.)

  • Fixed the issue with node being locked for longer than [console]subprocess_timeout seconds when shellinabox process fails to start before the specifed timeout elapses.

  • Fixed a bug when executing create_configuration cleaning step for disks of PERC H740P controller, first disks get created and then controller doesn’t allow to create next couple disks because controller is getting busy.

  • Fixes an issue wherein asynchronous out-of-band deploy steps in deployment template fails to execute. See story 2006342 for details.

  • Fixes an issue where users attempting to leverage non-iPXE UEFI booting would experience failures when their dhcp_provider was set to none.

  • Fixes a bug in iLO UEFI iSCSI Boot, where it fails if a server has multiple NIC adapters, since Proliant Servers have a limitation of creating only four iSCSI NIC sources and the existing implementation would try to create for more and failed accordingly.

  • Adds the missing ipxe boot interface to the irmc hardware type. It is supposed to be used instead of the deprecated [pxe]ipxe_enabled configuration option.

  • Fixes an issue where clean steps of redfish BIOS interface do not boot up the IPA ramdisk after cleaning reboot. See story 2006217 for details.

  • Fixes an issue in ISO creation for UEFI boot mode when efiboot.img file is provided and the directory of location of grub.cfg file set using config [DEFAULT]/grub_config_path is not same as that of efiboot.img file. See story 2006218 for details.

  • Fixes an issue in updating firmware using update_firmware_sum clean step from management interface of ilo hardware type with an error stating that unable to connect to iLO address due to authentication failure. See story 2006223 for details.

  • Fixes an issue in powering-on of server in ilo hardware type. Server was failing to return success for power-on operation if no bootable device was found. See story 2006288 for details.

  • Fixes an issue in creation of RAID if none of the ‘logical_disks’ in ‘target_raid_config’ have ‘controller’ parameter. See story 2006316 for details.

  • Fixes an issue in creation of RAID for ilo5 RAID interface wherein second time RAID creation fails. See story 2006321 for details.

  • Provides an opt-in fix to change the default port attachment behavior for deployment and cleaning operations through a new configuration option, [neutron]add_all_ports. This option causes ironic to transmit all port information to neutron as opposed to only a single physical network port. This enables operators to successfully operate static Port Group configurations with Neutron ML2 drivers, where previously configuration of networking would fail.

    When these ports are configured with pxe_enabled set to False, neutron will be requested not to assign an IP address to the port. This is to prevent additional issues that may occur depending on physical switch configuration with static Port Group configurations.

  • Fixes an issue during provisioning network attachment where neutron ports were being created with the same data structure being re-used.

Other Notes

  • This release allows to configure retryable ipmitool exceptions via [ipmi]additional_retryable_ipmi_errors so that, depending on the environment, operators could allow retrying ipmitool commands containing specified substrings.