OpenStack-Helm 1.0 Requirements

Topic: osh-1.0-requirements

Problem Description

OpenStack-Helm has undergone rapid development and maturation over its lifetime, and is nearing the point of real-world readiness. This spec details the functionality that must be implemented in OpenStack-Helm for it to be considered ready for a 1.0 release, as well as for general use.

Use case

This spec describes a point-in-time readiness for OpenStack-Helm 1.0, after which it will be for historical reference only.

Proposed Change

The proposed requirements for a 1.0 release are as follows:

Gating

A foundational requirement of 1.0 readiness is the presence of robust gating that will ensure functionality, backward compatibility, and upgradeability. This will allow development to continue and for support for new versions of OpenStack to be added post-1.0. The following gating requirements must be met:

Helm test for all charts

Helm test is the building block for all gating. Each chart must integrate a helm-test script which validates proper functionality. This is already a merge criterion for new charts, but a handful of older charts still need for helm test functionality to be added. No additional charts will be merged prior to 1.0 unless they meet this requirement (and others in this document).

Resiliency across reboots

All services should survive node reboots, and their functionality validated following a reboot by a gate.

Upgrades

Gating must prove that upgrades from each supported OpenStack version to the next operate flawlessly, using the default image set (LOCI). Specifically, each OpenStack chart should be upgraded from one release to the next, and each infrastructure service from one minor version to the next. Both the container image and configuration must be modified as part of this upgrade. At minimum, Newton to Ocata upgrade must be validated for the 1.0 release.

Code Completion and Refactoring

A number of in-progress and planned development efforts must be completed prior to 1.0, to ensure a stable OpenStack-Helm interface thereafter.

Charts in the appropriate project

All charts should migrate to their appropriate home project as follows:

  • OpenStack-Helm for OpenStack services

  • OpenStack-Helm-Infra for supporting services

  • OpenStack-Helm-Addons for ancillary services

In particular, these charts must move to OpenStack-Helm-Infra:

  • ceph

  • etcd

  • ingress

  • ldap

  • libvirt

  • mariadb

  • memcached

  • mongodb

  • openvswitch

  • postgresql

  • rabbitmq

Combined helm-toolkit

Currently both OpenStack-Helm and OpenStack-Helm-Infra have their own parallel versions of the Helm-Toolkit library chart. They must be combined into a single chart in OpenStack-Helm-Infra prior to 1.0.

Standardization of manifests

Work is underway to refactor common manifest patterns into reusable snippets in Helm-Toolkit. The following manifests have yet to be combined:

  • Database drop Job

  • Prometheus exporters

  • API Deployments

  • Worker Deployments

  • StatefulSets

  • CronJobs

  • Etc ConfigMaps

  • Bin ConfigMaps

Standardization of values

OpenStack-Helm has developed a number of conventions around the format and ordering of charts’ values.yaml file, in support of both reusable Helm-Toolkit functions and ease of developer ramp-up. For 1.0 readiness, OpenStack-Helm must cement these conventions within a spec, as well as the ordering of values.yaml keys. These conventions must then be gated to guarantee conformity. The spec in progress can be found here [1].

Inclusion of all core services

Charts for all core OpenStack services must be present to achieve 1.0 releasability. The only core service outstanding at this time is Swift.

Split Ceph chart

The monolithic Ceph chart does not allow for following Ceph upgrade best practices, namely to upgrade Mons, OSDs, and client services in that order. The Ceph chart must therefore be split into at least three charts (one for each of the above upgrade phases) prior to 1.0 to ensure smooth in-place upgradability.

Values-driven config files

In order to maximize flexibility for operators, and to help facilitate upgrades to newer versions of containerized software without editing the chart itself, all configuration files will be specified dynamically based on values.yaml and overrides. In most cases the config files will be generated based on the YAML values tree itself, and in some cases the config file content will be specified in values.yaml as a string literal.

Documentation

Comprehensive documentation is key to the ability for real-world operators to benefit from OpenStack-Helm, and so it is a requirement for 1.0 releasability. The following outstanding items must be completed from a documentation perspective:

Document version requirements

Version requirements for the following must be documented and maintained:

  • Kubernetes

  • Helm

  • Operating system

  • External charts (Calico)

Document Kubernetes requirements

OpenStack-Helm supports a “bring your own Kubernetes” paradigm. Any particular k8s configuration or feature requirements must be documented.

  • Hosts must use KubeDNS / CoreDNS for resolution

  • Kubernetes must enable mount propagation (until it is enabled by default)

  • Helm must be installed

Examples of how to set up the above under KubeADM and KubeSpray-based clusters must be documented as well.

OpenStack-Helm release process

The OpenStack-Helm release process will be somewhat orthogonal to the OpenStack release process, and the differences and relationship between the two must be documented in a spec. This will help folks quickly understand why OpenStack-Helm is a Release-Independent project from an OpenStack perspective.

Release notes

Release notes for the 1.0 release must be prepared, following OpenStack best practices. The criteria for future changes that should be included in release notes in an ongoing fashion must be defined / documented as well.

  • values.yaml changes

  • New charts

  • Any other changes to the external interface of OpenStack-Helm

LMA Operations Guide

A basic Logging, Monitoring, and Alerting-oriented operations guide must be in place, illustrating for operators (and developers) how to set up and use an example LMA setup for OpenStack and supporting services. It will include instructions on how to perform basic configuration and how to access and use the user interfaces at a high level. It will also link out to more detailed documentation for the LMA tooling itself.

Process and Tooling

To facilitate effective collaboration and communication across the OpenStack-Helm community team, work items for the enhancements above will be captured in Storyboard. Therefore, migration from Launchpad to Storyboard must be accomplished prior to the 1.0 release. Going forward, Storyboard will be leveraged as a tool to collaboratively define and communicate the OpenStack-Helm roadmap.

Security Impact

No impact

Performance Impact

No impact

Alternatives

This spec lays out the criteria for a stable and reliable 1.0 release, which can serve as the basis for real-world use as well as ongoing development. The alternative approaches would be to either iterate indefinitely without defining a 1.0 release, which would fail to signal to operators the point at which the platform is ready for real-world use; or, to define a 1.0 release which fails to satisfy key features which real-world operators need.

Implementation

This spec describes a wide variety of self-contained work efforts, which will be implemented individually by the whole OpenStack-Helm team.

Assignee(s)

Primary assignee:

Work Items

See above for the list of work items.

Testing

See above for gating requirements.

Documentation Impact

See above for documentation requirements.

References