Ussuri Series Release Notes

10.1.0-3

New Features

  • Support hyperkube_prefix label which defaults to k8s.gcr.io/. Users now have the option to define alternative hyperkube image source since the default source has discontinued publication of hyperkube images for kube_tag greater than 1.18.x. Note that if container_infra_prefix label is define, it still takes precedence over this label.

10.1.0

New Features

  • Users can enable or disable master_lb_enabled when creating a cluster.

  • The default 10 seconds health polling interval is too frequent for most of the cases. Now it has been changed to 60s. A new config health_polling_interval is supported to make the interval configurable. Cloud admin can totally disable the health polling by set a negative value for the config.

  • Expose autoscaler prometheus metrics on pod port metrics (8085).

  • Add a new label named master_lb_allowed_cidrs to control the IP ranges which can access the k8s API and etcd load balancers of master. To get this feature, the minimum version of Heat is stable/ussuri and minimum version of Octavia is stable/train.

  • A new boolean flag is introduced in the CLuster and Nodegroup create API calls. Using this flag, users can override label values when clusters or nodegroups are created without having to specify all the inherited values. To do that, users have to specify the labels with their new values and use the flag –merge-labels. At the same time, three new fields are added in the cluster and nodegroup show outputs, showing the differences between the actual and the iherited labels.

  • Magnum now cascade deletes all the load balancers before deleting the cluster, not only including load balancers for the cluster services and ingresses, but also those for Kubernetes API/etcd endpoints.

  • Support Helm v3 client to install helm charts. To use this feature, users will need to use helm_client_tag>=v3.0.0 (default helm_client_tag=v3.2.1). All the existing chart used to depend on Helm v2, e.g. nginx ingress controller, metrics server, prometheus operator and prometheus adapter are now also installable using v3 client. Also introduce helm_client_sha256 and helm_client_url that users can specify to install non-default helm client version (https://github.com/helm/helm/releases).

  • Cloud admin user now can do rolling upgrade on behalf of end user so as to do urgent security patching when it’s necessary.

  • Add to prometheus federation exported metrics the cluster_uuid label.

Upgrade Notes

  • If it’s still preferred to have 10s health polling interval for Kubernetes cluster. It can be set by config health_polling_interval under kubernetes section.

  • Now the default admission controller list is updated by as “NodeRestriction, PodSecurityPolicy, NamespaceLifecycle, LimitRanger, ServiceAccount, ResourceQuota, TaintNodesByCondition, Priority, DefaultTolerationSeconds, DefaultStorageClass, StorageObjectInUseProtection, PersistentVolumeClaimResize, MutatingAdmissionWebhook, ValidatingAdmissionWebhook, RuntimeClass”

  • Default tiller_tag is set to v2.16.7. The charts remain compatible but helm_client_tag will also need to be set to the same value as tiller_tag, i.e. v2.16.7. In this case, the user will also need to provide helm_client_sha256 for the helm client binary intended for use.

  • Bumped prometheus-operator chart tag to 8.12.13. Added container_infra_prefix to missing prometheusOperator images.

Deprecation Notes

  • Support for Helm v2 client will be removed in X release.

Bug Fixes

  • Deploy traefik from the heat-agent

    Use kubectl from the heat agent to apply the traefik deployment. Current behaviour was to create a systemd unit to send the manifests to the API.

    This way we will have only one way for applying manifests to the API.

    This change is triggered to adddress the kubectl change [0] that is not using 127.0.0.1:8080 as the default kubernetes API.

    [0] https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#kubectl

  • Fixes an edge case where when a cluster with additional nodegroups is patched with health_status and health_status_reason, it was leading to the default-worker nodegroup being resized.

  • Fixes a regression which left behind trustee user accounts and certificates when a cluster is deleted.

  • Now the label fixed_network_cidr have been renamed with fixed_subnet_cidr. And it can be passed in and set correctly.

  • Fix an issue with private clusters getting stuck in CREATE_IN_PROGRESS status where floating_ip_enabled=True in the cluster template but this is disabled when the cluster is created.

  • Fixes database migrations with SQLAlchemy 1.3.20.

  • Prometheus server now scrape metrics from traefik proxy. Prometheus server now scrape metrics from cluster autoscaler.

  • Scrape metrics from kube-{controller-manager,scheduler}. Disable PrometheusRule for etcd.

  • Fixes an issue with cluster deletion if load balancers do not exist. See story 2008548 <https://storyboard.openstack.org/#!/story/2008548> for details.

10.0.0

New Features

  • Added calico_ipv4pool_ipip label for configuring calico network_driver IPIP Mode to use for the IPv4 POOL created at start up. Allowed_values: Always, CrossSubnet, Never, Off.

  • Add cinder_csi_enabled label to support out of tree Cinder CSI.

  • Expose traefik prometheus metrics.

  • Add fedora coreos driver. To deploy clusters with fedora coreos operators or users need to add os_distro=fedora-coreos to the image. The scripts to deploy kubernetes on top are the same with fedora atomic. Note that this driver has selinux enabled.

  • Added label heapster_enabled to control heapster installation in the cluster.

  • Installs the metrics-server service that is replacing kubernetes deprecated heapster as a cluster wide metrics reporting service used by schedulling, HPA and others. This service is installed and configured using helm and so tiller_enabled flag must be True. The label metrics_server_chart_tag can be used to specify the stable/metrics-server chart tag to be used. The label metrics_server_enabled is used to enable disable the installation of the metrics server (default: true).

  • Added label helm_client_tag to allow user to specify helm client container version.

  • Added custom.metrics.k8s.io API installer by means of stable/prometheus-adapter helm chart. The label prometheus_adapter_enabled (default: true) controls configuration. You can also use prometheus_adapter_chart_tag to select helm chart version, and prometheus_adapter_configmap if you would like to setup your own metrics (specifying this other than default overwrites default configurations). This feature requires the usage of label monitoring_enabled=true.

  • Along with the kubernetes version upgrade support we just released, we’re adding the support to upgrade the operating system of the k8s cluster (including master and worker nodes). It’s an inplace upgrade leveraging the atomic/ostree upgrade capability.

  • A new config option post_install_manifest_url is added to support installing cloud provider/vendor specific manifest after booted the k8s cluster. It’s an URL pointing to the manifest file. For example, cloud admin can set their specific storageclass into this file, then it will be automatically setup after created the cluster.

  • Add selinux_mode label. By default, selinux_mode=permissive with Fedora Atomic driver and selinux_mode=enforcing with Fedora CoreOS.

  • Now the Fedora CoreOS driver can support the sha256 verification for the hyperkube image when bootstraping the Kubernetes cluster.

  • The original design of k8s cluster health status is allowing the health status being updated by Magnum control plane. However, it doesn’t work when the cluster is private. Now Magnum supports updating the k8s cluster health status via the Magnum cluster update API so that a controller (e.g. magnum-auto-healer) running inside the k8s cluster can call the Magnum update API to update the cluster health status.

  • Cluster upgrade API supports upgrading specific nodegroups in kubernetes clusters. If a user chooses a default nodegroup to be upgraded, then both of the default nodegroups will be upgraded since they are in one stack. For non-default nodegroups users are allowed to use only the cluster template already set in the cluster. This means that the cluster (default nodegroups) has to be upgraded on the first hand. For now, the only label that is taken into consideration during upgrades is the kube_tag. All other labels are ignored.

  • Choose whether system containers etcd, kubernetes and the heat-agent will be installed with podman or atomic. This label is relevant for k8s_fedora drivers.

    k8s_fedora_atomic_v1 defaults to use_podman=false, meaning atomic will be used pulling containers from docker.io/openstackmagnum. use_podman=true is accepted as well, which will pull containers by k8s.gcr.io.

    k8s_fedora_coreos_v1 defaults and accepts only use_podman=true.

    Note that, to use kubernetes version greater or equal to v1.16.0 with the k8s_fedora_atomic_v1 driver, you need to set use_podman=true. This is necessary since v1.16 dropped the –containerized flag in kubelet. https://github.com/kubernetes/kubernetes/pull/80043/files

Known Issues

  • The startup of the heat-container-agent uses a workaround to copy the SoftwareDeployment credentials to /var/lib/cloud/data/cfn-init-data. The fedora coreos driver requires heat train to support ignition.

  • Now Fedora CoreOS driver can support using docker storage driver, only overlay2 is supported.

Upgrade Notes

  • Python 2.7 support has been dropped. Last release magnum support py2.7 is OpenStack Train. The minimum version of Python now supported by magnum is Python 3.6.

  • nginx-ingress-controller QoS changed from Guaranteed to Burstable. Priority class ‘system-cluster-critical’ or higher for nginx-ingress-controller.

  • The default version of Kubernetes dashboard has been upgraded to v2.0.0 and metrics-server is supported by k8s dashboard now.

  • Bump up default versions for fedora-coreos driver kube_tag: v1.18.2 autoscaler_tag: v1.18.1 cloud_provider_tag: v1.18.0 cinder_csi_plugin_tag: v1.18.0 k8s_keystone_auth_tag: v1.18.0 magnum_auto_healer_tag: v1.18.0 octavia_ingress_controller_tag: v1.18.0

  • The default Calico version has been upgraded from v3.3.6 to v3.13.1. Calico v3.3.6 is still a valid option.

  • The default CoreDNS version has been upgraded to 1.6.6 and now it can be schedule to master nodes.

  • Upgrade etcd to v3.4.6 and use quay.io/coreos/etcd since the tags on follow the same format as https://github.com/etcd-io/etcd/releases compared to k8s.gcr.io which modifies the canonical version tag. Users will need to pay attention to the format of etcd_tag, e.g. v3.4.5 is valid whereas 3.4.5 is not. Existing cluster templates and clusters which which use the latter will fail to complete.

  • Upgrade flannel version to v0.12.0-amd64 for Fedora CoreOS driver.

Deprecation Notes

  • Heapster phased out in favor of metrics-server. Last openstack/magnum version to include heapster has standard version is magnum train.

Bug Fixes

  • For k8s_coreos set REQUESTS_CA for heat-agent. The heat-agent as a python service needs to use the ca bundle of the host.

  • Fixed the usage of cert_manager_api=true making cluster creation fail due to a logic lock between kubemaster.yaml and kubecluster.yaml

  • A regression issue about downloading images has been fixed. Now both Fedora Atomic driver and Fedora CoreOS driver can support using proxy in template to create cluster.

  • nginx-ingress-controller requests.memory increased to 256MiB. This is a result of tests that showed the pod getting oom killed by the node on a relatively generic use case.

  • This proxy issue of Prometheus/Grafana script has been fixed.

  • The taint of master node kubelet has been improved to get the conformance test (sonobuoy) passed.

  • In a multi availability zone (AZ) environment, if Nova doesn’t support cross AZ volume mount, then the cluster creation may fail because Nova can not mount volume in different AZ. This issue only impact Fedora Atomic and Fedora CoreOS drivers. Now this issue is fixed by passing in the AZ info when creating volumes.

  • k8s-keystone-auth now uses the upstream k8scloudprovider docker repo instead of the openstackmagnum repo.

  • There was a corner case that when floating_ip_enabled=False, master_lb_enabled=True,master_lb_floating_ip_enabled=False in cluster template, but setting floating_ip_enabled=True when creating the cluster, which causes missing IP address in the api_address of cluster. Now the isssue has been fixed.

  • Fixes the next url in the list nodegroups API response.

  • Bump up prometheus operator chart version to 8.2.2 so that it is compatible with k8s 1.16.x.

  • Bump up traefik to 1.7.19 for compatibility with Kubernetes 1.16.x.

  • core-podman Mount os-release properly To display the node OS-IMAGE in k8s properly we need to mount /usr/lib/os-release, /ets/os-release is just a symlink.