Introduction

StarlingX is a fully integrated edge cloud software stack that provides everything needed to deploy an edge cloud on one, two, or up to 100 servers.

Key features of StarlingX include:

  • Provided as a single, easy to install package that includes an operating system, storage and networking components, and all the cloud infrastructure needed to run edge workloads.

  • Optimized software that meets edge application requirements.

  • Designed with pre-defined configurations to meet a variety of edge cloud deployment needs.

  • Tested and released as a complete stack, ensuring compatibility among open source components.

  • Included fault management and service management capabilities, which provide high availability for user applications.

  • Optimized by the community for security, ultra-low latency, extremely high service uptime, and streamlined operation.

Download the StarlingX ISO image from the StarlingX mirror.

Learn more about StarlingX:

Projects

StarlingX contains multiple sub-projects that include additional edge cloud support services and clients. API documentation and release notes for each project are found on the specific project page:

Supporting projects and repositories:

For additional information about project teams, refer to the StarlingX wiki.

New features in StarlingX 9.0

The sections below provide a detailed list of new features and links to the associated user guides (if applicable).

Kubernetes up-version

In StarlingX 9.0, the Kubernetes version that is supported is in the range of v1.24 to v1.27.

Platform Application Components Revision

The following applications have been updated to a new version in StarlingX Release 9.0. All platform application up-versions are updated to remain current and address security vulnerabilities in older versions.

  • app-sriov-fec-operator: 2.7.1

  • cert-manager: 1.11.1

  • metric-server: 1.0.18

  • nginx-ingress-controller: 1.9.3

  • oidc-dex: 2.37.0

  • vault: 1.14.8

  • portieris: 0.13.10

  • istio: 1.19.4

  • kiali: 1.75.0

FluxCD Maintenance

FluxCD helm-controller is upgraded from v0.27.0 to v0.35.0 and is compatible with Helm version up to v3.12.1 and Kubernetes v1.27.3.

FluxCD source-controller is upgraded from v0.32.1 to v1.0.1 and is compatible with Helm version up to v3.12.1 and Kubernetes v1.27.3.

Helm Maintenance

Helm has been upgraded to v3.12.2 in StarlingX Release 9.0.

Support for Silicom TimeSync Server Adaptor

The Silicom network adapter provides local time sync support via a local GNSS module which is based on the Intel Columbiaville device.

  • cvl-4.10 Silicom driver bundle
    • ice driver: 1.10.1.2

    • i40e driver: 2.21.12

    • iavf driver: 4.6.1

    Note

    cvl-4.10 is only recommended if the Silicom STS2 card is used.

Kubernetes Upgrade Optimization - AIO-Simplex

Configure Kubernetes Multi-Version Upgrade Cloud Orchestration for AIO-SX

You can configure Kubernetes multi-version upgrade orchestration strategy using the sw-manager command. This feature is enabled from StarlingX release 8.0 and is supported only for the AIO-SX system.

See: Configure Kubernetes Multi-Version Upgrade Cloud Orchestration for AIO-SX

Manual Kubernetes Multi-Version Upgrade in AIO-SX

AIO-SX now supports multi-version Kubernetes upgrades. In this model, Kubernetes is upgraded by two or more versions after disabling applications and then applications are enabled again. This is faster than upgrading Kubernetes one version at a time. Also, the upgrade can be aborted and reverted to the original version. This feature is supported only for AIO-SX.

See: Manual Kubernetes Multi-Version Upgrade in AIO-SX

Platform Admin Network Introduction

The newly introduced admin network is an optional network that is used to monitor and control internal StarlingX between the subclouds and system controllers in a Distributed Cloud environment. This function is performed by the management network in the absence of an admin network. However, the admin network is more easily reconfigured to handle subnet and IP address network parameter changes after initial configuration.

In deployment configurations, static routes from the management or admin interface of subclouds controller nodes to the system controller’s management subnet must be present. This ensures that the subcloud comes online after deployment.

Note

The admin network is optional. The default management network will be used if it is not present.

You can manage an optional admin network on a subcloud for IP connectivity to the system controller management network where the IP addresses of the admin network can be changed.

See:

L3 Firewalls for all StarlingX Platform Interfaces

StarlingX incorporates default firewall rules for the platform networks (OAM, management, cluster-host, pxeboot, admin, and storage). You can configure additional Kubernetes Network Policies to augment or override the default rules.

See:

app-sriov-fec-operator upgrade to FEC operator 2.7.1

A new version of the FEC Operator v2.7.1 (for all Intel hardware accelerators) is supported to include igb_uio along with making the accelerator resource names configurable and enabling accelerator device configuration using igb_uio driver when secure boot is enabled in the BIOS.

Note

FEC operator is now running on the StarlingX platform core.

See: Configure Intel Wireless FEC Accelerators using SR-IOV FEC operator

Redundant System Clock Synchronization

The phc2sys application can be configured to accept multiple source clock inputs. The quality of these sources are compared to user-defined priority values and the best available source is selected to set the system time.

The quality of the configured sources is continuously monitored by phc2sys application and will select a new best source if the current source degrades or if another source becomes higher quality.

See: Redundant System Clock Synchronization.

Configure Intel E810 NICs using Intel Ethernet Operator

You can install and use Intel Ethernet operator to orchestrate and manage the configuration and capabilities provided by Intel E810 Series network interface cards (NICs).

See: Configure Intel E810 NICs using Intel Ethernet Operator.

AppArmor Support

AppArmor is a Mandatory Access Control (MAC) system built on Linux’s LSM (Linux Security Modules) interface. In practice, the kernel queries AppArmor before each system call to know whether the process is authorized to do the given operation. Through this mechanism, AppArmor confines programs to a limited set of resources.

AppArmor helps administrators in running a more secure kubernetes deployment by restricting what operations containers/pods are allowed, and/or provide better auditing through system logs. The access needed by a container/pod is configured through profiles tuned to allow access such as Linux capabilities, network access, file permissions, etc.

See: About AppArmor.

Support for Vault

This release re-introduces support for Vault as it was intermittently unavailable in StarlingX. The supported version vault: 1.14.8 or later / vault-k8s: 1.2.1 / helm-chart: 0.25.0 after the helm-v3 up-version to 3.6+

StarlingX integrates open source Vault containerized security application (Optional) into the StarlingX solution, that requires PVCs as a storage backend to be enabled.

See: Vault Overview.

Support for Portieris

StarlingX now supports version 0.13.10. Portieris is an open source Kubernetes admission controller which ensures only policy-compliant images, such as signed images from trusted registries, can run. The Portieris application uses images from the icr.io registry. You must configure service parameters for the icr.io registry prior to applying the Portieris application, see: About Changing External Registries for StarlingX Installation. For Distributed Cloud deployments, the images must be present on the System Controller registry.

See: Portieris Overview.

Configurable Power Manager

Configurable Power Manager focuses on containerized applications that use power profiles individually by the core and/or the application.

StarlingX has the capability to regulate the frequency of the entire processor. However, this control is primarily directed towards the classification of the core, distinguishing between application and platform cores. Consequently, if a user requires to control over an individual core, such as Core 10 in a 24-core CPU, adjustments must be applied to all cores collectively. In the context of containerized operations, it becomes imperative to establish personalized configurations. This entails assigning each container the requisite power configuration. In essence, this involves providing specific and individualized power configurations to each core or group of cores.

See: Configurable Power Manager.

Technology Preview - Install Power Metrics Application

The Power Metrics app deploys two containers, cAdvisor and Telegraf that collect metrics about hardware usage.

See: Install Power Metrics Application.

Install Node Feature Discovery (NFD) StarlingX Application

Node Feature Discovery (NFD) version 0.15.0 detects hardware features available on each node in a kubernetes cluster and advertises those features using Kubernetes node labels. This procedure walks you through the process of installing the NFD StarlingX Application.

See: Install Node Feature Discovery Application.

Partial Disk (Transparent) Encryption Support via Software Encryption (LUKS)

A new encrypted filesystem using Linux Unified Key Setup (LUKS) is created automatically on all hosts to store security-sensitive files. This is mounted at ‘/var/luks/stx/luks_fs’ and the files kept in ‘/var/luks/stx/luks_fs/controller’ directory are replicated between controllers.

K8s API/CLI OIDC (Dex) Authentication with Local LDAP Backend

StarlingX offers LDAP commands to create and manage LDAP Linux groups as part of a StarlingX local LDAP server (serving the local StarlingX cluster and, in the case of Distributed Cloud, the entire Distribute Cloud System).

StarlingX provides procedures to configure the oidc-auth-apps OIDC Identity Provider (Dex) system application to use the StarlingX local LDAP server (in addition to, or in place of the already supported remote Windows Active Directory) to authenticate users of the Kubernetes API.

See:

Create LDAP Linux Groups

StarlingX offers LDAP commands to create and manage LDAP Linux groups as part of the ldapscripts library.

StarlingX OpenStack now supports Antelope

Currently stx-openstack has been updated and now deploys OpenStack services based on the Antelope release.

Pod Security Policy

PSP ONLY applies if running on Kubernetes v1.24 or earlier. PSP is deprecated as of Kubernetes v1.21 and is removed in Kubernetes v1.25. Instead of using PSP, you can enforce similar restrictions on Pods using Pod Security Admission Controller.

Since it has been introduced PSP has had usability problems. The way PSPs are applied to pods has proven confusing especially when trying to use them. It is easy to accidentally grant broader permissions than intended, and difficult to inspect which PSPs apply in a certain situation. Kubernetes offers a built-in PSA controller that will replace PSPs in the future.

WAD users sudo and local linux group assignment

StarlingX 9.0 supports and provides procedures for centrally configured Window Active Directory (WAD) Users with sudo access and local linux group assignments; i.e. with only WAD configuration changes.

See:

Subcloud Error Root Cause Correction Action

This feature provides a root cause analysis of the subcloud deployment / upgrade failure. This includes:

  • existing ‘deploy_status’ that provides progress through phases of subcloud deployment and, on error, the phase that failed

  • introduces deploy_error_desc attribute that provides a summary of the key deployment/upgrade errors

  • Additional text that is added at the end of the ‘deploy_error_desc’ error message, with information on:

    • trouble shooting commands

    • root cause of the errors and

    • suggested recovery action

See: Manage Subclouds Using the CLI

Patch Orchestration Phase Operations

The distributed cloud patch orchestration has the option to separate the upload from the apply, remove, install and reboot operations. This facilitates performing the upload operations outside of the system maintenance window to reduce the total execution time during the patch activation that occurs during the maintenance window. With the separation of operations, systems can be prestaged with the updates prior to applying the changes to the system.

See: Distributed Cloud Guide

Long Latency Between System Controller and Subclouds

Rehoming procedure of a subcloud that has been powered off for a long period of time will differ from the regular rehoming procedure. Based on how long the subcloud has been offline, the platform certificates will expire and will need to be regenerated.

See: Rehoming Subcloud with Expired Certificates

GEO Redundancy

StarlingX may be deployed across a geographically distributed set of regions. A region consists of a local Kubernetes cluster with local redundancy and access to high-bandwidth, low-latency networking between hosts within that region.

StarlingX Distributed Cloud GEO redundancy configuration supports the ability to recover from a catastrophic event that requires subclouds to be rehomed away from the failed system controller site to the available site(s) which have enough spare capacity. This way, even if the failed site cannot be restored in short time, the subclouds can still be rehomed to available peer system controller(s) for centralized management.

In this release, the following items are addressed:

  • 1+1 GEO redundancy

    • Active-Active redundancy model

    • Total number of subclouds should not exceed 1K

  • Automated operations

    • Synchronization and liveness check between peer systems

    • Alarm generation if peer system controller is down

  • Manual operations

    • Batch rehoming from alive peer system controller

See: GEO Redundancy

Redfish Virtual Media Robustness

Redfish virtual media operations has been observed to frequently fail with transient errors. While the conditions for those failures are not always known (network, BMC timeouts, etc), it has been observed that if the Subcloud install operation is retried, the operation is successful.

To alleviate the transient conditions, the robustness of the Redfish virtual media controller (RVMC) is improved by introducing additional error handling and retry attempts.

See: Install a Subcloud Using Redfish Platform Management Service

New features in StarlingX 8.0

See: https://docs.starlingx.io/r/stx.8.0/releasenotes/index.html#release-notes

New features in StarlingX 7.0

See: https://docs.starlingx.io/r/stx.7.0/releasenotes/index.html#new-features-and-enhancements