3. Results of measuring performance of Kargo

Abstract

This document includes performance test results of Kargo as for Kubernetes deployment solution. All tests have been performed regarding Measuring performance of Kargo.

Kargo sets up Kubernetes in the following way:

  • master: Calico, Kubernetes API services

  • minion: Calico, Kubernetes minion services

  • etcd: etcd service

Kargo deploys Kubernetes cluster with the following matching hostnames and roles:

  • node1: minion+master+etcd

  • node2: minion+master+etcd

  • node3: minion+etcd

  • all other nodes: minion

3.1. Environment description

3.1.1. Hardware configuration of each server

Description of servers hardware

server

name

node-{1..500}

node-{1..500}

role

kubernetes cluster

kubernetes cluster

vendor,model

Dell, R630

Lenovo, RD550-1U

operating_system

4.4.0-36-generic
Ubuntu-xenial
x86_64
4.4.0-36-generic
Ubuntu-xenial
x86_64

CPU

vendor,model

Intel, E5-2680v3

Intel, E5-2680 v3

processor_count

2

2

core_count

12

12

frequency_MHz

2500

2500

RAM

vendor,model

Hynix HMA42GR7MFR4N-TF

Samsung M393A2G40DB0-CPB

amount_MB

262144

262144

NETWORK

interface_name

bond0

bond0

vendor,model

Intel, X710 Dual Port

Intel, X710 Dual Port

interfaces_count

2

2

bandwidth

10G

10G

STORAGE

dev_name

/dev/sda

/dev/sda

vendor,model

raid1 PERC H730P Mini
2 disks Intel S3610
raid1 MegaRAID 3108
2 disks Intel S3610

SSD/HDD

SSD

SSD

size

800GB

800GB

3.1.2. Network scheme and part of configuration of hardware network switches

Network scheme of the environment:

Network Scheme of the environment

Here is the piece of switch configuration for each switch port which is a part of bond0 interface of a server:

show run int et1 interface Ethernet1

description - r02r13c33 switchport trunk native vlan 4 switchport trunk allowed vlan 4 switchport mode trunk channel-group 133 mode active lacp port-priority 16384 spanning-tree portfast

show run int po1 interface Port-Channel1

description osscr02r13c21 switchport trunk native vlan 131 switchport trunk allowed vlan 130-159 switchport mode trunk port-channel lacp fallback static port-channel lacp fallback timeout 30 mlag 1

3.1.3. Software configuration of Kargo

3.1.3.1. Setting up Kargo:

Kargo installation was performed on the bare metal Ubuntu Xenial servers. Kargo requires dedicated user (not root) to exist on the target nodes. To configure and launch Kargo section Launcher script has been used.

Versions of some software

Software

Version

Ubuntu

Ubuntu 16.04.1 LTS

fuel-ccp-installer

6b26170f70e523fb04bda8d6f15077d461fba9de

kargo

016b7893c64fede07269c01cac31e96c8ee0d257

3.1.3.2. Test tool:

We were using Dstat utility as main tool for collecting timing and system performarce durring tests. Script for parsing collected metrics was used to parse performance metrics after installation tests.

3.1.3.3. Operating system configuration:

You can find /etc folder contents from the one of the target servers where K8S cluster was deployed: etc_tarball_of_node1

3.2. Testing process

  1. Kargo launcher script was set up and executed on node1 server as described in Setting up Kargo: section.

  2. During Kargo run dstat tool was launched on the node1 with the following options:

root@node1:~# dstat --nocolor --time --cpu --mem --net -N bond0 --io --output /root/dstat.csv
  1. After finishing of Kargo run we parsed resulted “dstat.csv” files with Script for parsing collected metrics.

The above steps were repeated with the following numbers of nodes: 50,150,350

As a result of this part we got the following CSV files:

METRICS(NUMBER_OF_NODES=50)

METRICS(NUMBER_OF_NODES=150)

METRICS(NUMBER_OF_NODES=350)

3.3. Results

After simple processing results the following plots for performance metrics collected during provisioning of the nodes in depend on time created (click to expand an image):

Number of nodes

Plot CPU(TIME)

Plot RAM(TIME)

50

CPU_USAGE(TIME, NODES=50) RAM_USAGE(TIME, NODES=50)

150

../../../../_images/150_nodes_-_CPU.png ../../../../_images/150_nodes_-_RAM.png

350

../../../../_images/350_nodes_-_CPU.png ../../../../_images/350_nodes_-_RAM.png

Number of nodes

Plot NET(TIME)

Plot DISK(TIME)

50

CPU_USAGE(TIME, NODES=50) RAM_USAGE(TIME, NODES=50)

150

../../../../_images/150_nodes_-_net.png ../../../../_images/150_nodes_-_disk.png

350

../../../../_images/350_nodes_-_net.png ../../../../_images/350_nodes_-_disk.png

The following table shows how performance metrics and deployment time parameters depend on a number of nodes.

number of nodes

50

150

350

deployment time

2049.00

3922.00

13065.00

cpu_usage_max

99.0210

99.56

99.06

cpu_usage_min

0

0

0

cpu_usage_average

7.2920

10.03

12.63

cpu_usage_percentile 90%

19.6495

24.92

29.12

ram_usage_max

4466.10

13859.56

112079.57

ram_usage_min

1061.51

1033.32

1075.16

ram_usage_average

2121.20

4335.69

31288.94

ram_usage_percentile 90%

2876.33

8570.32

79915.96

net_all_max

3864760.75

20996615.75

60130883.88

net_all_min

0

0

0

net_all_average

70602.55

102913.32

177943.40

net_all_percentile 90%

253590.90

263933.25

180409.81

dsk_io_all_max

3503

3196

3470

dsk_io_all_min

0

0

0

dsk_io_all_average

26

37

56

dsk_io_all_percentile 90%

58

14

8

3.4. Issues that have been found during the tests

During the testing we’ve found several issues that prevented us from achieving test results at scale:

Issue

Link

etcd list sometimes hangs

https://github.com/kubespray/kargo/pull/448

K8S DNS services not working correctly

https://github.com/kubespray/kargo/pull/458

Calico creates extra pool during run

https://github.com/kubespray/kargo/pull/462

Timeout to quay.io to fetch etcd image

https://github.com/kubespray/kargo/pull/481

Downloading images doesn’t scale well

https://github.com/kubespray/kargo/pull/488

Kargo is too slow on scale

https://github.com/kubespray/kargo/issues/478

3.5. Applications

3.5.1. Launcher script

#!/bin/bash -xe

if [[ -d ./fuel-ccp-installer ]] ; then
    rm -rf ./fuel-ccp-installer
fi

git clone https://review.openstack.org/openstack/fuel-ccp-installer

export ENV_NAME="kargo-test"
export DEPLOY_METHOD="kargo"
export WORKSPACE="~/workspace"
export ADMIN_USER="vagrant"
export ADMIN_PASSWORD="kargo"

# for 50 nodes
#export SLAVES_COUNT=50
#export ADMIN_IP="10.3.58.66"
#export SLAVE_IPS="10.3.58.66 10.3.58.33 10.3.58.30 10.3.58.27 10.3.58.32 10.3.58.28 10.3.58.34 10.3.58.35 10.3.58.29 10.3.58.31 10.3.58.51 10.3.58.41 10.3.58.43 10.3.58.53 10.3.58.45 10.3.58.54 10.3.58.55 10.3.58.38 10.3.58.40 10.3.58.48 10.3.58.42 10.3.58.46 10.3.58.36 10.3.58.37 10.3.58.52 10.3.58.50 10.3.58.39 10.3.58.10 10.3.58.58 10.3.58.7 10.3.57.254 10.3.58.4 10.3.57.255 10.3.58.1 10.3.58.3 10.3.58.57 10.3.58.23 10.3.58.13 10.3.58.12 10.3.58.21 10.3.58.5 10.3.58.22 10.3.58.9 10.3.58.24 10.3.58.15 10.3.58.19 10.3.58.16 10.3.56.6 10.3.56.7 10.3.56.83"

# for 150 nodes:
#export SLAVES_COUNT=150
#export ADMIN_IP="10.3.56.3"
#export SLAVE_IPS="10.3.56.3 10.3.56.254 10.3.56.4 10.3.56.6 10.3.56.7 10.3.56.83 10.3.56.82 10.3.56.84 10.3.56.86 10.3.56.87 10.3.56.89 10.3.56.12 10.3.56.11 10.3.56.13 10.3.56.15 10.3.56.16 10.3.56.17 10.3.56.18 10.3.56.20 10.3.56.21 10.3.56.22 10.3.56.23 10.3.56.25 10.3.56.26 10.3.56.27 10.3.56.29 10.3.56.30 10.3.56.31 10.3.56.32 10.3.56.34 10.3.56.33 10.3.56.37 10.3.56.38 10.3.56.39 10.3.56.41 10.3.56.43 10.3.56.45 10.3.56.46 10.3.56.47 10.3.56.48 10.3.56.50 10.3.56.49 10.3.56.51 10.3.56.52 10.3.56.133 10.3.56.135 10.3.56.137 10.3.56.136 10.3.56.113 10.3.56.139 10.3.56.141 10.3.56.148 10.3.56.142 10.3.56.117 10.3.56.143 10.3.56.145 10.3.56.123 10.3.56.122 10.3.56.128 10.3.56.144 10.3.56.250 10.3.56.251 10.3.56.126 10.3.56.180 10.3.56.181 10.3.56.184 10.3.56.182 10.3.56.185 10.3.56.183 10.3.56.188 10.3.56.191 10.3.56.192 10.3.56.187 10.3.56.195 10.3.56.190 10.3.56.199 10.3.56.193 10.3.56.204 10.3.56.207 10.3.56.205 10.3.56.206 10.3.56.201 10.3.56.202 10.3.56.208 10.3.56.217 10.3.56.216 10.3.56.209 10.3.56.210 10.3.56.215 10.3.56.218 10.3.56.212 10.3.56.213 10.3.56.214 10.3.56.211 10.3.56.221 10.3.56.224 10.3.56.227 10.3.56.149 10.3.56.219 10.3.56.223 10.3.56.231 10.3.56.228 10.3.56.235 10.3.56.236 10.3.56.230 10.3.56.233 10.3.56.229 10.3.56.232 10.3.56.234 10.3.59.95 10.3.59.92 10.3.59.88 10.3.59.96 10.3.59.111 10.3.59.115 10.3.59.116 10.3.56.146 10.3.59.119 10.3.59.117 10.3.59.112 10.3.59.110 10.3.59.109 10.3.59.120 10.3.59.137 10.3.59.136 10.3.59.133 10.3.59.132 10.3.59.138 10.3.59.134 10.3.59.135 10.3.59.139 10.3.59.131 10.3.59.130 10.3.59.74 10.3.59.80 10.3.59.73 10.3.59.77 10.3.59.84 10.3.59.105 10.3.59.82 10.3.59.83 10.3.59.81 10.3.59.98 10.3.59.108 10.3.59.106 10.3.59.102 10.3.59.107 10.3.59.86 10.3.58.66 10.3.58.33"

# for 350 nodes:
#export SLAVES_COUNT=350
#export ADMIN_IP="10.3.56.3"
#export SLAVE_IPS="10.3.56.3 10.3.56.254 10.3.56.4 10.3.56.6 10.3.56.7 10.3.56.83 10.3.56.82 10.3.56.84 10.3.56.86 10.3.56.87 10.3.56.89 10.3.56.12 10.3.56.11 10.3.56.13 10.3.56.15 10.3.56.16 10.3.56.17 10.3.56.18 10.3.56.20 10.3.56.21 10.3.56.22 10.3.56.23 10.3.56.25 10.3.56.26 10.3.56.27 10.3.56.29 10.3.56.30 10.3.56.31 10.3.56.32 10.3.56.34 10.3.56.33 10.3.56.37 10.3.56.38 10.3.56.39 10.3.56.41 10.3.56.43 10.3.56.45 10.3.56.46 10.3.56.47 10.3.56.48 10.3.56.50 10.3.56.49 10.3.56.51 10.3.56.52 10.3.56.133 10.3.56.135 10.3.56.137 10.3.56.136 10.3.56.113 10.3.56.139 10.3.56.141 10.3.56.148 10.3.56.142 10.3.56.117 10.3.56.143 10.3.56.145 10.3.56.123 10.3.56.122 10.3.56.128 10.3.56.144 10.3.56.250 10.3.56.251 10.3.56.126 10.3.56.180 10.3.56.181 10.3.56.184 10.3.56.182 10.3.56.185 10.3.56.183 10.3.56.188 10.3.56.191 10.3.56.192 10.3.56.187 10.3.56.195 10.3.56.190 10.3.56.199 10.3.56.193 10.3.56.204 10.3.56.207 10.3.56.205 10.3.56.206 10.3.56.201 10.3.56.202 10.3.56.208 10.3.56.217 10.3.56.216 10.3.56.209 10.3.56.210 10.3.56.215 10.3.56.218 10.3.56.212 10.3.56.213 10.3.56.214 10.3.56.211 10.3.56.221 10.3.56.224 10.3.56.227 10.3.56.149 10.3.56.219 10.3.56.223 10.3.56.231 10.3.56.228 10.3.56.235 10.3.56.236 10.3.56.230 10.3.56.233 10.3.56.229 10.3.56.232 10.3.56.234 10.3.59.95 10.3.59.92 10.3.59.88 10.3.59.96 10.3.59.111 10.3.59.115 10.3.59.116 10.3.56.146 10.3.59.119 10.3.59.117 10.3.59.112 10.3.59.110 10.3.59.109 10.3.59.120 10.3.59.137 10.3.59.136 10.3.59.133 10.3.59.132 10.3.59.138 10.3.59.134 10.3.59.135 10.3.59.139 10.3.59.131 10.3.59.130 10.3.59.74 10.3.59.80 10.3.59.73 10.3.59.77 10.3.59.84 10.3.59.105 10.3.59.82 10.3.59.83 10.3.59.81 10.3.59.98 10.3.59.108 10.3.59.106 10.3.59.102 10.3.59.107 10.3.59.86 10.3.59.93 10.3.59.100 10.3.59.87 10.3.59.99 10.3.59.97 10.3.59.89 10.3.59.46 10.3.59.35 10.3.59.40 10.3.59.47 10.3.59.55 10.3.59.51 10.3.59.48 10.3.59.63 10.3.59.56 10.3.59.68 10.3.59.32 10.3.59.43 10.3.59.36 10.3.59.54 10.3.59.53 10.3.59.71 10.3.59.57 10.3.59.62 10.3.59.69 10.3.59.65 10.3.59.70 10.3.59.72 10.3.59.66 10.3.59.76 10.3.59.75 10.3.59.79 10.3.59.78 10.3.59.64 10.3.59.25 10.3.59.22 10.3.59.16 10.3.59.24 10.3.59.15 10.3.59.11 10.3.59.10 10.3.58.241 10.3.59.12 10.3.59.42 10.3.59.31 10.3.59.28 10.3.59.34 10.3.59.37 10.3.59.27 10.3.59.30 10.3.59.29 10.3.59.58 10.3.59.52 10.3.59.38 10.3.59.61 10.3.59.59 10.3.59.49 10.3.59.39 10.3.58.176 10.3.58.178 10.3.58.251 10.3.58.179 10.3.58.188 10.3.58.184 10.3.58.181 10.3.58.194 10.3.58.196 10.3.58.205 10.3.58.201 10.3.58.192 10.3.58.197 10.3.58.193 10.3.58.254 10.3.58.186 10.3.58.180 10.3.58.198 10.3.58.252 10.3.58.189 10.3.58.253 10.3.58.195 10.3.58.200 10.3.58.210 10.3.58.183 10.3.58.199 10.3.58.182 10.3.58.208 10.3.58.209 10.3.58.100 10.3.58.127 10.3.58.146 10.3.58.136 10.3.58.118 10.3.58.132 10.3.58.142 10.3.58.131 10.3.58.144 10.3.58.121 10.3.58.123 10.3.58.134 10.3.58.120 10.3.58.129 10.3.58.135 10.3.58.137 10.3.58.117 10.3.58.125 10.3.58.155 10.3.58.162 10.3.58.154 10.3.58.153 10.3.58.148 10.3.58.159 10.3.58.171 10.3.58.167 10.3.58.166 10.3.58.165 10.3.58.164 10.3.58.156 10.3.58.147 10.3.58.170 10.3.58.149 10.3.58.168 10.3.58.160 10.3.58.172 10.3.58.157 10.3.58.71 10.3.58.59 10.3.58.70 10.3.58.67 10.3.58.69 10.3.58.79 10.3.58.64 10.3.58.73 10.3.58.77 10.3.58.65 10.3.58.86 10.3.58.63 10.3.58.80 10.3.58.75 10.3.58.62 10.3.58.84 10.3.58.74 10.3.58.76 10.3.58.85 10.3.58.78 10.3.58.60 10.3.58.72 10.3.58.81 10.3.58.61 10.3.58.82 10.3.58.87 10.3.58.66 10.3.58.33 10.3.58.30 10.3.58.27 10.3.58.32 10.3.58.28 10.3.58.34 10.3.58.35 10.3.58.29 10.3.58.31 10.3.58.51 10.3.58.41 10.3.58.43 10.3.58.53 10.3.58.45 10.3.58.54 10.3.58.55 10.3.58.38 10.3.58.40 10.3.58.48 10.3.58.42 10.3.58.46 10.3.58.36 10.3.58.37 10.3.58.52 10.3.58.50 10.3.58.39 10.3.58.10 10.3.58.58 10.3.58.7 10.3.57.254 10.3.58.4 10.3.57.255 10.3.58.1 10.3.58.3 10.3.58.57 10.3.58.23 10.3.58.13 10.3.58.12 10.3.58.21 10.3.58.5 10.3.58.22 10.3.58.9 10.3.58.24 10.3.58.15 10.3.58.19 10.3.58.16"

export CUSTOM_YAML='docker_version: 1.12
hyperkube_image_repo: "quay.io/coreos/hyperkube"
hyperkube_image_tag: "v1.3.5_coreos.0"
etcd_image_repo: "quay.io/coreos/etcd"
etcd_image_tag: "v3.0.1"
calicoctl_image_repo: "calico/ctl"
#calico_node_image_repo: "calico/node"
calico_node_image_repo: "l23network/node"
calico_node_image_tag: "v0.20.0"
calicoctl_image_tag: "v0.20.0"
kube_apiserver_insecure_bind_address: "0.0.0.0"'

mkdir -p $WORKSPACE
echo "Running on $NODE_NAME: $ENV_NAME"
cd ./fuel-ccp-installer

bash -xe "./utils/jenkins/run_k8s_deploy_test.sh"

3.5.2. Script for parsing collected metrics

#!/bin/bash -e

if [[ ! $1 ]] || [[ ! $2 ]] ; then
    echo \$1 = kargo_env_name, \$2 = csv file path
    exit 1
fi

WORKDIR='~/worked_up_results/'
cur_dir="${WORKDIR}kargo_${1}"
csv_name=`basename $2`
if [[ ! -d $cur_dir ]] ; then mkdir -p $cur_dir ; fi

awk -F "," 'BEGIN {getline;getline;getline;getline;getline;getline;getline;
    print "time,cpu_usage,ram_usage,net_recv,net_send,net_all,dsk_io_read,dsk_io_writ,dsk_all"}
    {printf "%s,%0.3f,%0.3f,%0.3f,%0.3f,%0.3f,%d,%d,%d\n", $1,100-$4,$8/1048576,$12/8,$13/8,($12+$13)/8,$14,$15,$14+$15 }' $2 > $cur_dir/${csv_name}