------------------------------------------------------------------- Tue Feb 13 16:10:46 UTC 2018 - containers-bugowner@suse.de - Commit 45217ed by Alvaro Saurin alvaro.saurin@gmail.com Make sure we do not crash on pillars that are not properly formatted. ------------------------------------------------------------------- Tue Feb 13 09:34:32 UTC 2018 - containers-bugowner@suse.de - Commit 413c5fa by Rafael Fernández López ereslibre@ereslibre.es When executing a highstate of `apiserver` make sure that we check the local `apiserver` instance When executing the highstate make sure the `apiserver` we are checking is the local one, not *any* master through haproxy. Make haproxy more reliable. - Let it redispatch requests. - Really restart the service when the config changes. - Apply configuration before highstates with a small batch, so we control the restarts. Wait for the apiserver to be up and responding behind HAProxy Remove `addons` and `dex` as a trait of the `kube-master` role. Instead, deploy them as part of the orchestration. Fixes: bsc#1079460 ------------------------------------------------------------------- Thu Feb 8 13:08:40 UTC 2018 - containers-bugowner@suse.de - Commit 25c660f by Flavio Castelli fcastelli@suse.com Mark the haproxy as critical pod Flag the haproxy pods providing connectivity to the API server as critical ones. This should force kubelet and the scheduler to never ever get rid of them. If these pods are killed to make more space for other ones, the node would not be able to talk with the API server making it useless. More details inside upstream doc: https://kubernetes.io/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/ Signed-off-by: Flavio Castelli ------------------------------------------------------------------- Wed Jan 31 15:49:53 UTC 2018 - containers-bugowner@suse.de - Commit db59d4c by Rafael Fernández López ereslibre@ereslibre.es Bump dex version (cherry picked from commit 1863c0678ef4d554296a361a70aa18c202fc7ca4) ------------------------------------------------------------------- Thu Jan 25 14:08:27 UTC 2018 - containers-bugowner@suse.de - Commit f722758 by Federico Ceratto federico.ceratto@suse.de Add swap disabling Backport of 9e358bbbbf154f700728fb9b9037558dfd187700 Fixes: bsc#1075001 ------------------------------------------------------------------- Thu Jan 25 09:47:45 UTC 2018 - containers-bugowner@suse.de - Commit d484a1e by Rafael Fernández López ereslibre@ereslibre.es Catch any exceptions on `caasp_retriable` state, and treat it as a regular failure. Explicitly wait for `caasp_retriable` states. Fixes: bsc#1070989 ------------------------------------------------------------------- Thu Jan 25 08:59:23 UTC 2018 - containers-bugowner@suse.de - Commit 7f14596 by Rafael Fernández López ereslibre@ereslibre.es Early mark nodes requiring update reboot as update in progress. This will allow us to reduce the timeframe in which the update-etc-hosts orchestration can pop up, eventually running states on minions effectively taking their lock and making this orchestration fail. We don't want the update-etc-hosts orchestration to interfere with the main update orchestration. We'll release minion per minion grain when they are done, but let's block all of them at the very beginning. Fixes: bsc#1077086 ------------------------------------------------------------------- Wed Jan 24 15:17:33 UTC 2018 - containers-bugowner@suse.de - Commit 5ca372a by Rafael Fernández López ereslibre@ereslibre.es Retry certificate generation This will make the certificate request to the CA more resilient to transient errors, in case of overload or any other reasons that make the CA slow when creating new requested certificates. Fixes: bsc#1070989 ------------------------------------------------------------------- Thu Jan 18 16:25:58 UTC 2018 - containers-bugowner@suse.de - Commit d3ec313 by Maximilian Meister mmeister@suse.de Configure docker via config file, not args docker can be configured via /etc/docker/daemon.json registries can be configured there too, but need to be in their own dedicated pillar as we need to map certificates to the registry names Signed-off-by: Maximilian Meister Commit 40fff30 by Alvaro Saurin alvaro.saurin@gmail.com Additional certificates for repositories for Docker (bsc#1039877) ------------------------------------------------------------------- Tue Jan 9 11:02:09 UTC 2018 - containers-bugowner@suse.de - Commit cc10959 by Rafael Fernández López ereslibre@ereslibre.es Only uncordon nodes that were cordoned because of our own processes Fix kubelet highstate to uncordon the node only if we did cordon it by one of our processes (like an update). Without this patch, adding new nodes or performing an update would uncordon all nodes unconditionally, without taking into account if a user had a node cordoned for some reason (e.g. hardware failures or other reasons). Do not uncordon those nodes, keep them cordoned. Fixes: bsc#1050017 ------------------------------------------------------------------- Mon Dec 18 17:25:29 UTC 2017 - containers-bugowner@suse.de - Commit 24ca219 by Alvaro Saurin alvaro.saurin@gmail.com Do not try to create clusterrolebindings when they are already there bsc#1072005 ------------------------------------------------------------------- Mon Dec 18 15:29:04 UTC 2017 - containers-bugowner@suse.de - Commit b7a054b by Rafael Fernández López ereslibre@ereslibre.es Add beacon to notify network changes only on the default network interface Fixes: bsc#1063709 ------------------------------------------------------------------- Wed Dec 13 12:32:39 UTC 2017 - containers-bugowner@suse.de - Commit ee16ef7 by Alvaro Saurin alvaro.saurin@gmail.com In the certs macros, do not assume "names" are always names and "ips" are always IPs: just filter with the "is_ip" filter. Minor shortcuts in the arguments. Fixes: bsc#1069205 Backport to 2.0 of https://github.com/kubic-project/salt/pull/311 ------------------------------------------------------------------- Wed Nov 29 11:19:49 UTC 2017 - containers-bugowner@suse.de - Commit 17a1f5b by Rafael Fernández López ereslibre@ereslibre.es Include `Internal Dashboard FQDN/IP` value in the LDAP certificate Since Dex will connect to LDAP using this FQDN/IP, make sure that the TLS handshake will succeed by regenerating the certificate early in the orchestration, so it includes this FQDN/IP in the SAN extensions of the LDAP certificate. Fixes: bsc#1069175 ------------------------------------------------------------------- Mon Nov 27 16:59:22 UTC 2017 - containers-bugowner@suse.de - Commit 909edea by Alvaro Saurin alvaro.saurin@gmail.com Use some Jinja macros for getting the default interface's IP. (bsc#1058079) Get rid of our custom grain. ------------------------------------------------------------------- Wed Nov 22 15:42:47 UTC 2017 - containers-bugowner@suse.de - Commit 1031b83 by Maximilian Meister mmeister@suse.de only set service entries for localhost on kube-master also explain in a comment why we need to set the apiserver for 127.0.0.1 on all hosts (bsc#1067219) Signed-off-by: Maximilian Meister (cherry picked from commit 610f65e57c7daa96d9f7ad51addcb0e919fba837) ------------------------------------------------------------------- Fri Nov 10 15:05:05 UTC 2017 - containers-bugowner@suse.de - Commit ae0f1dc by Rafael Fernández López ereslibre@ereslibre.es Disable container-feeder before rebooting. This will allow us to control when container-feeder starts to load new images from the filesystem. Due to some possible docker configuration changes it might be restarted while container-feeder is working (if we keep it enabled). Force to disable the service before rebooting. Fixes: bsc#1066653 ------------------------------------------------------------------- Mon Oct 30 10:25:53 UTC 2017 - containers-bugowner@suse.de - Commit 1e6320d by Alvaro Saurin alvaro.saurin@gmail.com Increase worker threads and backlog length (bsc#1065018) ------------------------------------------------------------------- Mon Oct 30 09:43:13 UTC 2017 - containers-bugowner@suse.de - Commit 8a7ba88 by Flavio Castelli fcastelli@suse.com Retry all iptables states Retry all iptables states to prevent failures like seen with bsc#1064186. Signed-off-by: Flavio Castelli Commit 152afbf by Flavio Castelli fcastelli@suse.com Introduce caasp_retriable Provide a generic way to retry any kind of salt state. Signed-off-by: Flavio Castelli ------------------------------------------------------------------- Wed Oct 25 14:43:58 UTC 2017 - containers-bugowner@suse.de - Commit 3a04f89 by Michal Jura mjura@suse.com Synchronize /etc/hosts on velum after container restarts, bsc#1062728 ------------------------------------------------------------------- Fri Oct 20 10:26:29 UTC 2017 - containers-bugowner@suse.de - Commit 74d9cb6 by Kiall Mac Innes kiall@macinnes.ie Correctly handle FQDN `dashboard` values in Velum cert Ensure we correctly handle FQDN values for the `dashboard` pillar when generating the Velum TLS certificate. Fixes bsc#1064284 ------------------------------------------------------------------- Wed Oct 18 13:20:52 UTC 2017 - containers-bugowner@suse.de - Commit b9281e0 by Kiall Mac Innes kiall@macinnes.ie Manage the Velum TLS cert This ensures that the dashboard_external_fqdn is registered within the velum TLS certificate. bsc#1063998 ------------------------------------------------------------------- Wed Oct 18 10:33:50 UTC 2017 - containers-bugowner@suse.de - Commit 8029758 by Michal Jura mjura@suse.com Add floating network to cloud-provider integration with OpenStack We would like add new pillar value floating, which will be used to configure floating network for cloud provider intergration with OpenStack. If this option is specified, it will create floating ip for loadbalancer automatically. (cherry picked from commit 497891de69f904f9de8cb87e7bb186926af990ce) ------------------------------------------------------------------- Tue Oct 17 18:15:27 UTC 2017 - containers-bugowner@suse.de - Commit b82cc56 by Michal Jura mjura@suse.com Keep updated /etc/hosts on velum-dashboard container, bsc#1062728 We would like to keep /etc/hosts file updated for velum-dashboard with Admin host. Velum needs to know external name of Kube API which will be used to register in Dex service. Problem was discovered and discribed in bug 1062728 ------------------------------------------------------------------- Tue Oct 17 18:09:32 UTC 2017 - containers-bugowner@suse.de - Commit 44cc863 by Michal Jura mjura@suse.com Allow to deploy haproxy pod on Admin node, bsc#1062728 Velum to get kubeconfig connects to Dex service on kube-masters nodes. For multimaster configuration it needs haproxy installed because all Kube-API domains will be binded to localhost 127.0.0.1 (cherry picked from commit 7ef721b6bc61a5c6c20e1a027ef18aa2507ffc16) ------------------------------------------------------------------- Tue Oct 17 13:33:23 UTC 2017 - containers-bugowner@suse.de - Commit 91a72c2 by Kiall Mac Innes kiall@macinnes.ie Revert K8S to use etcd2 storage format With etcd3, the kubernetes api server will sit in a (slow) restart loop when multimaster is enabled, logging a stacktrace and then restarting. This will manifest as, most commonly, "Unable to connect to the server: unexpected EOF" from kubectl. This will break bootstrap as we need to talk to K8S API to deploy dex, kube-dns, and tiller. bsc#1063235 bsc#1063285 bsc#1063543 ------------------------------------------------------------------- Tue Oct 17 05:28:24 UTC 2017 - containers-bugowner@suse.de - Commit 16112bd by Kiall Mac Innes kiall@macinnes.ie Revert K8S to a storage media type of json With the default value (protobuf), the kubernetes api server will sit in a (slow) restart loop when multimaster is enabled, logging a stacktrace and then restarting. This will manifest as, most commonly, "Unable to connect to the server: unexpected EOF" from kubectl. This will break bootstrap as we need to talk to K8S API to deploy dex, kube-dns, and tiller. bsc#1063235 bsc#1063285 bsc#1063543 ------------------------------------------------------------------- Wed Oct 11 16:12:57 UTC 2017 - containers-bugowner@suse.de - Commit 00261cc by Rafael Fernández López ereslibre@ereslibre.es Fix missing requirement during the upgrade process. Fixes: bsc#1062824 ------------------------------------------------------------------- Wed Oct 11 15:49:41 UTC 2017 - containers-bugowner@suse.de - Commit beabee8 by Kiall Mac Innes kiall@macinnes.ie Allow Dex to redirect to the Dashboard's external FQDN Some scenarios where the admin node's private IP is not accessible to the outside world require that we use a end user provided FQDN - e.g. as is the case on OpenStack and possibly other cloud environments. Allow redirections to this FQDN. Part of bsc#1062291 ------------------------------------------------------------------- Mon Oct 9 17:25:51 UTC 2017 - containers-bugowner@suse.de - Commit 75e85a0 by Nikhil Manchanda SlickNik@gmail.com Update tiller deployment to use sles-based docker image Currently the tiller image being used for the tiller deployment is from the upstream registry at gcr.io. We should be using the SLES based docker image instead of the upstream one. Fixes: bsc#1062380 ------------------------------------------------------------------- Sat Oct 7 08:46:43 UTC 2017 - containers-bugowner@suse.de - Commit bbdd7a3 by Kiall Mac Innes kiall@macinnes.ie Update VERSION file to 2.0.0 ------------------------------------------------------------------- Fri Oct 6 15:29:20 UTC 2017 - containers-bugowner@suse.de - Commit b4e9d6d by Rafael Fernández López ereslibre@ereslibre.es Set frontend settings: `dir` and `theme`. ------------------------------------------------------------------- Fri Oct 6 15:09:23 UTC 2017 - containers-bugowner@suse.de - Commit bf67143 by Michal Jura mjura@suse.com Remove duplicated storage-backend option for Kubernetes API, bsc#1061810 Option storage-backend is provided two times for Kubernetes API configuration. We have to keep only one option with value provided from pillar. (cherry picked from commit c4b42e689a0044a73715ff9e3619f709b3bca982) ------------------------------------------------------------------- Fri Oct 6 14:42:40 UTC 2017 - containers-bugowner@suse.de - Commit 496a16f by Kiall Mac Innes kiall@macinnes.ie Dex: Wait for Dex to be fully up and running We shouldn't allow a bootstrap to complete without Dex being up and running, so lets wait for the Dex API to start responding. ------------------------------------------------------------------- Fri Oct 6 10:37:34 UTC 2017 - containers-bugowner@suse.de - Commit 1585c87 by Robert Roland robert.roland@suse.com Add a URL off Velum as a valid OIDC redirect URI This will make it so that Dex will be happy to redirect you to velum ------------------------------------------------------------------- Tue Sep 26 17:28:06 UTC 2017 - jmassaguerpla@suse.com - Branch project to release-2.0 branch. This means the tarball is no longer called master.tar.gz but release-2.0.tar.gz ------------------------------------------------------------------- Thu Sep 21 13:56:59 UTC 2017 - containers-bugowner@suse.de - Commit 50f84f4 by Rafael Fernández López ereslibre@ereslibre.es Add `caasp_service.running_stable` This new state will allow us to make sure that a service is running in a stable manner. Also, will do some waits in case systemd will do retries on the background, what avoids instant failure from salt being reported with a regular `service.running`. Fixes: bsc#1059105 ------------------------------------------------------------------- Thu Sep 21 13:13:22 UTC 2017 - containers-bugowner@suse.de - Commit 408ab7a by Kiall Mac Innes kiall@macinnes.ie Allow custom options to be passed to the Salt Master Rename the salt master configurations, so that custom options can be loaded after the stock options, allowing an override. bsc#1059724 ------------------------------------------------------------------- Thu Sep 21 10:14:09 UTC 2017 - containers-bugowner@suse.de - Commit 60e6a69 by Alvaro Saurin alvaro.saurin@gmail.com Do not access infra machines through the proxy (bsc#1053739) ------------------------------------------------------------------- Thu Sep 21 09:57:03 UTC 2017 - containers-bugowner@suse.de - Commit f730743 by Kiall Mac Innes kiall@macinnes.ie Ensure cluster-service labels are consistent These were inconsistent, with some services using the labels, and others not. Within services, some of the resoures the label should be applied to were not, even though other parts of the same service did have the label applied. Commit 6520870 by Kiall Mac Innes kiall@macinnes.ie Add CriticalAddonsOnly tolerations Add CriticalAddonsOnly toleration to dex/kube-dns/timmer, this syncs them with upstream, and allows for masters to be flagged as suitable for running these critical contains if desired. Commit 6cde454 by Kiall Mac Innes kiall@macinnes.ie Remove Kube addonmanager references As Kubernetes addonmanager is not used to deploy these, we should not apply the addonmanager labels. Should a end user deploy kube addonmanager, it will believe these pods are under it's control and potentially remove or change them. bsc#1059516 ------------------------------------------------------------------- Thu Sep 21 09:12:20 UTC 2017 - containers-bugowner@suse.de - Commit 7184f5e by Kiall Mac Innes kiall@macinnes.ie Prevent update-etc-hosts conflicting with bootstrap Fix another case where the etc hosts update orchestration would otherwise conflict with the bootstrap / add node orchestration. bsc#1059577 ------------------------------------------------------------------- Wed Sep 20 09:51:41 UTC 2017 - containers-bugowner@suse.de - Commit 8865d73 by Robert Roland rob.roland@gmail.com Making the service account key the same on all nodes (#230) The kube-apiserver and kube-controller-manager must agree on what the private key is for service account generation. In a multi-master scenario, where an api server starts on one machine, and the controller-manager on another machine becomes primary, pods cannot be created because kube-controller-manager cannot communicate with the apiserver. So, now, we generate the service account key on the ca minion and store it in the mine, so that it's generated once. Fixes bsc#1059398 ------------------------------------------------------------------- Tue Sep 19 22:27:46 UTC 2017 - containers-bugowner@suse.de - Commit 6868ea5 by Alvaro Saurin alvaro.saurin@gmail.com Set a default external fqdn ------------------------------------------------------------------- Tue Sep 19 22:26:44 UTC 2017 - containers-bugowner@suse.de - Commit 2df25a0 by Aishwarya Thangappa aishwarya.thangappa@gmail.com Fix the race condition that occurs when starting Kube-DNS KubeDNS may fail to apply due to a race condition within `kubectl apply`, this mitigates that issue. ------------------------------------------------------------------- Fri Sep 15 10:36:14 UTC 2017 - containers-bugowner@suse.de - Commit 5d0e520 by Kiall Mac Innes kiall@macinnes.ie Update paths to match SLES based Dex container The SLES based dex container does not put dex in /usr/local/bin, additionally, we install the web content in /usr/share/caasp-dex/web. Part of bsc#1058833 ------------------------------------------------------------------- Wed Sep 13 12:59:55 UTC 2017 - containers-bugowner@suse.de - Commit e966106 by Michal Jura mjura@suse.com Add OpenStack block storage version as a option ------------------------------------------------------------------- Wed Sep 13 12:58:52 UTC 2017 - containers-bugowner@suse.de - Commit 8e90c5c by Kiall Mac Innes kiall@macinnes.ie Include kube-apiserver in the dex role Without this, We're seeing an error post-bootstrap, so deployments look green, but fail with: The following requisites were not found: require: id: kube-apiserver ------------------------------------------------------------------- Wed Sep 13 10:03:30 UTC 2017 - containers-bugowner@suse.de - Commit cc32e39 by Robert Roland robert.roland@suse.com Switch to the sles12/caasp-dex image ------------------------------------------------------------------- Wed Sep 13 08:54:40 UTC 2017 - containers-bugowner@suse.de - Commit 6c2b47a by Michal Jura mjura@suse.com Add orchestration for etcd storage 'etcd2' to 'etcd3' In Kubernetes v1.7 default storage backend for apiserver is 'etcd3'. We need orchestrate migration between version 'etcd2' and 'etcd3'. ------------------------------------------------------------------- Wed Sep 13 08:52:38 UTC 2017 - containers-bugowner@suse.de - Commit c26d987 by Robert Roland rob.roland@gmail.com Role-based access control (#192) Adding role-based access control based on CoreOS Dex and OpenLDAP ------------------------------------------------------------------- Tue Sep 12 14:27:59 UTC 2017 - containers-bugowner@suse.de - Commit 2b5dd9b by Nikhil Manchanda SlickNik@gmail.com Add cluster role binding for tiller Tiller requires a cluster role binding to work correctly with the new RBAC changes. Add this cluster role binding so that helm commands work correctly. ------------------------------------------------------------------- Tue Sep 12 09:03:03 UTC 2017 - containers-bugowner@suse.de - Commit efd8877 by Rafael Fernández López ereslibre@ereslibre.es Set etcd3 as default backend storage ------------------------------------------------------------------- Sat Sep 9 09:01:51 UTC 2017 - containers-bugowner@suse.de - Commit 3e9bcd6 by Kiall Mac Innes kiall@macinnes.ie Move External FQDN to 127.0.0.1 address s was added to ensure Dex was always reachable, however, with multi masters, this name was assigned to 3 different lines in /etc/hosts. Most consumers of /etc/hosts do not deal with this as they would a round-robin DNS entry which returns multiple IPs. When the "selected" master is powered off, this name continues to resolve the same dead IP address. As Dex uses a NodePort service, putting this to 127.0.0.1 works as we expect it to. ------------------------------------------------------------------- Fri Sep 8 12:46:25 UTC 2017 - containers-bugowner@suse.de - Commit 5e89d99 by Alvaro Saurin alvaro.saurin@gmail.com Refactor the wait-for-apiserver so it can be used in some other parts of the code ------------------------------------------------------------------- Fri Sep 8 12:45:44 UTC 2017 - containers-bugowner@suse.de - Commit 5a13bbc by Kiall Mac Innes kiall@macinnes.ie Ensure systemd is reloaded after units are changed Ensure systemd is reloaded as soon as a unit is changed, rather than relying on a task later within the orchestration to execute. Fixes bsc#1057641 ------------------------------------------------------------------- Fri Sep 8 11:37:54 UTC 2017 - containers-bugowner@suse.de - Commit a601b38 by Kiall Mac Innes kiall@macinnes.ie Include short hostname for masters The short hostname for masters was not being set, as it was for both the admin node, and worker nodes Fixes bsc#1057794 ------------------------------------------------------------------- Fri Sep 8 11:09:21 UTC 2017 - containers-bugowner@suse.de - Commit 755ad7c by Sam Leavens rbwsam@gmail.com Adding optional addon for Helm's tiller ------------------------------------------------------------------- Fri Sep 8 10:23:47 UTC 2017 - containers-bugowner@suse.de - Commit e0727d2 by Kiall Mac Innes kiall@macinnes.ie Combine etcd and etcd-proxy formulas The base etcd formula is never used on it's own, lets remove this unnecessary complexity. ------------------------------------------------------------------- Thu Sep 7 13:23:50 UTC 2017 - containers-bugowner@suse.de - Commit c0bbaba by Kiall Mac Innes kiall@macinnes.ie Include both v2 and v3 flags in etcdctl vars ------------------------------------------------------------------- Tue Sep 5 17:13:10 UTC 2017 - containers-bugowner@suse.de - Commit c1c851c by Robert Roland rob.roland@gmail.com Role-based access control (#192) Adding role-based access control based on CoreOS Dex and OpenLDAP ------------------------------------------------------------------- Wed Aug 30 09:29:40 UTC 2017 - containers-bugowner@suse.de - Commit 66b0de2 by Aishwarya Thangappa aishwarya.thangappa@gmail.com Update docker images for KubeDNS to ones based on SLES from the rpms in MicroOS ------------------------------------------------------------------- Tue Aug 29 15:55:29 UTC 2017 - containers-bugowner@suse.de - Commit 67846f6 by Kiall Mac Innes kiall@macinnes.ie Fix flannel config for 0.8.0 Flannel in 0.8.0 rejects the "-logtostderr" flag we were providing, this doesn't seem to have ever been an option, however it was silently ignored in the past. ------------------------------------------------------------------- Tue Aug 29 14:48:45 UTC 2017 - containers-bugowner@suse.de - Commit 5c4bf44 by Michal Jura mjura@suse.com Set kube-apiserver storage backend as option Parametrize Kubernetes apiserver storage backend. This will be used in future for migration process from storage etcd2 to etcd3. ------------------------------------------------------------------- Fri Aug 25 17:50:59 UTC 2017 - containers-bugowner@suse.de - Commit 0a8f3e2 by Michal Jura mjura@suse.com Add cloud provider integration for OpenStack Storage Commit 885cc4d by Michal Jura mjura@suse.com Add cloud provider integration for OpenStack LoadBalancer ------------------------------------------------------------------- Tue Aug 22 10:42:22 UTC 2017 - containers-bugowner@suse.de - Commit 6ac7ffb by Kiall Mac Innes kiall@macinnes.ie Use haproxy to load balance Kube API requests Now that we can have multiple masters, we need a way for the various services and end-users to be load balanced over the set of kube-api servers. We install haproxy on each node, inside a docker container, configured to load balance requests over all the cluster masters. This haproxy is configured to listen on 0.0.0.0 on the masters, and 127.0.0.1 on the workers. This is to allow the minions to simply "talk" to 127.0.0.0, and be routed to an active kube-api server. ------------------------------------------------------------------- Mon Aug 21 14:22:13 UTC 2017 - containers-bugowner@suse.de - Commit 2269176 by Kiall Mac Innes kiall@macinnes.ie Use apply instead of create for addons kubectl apply is generally idempotent, while kubectl create is not. With multi-master now enabled, if two masters execute this script at once, one of them is likely to fail given the check+set race within this script - Switching to apply removes part of this this C+S race. The second part of this race, is it client-side decision by apply to create or update, by retrying the command once if it fails, we can ensure when two masters run this script at the same time, for the first time, the C+S race will be avoided here too. ------------------------------------------------------------------- Mon Aug 21 08:43:16 UTC 2017 - containers-bugowner@suse.de - Commit b470a20 by Kiall Mac Innes kiall@macinnes.ie Ensure k8s_etcd.get_cluster_size works for multi-master If we had enough masters to form a etcd cluster, we would end up returning "None" from this method, preventing the cluster formation. ------------------------------------------------------------------- Mon Aug 21 08:34:11 UTC 2017 - containers-bugowner@suse.de - Commit 06033b3 by Alvaro Saurin alvaro.saurin@gmail.com Wait for the API server after starting the service. ------------------------------------------------------------------- Mon Aug 21 08:01:52 UTC 2017 - containers-bugowner@suse.de - Commit af41306 by Alvaro Saurin alvaro.saurin@gmail.com Do not generate an empty --proxy line in curlrc ------------------------------------------------------------------- Fri Aug 18 14:56:14 UTC 2017 - containers-bugowner@suse.de - Commit bdd9b9c by Kiall Mac Innes kiall@macinnes.ie Grow flannel CIDR to accommodate 1024 workers Flannel was setup such that 150 workers could obtain a subnet before there were not none left. By growing this range, and the size of the individual allocations, we allow for up to 1024 workers with 510 pods on each. bsc#1047847 ------------------------------------------------------------------- Thu Aug 17 16:43:08 UTC 2017 - containers-bugowner@suse.de - Commit 4b40d4c by Aishwarya Thangappa aishwarya.thangappa@gmail.com Add kube-dns service account ------------------------------------------------------------------- Thu Aug 17 14:38:54 UTC 2017 - containers-bugowner@suse.de - Commit e1d5650 by Kiall Mac Innes kiall@macinnes.ie Disable Salt's Job Cache Salt's job cache is buggy, causing random failures to lookup mine data, which in turn causes our deployments to fail. Fixes bsc#1054256 ------------------------------------------------------------------- Thu Aug 17 13:53:02 UTC 2017 - containers-bugowner@suse.de - Commit 7c47d63 by Alvaro Saurin alvaro.saurin@gmail.com Properly wait for a HTTP endpoint ------------------------------------------------------------------- Wed Aug 16 18:14:24 UTC 2017 - containers-bugowner@suse.de - Commit a4a049e by Kiall Mac Innes kiall@macinnes.ie Kube-API: Set storage-backend to etcd2 In our current configuration, kube-api logs a series of errors unless this is set. ------------------------------------------------------------------- Wed Aug 9 12:03:51 UTC 2017 - containers-bugowner@suse.de - Commit 6caa9fa by Robert Roland robert.roland@suse.com Dedicated certificate for kube-controller-manager Commit 5e5dfb5 by Robert Roland robert.roland@suse.com Dedicated certificate for kube-proxy Commit afe4f63 by Robert Roland robert.roland@suse.com Dedicated certificate for kubelet Commit 8acea7c by Robert Roland robert.roland@suse.com Dedicated certificate for kube-scheduler Commit e59670e by Robert Roland robert.roland@suse.com Adapting kube-apiserver wait fix into this branch Commit c4eef4d by Robert Roland robert.roland@suse.com eliminated the kubernetes-master formula the daemons are all separate now, so it's controlled by role membership in the top.sls file moved addons to a separate salt formula Commit 9232705 by Robert Roland robert.roland@suse.com kube-proxy as a separate salt formula Commit 15ff190 by Robert Roland robert.roland@suse.com kubelet as a separate salt formula Commit 4412b9d by Robert Roland robert.roland@suse.com kube-scheduler as its own formula fixing a bug where we uncordon master nodes. but we should never do that. Commit 4662dd1 by Robert Roland robert.roland@suse.com kube-controller-manager as a separate formula Commit ee9fb0b by Robert Roland robert.roland@suse.com kube-apiserver as a separate formula Makes a dedicated formula for the kube-apiserver Generates a cert specifically for the kube-apiserver ------------------------------------------------------------------- Mon Aug 7 20:34:39 UTC 2017 - containers-bugowner@suse.de - Commit 65b9e9c by Robert Roland robert.roland@suse.com can't talk to 6443 without a client cert talk to the insecure-bind-address instead. Commit 5c6d2e1 by Kiall Mac Innes kiall@macinnes.ie Wait for Kube-API before installing Kube-DNS ------------------------------------------------------------------- Thu Aug 3 16:51:38 UTC 2017 - containers-bugowner@suse.de - Commit 3a6869d by Aishwarya Thangappa aishwarya.thangappa@gmail.com Install Kube-DNS by default 1. Removed the skydns template files and added kubedns template files. We will be using deployments instead of replication controllers. 2. Modified the deploy script to check for the existence of kube-dns deployment, kube-dns service and config map before creating one. 3. Turned on the addon:dns flag so as to install KubeDNS by default. ------------------------------------------------------------------- Wed Aug 2 22:34:57 UTC 2017 - containers-bugowner@suse.de - Commit d1abfaa by Thomas Hipp thipp@suse.de update k8s version Signed-off-by: Thomas Hipp ------------------------------------------------------------------- Tue Aug 1 14:27:32 UTC 2017 - containers-bugowner@suse.de - Commit bc3adf7 by Robert Roland robert.roland@suse.com Explicit dependency ordering Commit 1086ebf by Robert Roland robert.roland@suse.com Run kubelet and kube-proxy on the master node A standard Kubernetes installation runs a kubelet and kube-proxy on every node, and then you decide where to run apiserver, controller-manager and scheduler. This change is required to support RBAC, DaemonSets and many other changes. Requires an updated kubernetes-client package that contains: https://build.opensuse.org/request/show/494998 ------------------------------------------------------------------- Thu Jul 20 15:15:54 UTC 2017 - containers-bugowner@suse.de - Commit 5df94da by Kiall Mac Innes kiall@macinnes.ie Delay reboots during upgrade by 15 seconds Even with backgrounding the call, salt-minion sometimes still does not have enough time to respond before systemd shuts down salt-minion on some environments. By adding a 15 second delay, we give salt-minion much more time than it should need in a healthy cluster to respond. Additionally, switch from the deprecated syntax for supplying bg=True, to the newer syntax which no longer logs a warning. Followup up fix for bsc#1049200 ------------------------------------------------------------------- Thu Jul 20 12:34:09 UTC 2017 - containers-bugowner@suse.de - Commit 4920c7a by Rafael Fernández López ereslibre@ereslibre.es Do not publish the `ca.crt` from the `ca` SLS, use `mine_functions` We will be publishing this contents when the `ca` minion starts, so there's no need to do this during the orchestration. `mine.send` is not reliable enough since we cannot confirm that the contents are there yet, and waiting a random amount of time is not appropriate as we are just hiding the real problem. In the near future we can do an active wait for the content to be there using `retry`, but for now we just publish the contents of the `ca.crt` using `mine_functions`, so it is sent when the `ca` minion starts. There's no need to refresh the mine, as this was just hiding the real problem when we were publishing this contents during the orchestration phase. Fixes: bsc#1049137 Fixes: bsc#1048548 ------------------------------------------------------------------- Wed Jul 19 14:55:03 UTC 2017 - containers-bugowner@suse.de - Commit 3e5cf9f by Kiall Mac Innes kiall@macinnes.ie Add extra requisites to the update orchestration These additional requisites enforce a stricter ordering of tasks during the upgrade. In some case, "-set-update-grain" would not execute in the right place, potentially leading to a failed upgrade. bsc#1045381 ------------------------------------------------------------------- Wed Jul 19 11:40:34 UTC 2017 - containers-bugowner@suse.de - Commit d97a24e by Kiall Mac Innes kiall@macinnes.ie Don't wait for minion responses when rebooting When we instruct a minion to reboot, we can't reliably wait for the response from salt-minion letting us know that the "systemctl reboot" command succeeded, as systemd may choose to shutdown the salt-minion service before it can sent out the "Yes, that worked" response. Salt does not make any attempt to finish in progress tasks when it receives a SIGTERM, leaving us with few other viable choices for this. Fixes bsc#1049200 ------------------------------------------------------------------- Tue Jul 18 10:11:10 UTC 2017 - containers-bugowner@suse.de - Commit 0692dbf by Rafael Fernández López ereslibre@ereslibre.es Explicitly refresh the mine on all minions after the `ca` has published the `ca.crt` We will explicitly force all minions to refresh the mine after the `ca` minion has published the `ca.crt` certificate on the mine, to avoid rendering problems with later SLS being executed. It might happen that a minion was missing this information on its mine, so the rendering of the SLS failed, effectively stopping the whole orchestration process. Fixes: bsc#1048548 ------------------------------------------------------------------- Mon Jul 17 12:55:09 UTC 2017 - containers-bugowner@suse.de - Commit 219b7d5 by Kiall Mac Innes kiall@macinnes.ie Upgrade: Wait longer for minions to reboot Wait 1200 seconds (20 minutes) for minions to reboot, instead of the default 300 seconds (5 minutes). We increase this to cover off cases where slower to boot physical hardware is used. 20 minutes was chosen as, I've seen physical hardware take 10-12 minutes in the past, and someone likely has something that is slower to reboot. bsc#1048683 ------------------------------------------------------------------- Fri Jul 14 15:59:05 UTC 2017 - containers-bugowner@suse.de - Commit 1e41512 by Alvaro Saurin alvaro.saurin@gmail.com Add some extra naames to the AIP server certificate (bsc#1033671) ------------------------------------------------------------------- Fri Jul 14 14:46:02 UTC 2017 - containers-bugowner@suse.de - Commit 6b146d5 by Maximilian Meister mmeister@suse.de make branch safe by transforming slashes to dashes Signed-off-by: Maximilian Meister Commit 588b834 by Maximilian Meister mmeister@suse.de packaging: make branch configurable Signed-off-by: Maximilian Meister ------------------------------------------------------------------- Fri Jul 14 13:45:02 UTC 2017 - containers-bugowner@suse.de - Commit 6b146d5 by Maximilian Meister mmeister@suse.de make branch safe by transforming slashes to dashes Signed-off-by: Maximilian Meister Commit 588b834 by Maximilian Meister mmeister@suse.de packaging: make branch configurable Signed-off-by: Maximilian Meister ------------------------------------------------------------------- Fri Jul 14 08:26:11 UTC 2017 - containers-bugowner@suse.de - Commit c59070d by Rafael Fernández López ereslibre@ereslibre.es Fix `ca` key path This was a leftover from the previous implementation. Now the ca key is present under `/etc/pki/private` in the ca container too (as it mounts `/etc/pki`) ------------------------------------------------------------------- Thu Jul 13 19:29:28 UTC 2017 - containers-bugowner@suse.de - Commit b6281ae by Kiall Mac Innes kiall@macinnes.ie Ensure grains are always refreshed periodically Salt's grains_refresh_every configuration param does not quite do what we need it to, it's failing to refresh grains from the `grains` file - leading to updates going undetected. This change adds a slightly modified version of what this config param internally does, adding the force_refresh: True argument, ensuring we correctly refresh. bsc#1048583 ------------------------------------------------------------------- Tue Jul 11 14:57:50 UTC 2017 - containers-bugowner@suse.de - Commit 88e9ff9 by Rafael Fernández López ereslibre@ereslibre.es Keep `job_cache: True` as it's discouraged to disable it Our deployment is also failing probably due to the fact that we were disabling the salt `job_cache`. Commit b0547af by Miquel Sabaté Solà msabate@suse.com Set MySQL as the job cache for the Salt master First of all, we can specify an external job cache. If we don't do that, then the `keep_jobs` option only applies to the local cache. This means that Salt will not clean up jobs, events and returns older than the specified `keep_jobs` value (default: 24h) for the MySQL returner that we have already configured. Moreover, since we'd already be using MySQL as a job cache, we don't have to use the local system (/var/cache/salt/master/jobs/) as a cache (note that Salt would still be using this directory to avoid JID collisions). The documentation also says that the local cache can be a burden for large deployments. See bsc#1044133 Signed-off-by: Miquel Sabaté Solà ------------------------------------------------------------------- Tue Jul 11 14:06:57 UTC 2017 - containers-bugowner@suse.de - Commit 31ad98d by Michal Jura mjura@suse.com Don't duplicate log level argument for k8s services, bsc#1046407 ------------------------------------------------------------------- Tue Jul 11 12:54:53 UTC 2017 - containers-bugowner@suse.de - Commit fcbfd6b by Michal Jura mjura@suse.com Make log level configurable for dockerd service, bsc#1046407 Set the logging level for dockerd, possible values are: [ debug, info, warn, error, fatal ] ------------------------------------------------------------------- Tue Jul 11 10:18:31 UTC 2017 - containers-bugowner@suse.de - Commit e3c9c21 by Kiall Mac Innes kiall@macinnes.ie Add Jenkinsfile The Jenkinsfile in each repo, if we adopt Jenkins in the end, will be very thin, including just a single library load, and a single method call. This prevents us from needing to keep each projects Jenkinsfile in sync as CI changes are made. ------------------------------------------------------------------- Mon Jul 10 20:59:33 UTC 2017 - containers-bugowner@suse.de - Commit 08a0960 by Kiall Mac Innes kiall@macinnes.ie Revert "Set MySQL as the job cache for the Salt master" This reverts commit de22c660a99bc1425295c86be7d7dc3e79089845. ------------------------------------------------------------------- Mon Jul 10 12:57:44 UTC 2017 - containers-bugowner@suse.de - Commit de22c66 by Miquel Sabaté Solà msabate@suse.com Set MySQL as the job cache for the Salt master First of all, we can specify an external job cache. If we don't do that, then the `keep_jobs` option only applies to the local cache. This means that Salt will not clean up jobs, events and returns older than the specified `keep_jobs` value (default: 24h) for the MySQL returner that we have already configured. Moreover, since we'd already be using MySQL as a job cache, we don't have to use the local system (/var/cache/salt/master/jobs/) as a cache (note that Salt would still be using this directory to avoid JID collisions). The documentation also says that the local cache can be a burden for large deployments. See bsc#1044133 Signed-off-by: Miquel Sabaté Solà ------------------------------------------------------------------- Fri Jul 7 09:34:03 UTC 2017 - containers-bugowner@suse.de - Commit d2df0ed by Rafael Fernández López ereslibre@ereslibre.es When generating the certificate use the pillar path Since we added the minion certificate location to the pillar, also take the public key location from the pillar, or the certificate generation will fail if the pillar value changes. ------------------------------------------------------------------- Fri Jul 7 09:31:58 UTC 2017 - containers-bugowner@suse.de - Commit ce45c56 by Rafael Fernández López ereslibre@ereslibre.es Remove unneeded signing policies These signing policies were used when the CA wasn't containerized, when we containerized it, they were moved to `caasp-container-manifests`, and the CA container is mounting it from there. If we uncontainerize the CA in the future we can move it back if needed, but let's keep this clean so it's not misleading. ------------------------------------------------------------------- Fri Jul 7 08:11:40 UTC 2017 - containers-bugowner@suse.de - Commit 871a9dc by Michal Jura mjura@suse.com Fix JINJA escaping for docker_opts in docker state module ------------------------------------------------------------------- Thu Jul 6 12:59:46 UTC 2017 - containers-bugowner@suse.de - Commit 2bd42f5 by Rafael Fernández López ereslibre@ereslibre.es Add prerequisite for key to be present on `cert` sls Add a specific dependency for the key to be present when generating the certificate for the minion. ------------------------------------------------------------------- Thu Jul 6 12:57:41 UTC 2017 - containers-bugowner@suse.de - Commit eb852df by Rafael Fernández López ereslibre@ereslibre.es Add kubectl client certificate This certificate will be served by Velum when downloading the `kubeconfig` file, and is specific for that usage. Fixes: bsc#1046963 ------------------------------------------------------------------- Fri Jun 30 10:24:19 UTC 2017 - containers-bugowner@suse.de - Commit 9950702 by Kiall Mac Innes kiall@macinnes.ie Ensure bootstrap_complete grain is set At the time this if block is called, the mine / grains sync hasn't happened yet. This reverts a change from commit fc8347c (bsc#1043589) ------------------------------------------------------------------- Fri Jun 30 10:16:13 UTC 2017 - containers-bugowner@suse.de - Commit 5e7c46f by Michal Jura mjura@suse.com Define etcdctl config file with SSL variables Let's add /etc/sysconfig/etcdctl with paths to the client server TLS files and endpoint. This will make possible to run etcdctl command in easy way, e.g. source /etc/sysconfig/etcdctl etcdctl cluster-health fixes bsc#1046818 ------------------------------------------------------------------- Fri Jun 30 09:34:50 UTC 2017 - containers-bugowner@suse.de - Commit 15748cd by Flavio Castelli fcastelli@suse.com Handle curl proxy settings YaST is also configuring proxy settings inside of `/root/.curlrc`, this is needed because zypper is using libcurl. So if you run zypper from a cronjob or `su`, the `/etc/sysconfig/proxy` variables are not parsed and set in the environment. Which means, zypper will not use the proxy and fail. With `/root/.curlrc`, libcurl will use the proxies configured there. Signed-off-by: Flavio Castelli ------------------------------------------------------------------- Thu Jun 29 17:11:15 UTC 2017 - containers-bugowner@suse.de - Commit fc8347c by Rafael Fernández López ereslibre@ereslibre.es Enable TLS on the salt-api service Fixes: bsc#1043589 ------------------------------------------------------------------- Thu Jun 29 16:26:45 UTC 2017 - containers-bugowner@suse.de - Commit 465a4d6 by Kiall Mac Innes kiall@macinnes.ie Add proxy state to admin node Installs proxies onto the admin node - bsc#1043538 Commit a16c19e by Kiall Mac Innes kiall@macinnes.ie Disable rebootmgr on admin node Once the system bootstraps, we now disable rebootmgr on the admin node. This allows the velum initiated updates to takeover and prevent any unexpected surprises. bsc#1046602 Commit ef8ba5b by Kiall Mac Innes kiall@macinnes.ie Render /etc/hosts on admin node Render the /etc/hosts file on the admin node, so nodes are reacable via their internal FQDNs everywhere. Additionally, include the admin node in the /etc/hosts files. bsc#1045186 ------------------------------------------------------------------- Thu Jun 29 13:04:10 UTC 2017 - containers-bugowner@suse.de - Commit eadd8e1 by Kiall Mac Innes kiall@macinnes.ie Increase salt-master timeout When dealing with a large number of minions, timeouts are visible when using the default value of 5 seconds. Increasing the CPU/RAM resources allocated to the master helps, but given it it's short bursts of heavy usage (bootstrap and upgrade), this shouldn't be necessary. We increase the timeout from 5 to 20 seconds, allowing tasks to take longer yet still succeed. ------------------------------------------------------------------- Wed Jun 28 15:36:29 UTC 2017 - containers-bugowner@suse.de - Commit 3f2c44b by Graham Hayes graham.hayes@suse.com bsc#1045381 Ensure updates do not conflict with etc-hosts This ensure that the etc-hosts orchestration does not run during an upgrade, as this can cause conflicts on the nodes, which cause salt to fail to complete an `orch.update` run. ------------------------------------------------------------------- Tue Jun 27 10:45:02 UTC 2017 - containers-bugowner@suse.de - Commit 5f492f9 by Graham Hayes graham.hayes@suse.com Turn off `auto_accept` ------------------------------------------------------------------- Mon Jun 26 18:56:16 UTC 2017 - containers-bugowner@suse.de - Commit 197d164 by Michal Jura mjura@suse.com Enable etcd authentication based on client certificates Enable ETCD_CLIENT_CERT_AUTH and ETCD_PEER_CLIENT_CERT_AUTH in etcd-proxy state module. - Enable client cert authentication ETCD_CLIENT_CERT_AUTH="true" - Enable peer client cert authentication. ETCD_PEER_CLIENT_CERT_AUTH="true" Commit 970a590 by Michal Jura mjura@suse.com Use Kubernetes API server etcd ssl Commit 776bf33 by Michal Jura mjura@suse.com Enable https for flanneld service Commit b762959 by Michal Jura mjura@suse.com Add ssl pillar profile Commit 07a5652 by Michal Jura mjura@suse.com Enable https for etcd-proxy services All these fixes bsc#1043595 ------------------------------------------------------------------- Fri Jun 23 13:42:40 UTC 2017 - containers-bugowner@suse.de - Commit a567814 by Kiall Mac Innes kiall@macinnes.ie Ensure CA fields are static (bsc#1045766) As the DHCP domain name can change, we should avoid using it in our CA cert in order to prevent it being unnecessarily regenerated. Fixes bsc#1045766 ------------------------------------------------------------------- Thu Jun 22 16:40:48 UTC 2017 - containers-bugowner@suse.de - Commit 9e20d89 by Alvaro Saurin alvaro.saurin@gmail.com Option for using the proxy settings system-wide (bsc#1036627) ------------------------------------------------------------------- Wed Jun 21 14:47:21 UTC 2017 - containers-bugowner@suse.de - Commit 5042479 by Rafael Fernández López ereslibre@ereslibre.es Do not run etcd discovery on every orchestration run, only the first time When adding new nodes, the `orch.kubernetes` orchestration was failing because etcd is refusing to start since the etcd discovery mechanism was already used when bootstrapping the cluster. With this change we ensure that we use the discovery mechanism only when we are boostrapping the cluster. ------------------------------------------------------------------- Tue Jun 20 16:21:12 UTC 2017 - containers-bugowner@suse.de - Commit e51791e by Kiall Mac Innes kiall@macinnes.ie Set etcd batch size to 3 nodes Currently, we never ask for more than 3 members. Setting this to 3 ensures we don't let more than 3 members attempt etcd discovery before a cluster has been fully formed. If we have less this 3, this will still succeed, as the exact number of members we expect will also end up attempting discovery at the same time. ------------------------------------------------------------------- Tue Jun 20 13:35:24 UTC 2017 - containers-bugowner@suse.de - Commit a13010e by Rafael Fernández López ereslibre@ereslibre.es Do not fail if `salt.function` has no minions to target Currently, `update-etc-hosts` orchestration fails because `update_mine` `salt.function` cannot target any minions at the beginning, and since this is a prerequisite for other states, the Reactor orchestration fails. Only call to these `salt.function` if there are any minions to target. ------------------------------------------------------------------- Fri Jun 16 11:50:44 UTC 2017 - containers-bugowner@suse.de - Commit d2f8840 by Rafael Fernández López ereslibre@ereslibre.es Add missing `tgt_type` so we target the minions we intend to This last step on the orchestration was returning a `False` result because no targets were found to execute the grain set. ------------------------------------------------------------------- Fri Jun 16 08:52:34 UTC 2017 - containers-bugowner@suse.de - Commit 9ddaa5a by Flavio Castelli fcastelli@suse.com salt-api: listen to localhost [bsc#1043589] Do not expose the salt-api to the entire world. This is needed only by Velum to trigger salt actions. Given both the containers use the same network namespace we can just bind this service to localhost. By doing that we are going to reduce the attack surface. This fixes one of the two issues reported by bsc#1043589 Signed-off-by: Flavio Castelli ------------------------------------------------------------------- Thu Jun 15 13:56:56 UTC 2017 - containers-bugowner@suse.de - Commit a99d516 by Aishwarya Thangappa aishwarya.thangappa@gmail.com Making the cluster-dns and cluster-domain arguments default Right now, caasp doesn't support kube-dns out of the box. If customers wanted to have dns support, they have to bring it up on their own by using `kubectl create -f kubedns.yaml`. But this will not work until you add the cluster-dns and cluster-domain arguments to kubelet args and restart the kubelet. While doing this manually in every node is one pain point, salt will try to bring it back to its original state. Meaning that the changes you made to the kubelet args will no longer be there. So, unless you bring up the caasp cluster with the addon set to true, you cannot have kube-dns working reliably on the cluster. This change will make it a little easier, by having these arguements by default in every node. ------------------------------------------------------------------- Wed Jun 14 18:46:11 UTC 2017 - containers-bugowner@suse.de - Commit 706837b by Graham Hayes graham.hayes@suse.com Ensure that reactor states only run on completed nodes This ensures that we do not run reactor orchestrations on nodes that have not completed bootstrapping. This ensures that a node cannot have 2 states applied to it at the same time. ------------------------------------------------------------------- Wed Jun 14 17:10:03 UTC 2017 - containers-bugowner@suse.de - Commit e44cf82 by Kiall Mac Innes kiall@macinnes.ie Remove concurrent=True from orchestrations Salt's documentation calls this option out as dangerous, staging that the state must be able to be ran concurrently. This is not something we can reasonably ensure works, so lets not use it. From Salt's documentation: This flag is potentially dangerous. It is designed for use when multiple state runs can safely be run at the same time. Do not use this flag for performance optimization. ------------------------------------------------------------------- Wed Jun 14 17:09:04 UTC 2017 - containers-bugowner@suse.de - Commit 3fd0d08 by Kiall Mac Innes kiall@macinnes.ie Refresh grains at the start of orchestration Additionally, refresh pillars at the start of update-etc-hosts.sls for consistency. ------------------------------------------------------------------- Wed Jun 14 10:41:27 UTC 2017 - containers-bugowner@suse.de - Commit 7d0a037 by Graham Hayes graham.hayes@suse.com Update transactional-update to use "salt" option This will ensure that the transactional-update code will write a grain (`tx_update_reboot_needed:true`) on the node instead of rebooting the node. This also allows for increasing the frequency of the snapshots being built ------------------------------------------------------------------- Tue Jun 13 15:42:26 UTC 2017 - containers-bugowner@suse.de - Commit 91d649f by Alvaro Saurin alvaro.saurin@gmail.com React to IP changes by using beacons ------------------------------------------------------------------- Mon Jun 12 14:00:19 UTC 2017 - containers-bugowner@suse.de - Commit 53e389f by Rafael Fernández López ereslibre@ereslibre.es Only run `service.dead` on salt minions that we know support it. The `ca` container was reporting this error during the orchestration: ``` service.dead { "__run_num__": 0, "_stamp": "2017-06-12T10:33:29.009340", "changes": {}, "comment": "State 'service.dead' was not found in SLS 'rebootmgr' Reason: 'service' __virtual__ returned False: No service execution module loaded: check support for service management on SLES-12 ", "name": "rebootmgr", "result": false, "retcode": 2 } ``` Also, the overall result of the orchestration was not successfully (despite individual highstates reported success) because of this. Containers don't have `systemctl` available, so `salt` doesn't know how to handle this. Right now, rely on our roles for doing this (despite we could have used `virtual` grain -- but for some reason a container reports `physical`, which doesn't help) -- at least with the `salt` version we are currently using. The orchestration result overall looks like this with this change: ``` "outputter": "highstate", "retcode": 0 }, "success": true, "user": "saltapi" } ``` ------------------------------------------------------------------- Mon Jun 12 10:43:45 UTC 2017 - containers-bugowner@suse.de - Commit 0cd2559 by Graham Hayes graham.hayes@suse.com Batch runs of the `cert` state This allows more nodes to be deployed without causing timeouts and failed runs on the `cert` state. Also, remove concurrecny from the etcd member and proxy to ensure members are created before proxies bsc#1038814 ------------------------------------------------------------------- Fri Jun 9 16:50:30 UTC 2017 - containers-bugowner@suse.de - Commit 9b3652a by Kiall Mac Innes kiall@macinnes.ie Revert "Add module for removing etcd cluster members" - bsc#1043676 This reverts commit 27a4e81c331dc345e56266a57c5dcd86d1c1a177 Commit befe0b5 by Kiall Mac Innes kiall@macinnes.ie Revert "Add etcd_info salt grain module" - bsc#1043676 This reverts commit da17af3f0f9cb89a9057618b7561074a4e35818e. ------------------------------------------------------------------- Wed Jun 7 14:15:15 UTC 2017 - containers-bugowner@suse.de - Commit 4132fa9 by Rafael Fernández López ereslibre@ereslibre.es Remove hardcoded secrets ------------------------------------------------------------------- Wed Jun 7 08:31:10 UTC 2017 - containers-bugowner@suse.de - Commit 27a4e81 by Michal Jura mjura@suse.com Add module for removing etcd cluster members ------------------------------------------------------------------- Tue Jun 6 21:17:34 UTC 2017 - containers-bugowner@suse.de - Commit 40d8e9b by Robert Roland robert.roland@suse.com Fixing broken build Need to remove a reference to /var/lib/etcd if salt isn't managing it anymore ------------------------------------------------------------------- Tue Jun 6 15:38:43 UTC 2017 - containers-bugowner@suse.de - Commit 1100cfe by Graham Hayes graham.hayes@suse.com Stop managing /var/lib/etcd in salt This dir is created by the etcd rpm, and permissions are maintained by etcd when it is running The salt and etcd disagree an what these permissions are causing extra "changed" entries. As etcd is changing them to what it needs, and the directory is created by etcd (and its RPM) we should not try and manage it. ------------------------------------------------------------------- Tue Jun 6 11:40:55 UTC 2017 - containers-bugowner@suse.de - Commit 26fa83b by Jordi Massaguer Pla jmassaguerpla@suse.de use git revision in package version this way zypper sees each new commit as an update Otherwise, using the date, will create a conflict if 2 commits are from the same day Signed-off-by: Jordi Massaguer Pla ------------------------------------------------------------------- Fri Jun 2 19:43:18 UTC 2017 - containers-bugowner@suse.de - Commit e706873 by Michal Jura mjura@users.noreply.github.com Enable https for all services and create dedicated ssl pillar profile (#86) * Enable https for etcd-proxy services * Enable https for flanneld service * Add ssl pillar profile * Use Kubernetes API server etcd ssl ------------------------------------------------------------------- Fri Jun 2 18:47:43 UTC 2017 - containers-bugowner@suse.de - Commit da17af3 by Michal Jura mjura@suse.com Add etcd_info salt grain module To maintaine etcd cluster configuration by salt, it is needed to get etcd status about members and their roles in etcd cluster. This etcd_info grain module provides followind information: - 'etcd_module' - return "available" if python-etcd module is installed - 'members_all' - return list of all members in etcd cluster - 'member_type' - return role of local etcd service, possible values "proxy", "member", "leader" - 'member_id' - return unique id of local etcd service in the cluster This grain module will be used by salt_delete state module for removing etcd nodes from the cluster. To run this module is required to install following packages: - python-etcd - python-urllib3 - python-dnspython ------------------------------------------------------------------- Fri Jun 2 15:34:22 UTC 2017 - containers-bugowner@suse.de - Commit 7031d71 by Victor Palade vpalade@suse.com disable reboot manager when orchestration happens ------------------------------------------------------------------- Fri Jun 2 09:27:08 UTC 2017 - containers-bugowner@suse.de - Commit 9815b3b by Rafael Fernández López ereslibre@ereslibre.es Ensure our states are idempotent - Adapt some `cmd.run` to use `onchanges`, so they only execute when their `watched` states change. - Add `stateful: True` to some `cmd.run`s, so following the salt protocol for this we ensure that the command didn't change anything in the system state. - Move `ca-cert` to its own SLS, so `cert` will only now generate the `/etc/pki/minion.{key,crt}` files. - The `cert` SLS will now be the only responsible for generating certificates depending on the role of the machine. This way we ensure that without mattering how this SLS is included it behaves in the same way under all conditions. We might want to use a certificate for different services, but that will need some extra changes. - Change some `module.run` to `module.wait` so they only execute when the `watched` states change. - Remove cleanups that make it impossible to have idempotent states. ------------------------------------------------------------------- Fri Jun 2 07:40:33 UTC 2017 - containers-bugowner@suse.de - Commit c0667e3 by Kiall Mac Innes kiall@macinnes.ie Don't change the system hostname Operators don't want us to change the system hostname, which we previously did to account for environments which don't provide unique DHCP hostnames. We'll undo this change, as we have now removed our reliance on the system default hostname. Fixes bsc#1041789 ------------------------------------------------------------------- Thu Jun 1 11:23:18 UTC 2017 - containers-bugowner@suse.de - Commit 86ae430 by Alvaro Saurin alvaro.saurin@gmail.com Update the /etc/hosts by using a loop, so the file doesn not grow indefinetively. Do not set the IP address for API server in the API servers to 127.0.0.1 Commit acb76f3 by Alvaro Saurin alvaro.saurin@gmail.com Add the kubelet port configurable with a Pillar variable Open the kubelet port in the firewall ------------------------------------------------------------------- Thu Jun 1 11:14:15 UTC 2017 - containers-bugowner@suse.de - Commit 8bc25b2 by Kiall Mac Innes kiall@macinnes.ie Add a caasp_fqdn grain and migrate to it This adds a caasp_fqdn grain and migrates usage of fqdn to it. This is needed because the fqdn grain has proved unrelable, where we know *exactly* what we want, and salt's detection will be broken by a upcoming change. Partial fix for bsc#1041789 ------------------------------------------------------------------- Thu Jun 1 09:29:01 UTC 2017 - containers-bugowner@suse.de - Commit 7f7d9aa by Graham Hayes graham.hayes@suse.com Initial framework of update orchestration ------------------------------------------------------------------- Thu Jun 1 09:28:05 UTC 2017 - containers-bugowner@suse.de - Commit 631ea1d by Kiall Mac Innes kiall@macinnes.ie Allow for clean shutdown of nodes Add a stop SLS for each service we wish to shutdown clearly, doing any necessary pre-stop actions such as draining kubelet. ------------------------------------------------------------------- Tue May 30 15:51:55 UTC 2017 - containers-bugowner@suse.de - Commit d8ce355 by Rafael Fernández López ereslibre@ereslibre.es Do not include etcd-proxy on this last action This triggers a chain reaction when the reboot sls is called directly (salt-call state.apply reboot) on the last step of the orchestration, since etcd-proxy includes etcd, and etcd includes cert. Cert sls will generate a new certificate overriding the current one with all the correct DNS names and IP addresses, by one that only contains `fqdn` as the only dns name. Fixes: bsc#1040858 ------------------------------------------------------------------- Mon May 29 15:25:59 UTC 2017 - containers-bugowner@suse.de - Commit daadead by Rafael Fernández López ereslibre@ereslibre.es - Make cert always include `fqdn` - - The only component that was adding `fqdn` to the list of dns names of SAN - certificates is the `kube-master` role. - - However, depending on the size of the cluster and other possible reasons it - might happen that a etcd member falls in a `kube-minion` instance, where the - certificate is missing local ip addresses, as well as the `fqdn` of the - machine. With this change, we are enforcing `cert` to always generate this - information automatically, while we still allow to extend it, in case that's - still necessary (for example, as kubernetes-master still requires). - - Check https://bugzilla.novell.com/show_bug.cgi?id=1039269#c9 for further - information. - - Fixes: bsc#1039269 ------------------------------------------------------------------- Fri May 26 14:52:45 UTC 2017 - containers-bugowner@suse.de - Commit ce5954e by Alvaro Saurin alvaro.saurin@gmail.com - Minor changes in etcd: do not remoove /var/lib/etcd and close some ports we - don't really need ------------------------------------------------------------------- Thu May 25 11:12:23 UTC 2017 - containers-bugowner@suse.de - Commit 7317ca8 by Miquel Sabaté Solà msabate@suse.com - docker: reload container-feeder after starting docker - - See bsc#1040579 - - Signed-off-by: Miquel Sabaté Solà ------------------------------------------------------------------- Tue May 23 06:57:51 UTC 2017 - containers-bugowner@suse.de - Commit 6013d74 by Robert Roland rob.roland@gmail.com - Update etcd.conf - - Stray + character was causing this line to not execute, and I ended up with a - cluster with both folders present, preventing etcd from starting. ------------------------------------------------------------------- Mon May 22 16:34:25 UTC 2017 - containers-bugowner@suse.de - Commit 824101b by Alvaro Saurin alvaro.saurin@gmail.com - Fix some problems with Docker when HTTP proxy vars are empty ------------------------------------------------------------------- Thu May 18 20:18:33 UTC 2017 - containers-bugowner@suse.de - Commit 4f664e1 by PI-Victor palade.ionut@gmail.com - revert changes to etcd systemd drop-in unit ------------------------------------------------------------------- Thu May 18 15:45:05 UTC 2017 - containers-bugowner@suse.de - Commit bace710 by Rafael Fernández López ereslibre@ereslibre.es - Add apiserver main hostname - - Fixes: bsc#1039437 ------------------------------------------------------------------- Thu May 18 14:58:30 UTC 2017 - containers-bugowner@suse.de - Commit 88c1434 by Michal Jura mjura@suse.com - Configure ETCD_INITIAL_ADVERTISE_PEER_URLS only with FQDN - - We have to remove IP based ETCD_INITIAL_ADVERTISE_PEER_URLS, because they use - HTTPS, which is failing for IP URLS with following error - - health check for peer 100fbbb05571e58f could not connect: x509: - cannot validate certificate for 10.17.3.176 because it doesn't contain any - IP SANs ------------------------------------------------------------------- Thu May 18 10:49:25 UTC 2017 - containers-bugowner@suse.de - Commit fcc6f23 by Alvaro Saurin alvaro.saurin@gmail.com - Handle proxies in the docker daemon ------------------------------------------------------------------- Tue May 16 11:54:01 UTC 2017 - containers-bugowner@suse.de - Use colons as nesting instead of dots ------------------------------------------------------------------- Tue May 16 10:16:21 UTC 2017 - containers-bugowner@suse.de - Do a deeper cleanup before restarting etcd Some etcd deps Take flannel setup out of the master Perform flannel setup before k8s master setup ------------------------------------------------------------------- Thu May 11 16:21:50 UTC 2017 - containers-bugowner@suse.de - bump number of worker threads * to avoid minion calls to master timing out * fixes https://github.com/kubic-project/salt/issues/62 ------------------------------------------------------------------- Mon May 8 12:01:03 UTC 2017 - containers-bugowner@suse.de - Initial config files for the reactor, with an example sls for presence ------------------------------------------------------------------- Tue May 2 16:27:17 UTC 2017 - containers-bugowner@suse.de - Renamed docker registry variable ------------------------------------------------------------------- Tue May 2 13:54:42 UTC 2017 - containers-bugowner@suse.de - Update etcd member count logic ------------------------------------------------------------------- Tue May 2 11:17:43 UTC 2017 - containers-bugowner@suse.de - Cleanup the docker options ------------------------------------------------------------------- Thu Apr 27 16:06:38 UTC 2017 - containers-bugowner@suse.de - Set Hostname to match machine-id ------------------------------------------------------------------- Thu Apr 27 15:30:49 UTC 2017 - containers-bugowner@suse.de - Fix Jinja2 syntax error in kubelet.jinja ------------------------------------------------------------------- Thu Apr 27 15:22:23 UTC 2017 - containers-bugowner@suse.de - Fix Jinja2 syntax error in kubeconfig.jinja ------------------------------------------------------------------- Thu Apr 27 14:26:13 UTC 2017 - containers-bugowner@suse.de - Use some constant names for the API server ------------------------------------------------------------------- Thu Apr 27 14:12:12 UTC 2017 - containers-bugowner@suse.de - Use machine ID and domain as kubelet hostname ------------------------------------------------------------------- Thu Apr 27 14:09:10 UTC 2017 - containers-bugowner@suse.de - Update default etcd cluster size to match number of masters ------------------------------------------------------------------- Thu Apr 27 08:50:26 UTC 2017 - containers-bugowner@suse.de - Configure kube-{scheduler/controller-manager} leader elections ------------------------------------------------------------------- Tue Apr 25 12:20:40 UTC 2017 - containers-bugowner@suse.de - [WIP] Use machine ID as kubelet hostname ------------------------------------------------------------------- Mon Apr 24 16:00:34 UTC 2017 - containers-bugowner@suse.de - Replace the SVGs by PNGs ------------------------------------------------------------------- Mon Apr 24 15:55:29 UTC 2017 - containers-bugowner@suse.de - Some docs ------------------------------------------------------------------- Wed Apr 19 15:17:16 UTC 2017 - containers-bugowner@suse.de - Cleanup ------------------------------------------------------------------- Wed Apr 19 10:55:32 UTC 2017 - containers-bugowner@suse.de - Do not assume minion_id is hostname/fqdn ------------------------------------------------------------------- Tue Apr 18 09:43:21 UTC 2017 - containers-bugowner@suse.de - Allow the kubelet to run on Kubernetes 1.6 ------------------------------------------------------------------- Mon Apr 10 08:47:59 UTC 2017 - containers-bugowner@suse.de - Bug 1032379 - Must install flanneld on the kubernetes master node ------------------------------------------------------------------- Wed Mar 29 08:24:13 UTC 2017 - containers-bugowner@suse.de - Actually use `grains.get` default value ------------------------------------------------------------------- Tue Mar 28 18:17:25 UTC 2017 - containers-bugowner@suse.de - Always set `CN`. Even if no grains are set (because the domain could not be inferred), set the default dns domain from the pillar. ------------------------------------------------------------------- Tue Mar 28 16:13:12 UTC 2017 - containers-bugowner@suse.de - Fix etcd deps ------------------------------------------------------------------- Tue Mar 28 13:41:42 UTC 2017 - containers-bugowner@suse.de - Make etcd state a requirement for states that need etcd running on localhost ------------------------------------------------------------------- Mon Mar 27 15:53:38 UTC 2017 - containers-bugowner@suse.de - Do not indent (it's not a mine_function) ------------------------------------------------------------------- Mon Mar 27 14:10:38 UTC 2017 - containers-bugowner@suse.de - Fixed the infra container path for CaaSP ------------------------------------------------------------------- Mon Mar 27 13:00:42 UTC 2017 - containers-bugowner@suse.de - Do not set certificate `CN` if domain was not specified by a grain ------------------------------------------------------------------- Thu Mar 23 09:36:30 UTC 2017 - containers-bugowner@suse.de - Added parameters for passing extra arguments ------------------------------------------------------------------- Tue Mar 21 13:39:37 UTC 2017 - containers-bugowner@suse.de - Renamed API server vars ------------------------------------------------------------------- Mon Mar 20 15:48:33 UTC 2017 - containers-bugowner@suse.de - fix infra container image (=pause image) for opensuse ------------------------------------------------------------------- Mon Mar 20 12:32:44 UTC 2017 - containers-bugowner@suse.de - pod_infra_container_image is not optional anymore ------------------------------------------------------------------- Mon Mar 20 12:06:18 UTC 2017 - containers-bugowner@suse.de - Revert 6bae304 and fe1677c ------------------------------------------------------------------- Mon Mar 20 12:01:16 UTC 2017 - containers-bugowner@suse.de - fix etcd proxy instance failure on restart ------------------------------------------------------------------- Mon Mar 20 09:46:20 UTC 2017 - containers-bugowner@suse.de - Renamed API server vars ------------------------------------------------------------------- Fri Mar 17 10:15:57 UTC 2017 - containers-bugowner@suse.de - packaging: fix name of tarball directory ------------------------------------------------------------------- Fri Mar 17 09:56:00 UTC 2017 - containers-bugowner@suse.de - packaging: fix name of tarball directory ------------------------------------------------------------------- Fri Mar 17 09:02:45 UTC 2017 - containers-bugowner@suse.de - packaging: fix name of tarball directory ------------------------------------------------------------------- Thu Mar 9 12:33:22 UTC 2017 - jmassaguerpla@suse.com - Disable service as it needs to be this way in the final repo ------------------------------------------------------------------- Fri Mar 3 15:49:42 UTC 2017 - alvaro.saurin@suse.com - Updated for CaaSP ------------------------------------------------------------------- Thu Feb 23 11:47:37 UTC 2017 - alvaro.saurin@suse.com - Updated for k8s 1.5.3 ------------------------------------------------------------------- Thu Feb 23 10:09:27 UTC 2017 - alvaro.saurin@suse.com - Initial version