Please note: This schedule is automatically displayed in China Standard Time (UTC +8). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date." The schedule is subject to change.
Are you curious about whether your kubernetes cluster can meet the performance needs when new business requirements arrive? Recently, our kubernetes cluster has be evolved to meet the needs of with large-scale coming mixed long running workloads and offline bigdata/ML training jobs. This has allowed our kubernetes cluster to reach 15k nodes, making it one of the largest clusters in the community. In this talk, we will be presenting methods for managing extremely large-scale kubernetes cluster to cater the needs of business. The bottlenecks of performance are identified by real traffic analysis, simulation and performance testing. Based on that, we optimize kubernetes apiserver performance and reducing list/create/update/delete response time to meet the SLO. We’ll share some improvements we've made to apiserver side as well as the clients side, e.g. different operators. Also we'll cover some aspects of etcd performance.