Loading…

线上峰会
12月9-10日
了解更多信息注册参加

Sched 应用程式允许你建立你的日程表,但不能代替你的活动注册。你必须注册 2021年中国 KubeCon + CloudNativeCon + Open Source Summit - 线上峰会 才能参加会议。如果你还没有注册但想加入我们,请到活动注册页面购票注册。

请注意:此日程表自动显示为中国标准时间(UTC +8)。要想看到您选择的时区,请从右侧 「Filter by Date」上方的下拉菜单中选择。日程表可能会有变动。


Virtual
December 9-10
Learn More and Register to Attend

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for KubeCon + CloudNativeCon + Open Source Summit China 2021 - Virtual to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

Please note: This schedule is automatically displayed in China Standard Time (UTC +8). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date." The schedule is subject to change.
Back To Schedule
Friday, December 10 • 12:10 - 12:45
Kubernetes 上的 Vivo 人工智能计算平台 | Vivo's AI Computing Platform on Kubernetes - Ziyang Wu, Vivo

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.
Vivo 是世界上最大的智能手机公司之一。人工智能实验室的数百名工程师和研究人员在 NLP、CV、推荐、演讲等各个领域工作,带来了各种各样复杂的模型训练和服务案例。人工智能计算平台的建立是为了解决两大挑战:1.为大规模分布式模型培训和服务提供有效的资源调度。2.实现计算资源的高利用率,特别是昂贵的 GPU 设备。今天,该平台有几个生产集群,数千个 GPU 节点和数百个 GPU 节点。每天会部署数百个服务,运行数百个 ML 作业。这一节将讨论如何使用 Kubernetes、kube-batch、kubeflow 和其他开源软件构建平台。它还将涵盖他们遇到的问题,来之不易的最佳实践和他们对开源社区的贡献。

Vivo is one of the biggest smartphone companies in the world. Hundreds of engineers and researchers of AI Lab are working on various areas like NLP, CV, recommendation, speech, etc., which bring various and complicated cases of model training and serving. The AI computing platform is built to address two major challenges: 1. Provide efficient scheduling of resources for massively distributed model training and serving. 2. Achieve high utilization of computing resources, especially expensive GPU devices. Today the platform has several clusters on production, thousands of GPU nodes and hundreds of GPU nodes. Hundreds of services are deployed and hundreds of ML jobs are run every day. This session will cover how the platform is built with Kubernetes, kube-batch, kubeflow, and other OSS. It will also cover the issues they ran into, the hard-earned best practices and the contribution they made to the open-source community.

Speakers
ZW

Ziyang Wu

Staff Engineer, vivo
Ziyang is a staff engineer of vivo AI lab and is leading the engineering effort at vivo AI computing platform. Prior to vivo, Ziyang worked for Rancher and Oracle. He is active in cloud native community and is the contributor of kube-batch、tf-operator etc..


Friday December 10, 2021 12:10 - 12:45 CST
Kubecon + CloudNativeCon 演讲厅