Non-Invasive Performance Intelligence for High-Speed Network Systems
Details
Abstract: Modern cloud and telecom networks increasingly run high-speed packet processing as software on commodity servers. While flexible, these systems often suffer unpredictable throughput and latency degradation under shared-resource contention, yet traditional monitoring based on packet inspection and in-band instrumentation is costly and operationally heavy. This talk presents a non-invasive performance intelligence approach that uses low-level hardware and system telemetry (e.g., CPU/cache counters and OS signals) to infer service-level KPIs without touching the dataplane. I will describe a two-layer architecture: an analytics layer with pluggable surrogate models for instantaneous KPI estimation and short-horizon forecasting, and an MLOps layer that automates drift detection, model selection, retraining, and safe rollout. The design is built to adapt across deployment environments, including settings where hardware counter access is limited, and to scale model lifecycle operations through distributed execution. The ultimate goal is to combine lightweight telemetry and tailored data-driven analytics to derive performance insights of high-speed network systems.
Speaker: Tianzhu Zhang is a research scientist at Nokia Bell Labs. He received his B.S. degree from Huazhong University of Science and Technology in China in 2012. He received his M.S. and Ph.D. degrees in 2014 and 2017 from Politecnico di Torino, Italy. From 2017 to 2019, he was a PostDoc researcher at Telecom ParisTech. His research interests center around applied AI/ML for network systems.
