Mixture-of-Experts is a powerful approach that leverages the strengths of multiple specialized models to tackle complex problems, offering improved performance and efficiency in various machine learning applications. In this talk, Dr. Lin will cover the basics of MoE and its recent development in large language models. Furthermore, he will briefly introduce the theory behind MoE under different setups to better understand its performance in practice.