DataTalks HFA #17: Efficiency and Generalization in Modern Neural Networks

Name: DataTalks HFA #17: Efficiency and Generalization in Modern Neural Networks
Start: 2024-09-09T15:30:00+03:00
End: 2024-09-09T17:30:00+03:00
Location: Tech&Talk MATAM

Hosted by Bar Eini P. and Tom B.

Meet the group

Datahack Haifa

No reviews yet

Details

📢 Dive into the enigmatic world of modern neural networks at our upcoming Datahack Haifa meetup! Hosted by our great friends at Gav-Yam MATAM, this event features talks by leading experts Prof. Daniel Soudry and Dr. Brian Chmiel. They will share their research and practical insights in two areas: efficiency - how to reduce computational demands while optimizing neural networks, and generalization—how deep networks manage to perform well on unseen data despite their complexity.

15:30 MPB (mingling, pizza, beer)
16:00 Prof. Daniel Soudry
16:45 Dr. Brian Chmiel

♦ Location: Tech&Talk MATAM
♦ Language: The event will be held in Hebrew
♦ Background: Basic knowledge in data science and machine learning is advised
------------------------------------------------------------------------------------------------
Abstracts:

🚀 Why do typical deep networks generalize well?
Deep neural networks models keep growing larger. However, it is not clear how such large ("over-parameterized") models can work at all, as classical ('worst case') theory tells us such models should overfit and not generalize well to unseen data. Interestingly, we prove that random models that fit the data typically generalize well. This remains true even if they are over-parameterized, as long as the labels are generated using a "narrow teacher".

Daniel Soudry is an associate professor and Schmidt Career Advancement Chair in AI in the Electrical and Computer Engineering Department at the Technion, working in the areas of machine learning and neural networks. His recent works focus on resource efficiency and implicit bias in neural networks. He is a member of Israel’s Young Academy, and the recipient of the Gruss Lipper fellowship, the Goldberg Award, the ERC starting grant, and Intel's Rising Star Faculty Award.

🤖 Overcoming 4 bit quantization challenges in Large Language Models training.
In recent years, natural language processing (NLP) has been transformed by large language models (LLMs), which excel in contextual understanding and reasoning as their size increases. However, the growth of these models comes with significant computational demands, making training and inference highly resource-intensive. To address this, quantization has emerged as a key technique, reducing memory usage by lowering the bit widths of model components without sacrificing performance. Despite its potential, standard 4-bit quantization formats have struggled with LLM training due to the long-tail distribution of neural gradients and the need for unbiased gradients for effective stochastic gradient descent (SGD). To overcome this, we propose a modified 4-bit format that addresses these challenges, enabling successful LLM training with comparable results to higher-bit representations.

Brian is a researcher at Habana Labs-Intel, specializing in deep learning and optimization. He has authored several papers in top conferences like NeurIPS, ICLR, and ICML, garnering hundreds of citations. Brian earned his PhD from the Technion, where he was supervised by Prof. Daniel Soudry and Prof. Alex Bronstein

Events in Haifa, IL

Artificial Intelligence

Machine Learning

Data Science

Statistical Modeling

DataTalks HFA #17: Efficiency and Generalization in Modern Neural Networks

Datahack Haifa

Details

Members are also interested in