***ATTENTION***: You need to register in the below link to attend the meetup
Register here: CLOSED
We are delighted to announce that our next meetup will be in cooperation with Maven Securities.
Special thanks to our sponsor for F&B.
Special Episode: This event comes with 2-hours of CPT certificate.
----------------------------------------------------------------------
Speaker 1: Yunbo Ni
Topic: Beyond Generic Coding Agents: Agentic Code Review for Critical Software Systems
Summary: Modern software infrastructure increasingly depends on large, complex, and fast-evolving codebases. Compilers are a representative example: they sit underneath trading systems, networking stacks, AI infrastructure, and many other performance-critical applications, yet subtle compiler bugs can silently propagate into downstream systems. In this talk, I will use compiler reliability as a case study to discuss how agentic code review can be designed for real-world, high-stakes software systems.
In this talk, I will introduce Archer, an agentic code review framework for LLVM. Instead of relying on a general-purpose coding agent, Archer combines domain-specific knowledge, compiler-aware reasoning, and a customized validation harness to inspect compiler changes and produce reproducible reports. In our evaluation, this specialized design is significantly more effective than general commercial review tools such as CodeRabbit and Greptile in identifying potential semantic issues in compiler optimizations.
Beyond compilers, this talk aims to share broader lessons on building effective agents for large systems: how to decompose expert workflows, how to design tool-using agents, how to build reliable validation harnesses, and why domain-specific agent infrastructure can outperform generic AI coding assistants.
Bio: He is a PhD student in Computer Science at The Chinese University of Hong Kong, advised by Prof. Shaohua Li. His research focuses on software engineering, programming languages, and AI-assisted developer tools. He is particularly interested in applying large language models and program analysis techniques to improve the reliability of complex software systems. His work has been published in top-tier international venues, including TOSEM and OOPSLA.
----------------------------------------------------------------------
Speaker 2 : Serafim Petrov
Topic: Leveraging XGBoost for Hierarchical Time Series Forecasting: A Practical Approach for Corporate Decision-Making
Summary: Accurate forecasting is of paramount importance for a top-tier corporate client in Hong Kong. Our team was tasked with replacing their manual forecasting process with a modern machine learning framework that combines an automated workflow, robust predictive models, and an actionable dashboard seamlessly integrated into their existing infrastructure.
In this presentation, we will walk through the key stages of a corporate data science project, including data preparation, model development and fitting, and the consumption layer. We will also discuss the role of MLOps in ensuring reliable model deployment and ongoing maintenance, as well as potential future integrations to enhance business impact.
Bio: He is a senior data analytics consultant with over a decade of experience applying AI and machine learning to solve real-world challenges across consulting, finance, government, and research in Europe, theUS, and Hong Kong. His expertise spans predictive modeling, generative AI, and automation, with a focus on leveraging AI/ML platforms to convert complex datasets into strategic decision-making tools.
----------------------------------------------------------------------
Speaker 3 : Zhai Yu
Topic: Grounding LLMs to Mitigate Hallucination: Retrieval-Augmented Generation for Adverse Drug Reaction Normalization in Social Media Data.
Summary: Social media data contains large amounts of unstructured, patient-reported adverse drug reactions (ADRs), which can benefit pharmacovigilance. However, mapping colloquial language to standardized terminologies such as MedDRA using large language models (LLMs) often produces hallucinated terms or codes.
This talk presents a framework to mitigate hallucinations by grounding LLM outputs in external knowledge and rigid constraints. The resulting methodology provides a reproducible and reliable approach to ADR detection and normalization, demonstrating how retrieval-augmented generation (RAG), dense reranking, and output-level constraints can together overcome the hallucination problem.
Bio: A PhD student in Computational Linguistic and Natural Language Processing study. My research interests are Knowledge Engineering, Ontology, and LLMs in Healthcare.