Nithin Movva will be presenting the paper "Exploiting Statistics on Query Expressions for Optimization".
We hope that you'll read the paper (http://research.microsoft.com/pubs/74106/sits.pdf) before the meetup, but don't stress if you can't.
Statistics play an important role in influencing the plans produced by a query optimizer. Traditionally, optimizers use statistics built over base tables and assume independence between attributes while propagating statistical information through the query plan. This approach can introduce large estimation errors, which may result in the optimizer choosing inefficient execution plans. In this paper, we show how to extend a generic optimizer so that it also exploits statistics built on expressions corresponding to intermediate nodes of query plans. We show that in some cases, the quality of the resulting plans is significantly better than when only base-table statistics are available. Unfortunately, even moderately-sized schemas may have too many relevant candidate statistics. We introduce a workload-driven technique to identify a small subset of statistics that can provide significant benefits over just maintaining base-table statistics. Finally, we present experimental results on an implementation of our approach in Microsoft SQL Server 2000.