- Inconvenient Mathematics - Why Data Scientists Can't Give A Free Lunch
Abstract: Data science, the 'sexiest job of the 21st century', has seen incredible growth in the past few years. With an explosion of available data, tools, software packages, and traditional and online course offerings, it seems that nothing is impossible. Current tools, such as Microsoft's Machine Learning Server, allow for complex machine learning algorithms to be built, configured, and utilized in a few lines of code. While such advancements have streamlined data science, there are still mathematical limitations to predictive modelling, including: • Bias-Variance Trade-off • Ugly Duckling Theorem • No Free Lunch Theorem Examples of these will be demonstrated on publicly available data on Melbourne house prices in terms of solving regression and classification problems. Speaker Bio: Dr Craig Savage has worked in a number of technical fields, including rocket science and bringing sight to the blind. He has experience in Credit Risk at NAB and ANZ, and is currently a Specialist Master at Deloitte. His professional interests include data analysis, data visualization and, more importantly, driving evidence-based decisions and actions.
- Dealing with 'unusual' data in Machine Learning and Statistical Modelling.
Topics covered: • The theory of 'unusual-ness' • The effect of these values on your model performance • Diagnosing this problem in practice • (More) robust models • Dealing with uni and multi variate cases Methods in R and Python shown throughout this meetup. Speaker Bio: Andrew Worsley - National Data Science Lead at Velrada Andrew has expertise in the areas of machine learning, statistics and cloud computing. He has also co-authored and contributed to academic publications in the area of medicine and population heath and teaches programming and data science.
- Microsoft Double Act: David Smith and Hong Ooi
We are privileged to have two R gurus present to us: David Smith and Hong Ooi, both previously at Revolutions Analytics and now at Microsoft. Agenda: 5:30 - Networking, pizzas and drinks. 6:00-6:45 - Not Hotdog: Image recognition with R and the Custom Vision API (David Smith) 6:45-7:00- Short break 7:00-7:45pm - AzureR: talking to Azure from R (Hong Ooi) 7:45-8:00 - More networking and close.
- Modelling with Impact: What you need to know as an analytics professional!
• What we'll do Modelling with Impact: What you need to know as an analytics professional The demand for data analysis and advanced visualizations have increased substantially. As analytics professionals, this is great news; an increase in appetite for the application of analytics and the demand for approaching problems using the scientific method and the ability to have a bigger impact on the final result. But therein lies a big problem. Surely being able to conduct more sophisticated analysis would lead to better outcomes? Industry practitioners would tell you otherwise. This talk will use the recent accounting standards (IFRS9) as an example; a new standard that calls upon a level of mathematical modelling that has not previously been required with literally billions of dollars at stake. And you will walk out with a better appreciation of current challenges, and potential solutions that are faced by industry practitioners . Note: Drinks and Pizzas will be provided! About Speaker: Craig Savage has worked in a number of technical fields, including rocket science and bringing sight to the blind. He has experience in Credit Risk at NAB and ANZ, and is currently an Analytics Consultant at Connected Analytics. His professional interests include data analysis, data visualisation and, more importantly, driving evidence-based decisions and actions. • What to bring • Important to know
- Email Intent Classification in the Real-World: Challenges and Approaches
Classification in the Real-World: Challenges and Approaches In this talk we will share experience in the development of a Machine Learning approach to email intent classification. We will discuss practical challenges in text classification, such as consistent labelling of training data for ML modelling, and also talk about approaches to alleviate some of those challenges. Jin Yu is a senior Data Scientist with Microsoft. She leads the execution of advanced analytics projects with Microsoft customers. Jin has worked with companies across broad industry verticals including telco, healthcare, global financial services, and state government agencies to develop analytics solutions for their most critical business problems. Note: Snacks and Drinks will be provided!
- Pitfalls of Machine Learning (aka "What you should learn from my mistakes")
Recently, there has been increased attention to the successes of machine learning, including beating (human) world-champions at games of chess and Go. While the successes of such algorithms make fine headlines, the challenges associated with such algorithms make for less interesting reading but very interesting jobs! This talk will explore some of the practicalities of machine learning algorithms and cultures in multiple aspects, including development, validation, and governance. For each aspect, I'll consider problem statements and tools that I have applied in my experience in different industries, including defence, biomedical engineering and credit risk. Speaker Bio Craig Savage has worked in a number of technical fields, including rocket science and bringing sight to the blind. He is currently a Manager of Models Management at ANZ, focusing on process improvements throughout the life cycle of Credit Risk Models. His professional interests include data analysis, data visualisation and, more importantly, driving evidence-based decisions and actions. Session Time The session will start at 6pm and will go to around 7:30pm, however there will be pizza available at 5:30pm for those that want to network and share ideas before hand.
- Galaxy Classification: a data science workflow with R Server 9
Microsoft R Server 9.0 includes several new features for data scientists. These include MicrosoftML, an advanced machine learning package; sqlrutils, for operationalising R models into SQL Server; and mrsdeploy, the next-generation DeployR tool for creating REST APIs from R. I’ll talk about these these features and demonstrate them as part of a data science workflow: classifying 240,000 galaxies from the Galaxy Zoo project using deep neural networks. About the Speaker (Hong Ooi) Hong Ooi is a senior data scientist with Microsoft, based in Melbourne. Prior to joining Microsoft, he had analytics roles in a number of large organisations, including Pivotal, ANZ Bank and IAG. He is a veteran R user and programmer, having started with S-Plus in the 1990s. His background is in statistics and actuarial science, and he holds a PhD in computational statistics from the Australian National University. He still can’t quite get used to referring to himself in the third person.
- Gold Coast: Microsoft Ignite Australia
Microsoft Ignite is the conference for Developers and IT Pros. It’s for curious minds hungry to connect with and learn from the brightest folk. And it’s for the innovators who want to make the most of what they have and find smarter ways of doing things. Microsoft Ignite Australia happens to attract some of the world’s brightest minds. It’s your chance to meet them, extend your professional network and deepen personal connections. Thoughts will be sparked. Questions will be answered. Fun will be had. Register online: https://msftignite.com.au/
- Breaking the Bottleneck: From fast programs to fast projects
Talk description Advances in computer hardware and software have greatly reduced processing time to solve complex problems. However, this does not readily translate to faster completion of software projects, but has transferred the bottleneck. In this talk, I'll present my personal experiences with identifying and addressing the limitations brought to the foreground by faster computing, and means of mitigating them open source tools. Finally, potential areas of further integration will be presented for consideration. Speaker Bio Craig Savage has worked in a number of technical fields, including rocket science and bringing sight to the blind. He is currently a Manager of Models Management at ANZ, focusing on process improvements throughout the life cycle of Credit Risk Models. His professional interests include data analysis, data visualisation and, more importantly, driving evidence-based decisions and actions.
- Creating Effective Graphs with R and Power BI
The R Project is a system for statistical computation and graphics, ideal for Data Scientists. It has been run as an global open-source project for over 20 years, and is wildly popular – this year R moved up to 5th place in the IEEE language rankings of all programming languages, behind only C, Java, Python and C++. Microsoft’s Power BI platform can be extended by integration with R, supporting both R data integration scripts and embedding R graphics in Power BI reports and dashboards. As this presentation will show, this combination makes it easy to produce dynamic, interactive R graphics and also share them through the powerbi.com site, without writing any extra code. For R users, the key advantages are adding interactive filtering to produce unlimited variations of your R graphics, without changing a line of code. You can also share your R content easily via the polished and modern Power BI web service. For Power BI users, R greatly extends the graphical options available, and makes a vast array of proven statistical packages available to enhance your data. About the presenter – Mike Honey Mike is the Tech Lead at Manga Solutions, a Melbourne-based data integration and data visualisation consultancy, and Power BI Partner. He is helping a range of clients – from large corporates to small consultancies and not-for-profits – to tell their data stories using Power BI and associated tools.