Microsoft Double Act: David Smith and Hong Ooi


We are privileged to have two R gurus present to us: David Smith and Hong Ooi, both previously at Revolutions Analytics and now at Microsoft.

This event is being run in conjunction with Microsoft's Citizen Data Scientists meetup: Please RSVP either here or there, but not on both.

Many thanks to Microsoft for hosting and catering for this event.


5:30 - Networking, pizzas and drinks.
6:00-6:45 - Not Hotdog: Image recognition with R and the Custom Vision API (David Smith)
6:45-7:00- Short break
7:00-7:45pm - AzureR: talking to Azure from R (Hong Ooi)
7:45-8:00 - More networking and close.


First presentation:

Not Hotdog: Image recognition with R and the Custom Vision API.

Building an application that can recognize a specific type of object from scratch is possible with tools like convolutional neural networks, but it's not easy: you may need many thousands of labelled representative and unrepresentative images, and training such a model may consume many expensive GPU core-hours.

An simpler yet effective way is to use Transfer Learning: use a standard neural network already trained to recognize general objects, and use the features it has already learned to recognize a new set of objects. With this method, you need far fewer novel images, and the training process is much faster.

In this talk, I'll use R in conjunction with the Microsoft Custom Vision API to train and use a custom vision recognizer. I'll use an example motivated by the TV series "Silicon Valley", and with just a couple of hundred images of food, create a function in R that can detect whether or not a given image contains a hot dog.

David Smith is a developer advocate at Microsoft, with a focus on data science and the R community. With a background in Statistics, he writes regularly about applications of R at the Revolutions blog (, and is a co-author of “Introduction to R”, the R manual. Follow David on Twitter as @revodavid.


Second presentation:

AzureR: talking to Azure from R

AzureR is a family of packages for working with Microsoft’s Azure cloud platform from R. The core of the family is AzureRMR, which provides an R interface to Resource Manager. It handles authentication, allows you to manage subscriptions and resource groups, and lets you create, update and delete individual resources and templates. This is a lightweight package that interfaces directly with the Resource Manager REST API, requiring neither Powershell nor Python. It is also easily extensible to handle specific Azure services; current packages that extend AzureRMR include the following.

AzureStor is a package for Azure storage services. On the admin side, it provides the ability to create and delete storage accounts; on the client side, it provides an interface to blob storage and file storage: upload/download files, create and delete shares and containers, list files, etc. Both authenticated and public (anonymous) access are supported. Other storage types coming soon.

AzureVM is a package for managing virtual machines. Deploy, start up, shutdown, and delete VMs (and clusters of VMs), run scripts, etc. It comes with a selection of templates based on the Azure Data Science Virtual Machine (both Ubuntu and Windows flavours), or you can also supply your own templates.

AzureContainers lets you work with Azure Container Registry (ACR), Container Instances (ACI) and Kubernetes Service (AKS). Upload images to a Docker registry hosted in Azure (or anywhere); and then deploy containers and services to a Kubernetes cluster.

Hong Ooi is a senior data scientist with the Azure CAT team. In addition to working with customers to solve analytics problems, he hacks with R, something he has been doing for several years (and with S-Plus before that). Prior to joining Microsoft, he had stints at Pivotal, ANZ Bank in Melbourne, and NRMA Insurance in Sydney.