• DSS-2019-04: LUKE METCALFE and YAN MOISEEV

    Commonwealth Bank of Australia

    Data Science Sydney proudly presents our speakers for April 2019: LUKE METCALFE: "Why generalists win at data science (and why everyone else loses)" YAN MOISEEV: “Building Robotic DJ 101” 200 seats available, first come - first served for members on the RSVP-yes list. Please ensure that you keep your RSVP up to date. If you cannot make it, please make you spot available for others as soon as possible. Subscribe to our YouTube channel: https://www.youtube.com/channel/UCMNZrokQNSm2UQ_T5VfOLgg To comply with CBA Security we need your FIRST and LAST NAME before the event. If these are not your profile name, please enter them when you register. Members who do not provide first and last name will be removed from the guest list and will not be able to attend. Registration opens at 5:30pm and close at 6:15pm, sharp. Food and beverages between 6pm and 6:15pm and late comers cannot be admitted. --- LUKE METCALFE: "Why generalists win at data science (and why everyone else loses)" About the Talk: Data scientists have the ironic position today of simultaneously being much in demand but also poorly utilised. Luke talks about why it's important to view always data in a wider context - in terms of the business, the end customer and the consumers of your insights and models. This not only makes your findings more digestible but make your analysis more accurate and more in tune with the fundamental drivers of ROI of your client. About the speaker: Luke Metcalfe has been a data scientist and serial entreprenerd for the past 18 years. He has built and sold two data businesses and now consults in the data space specialising in getting a 360 view of the customer and operations. ----- YAN MOISEEV: “Building Robotic DJ 101” About the talk: Are you tired of your DJ always being drunk, late, and trying to steal your girlfriend/boyfriend? We have a perfect solution for you — DJBax. Accenture Liquid Studio, along with our internal Amazon Business Group, have taken a 7-year old robot designed for pick-and-place tasks, and gave it a more exciting job. With its first gig already taken place on Amazon Re:Invent in Las Vegas end of 2018, we will talk about how’s and why’s in case if you decide to build one yourself. In this talk, we will walk you through our journey of giving Baxter second life, from ideation to the actual performance. We will touch on such topics as AI creativity and how to mimic it, to what degree we can let AI react to our environment to not be creepy, and how can we combine Deep Learning with good-ol’ if-else statements. About the speaker: Yan is a Data Scientist at Accenture Liquid Studio focusing on taking on all of the latest buzzwords and making them commercially viable. He has diverse experience in a wide range of industries — starting from a very stint living in a mining camp, designing massively scalable recommender systems, and building autonomous Robotic DJs. His current projects include building framework for Explainable AI and writing Homomorphic Encryption-compatible Machine Learning algorithms.

    5
  • Kaggle Team challenge! And presentation by the previous winners!

    This event aims to bring together people who want to work with each other to improve their data science skills. Please only register if you are able and willing to try a Kaggle challenge over the next 4 weeks after April 10 (no weekly Meetup but rather each team is encouraged to organize their own regular working sessions). Some of the winners and competitors of the previous challenge (https://www.meetup.com/Data-Science-Sydney/events/qtfdlqyxpbcb/) will be sharing their process and methods with you tonight. Afterwards we will form new teams for three new challenges. You will build a portfolio of projects while tackling real world machine learning challenges. Most materials are available online for free but doing it alone can be hard and/or boring, doing it together is much more fun. The group would be open for all data science enthusiasts regardless of background. The main objectives are data science education and networking. The purpose of this meeting is primarily to form teams of three to five other like-minded data scientists, based on your preference for one of three Kaggle challenges. The thee challenges will be selected in the coming days and posted on this Meetup, as well as described on the event itself. Your team will then have the following four weeks to further discuss the challenge, form a strategy and implement it. The outcomes will be presented four weeks later in a separate event, where a winner by popular vote is declared. Because you will form groups of your own, there is no minimum or maximum level of experience and you certainly don't have to be a experienced Kaggler. The important part is your willingness to do a team-based Kaggle competition in the next four weeks. PROGRAM: 6 pm: intro - Case study: amazing results from joining a Kaggle competition - Short introduction & grab pizza and drink - Intro challenge - Pizza and Beer sponsor: The DataSchool DownUnder (MIP) 6:15 pm - Winners PubG prediction: - Swetansu Pattnaik (leader) - Beau Bellamy - Sunilkumar Arumugam - Shinto Jose 6:35 - Winners Google Store Prediction (hard challenge): - Geoff (leader) - Javed - Yuri - Abner 6:55 pm - A new challenge (see comments for a sneak peak)! Start forming groups and making plans. 7:30 pm - Share plans and share more thoughts Afterwards: drink (another) beer/wine/soda somewhere nearby The location is at WeWorks this time, please go to level 14. You can enlarge your chances of participating by giving a cool reply to the question asked if you're on the waiting list. There will be pizza for the hungry and beer and soft drinks for the thirsty. Let me know if you have any questions, otherwise see you there! Charles de Leau

    11
  • Unlocking the secrets in your DNA using Machine learning and Cloud-computing

    Genomic produces more data than Astronomy, twitter, and YouTube combined, having caused research in this discipline to leapfrog to the forefront of cloud technology. Dr. Denis Bauer provides an insider’s view into the development of a Spark-based machine learning framework that is able to find disease genes in the 3 billion letters of the genome. She will also cover serverless, which is pitted to become a $8 Billion market for its ability to accelerate software development, akin to how pre-fabrication has sped up the construction sector over bricklaying. Her serverless “search engine for the genome” enables researchers to use genome engineering for next-generation medicines. Speaker bio: Dr Denis Bauer is head of cloud computing, Bioinformatics, at Australia’s government research agency. She is an internationally recognised expert in machine learning and cloud-based genomics, having presented at AWS Summit, Canberra, 2018 and Open data science conference, India, 2018. Her achievements include developing open-source machine-learning cloud services that accelerate disease research, which is used by 10,000 researchers annually. Signup link: https://yn201904syddenisbauer.eventbrite.com.au

  • DSS-2019-03: JAMES ROSS and VARUN NAYYAR

    Commonwealth Bank of Australia

    Data Science Sydney proudly presents our speakers for March 2019: VARUN NAYYAR: "The Why, How and What of Bayesian Inference" JAMES ROSS: "Measuring Financial Wellbeing" 200 seats available, first come - first served for members on the RSVP-yes list. Please ensure that you keep your RSVP up to date. If you cannot make it, please make you spot available for others as soon as possible. Subscribe to our YouTube channel: https://www.youtube.com/channel/UCMNZrokQNSm2UQ_T5VfOLgg To comply with CBA Security we need your FIRST and LAST NAME before the event. If these are not your profile name, please enter them when you register. Members who do not provide first and last name will be removed from the guest list and will not be able to attend. Registration opens at 5:30pm and close at 6:15pm, sharp. Food and beverages between 6pm and 6:15pm and late comers cannot be admitted. --- VARUN NAYYAR: "The Why, How and What of Bayesian Inference" About the Talk: The Why, How and What of Bayesian Inference Deep learning has provided an answer for many problems once though unsolvable, but it's performance isn't great when the datasets are small and/or we wish to understand causality. More generally, when datasets are a few thousand rows, even cross validation becomes less helpful when doing analysis as we are still subject to the 1/sqrt(N) relationship to sample deviation. These are the Data Science problems of tomorrow and Bayesian Inference is the best framework to approach such questions. In fact, if you've looked at Probabilistic Graphical Models, you've already looked at Bayesian Inference. Most Bayesian discussions get stuck on theory, so this talk will focus on building on the theory to show how current ML procedures link to issues of frequentism and show a few real world examples of Bayesian Inference outperforming frequentism, before concluding with some resources and ideas to help you bring Bayesianism into your work. About the speaker: Varun has been a data scientist from before the title was coined, having been drawn to the mix of computer science and mathematics which had him so enamoured he completed two honours theses in Bayesian Inference which had him programming GPUs before Tensorflow. He's since had a varied career working in startups, tech, HFT and even a brief stint with parliament dealing with very diverse questions, data sizes and goals. Additionally, Varun is active in the open source community for python & R and if you ever draw a histogram in python, you're likely running some of his code. ----- JAMES ROSS: "Measuring Financial Wellbeing" About the talk: A collaboration between Melbourne Institute and Commonwealth Bank of Australia has developed the CBA-MI Reported and Observed Financial Wellbeing Scales. These are two first-of-their kind measures that combine self-reports of people’s financial experiences with bank-record indicators of wellbeing outcomes. The scales will help CBA, other organisations and policymakers to design policies, products and services to improve Australians’ financial wellbeing. In this presentation, we will investigate the analytical process used to transform the multifaceted notion of financial wellbeing into a robust pair of scales. We will also explore some unique insights produced as a result of the work. About the speaker: James is an associate data scientist at the Commonwealth Bank of Australia. He co-authored the first technical report published by Melbourne Institute and Commonwealth Bank of Australia. At work, James develops models for transactional sequence data. He also enjoys research in computational combinatorics with a focus on hypergraphs. Otherwise, he is busy perfecting his home brew, or hiking in the mountains.

    5
  • The Data Literacy Revolution - Dr. Eugene Dubossarsky

    Note: to register for this event do not register on the current event page. RSVP here https://www.meetup.com/FutureShapers/events/259375797/ The popularity and ubiquity of data science, data analytics, AI and the trend towards digital transformation have led to massive, repeated failures in many businesses. Despite billions spent, hundreds of Ph.D.s hired, and much boasting in conference presentations, many enterprises are still struggling to leverage the value of these new technologies. The missing ingredient is the literacy of the rest of the organisation, particularly senior management. This presentation will describe this new literacy: “data literacy”, the analogy with computer literacy, and reasons why this skillet will soon be as essential to all professionals as data literacy is today. It will address issues of automation, the advent of decision making as the key managerial activity and the resulting democratisation of AI and analytics, however still maintaining a class of data science and analytics experts. The presentation will address issues of mindset, as well as skillset, and the ways in which management engagement with data analytics must change to leverage its value. Speaker Profile Eugene Dubossarsky has been a commercial data scientist, financial trader, software developer and professional community leader for over 20 years. He is the principal founder of The Australia New Zealand Data Analytics Network, a community of over 15600 members with regular meetups across the region. His data science training business Presciient has been in operation for over 9 years. He is also the Chief Data Scientist and a Managing Partner of AlphaZetta, a global data analytics training and consulting business spanning 44 countries. He is also a Founding Partner of Advantage Data, a data science technology company, Aurum Data, which leverages AI to value data sets, and risk.earth, innovative cyclone catastrophe risk modelling company. Please do NOT register for this event on this page. Instead, RSVP here: https://www.meetup.com/FutureShapers/events/259375797/

  • DSS-2019-03(I): YANIR SEROUSSI: A Day in the Life of a Remote Data Scientist

    Sydney Mechanics' School of Arts

    Data Science Sydney proudly presents an special speaker for March 2019: NOTE: the event is not at our usual location at CBA but at SMSA (280 Pitt St). YANIR SEROUSSI: A Day in the Life of a Remote Data Scientist Please ensure that you keep your RSVP up to date and make you spot available for others as soon as possible. Registration opens at 5:45pm with the talk starting at 6:00pm. --- About the Talk: Despite the recent rise in popularity of remote and distributed work, some people still believe that effective data scientists should spend most or all of their time in an office. This belief rests on tradition rather than on solid evidence. In reality, many data science tasks require spending quiet time alone with a computer, but many modern offices are noisy and full of distractions. There's no reason to keep good data scientists chained to desks for eight hours (or more!) every day. In this talk, Yanir will discuss his experience working remotely as a data scientist with Automattic, the company behind WordPress.com and one of the world's largest fully-distributed employers. He will demonstrate that remote data science work is not only possible – it can often be more productive and successful than the archaic approach of full-time on-site work. About the Speaker: Yanir has been working as a remote data scientist with Automattic since 2017. Prior to that, he lived in Sydney and Melbourne and worked as a data scientist with small startups, and as a data science consultant with bigger clients. He accidentally became a data scientist in 2012 after getting his PhD from Monash University – all he really wanted was to do more interesting software engineering after working with tech giants Google, Qualcomm, and Intel. His data science experience and interests include machine learning engineering, recommender systems, Bayesian modelling, and causal inference. But you can often find him in or around the ocean in some corner of the world or in Ballina, where he now resides.

    5
  • Introduction to Bayesian Inference with Stan - Michael Betancourt - Stan Core

    To register for this course, and for payment and location details : please do not use this Meetup page. Go to : https://presciient.com/event/stan-syd-feb-2019/ Despite the promise of big data, inferences are often limited not by the size of data but rather by its systematic structure. Only by carefully modeling this structure can we take full advantage of the data—big data must be complemented with big models and the algorithms that can fit them. Stan is a platform for facilitating this modeling, providing an expressive modeling language for specifying bespoke models and implementing state-of-the-art algorithms to draw subsequent Bayesian inferences. In this three-day course, we will introduce how to implement a robust Bayesian workflow in Stan, from constructing models to analyzing inferences and validating the underlying modeling assumptions. The course will emphasize interactive exercises run through RStan, the R interface to Stan, and PyStan, the Python interface to Stan. We will begin by surveying probability theory, Bayesian inference, Bayesian computation, and a robust Bayesian workflow in practice, culminating in an introduction to Stan and the implementation of that workflow. With a solid foundation we will continue with a discussion of regression modeling techniques along with their efficient implementation in Stan, spanning linear regression, discrete regression, and homogeneous and heterogeneous logistic regression. Time permitting, we will consider the practical implementation of advanced modeling techniques at the state of the art of applied statistics research—such as Gaussian process priors and the horseshoe prior. Prerequisites The course will assume familiarity with the basics of calculus and linear algebra. To participate in the interactive exercises, attendees must provide a laptop with the latest version of RStan or PyStan installed. Users are encouraged to report any installation issues at the Stan forum as early as possible. The instructor: Michael Betancourt Michael Betancourt is a research scientist with Symplectomorphic, where he develops theoretical and methodological tools to support practical Bayesian inference. He is also a core developer of Stan, where he implements and tests these tools. In addition to hosting tutorials and workshops on Bayesian inference with Stan, he also collaborates on analyses in epidemiology, pharmacology, and physics, among others. Before moving into statistics, Michael earned a BS from the California Institute of Technology and a PhD from the Massachusetts Institute of Technology, both in physics. Find out more at Michael’s website. https://betanalpha.github.io/consulting/ To read what students are saying about Michael’s courses, please scroll to the bottom of his consulting page. https://betanalpha.github.io/consulting/

  • DSS-2019-02: MICHAEL BETANCOURT (STAN CORE)

    Commonwealth Bank of Australia

    Data Science Sydney proudly presents our speaker for February 2019: MICHAEL BETANCOURT: SCALABLE BAYESIAN INFERENCE WITH HAMILTONIAN MONTE CARLO 200 seats available, first come - first served for members on the RSVP-yes list. Please ensure that you keep your RSVP up to date and make you spot available for others as soon as possible. To comply with CBA Security we need your FIRST and LAST NAME before the event. If these are not your profile name, please enter them when you register. Members who do not provide first and last name will be removed from the guest list and will not be able to attend. Registration opens at 5:30pm and close at 6:15pm, sharp. Food and beverages between 6pm and 6:15pm and late comers cannot be admitted. --- About the Talk: Despite the promise of big data, inferences are often limited not by sample size but rather by systematic effects. Only by carefully modeling these effects can we take full advantage of the data -- big data must be complemented with big models and the algorithms that can fit them. One such algorithm is Hamiltonian Monte Carlo, which exploits the inherent geometry of the posterior distribution to admit full Bayesian inference that scales to the complex models of practical interest. In this talk I will present a conceptual discussion of the challenges inherent to Bayesian computation and the foundations of why Hamiltonian Monte Carlo in uniquely suited to surmount them. About the Speaker: Michael Betancourt is the principal research scientist with Symplectomorphic, LLC where he develops theoretical and methodological tools to support practical Bayesian inference. He is also a core developer of Stan, where he implements and tests these tools. In addition to hosting tutorials and workshops on Bayesian inference with Stan he also collaborates on analyses in epidemiology, pharmacology, and physics, amongst others. Before moving into statistics, Michael earned a B.S. from the California Institute of Technology and a Ph.D. from the Massachusetts Institute of Technology, both in physics. Website: https://betanalpha.github.io Twitter: @betanalpha

    1
  • INTELLIGENCE-DRIVEN BUSINESS

    Needs a location

    Intelligence is well understood and valued in the national security context but its application in the private sector is nascent. Pockets of intelligence expertise exist in the private sector at the enterprise level but they largely remain unconnected and narrowly employed. Reperceiving intelligence as a core enabling capability for the enterprise creates a new driver of business value and the capacity to support a wide range of business decisions, ranging from operational to strategic. This seminar will explore the proposition of ‘intelligence-driven business’ through four sessions: an opening address and keynote address to set the scene; case-based presentations describing domains of intelligence practice; case-based presentations describing how to better enable intelligence practice; and a panel discussion to synthesise challenges and opportunities for strengthening intelligence-driven business. ‘Intelligence-Driven Business’ will be of interest to private sector managers and employees wanting to grow their intelligence capability, academics concerned with the curriculum design and development needed to service the growth of intelligence in the private sector, and business students considering careers in the private sector. Tickets: General Admission: AUD $70 Student Admission: AUD $20 Refunds and cancellations: Registration fees will be refunded in full on request when event cancellation is made before 15 February 2019. No refunds will be issued after this date, except in the case of extenuating circumstances. This event is hosted by Macquarie University in collaboration with the Australian Institute of Professional Intelligence Officers (AIPIO). Please do not RSVP to Meetup for this event. Register here: https://www.mq.edu.au/about/events/view/intelligence-driven-business/

    1