Data Science at River House


Details
DESCRIPTION
An evening to showcase Data Science as it's being used by resident companies in River House.
LOCATION
Club Lounge (1st Floor), Clockwise Offices, River House
SPONSORS
This is a community-led event, organised by residents of River House. We'd like to thank Clockwise for providing the Club Lounge and Signifyd for providing refreshments on this occasion.
FORMAT
18:00 - 18:30 Networking and Refreshments
18:30 - 20:15 Short 15min presentations on a technical topic
20:15 - 21:00 Networking
21:00 onwards The National Bar
DETAILS
Framework for detecting anomalies in network data - Shane Toner, Qarik
Cyber Security is growing in importance due to the damaging effects breaches have on companies. Generic security is implemented using a rule-based approach to find anomalies in network logs, which can be complex and need to be manually managed. Machine Learning can make anomaly detection automated and more scalable as networks grow.
Network logs can be used to form time series which can have a normal structure, anomalous activity can be defined when the activity deviates from the norm, it is assumed that when an attacker penetrates the network, they will cause changes to the normal behaviour.
This framework sets out methods to detect anomalies in network logs, first by creating a time series of events per unit time for each IP/Computer in network logs, then grouping these time series to train autoencoders. The autoencoders learn the structure and can detect deviations from normal data when tested with anomalies. A Risk Score is introduced to quantify the strength of identified anomalies to reduce the false-positive rate.
Dynamic Time Warping and HDBSCAN were shown to work really well to group the time series. These groups were then used to train autoencoders for each group/cluster. Increased sensitivity and computing Risk Scores contributed to accurate anomaly classification of simulated anomalies. The output of the framework is an IP, a time bucket and a Risk Score that can be passed to a person or another process to check the logs for the anomalous activity.
Getting started with machine learning: classification - Jim McCann, Serafim
This talk is for beginners in data analytics. It explains what is meant by supervised learning, machine learning and describes some techniques for classification problems. An example (case study) of how this is done in practice is given. No prior knowledge of mathematics/statistics is assumed.
"What the [redacted] did you do?", On reproducible research - Rob King, Signifyd
The what, why and how of reproducible research in data science.
CattleEye Autonomous livestock Management - Terry Canning, Cattle Eye
Following a recent seed investment, CattleEye plan to deliver the world's first Autonomous livestock Management system. CattleEye is a deep learning cloud-based Artificial Intelligence platform designed to interpret visual imagery of livestock from industry-standard web cameras and potentially other vision detection equipment. It will have the ability to autonomously identify the animals and extract insights including measuring gait, applying mobility scores and monitoring breeding physiological signs. These insights will be delivered back to the farmer in an actionable format.
T&Cs
Photos of this event will be shared on Social Media, and may be used in promotional material used by River House/Clockwise/Resident companies. If you prefer not to appear in these photographs, please make yourself known to Brian Douglas/Clockwise staff on the evening.

Data Science at River House