What we're about
Upcoming events (1)
DESCRIPTION: In this talk, will build an end-to-end AI/ML pipeline for natural language processing with SageMaker. The talk is broken down into 7 parts as follows: 1) Ingest the dataset 2) Analyze and visualize the dataset 3) Transform the raw dataset into machine learning features 4) Train a model with our features 5) Optimize model training using hyper-parameter tuning 6) Deploy and test our model both online (real-time) and offline (batch) 7) Automate the entire process with a SageMaker pipeline **Agenda** [15 mins] Ingest Data [15 mins] Explore Data [15 mins] Prepare Data [15 mins] Train Model [15 mins] Optimize Model [15 mins] Deploy Model [15 mins] Create Pipeline [15 mins] Q&A / Wrap Up **Key Learning Points** Attendees will learn how to: * Ingest data into S3 using Amazon Athena and the Parquet data format * Visualize data with pandas, matplotlib on SageMaker notebooks and AWS Data Wrangler * Analyze data with the Deequ library, Apache Spark, and SageMaker Processing Jobs * Perform feature engineering on a raw dataset using Scikit-Learn and SageMaker Processing Jobs * Train a custom BERT model using TensorFlow, Keras, and SageMaker Training Jobs * Find the best hyper-parameters using SageMaker Hyper-Parameter Optimization Service (HPO) * Deploy a model to a REST Inference Endpoint using SageMaker Endpoints * Perform batch inference on a model using SageMaker Batch Transformations * Automate the entire process using StepFunctions, EventBridge, and S3 Triggers RESOURCES: * https://datascienceonaws.com * https://github.com/data-science-on-aws/workshop BIOS: 1. Antje Barth (@anbarth) https://twitter.com/anbarth Antje is a Developer Advocate for AI and Machine Learning at Amazon Web Services (AWS) based in Düsseldorf, Germany. She is co-author of the O'Reilly Book, "Data Science on AWS." Antje is also co-founder of the Düsseldorf chapter of Women in Big Data. Antje frequently speaks at AI and Machine Learning conferences and meetups around the world, including the O’Reilly AI and Strata conferences. Besides ML/AI, Antje is passionate about helping developers leverage Big Data, containers, and Kubernetes platforms in the context of AI and Machine Learning. Previously, Antje worked in technical evangelism and solutions engineering at MapR and Cisco where she worked with many companies to build and deploy cloud-based AI solutions using AWS and Kubernetes. 2. Chris Fregly (@cfregly) https://twitter.com/cfregly Chris is a Developer Advocate for AI and Machine Learning at Amazon Web Services (AWS) based in San Francisco, California. He is co-author of the O'Reilly Book, "Data Science on AWS." Chris is also the Founder of many global meetups focused on Apache Spark, TensorFlow, and KubeFlow. He regularly speaks at AI and Machine Learning conferences across the world including O’Reilly AI & Strata, Open Data Science Conference (ODSC), and GPU Technology Conference (GTC). Previously, Chris was Founder at PipelineAI where he worked with many AI-first startups and enterprises to continuously deploy ML/AI Pipelines using Apache Spark ML, Kubernetes, TensorFlow, Kubeflow, Amazon EKS, and Amazon SageMaker. **Note:** If you attend live, you will receive AWS swag and possibly credits. More details coming soon.