Note: ML Model Versioning, Deployment, and Monitoring are core themes of the https://scale.bythebay.io 2019, 11/14-15, Oakland. Reserve your seat today using the code MEETSFFINTECH15 for 15% off all passes, including the complete Serverless workshop! Joint meetup -- please RSVP at http://bay.area.ai! (1) MODEL VERSIONING: WHY, WHEN, AND HOW Models are the new code. While machine learning models are increasingly being used to make critical product and business decisions, the process of developing and deploying ML models remain ad-hoc. In the “wild-west” of data science and ML tools, versioning, management, and deployment of models are massive hurdles in making ML efforts successful. As creators of ModelDB, an open-source model management solution developed at MIT CSAIL, we have helped manage and deploy a host of models ranging from cutting-edge deep learning models to traditional ML models in finance. In each of these applications, we have found that the key to enabling production ML is an often-overlooked but critical step: model versioning. Without a means to uniquely identify, reproduce, or rollback a model, production ML pipelines remain brittle and unreliable. In this talk, we draw upon our experience with ModelDB and Verta to present best practices and tools for model versioning and how having a robust versioning solution (akin to Git for code) can streamlining DS/ML, enable rapid deployment, and ensure high quality of deployed ML models. Speakers: Manasi Vartak, CEO, Verta.ai, Conrado Miranda, CTO, Verta.ai Manasi Vartak is the founder and CEO of Verta.ai (www.verta.ai), an MIT-spinoff building software to enable high-velocity machine learning. Manasi previously worked on deep learning for content recommendation as part of the feed-ranking team at Twitter and dynamic ad-targeting at Google. Conrado Miranda is the CTO at Verta.AI. Conrado has a PhD in Machine Learning and a focus on building platforms for AI. He was the tech lead for the Deep Learning platform at Twitter’s Cortex, where he designed and led the implementation of TensorFlow for model development and PySpark for data analysis and engineering. He also led efforts on NVIDIA’s self-driving car initiative, including the Machine Learning platform, large scale inference for the Drive stack, and build and CI for Deep Learning models. (2) Model Monitoring in Production Machine Learning models continuously discover new data patterns in production they have never seen during training and testing iterations. The best offline experiment can lose in production. The most accurate model is not always tolerant to a minor data drift or adversarial input. Neither prodops, data science or engineering teams are skilled to detect, monitor and debug model degradation behaviour. Real mission critical AI systems require advanced monitoring and model observability ecosystem which enables continuous and reliable delivery of machine learning models into production. Common production incidents include: - Data anomalies - Data drifts, new data, wrong features - Vulnerability issues, adversarial attacks - Concept drifts, new concepts, expected model degradation - Domain drift - Biased Training set In this demo based talk we discuss algorithms for monitoring text and image use cases as well as for classical tabular datasets. Demo part will cover the full cycle of machine learning model in production: Model training and deployment with Kubeflow pipelines Production traffic simulation Model monitoring metrics configuration Data drift detection Drift exploration and monitoring metadata mining New training dataset generation from production feature store Model retraining and redeployment Stepan Pushkarev is a CTO of Hydrosphere.io - Model Management platform and co-founder of Provectus - an AI Solutions provider and consultancy, a parent company of Hydrosphere.io.

  • Scale By the Bay 2018


    Dear Friends — we are proud to announce the program of Scale By the Bay 2018, our sixth year of the flagship, and by now iconic, independent developer conference By the Bay. (Tl;dr: get your spot at http://scale.bythebay.io while supplies last, and especially when Early Bird is in effect until August 31.) The conference follows the established three-day, three track structure, hosted for the third year in a row by Twitter HQ in its wonderful modern building, with all of its spacious tracks, community spaces, cozy booths, and the commons area where so many connections are made during the hallway track. This year, Martin Odersky, the creator of Scala, opens the main conference on November 15. Neha Narkhede, the co-creator of Kafka and cofounder of Confluent, is keynoting the day 2. The three tracks are — Functional and Thoughtful Programming — Reactive Microservices and Streaming Architectures — End-to-end Data Pipelines all the way up to Machine Learning and AI The 100 sessions include technology leaders such as Twitter, IBM, Microsoft, Salesforce, Fauna, DataStax, Databricks, Confluent, Credit Karma, Sumo Logic, GoPro, Buoyant, Workday, Zignal Labs, and many more. We cover your tools with JetBrains, your shopping with Best Buy and Target, your vacations with HomeAway, your listening with Spotify, your viewing with Netflix, your reading with Medium, and your banking with JP Morgan Chase. The list goes on and on and on — we have the most of the advanced stacks and approaches employed by the best that Silicon Valley offers to the world at scale, shared as best practices, with code, yours to learn, take home, and build upon. Our speakers span the whole spectrum from the first-time presenters with leading companies to veterans of SBTB going all the way back to 2013, evolving their craft before our eyes. You can follow their progress by watching their previous talks on http://functional.tv and the photos of the past conferences at https://meetup.bythebay.photo/Conferences/Scale-By-the-Bay The three panels, closing each day, are: — Thoughtful Software Engineering — Data Engineering for AI, and — Cloud, Edge, and Silver Lining. Each day begins with a hot breakfast, that begins an uninterruptible supply of Philz coffee through the whole day, and lunch is provided. On the first two days, the closing panels are followed by our signature happy hours, with great drinks, food, and conversation. The hallway tracks are legendary. SBTB is famous for its bespoke, all-day, build-yourself-a-company training. This year, we double it. Cliff Click, the legend of software engineering, is teaching a full day Advanced Software Engineering workshop on 11/13, followed by Ryan Knight, now of Fauna, leading cloud-native data pipelines on 11/14. The workshops are limited by 80 participants each. As last year, we’ll plan an unconference track for those who want to share their ideas in an intimate setting for joint brainstorming. The only thing moderate about SBTB is its size — we cap at 600 attendees to preserve the immediate and direct nature of the communication that happens, sparks that fly, and serendipity that always occurs. We are always sold out by the time the conference begins in November — so reserve your seat early at http://scale.bythebay.io! And enjoy the Early Bird that is in effect until August 31.

  • Rethink Trust -- Amsterdam, June 29

    Beurs van Berlage

    We’re super excited to share some awesome news: we’re expanding to Europe and introducing our newest conference! Blockchain: http://RethinkTrust.org - taking place at Beurs van Berlage, Amsterdam, on June 29th! (Scroll down for 15% off.) Blockchain: Rethink Trust is a gathering of top experts in engineering and ecosystem-minded leaders, focused on reengineering enterprise trust networks through technology. It will expand your understanding of blockchain, trust mechanisms, and how corporate world uses them. Rethink Trust is a By the Bay conference, brought to you by the creators of Scale By the Bay, AI By the Bay, and Data By the Bay engineering events help in San Francisco for over five years in partnership with IBM, ING, Apple, Twitter, Salesforce, and dozens of innovative startups and enterprises in the Bay Area and around the world. Dr. Alexy Khrabrov, Founder, By the Bay, is the Program Chair, and he wrote about the philosophy of Rethink Trust on his Medium blog: http://chief.sc/rethinktrust2018-intro We invite C-level executives, senior engineers, and technical leaders who wants to master the best practices in blockchain to the famous Beurs van Berlage, “the third stock exchange” of Amsterdam. True to the spirit of all of our previous events, Blockchain: Rethink Trust is laser-focused on learning, open-source excellence, and industry-oriented approaches that work. SPEAKERS We have the world’s top technology leaders speaking at the event. The keynote speakers include: -- Christopher Ferris, IBM CTO Open Technology, Chair of the Hyperledger Technical Steering -- Mariana Gómez de la Villa, Global Program Manager, ING DLT You will also hear from: -- Clara Durodie, Founder and CEO, Cognitive Finance Group -- Roman Shaposhnik, VP Product & Strategy, Co-founder. ZEDEDA -- Yonatan Sompolinsky, Co-founder & Scientist, DAGlabs -- Michael Egorov, CTO, NuCypher -- Roberto Mancone, Chief Operating Officer at we.trade Innovation DAC, the company developing, deploying, and distributing we.trade, the Blockchain based Trade Finance , reporting to the Board of Director of the 9 European Shareholders Banks (Deutsche Bank, HSBC, KBC, Natixis, Nordea, Rabobank, Rabobank, Santander, SocGen, Unicredit) -- Christopher Georgen, Founder and CEO, Topl, the company developing blockchain solutions for the developing world … and more! TOPICS We will cover a variety of topics at the intersection of engineering and business management, including: -- Crypto protocols of today and tomorrow and the software engineering process required to deploy them for enterprise customers at scale; -- Technology behind the key blockchain deployments in FinTech and IoT; -- Rigorous software engineering practices required for safe, correct, and performant implementation of blockchain applications and platforms; -- Key aspects of hardware-software codevelopment crucial for IoT+blockchain; -- ... and more Explore the program of the conference at http://rethinktrust.org WORKSHOPS We’ll have three workshops at the workshop track, available to all attendees. You can freely switch between the main track and the workshop track. -- Hyperledger workshop taught by Arnaud Le Hors, core Hyperledger team -- Implementation workshop by IntellectSoft -- Scala Blockchain -- secure and type-safe -- by Topl, the developing world blockchain company YOU WILL LEARN -- Discover ways to implement blockchain technology for reengineering trust in key business verticals -- Understand best practices of enterprise adoption of the ledger consensus approaches -- Conduct strategic partnerships for a consortium of trusted and trustless systems -- Invite key developers to collaborate on your blockchain ecosystem Tickets For a limited time, we’re offering an additional 15% off to By the Bay community. To claim the offer, use the code BYTHEBAYOFFER15 at http://rethinktrust.org

  • Scale By the Bay 2018 CFP is Open until May 31

    Needs a location

    It's the sixth year that we are organizing our flagship Scale By the Bay conference, and it's a truly spectacular tech event many of you know very well. For those who are new to SBTB, I would love to invite you to attend. And if you'd like to present, we'd like to see your talk! The CFP for SBTB 2018 is now open through May 31: http://scale.bythebay.io/cfp.html Give it your best shot, or two, as the rate of high-quality submissions is already very high. At Scale By The Bay, returning to Twitter HQ in San Francisco on November 15-17, 2018, you can connect with fellow senior software engineers, CTOs, VPs/Directors of Engineering, developers and technical founders who never stop learning. Embrace the whole end-to-end software stacks and infrastructure running them, put together your own SMACK Stack, operationalize reactive micro services and data pipelines, build streaming data infrastructure for actionable, real-time insights, and deep-dive into practical aspects of full-stack architectures and developer productivity. We'll have a stellar program: * the full-stack Scala and Functional Programming conference with world authorities on practical FP, beginning with Martin Odersky, the creator of Scala, who comes back to keynote SBTB! * the fast data pipelines done right, with Neha Narkhede, the co-creator of Apache Kafka and co-founder of Confluent, keynoting * FP+ML: Functional Programming for Machine Learning, a topic even more current today when TensorFlow for Swift has been unveiled Throughout the three track, three day event, we'll weave the themes of open-source development, type safety, full-stack acrhitectures, with the emerging areas of ML and AI so that you can learn all about it if you want to. At the same time, we'll make sure we're still, and always, the best in the software engineering realm with solid understanding of distributed systems, from operations up to services to streaming algorithms. We firmly believe that thoughtful software engineering with the right reusable abstractions and best practices around development is key to everything. We want to link this approach to more things and see more use cases. We especially welcome FP+ML talks this year. Please note that we go forward at Scale. We welcome production use cases of all thoughtfully designed software stacks, including Scala, Haskell, Swift, Rust, Clojure, F#, and so on. We welcome Java, C++, Go, and other systems, especially in the microservice, polyglot environment. Our SMACK 2.0 plan, unveiled at the Index conference, calls for Streaming, in-Memory architectures, API-centric, Containerized and running on Kubernetes. We welcome submissions on all levels of these new systems, starting with orchestration. No matter where you are along the full-stack spectrum, you need thoughtful software engineering, reactive and streaming architectures, manageable micro services, and scalable data pipelines that can work together with modern ML frameworks for immediate customer insights. Join the SBTB family at Twitter HQ again this year, see how companies like Twitter are built in software, build your own, and share your findings with others! See you in November at Twitter HQ! Dr. Alexy Khrabrov, Program Chair, By the Bay PS. If you are in Europe and can't wait, By the Bay comes to Amsterdam as RethinkTrust.org, out first signature engineering take on enterprise trust systems with blockchain and hyperledger in energy, fintech, IoT, and other real-world use cases. Our tech includes Swift and Scala, scalability and security of trust systems, their performance and enterprise stacks integration — the topics rarely, if ever, covered at general blockchain events. Use the code TRUSTBYTHEBAY for 15% off and join us in Amsterdam!

  • [joint with sfscala.org] Bitcoin in Scala

    Needs a location

    This is a joint meetup with SF Scala (https://www.meetup.com/preview/SF-Scala/events/244151821). Scale By the Bay (http://scale.bythebay.io) will add a deep dive into Bitcoin Functional Programming. Following the ScalingBitcoin.org (http://scalingbitcoin.org/) held at Stanford, 11/4-5, gathering the world's best developers on the first, longest, most highly capitalized blockchain in history, we're fortunate to present the top Scala teams implementing Bitcoin in Scala. Bitcoin-s (https://github.com/bitcoin-s) is an implementation of the bitcoin protocol. In this talk I will demonstrate what various data structures are of the bitcoin protocol. I will show how I've leveraged scalacheck to thoroughly test this code base. Next I will demonstrate how algebraic data types can be used to represent bitcoin's standardized contract types. Finally I will demonstrate how to use bitcoind via the bitcoin-s-rpc-client package to show direct interaction with the bitcoin peer to peer network. Chris is the founder of SuredBits -- a cryptocurrency company. Chris has been doing cryptocurrency development since 2014. He is the primary author of bitcoin-s -- an implementation of the bitcoin protocol in Scala. He also is a contributor to bitcoin core -- the reference implementation of bitcoin. Chris is an active contributor to cryptocurrency protocol research along with implementation of that research.

  • Scorex -- the Smallest Codebase for Blockchain, in Scala; and Topl

    Location visible to members

    This is a joint engineering meetup with SF Scala (http://sfscala.org). If you want to learn more about Scala, Scale By the Bay, SF Scala flagship engineering conference, begins on November 15 (http://scale.bythebay.io). Reserve your seat now! We have two talks: on Scorex and Topl, a startup using it. Dmitry Meshkov, one of the founders of Ergo Platform (http://www.ergoplatform.org) and contributor to Scorex (https://github.com/ScorexFoundation/Scorex), will give a code-centric deep dive into Bitcoin mechanics from an engineering viewpoint. Zihe Huang, technical lead at Topl (http://topl.co), is an open-source project and its goal is to make investing in foreign small and medium enterprises just as easily as investing in domestic companies. Topl is built on Scorex 2 and believes the power of blockchain can foster economic development and growth in developing countries. In this talk, I will be talking about why we chose Scala and Scorex and our development progress so far. We expect to go over code blocks and be able to follow them in the REPL. Please take this into consideration when RSVPing. Please answer the RSVP question, we will deprioritize blank answers in case of l(most certainly) imited capacity.

  • Yaron Minsky: Better Models through Metaprogramming

    Location visible to members

    This is a joint engineering meetup with SF Scala (https://www.meetup.com/SF-Scala/events/243125720/). Note: Functional Programming for Machine Learning is one of the key directions explored by this year's Scale By the Bay (http://scale.bythebay.io/) conference held at Twitter, November 16-18. We need a venue for this meetup! Host us if you're in SF, can provide food, drinks, no NDAs, fit at least 100 people, and are FP-friendly. As a trading firm, Jane Street (https://www.janestreet.com/) needs the ability to work collaboratively on quantitative models, with many users defining, modifying, debugging, and viewing the results in real-time. These models need to be scalable, both in terms of performance and in terms of the understandability of the resulting system. We will discuss some traditional approaches to this problem in the financial industry, as well the system we built for this case, called Webs. Webs uses a graph-based computational model specified using OCaml as a metalanguage to allow for concise description of programs, while still making the models easy to inspect, customize and debug. This is coupled with a parallel, incremental, evaluation engine for real-time results efficiently. Yaron Minsky joined Jane Street back in 2002, and claims the dubious honor of having convinced the firm to start using OCaml. He also spends way too much time teaching his kids how to program.