Skip to content

Natural Language Translation and Apache Spark Test Driven Development

Photo of Arseny Chernov
Hosted By
Arseny C.
Natural Language Translation and Apache Spark Test Driven Development

Details

Spring is when you feel like a spark flying! Come over to Shopee's offices, where we'll go through two really cool topics presented by Apache Spark and Deep Learning practitioners:

  • "Natural Language Translation at Shopee’s Data Science" - the project on tensorflow, pytorch and OpenNMT helps Shopee's cross-border business with automated translation, saving costs that would be incurred by using third-party vendors. Some state-of-the-art Deep Learning models in the current Machine Translation research field. Learn how Shopee data science team implemented it!

  • "Data Integration using Guzzle and Test Automation" - the resilient, enterprise-grade, YAML-driven ETL framework that Just Analytics is known for wouldn't have existed without Cucumber, daily build and regression testing. Batch, Streaming, Near Real-Time. Jenkins, Docker, Apache Atlas, Kafka, - everything is vigorously scrutinised to ensure the changes are introduced without any disruption to the customer's existing performance and resilience SLA-s.

..

Speaker 1: Shao Hongxin (Shopee)
Data Scientist in Shopee and a part-time PhD from Nanyang Technological University. His research interest is sequential data and natural language processing.
#lifeatshopee

Speaker 2: Umesh Kakkad (Just Analytics)
Umesh is Co-founder of Just Analytics (JA), a specialized IT consulting firm focusing on data and analytics space. He has over 16 years of experience in big data, data warehouse and analytics, spanning across wide range of industry domains. He is the Delivery and R&D Head at JA and manages a team of over 40 consultants both in Singapore and the region and oversees delivery of data lakes , data warehouse, BI and analytics projects across the region. He is also leading the design and build of JA’s flagship product Guzzle, a data integration workbench which simplifies building, managing, orchestrating and monitoring data engineering jobs and uses Apache Spark as the runtime.

Photo of Databricks Meetup Group group
Databricks Meetup Group
See more events
Shopee Singapore
5 Science Park Dr · Singapore