Scraping and Sourcing Data with Python

Name: Scraping and Sourcing Data with Python
Start: 2025-11-18T13:00:00-08:00
End: 2025-11-18T16:00:00-08:00

Hosted by Raju S.

Details

## Details

Enroll in this training and receive a one-month complimentary e-learning subscription with access to 40+ courses.
This course is provided by Big Data Trunk for Stanford Technology Training Program but a limited few seats available to the public.

Students of this class may have opportunity to be considered for Internship with Big Data Trunk.

In this class, we will:

Explore many sources and repositories for valuable data acquisition such as open government and university datasets
Explore popular social APIs (e.g., Facebook, Spotify, Twitter) and domain-specific APIs (e.g., healthcare, news, science and math) that store a wealth of data
Discuss methods to query web servers, and request and parse data to extract the information you need
Explore scraping various types of data from websites and how to read and extract text from documents (e.g., PDF, Word) along with methods to clean and store sourced and scraped data.

Learning Objectives
During this course, you will have the opportunity to:

Topic Outline
Overview of Data Sourcing

Introduction to the Python Programming Language

Using Public APIs (Application Programming Interfaces)

Explore Popular and Domain-specific APIs
Common Conventions
Parsing JSON
Milestone 3 Learning Exercise: Access a public API (e.g., Facebook, Twitter, Google)

Extracting Text from Documents

Overview of Data Scraping

Cleaning Scraped Data

Conclusion: Next steps
Prerequisites
Learners should have an understanding of Basic Python Programming.

Big Data

Data

Data Analytics

Data Visualization

New Technology