Time Series Prediction: Weather data Madrid, using ML and Deep Learning


Details
Join us for our 5th community coding adventure in Deep Learning! Just bring your curiosity and get ready to meet our growing community π We are using ML and Deep Learning to perform time series prediction of weather data in Madrid, Spain!
Join Zoom Meeting:
https://us02web.zoom.us/j/84402592502?pwd=d1lVSkxQZE1sSGljR3dXaEZwYmNEdz09
Phone: +1 929 205 6099 US
Meeting ID: 844 0259 2502
Agenda:
-
Introductions and get to know our community
-
Deep Learning YouTube recordings, feel free to share and subscribe π
https://bit.ly/deep-learning-tf
https://bit.ly/deep-learning-tf-coding -
Deep Learning Adventures - Coding Presentation:
https://docs.google.com/presentation/d/1XXSLSTDOUnlYK1ksA4p3Kym-sDd7Nzj2r7__2fYqkxo/edit?usp=sharing -
Code repository for our Deep Learning Adventures π
https://github.com/georgezoto/Deep-Learning-Adventures -
Join us on Slack:
https://join.slack.com/t/deeplearninga-nmk8930/shared_invite/zt-gpqxpg6u-U7TRpIRE3NgsAum6BC2IZQ -
Spread the word about our meetup π
-
Coding session on real world data - Time Series
-
Step 1 π
Explore and learn more about this multivariate time series data from this Kaggle dataset:
https://www.kaggle.com/juliansimon/weather_madrid_lemd_1997_2015.csv -
Step 2 π
Do some data exploration, look for missing dates or missing values as well as how each field is distributed or correlated with other fields -
Step 3 π
Let's get coding! We will use historical daily data to predict weather/temperature 3 years from a given date!
#Download and read data from Kaggle
dataset = pd.read_csv(dataset_path, parse_dates=['CET'])
- Step 4 π
#Keep only records from '2004-02-01' and forward due to missing days in previous records.
#See also "extra things" to consider in the end.
#As a benchmark, we will look only at 'Mean TemperatureC' (average temperature in Celsius) for univariate time series prediction
series = np.array(dataset['Mean TemperatureC'])
time = np.array(dataset.index)
plt.figure(figsize=(16, 9))
plot_series(time, series)
- Step 5 π
#As a group, we will use the following Train, Validation (optional) and Test split.
#This means that we are keeping data:
from 2004-02-01 to 2010-12-31 for training, #7 years of daily data
from 2011-01-01 to 2012-12-31 for validation, #optional, can be combined with the training set. 2 years of daily data
from 2013-01-01 to 2015-12-31 for test, #3 years of daily data
split_time = dataset[dataset['Date'] == '2012-12-31'].index[0] + 1
time_train = time[:split_time]
x_train = series[:split_time]
time_test = time[split_time:]
x_test = series[split_time:]
validation_split_time = dataset[dataset['Date'] == '2010-12-31'].index[0] + 1
time_valid = time_train[validation_split_time:]
x_valid = x_train[validation_split_time:]
Due to space limitation, the next steps are posted as 3 extra comments below!
Source:
https://www.kaggle.com/juliansimon/weather_madrid_lemd_1997_2015.csv

Time Series Prediction: Weather data Madrid, using ML and Deep Learning