Skip to content

Details

Join us for our 5th community coding adventure in Deep Learning! Just bring your curiosity and get ready to meet our growing community 😀 We are using ML and Deep Learning to perform time series prediction of weather data in Madrid, Spain!

Join Zoom Meeting:
https://us02web.zoom.us/j/84402592502?pwd=d1lVSkxQZE1sSGljR3dXaEZwYmNEdz09

Phone: +1 929 205 6099 US
Meeting ID: 844 0259 2502

Agenda:

#Download and read data from Kaggle
dataset = pd.read_csv(dataset_path, parse_dates=['CET'])

  • Step 4 😀
    #Keep only records from '2004-02-01' and forward due to missing days in previous records.
    #See also "extra things" to consider in the end.

#As a benchmark, we will look only at 'Mean TemperatureC' (average temperature in Celsius) for univariate time series prediction

series = np.array(dataset['Mean TemperatureC'])
time = np.array(dataset.index)
plt.figure(figsize=(16, 9))
plot_series(time, series)

  • Step 5 😀
    #As a group, we will use the following Train, Validation (optional) and Test split.
    #This means that we are keeping data:
    from 2004-02-01 to 2010-12-31 for training, #7 years of daily data
    from 2011-01-01 to 2012-12-31 for validation, #optional, can be combined with the training set. 2 years of daily data
    from 2013-01-01 to 2015-12-31 for test, #3 years of daily data

split_time = dataset[dataset['Date'] == '2012-12-31'].index[0] + 1
time_train = time[:split_time]
x_train = series[:split_time]

time_test = time[split_time:]
x_test = series[split_time:]

validation_split_time = dataset[dataset['Date'] == '2010-12-31'].index[0] + 1
time_valid = time_train[validation_split_time:]
x_valid = x_train[validation_split_time:]

Due to space limitation, the next steps are posted as 3 extra comments below!

Source:
https://www.kaggle.com/juliansimon/weather_madrid_lemd_1997_2015.csv

Members are also interested in