Description
Instructions:
In this project, you are given a dataset collected by an actual IoT system (see description below) and asked to use the dataset to build a forecasting model. You have to answer a set of questions, as well as propose your own interesting questions.
1. Form teams in groups of 4 students and select a name for your team. Be creative! Please email me your group members and team name.
2. Complete Question 1. Use a Jupyter notebook (ipynb file) to do the analysis and answer all the parts of the Question. Submit (i) PDF file/Print preview of your Jupyter notebook, and (ii) the Jupyter notebook (ipynb file). Zip both files into one zip file named as GroupName Question 1.zip and upload it to the appropriate LumiNUS folder.
3. Do the same for Question 2. Please name your file GroupName Question 2.zip and upload to LumiNUS.
4. Do the same for Question 3. For Question 3, please include a detailed description of your proposed work. Please name your file GroupName Question 3.zip and upload to LumiNUS.
5. The project carries a total of 40 marks: 30 marks for technical contributions (10 marks for each question), and 10 marks for presentation.
Data File:
The data file is available in the IVLE workbin under the directory ”Project Details”.
Data Description:
<Timestamp (localtime)> <MeterID (dataid)> <meter reading (meter_value)>
Questions:
1. Exploring the Data (10 marks)
EE4211 Data Science for IoT, Project Description Page 2 of 2
1.2 Generate hourly readings from the raw data. Select one month from the 6-month study interval and plot the hourly readings (time-series) for that month. Hint: You will have to decide what to do if there are no readings for a certain hour.
1.3 Intuitively, we expect that gas consumption from different homes to be correlated. For example, many homes would experience higher consumption levels in the evening when meals are cooked. For each home, find the top five homes with which it shows the highest correlation.
2. Forecasting (10 marks)
2.2 Build a linear regression model to forecast the hourly readings in the future (next hour). Generate two plots: (i) Time series plot of the actual and predicted hourly meter readings and (ii) Scatter plot of actual vs predicted meter readings (along with the line showing how good the fit is).
2.3 Do the same as Question 2.2 above but use support vector regression (SVR).
3. Student Proposal (10 marks)
3.1 At this point, you understand the data quite well. Propose and carry out additional analysis using the dataset given. Please be sure to justify why this additional analysis is useful and interesting.
Additional Information about Data Collection:
1. Gas flow meters have a sensor that is used to measure the volume of gas that passes though a pipe. Different meters use different sensors (e.g. ultrasonic sensors, synthetic diaphragm with rotating valve etc.). The meters check on the sensors periodically to get a reading of the current consumption value. This is what is meant in the sentence above: ”The gas meters measure the cumulative gas consumption at a frequency of 15 seconds.”
Reviews
There are no reviews yet.