COMS4111 – (Solution)

$ 29.99
Category:

Description

Homework 4: All Tracks
Overview
There are two parts to HW 4:
4a : Written questions
4b: A common set of practical tasks for both the programming and nonprogramming tracks.
HW 4 does not have separate assignments for the programming and non-programming tracks.
Homework 4b has the following tasks:
. Create a new schema <uni>_S22_classic_models_star. Replace <uni> with your UNI.
. You will create a star schema using the data from your Classic Models database.
The fact in the fact table is of the form (productCode, quantityOrders, priceEach, orderedData, customerNumber). The dimensions are:
date_dimension: year, quarter, month, day of the month.
location_dimension: region, country, city. The zip file contain a
file country_region.csv that provides the mapping of countries to regions.
product_dimension: product_scale, product_line,
product_vendor.
. You will write queries that demonstrate:
A slice of the data.
A dice of the data.
A drill-down. A roll-up.
[MathJax]/jax/output/CommonHTML/fonts/TeX/fontdata.js
Setup
import pandas as pd
In [2]:
%load_ext sql
In [3]:
%sql mysql+pymysql://root:dbuserdbuser@localhost
In [4]:
country_region = pd.read_csv(‘./country_region.csv’)
In [5]:
country_region
In [6]:
[MathJax]/jax/output/CommonHTML/fonts/TeX/fontdata.js
Out[6]: Country Region

0 France EMEA
1 USA NaN
2 Australia APAC
3 Norway EMEA
4 Poland EMEA
5 Germany EMEA
6 Spain EMEA
7 Sweden EMEA
8 Denmark EMEA
9 Singapore APAC
10 Portugal EMEA
11 Japan APAC
12 Finland EMEA
13 UK EMEA
14 Ireland EMEA
15 Canada NaN
16 Hong Kong APAC
17 Italy EMEA
18 Switzerland EMEA
19 Netherlands EMEA
20 Belgium EMEA
21 New Zealand APAC
22 South Africa EMEA
23 Austria APAC
24 Philippines APAC
25 Russia EMEA
26 Israel EMEA
Schema
Execute your SQL statements for creating the schema, table and constraints for the fact and dimension tables in the following cells.

* mysql+pymysql://root:***@localhost
(pymysql.err.ProgrammingError) (1007, “Can’t create database ‘classicmodels_st ar’; database exists”)
[SQL: create schema classicmodels_star;]
(Background on this error at: https://sqlalche.me/e/14/f405)
from sqlalchemy import create_engine

database_user_id = “root” database_pwd = “dbuserdbuser”

database_url = “mysql+pymysql://” + database_user_id + “:” + database_pwd + “@localhost” database_url

sqla_engine = create_engine(database_url)

country_region.to_sql(
“country_region”, con=sqla_engine, if_exists=”replace”, index=False, schema=”classicmodels_star”)
In [8]:
27 Out[8]:
In [9]: %%sql use classicmodels_star
* mysql+pymysql://root:***@localhost
0 rows affected.
[] Out[9]:
In [10]: %%sql

UPDATE country_region SET region=”NA”
WHERE region IS NULL;
* mysql+pymysql://root:***@localhost
2 rows affected.
[] Out[10]:
In [11]: %%sql

DROP TABLE IF EXISTS sales_facts;
CREATE TABLE sales_facts AS
SELECT
CONCAT(orderNumber,’-‘,orderLineNumber) AS fact_id, orderNumber, productCode, quantityOrdered, priceEach, orderDate, customerNumber FROM
classicmodels.orderdetails NATURAL LEFT OUTER JOIN
Loading [Math Ja x]/jax/outputclassicmodels/CommonHTML/fonts.orders/TeX/font; data.js

ALTER TABLE sales_facts
DROP COLUMN orderNumber,
ADD PRIMARY KEY (fact_id);
* mysql+pymysql://root:***@localhost
0 rows affected.
2996 rows affected.
0 rows affected.
[] Out[11]:
In [13]: %%sql

DROP TABLE IF EXISTS date_dimension;

CREATE TABLE date_dimension AS
SELECT OrderDate,
YEAR(OrderDate) AS Year,
QUARTER(OrderDate) AS Quarter,
MONTH(OrderDate) AS Month,
DAY(OrderDate) AS “Day of the Month” FROM
Loading [Math Ja x]/jax/output(SELECT/Common HTML /fonts/TeX/fontdata.js
DISTINCT
%%sql

DROP TABLE IF EXISTS location_dimension;

CREATE TABLE location_dimension AS
WITH loc_interim AS
(SELECT
customerNumber, country, city FROM (SELECT
customerNumber,
RTRIM(country) as country,
city FROM
classicmodels.customers) AS A) SELECT
customerNumber, region, country, city FROM
loc_interim
NATURAL LEFT OUTER JOIN country_region;

ALTER TABLE location_dimension
ADD PRIMARY KEY (customerNumber);
In [12]:
* mysql+pymysql://root:***@localhost
0 rows affected.
122 rows affected.
0 rows affected.
[] Out[12]:
orderDate
FROM classicmodels.orders) as A;

ALTER TABLE date_dimension
ADD PRIMARY KEY (OrderDate);
* mysql+pymysql://root:***@localhost
0 rows affected.
265 rows affected.
0 rows affected.
[] Out[13]:
%%sql

DROP TABLE IF EXISTS product_dimension;
CREATE TABLE product_dimension AS SELECT productCode,
productScale as product_scale, productLine as product_line, productVendor as product_vendor
FROM classicmodels.products;

ALTER TABLE product_dimension
ADD PRIMARY KEY (productCode);
In [14]:
* mysql+pymysql://root:***@localhost
0 rows affected.
110 rows affected.
0 rows affected.
[] Out[14]:
In [15]: %%sql
ALTER TABLE sales_facts
ADD FOREIGN KEY (customerNumber) REFERENCES location_dimension(customerNumb
ADD FOREIGN KEY (orderDate) REFERENCES date_dimension(OrderDate),
ADD FOREIGN KEY (productCode) REFERENCES product_dimension(productCode);
* mysql+pymysql://root:***@localhost
2996 rows affected.
[] Out[15]:
%%sql

DROP VIEW IF EXISTS classic_cube;

CREATE VIEW classic_cube AS SELECT * FROM sales_facts NATURAL JOIN location_dimension NATURAL JOIN date_dimension NATURAL JOIN product_dimension;
In [16]:

* mysql+pymysql://root:***@localhost
0 rows affected.
0 rows affected.
[] Out[16]:
%%sql

SELECT * FROM classic_cube LIMIT 5;
* mysql+pymysql://root:***@localhost 5 rows affected. productCode orderDate customerNumber fact_id quantityOrdered priceEach region countr
In [17]:
Out[17]:

Data Loading
Enter and execute your SQL for loading the data into the facts and dimensions table. The source of the information is the Classic Models data.
Queries
In each of the sections below, define what your query is producing, provide the query and execute to produce the results.
Slice
Explanation: Given the cube of the data with dimensions product_vendor, year and region, we slice on year=2005
region, count(*) as count FROM
classic_cube GROUP BY
product_vendor, year, region HAVING year=2005;
* mysql+pymysql://root:***@localhost 39 rows affected.

Motor City Art Classics 2005 EMEA 18
Gearbox Collectibles 2005 APAC 9
Highway 66 Mini Classics 2005 APAC 5
Autoart Studio Design 2005 APAC 10
Dice
Explanation: Given the cube of the data with dimensions product_vendor, year and region, we dice on year in (2003,2004), product_vendor in (‘Min Lin Diecast’,’Red Start Diecast’) and region IN (‘NA’,’EMEA’).
%%sql
SELECT
product_vendor, year, region,
count(*) as count FROM
classic_cube GROUP BY
product_vendor, year, region HAVING
product_vendor IN (‘Min Lin Diecast’,’Red Start Diecast’) AND
year IN (2003,2004) AND
region IN (‘NA’,’EMEA’);
In [19]:
* mysql+pymysql://root:***@localhost 8 rows affected.
Out[19]: product_vendor Year region count

Red Start Diecast 2003 NA 28
Red Start Diecast 2003 EMEA 27
Min Lin Diecast 2003 EMEA 33
Min Lin Diecast 2003 NA 29
Red Start Diecast 2004 EMEA 36
Min Lin Diecast 2004 EMEA 47
Min Lin Diecast 2004 NA 40
Red Start Diecast 2004 NA 36
Roll Up
Explanation: Given the full data we create a roll-up of the data with dimensions product_vendor, year and region having number of instances as count.
%%sql
SELECT
product_vendor, year, region, count(*) as count FROM
classic_cube GROUP BY
product_vendor, year, region;
In [20]:
* mysql+pymysql://root:***@localhost 117 rows affected.
[MathJax]/jax/output/CommonHTML/fonts/TeX/fontdata.js
Out[20]:

Drilldown
Explanation: Given rollup from the previous question, we drill down into year->month and region->country, leading to more fields in our cube.
Motor City Art Classics 2005 APAC 10
Carousel DieCast Legends 2005 APAC 8
Welly Diecast Productions 2005 APAC 12
Min Lin Diecast 2005 APAC 7
Exoto Designs 2005 APAC 10
Classic Metal Creations 2005 APAC 12
Carousel DieCast Legends 2005 EMEA 14
Second Gear Diecast 2005 EMEA 15
Motor City Art Classics 2005 EMEA 18
Gearbox Collectibles 2005 APAC 9
Highway 66 Mini Classics 2005 APAC 5
Autoart Studio Design 2005 APAC 10
%%sql
SELECT
product_vendor, year, month, region, country, count(*) as count FROM
classic_cube GROUP BY
product_vendor, year, month, region, country;
In [21]:
* mysql+pymysql://root:***@localhost 1533 rows affected.
[MathJax]/jax/output/CommonHTML/fonts/TeX/fontdata.js

Out[21]:
NA USA NA USA NA USA
Denmark
Italy

NA USA
NA USA

NA USA

Unimax Art Galleries Denmark
Carousel DieCast Legends Denmark
Red Start Diecast Denmark USA USA

UK

Second Gear UK
Motor City Art Classics

NA Canada

USA

Studio M Art Models 2005 5 APAC Austria 1
Red Start Diecast 2005 5 APAC Austria 1
Gearbox Collectibles 2005 5 APAC Austria 2
Min Lin Diecast 2005 5 APAC Austria 1
Highway 66 Mini Classics 2005 5 APAC Austria 1
Unimax Art Galleries 2005 5 APAC Austria 1
Second Gear Diecast 2005 5 APAC Austria 1
Gearbox Collectibles 2005 5 APAC Australia 1
Carousel DieCast Legends 2005 5 APAC Australia 1
Red Start Diecast 2005 5 APAC Australia 1
Welly Diecast Productions 2005 5 APAC Australia 1
Studio M Art Models 2005 5 APAC Australia 1
Classic Metal Creations 2005 5 APAC Australia 1
Carousel DieCast Legends 2005 5 EMEA Belgium 1
Red Start Diecast 2005 5 EMEA Belgium 1
Min Lin Diecast 2005 5 EMEA Belgium 1
Second Gear Diecast 2005 5 EMEA Belgium 1
Exoto Designs 2005 5 EMEA Belgium 1
Studio M Art Models 2005 5 EMEA Spain 2
Highway 66 Mini Classics 2005 5 EMEA France 2
Carousel DieCast Legends 2005 5 EMEA France 2
Motor City Art Classics 2005 5 EMEA France 2
Second Gear Diecast 2005 5 EMEA France 1
Gearbox Collectibles 2005 5 EMEA France 1
Exoto Designs 2005 5 EMEA France 1
Classic Metal Creations 2005 5 EMEA France 1
Unimax Art Galleries 2005 5 EMEA France 2
Autoart Studio Design 2005 5 EMEA France 1
[MathJax]/jax/output/CommonHTML/fonts/TeX/fontdata.js

Reviews

There are no reviews yet.

Be the first to review “COMS4111 – (Solution)”

Your email address will not be published. Required fields are marked *