CMPE321 – (Solution)

$ 24.99
Category:

Description

Ask the project related questions in the Moodle forum!
1 Introduction
The study of drugs or chemicals and the effects they have on a living organism is called pharmacology. A drug is a chemical that interacts with a target molecule (such as a protein), which in turn results in a change in the molecule’s or the corresponding cell’s behavior.
To cope with the tremendous number of drugs and targets (e.g., proteins that bind to drugs), online databases are created and commonly used in the literature. Online databases allow easy and fast access to information and most of them focus on a specific field. For instance, DrugBank contains information on drugs and drug targets, UniProt contains proteins and their structures, SIDER contains drugs and their side effects, whereas BindingDB curates the interactions between drugs and targets. Thus, a free online resource that integrates these databases would benefit drug discovery researchers, chemists, pharmacists, students, and the general public.
2 Project Description
In this project, you will design a unified Drug – Target database, called DTBank. You will begin with a detailed description of the content. Then you will need to systematically go through parts of the standard database design process as you learned about in class, including conceptual design, logical design, and schema refinement.
2.1 DTBank: Structure
The DTBank should contain the following information:
1. User includes the following attributes; username, institute name, and password. There exists only one user with a specific username and institute. There can be an unlimited number of users.
2. Database manager consists of the following attributes: username and password. There exists only one database manager with a username. There can be at most 5 database managers registered to the system.
3. DrugBank includes DrugBank ID, drug name, description, and interaction with other drugs. By definition, each DrugBank ID is unique.
4. SIDER includes UMLS CUI (side effect IDs), DrugBank ID, and side effect name. By definition, each UMLS CUI is unique.
5. BindingDB includes Reaction ID, DrugBank ID, UniProt ID, target (protein) name, SMILES (chemical notation of drug), affinity in nM (the strength of the binding interaction between drugs and targets), the measure of the interaction (Ki, Kd, IC50), DOI (link on the web to identify the article or document that mentions the drug-target interaction. E.g., https://doi.org/10.1093/bioinformatics/bty593), authors of the article or document, the username of the first author, and institution of the first author. By definition, each Reaction ID is unique and the first author is a user of the DTBank.
6. UniProt includes UniProt ID and amino acid sequence of the corresponding protein. By definition, each UniProt ID is unique.
2.2 DTBank: Real data
umls cui drugbank id side effect name
C0085624 DB00600 Burning sensation
C1325847 DB00600 Sensitisation
C0152030 DB00600 Skin irritation
C0521491 DB00210 Application site pain
C0085624 DB00210 Burning sensation
C0009763 DB00210 Conjunctivitis
C0036572 DB00210 Convulsion
C0011603 DB00210 Dermatitis
C0152030 DB00210 Skin irritation
username institution password
Charpentier CIRD GALDERMA charpentier99!.
Diaz TBA diaz.p25
Morgan Amgen Inc morgan.re.123
Table 1: Sample data from User
username password
selen.parlar selen.parlar
riza.ozcelik riza.ozcelik0
arzucan ozgur arzucan 135
uniprot id sequence
P15207 MEVQLGLGR…KPIYFHTQ
P10826 MTTSGHACP…VSQSPLVQ
Table 3: Sample data from SIDER
Table 2: Sample data from Database Manager
Table 4: Sample data from UniProt
Table 5: Sample data from DrugBank
Table 6: Sample data from BindingDB
2.3 Part 1: Conceptual database design
Your task in Part 1 is to perform the Conceptual Database Design (or ER Design) – draw ER diagrams to capture all the information, following the approach described in lectures. While there are many ER-model variants, for this project, we expect you to use the ER notation from the textbook and lecture.
2.4 Part 2: Logical database design
For the second part of the project, your task is to convert the ER diagrams into relational tables, based on the set of simple rules as described in the textbook and in lectures. You should provide the schema of each relation including the relation name, attribute names, and attribute domains.
2.5 Part 3: Schema refinement and normalization
For the third part of the project, your task is to analyze your design in Part 2 in terms of functional dependencies (FDs) and normal forms, then, refine your design if needed. You should explicitly list all of the non-trivial FDs. Then, for each relation you should determine if it is in Boyce-Codd Normal Form (BCNF) and you should explain how the requirements of BCNF are met (or not met) in terms of FDs. If a relation is not in BCNF, you should check whether it is 3NF and explain how the requirements for 3NF are met (or not met). If a relation is not in BCNF, you should either decompose it into BCNF relations or provide a justification if you decide not to decompose it. If you decompose a relation, you should explain whether the decomposition is lossless-join and dependency preserving. It is possible that your initial schema is already in BCNF. If this is the case, you still need to explain how the requirements of BCNF are met in terms of FDs for each one of your relations.
2.6 Part 4: Write SQL statements for the normalized schema
You are required to write SQL DDL statements that create the tables you designed for this part. You should specify all the constraints such PK, FK, Unique, NOT NULL, and other general constraints with CHECK. You should turn in two files:
1. createTables.sql
2. dropTables.sql
Make sure that you include a comment in each file.
3 Submission & Remarks

Reviews

There are no reviews yet.

Be the first to review “CMPE321 – (Solution)”

Your email address will not be published. Required fields are marked *