CSC343 Assignment 1: Relational Algebra (Solution)

$ 20.99
Category:

Description

1 Constraints
For each of the following constraints give a one sentence explanation of what the constraint implies, and why it is required.
• πspecies(Artifact) − πspecies(Species) = ∅.
• πrank(Staff) ⊆{’technician’, ’student’, ’pre-tenure’, ’tenure’}.
All staff ranks must fall under technician, student, pre-tenure or tenure, otherwise a staff member could have an unknown rank.
• πfamily(Genus) − πfamily(COL) = ∅.
Every genus belongs to a family name that is known in the Catalogue of Life, as per proper taxonomic practice.
• πgenus(Species) ⊆ πgenus(Genus).
All species belong to a known genus in the Catalogue of Life, as per proper taxonomic practice.
• πCID(Collected) = πCID(Collection).
The collections that artifacts belong to are known entire collections from field trips.
• πAN(Artifact) = πAN(Collected).
All artifacts belong to collections from field trips.
• πSID(Collection) ⊆ πSID(Staff).
• πSID(Artifact) ⊆ πSID(Staff).
Personnel who maintain an artifact must be a member of the institute scientific staff, ensuring that the artifact is safely stored.

All artifacts types must fall under tissue, image, model or live.
• πAN(Published) ⊆ πAN(Artifact)
Artifacts mentioned in scholarly publications must be known single artifacts collected in the field.
2 Queries
Write relational algebra expressions for each of the queries below.
1. Rationale: Performance reviews include seeing how current the work is of staff who have held theircurrent rank for a long time.
• We begin by determining the SIDs of staff who have not held the same rank for the longest time:
longestRankStaff := πSID(Staff) − notLongestRankStaff
• Determine the collections corresponding to the longest-rank staff from the above line:
tempCollection := σCollection.SID = longestRankStaff.SID(Collection)
newestCollectionDate := πdate(Collection) − notNewestCollection
2. Rationale: Staff who maintain every artifact in some collection should be considered favourably inperformance reviews.
Query: Find all staff who maintain all artifacts in at least one collection.
• We first determine the CIDs of collections and link them with the staff members who perform maintenance:
CollectionStaff := πCID, SID(Collected ./ Artifact)
• We then find the collections that are maintained by more than one staff member:

MultiStaffCollection := πC1.CID σC1.CID=C2.CID
C1.SID6=C2.SID
• We now find collections maintained by only one staff member:
SingleStaffCollection := (πCIDCollection) − MultiStaffCollection
• Finally we get the SIDs of the staff who maintain all artifacts in at least one collection:
AllArtifactStaff := πSID(SingleStaffCollection ./ Collection)
Query: Find all artifacts that were collected by the same staff who maintains them.
We find the artifact numbers where the maintenance staff is the same as the collection staff.
πAN(Collection ./ Collected ./ Artifact)
4. Rationale: Identify multi-talented field workers.
Query: Find all staff who have collected at least 3 artifacts from every species in some family.
• We first construct a relation that ties together ANs, CIDs and SIDs. This will allow us to isolate for staff who have collected at least three unique artifacts under the same species:
Triples := (πAN, speciesArtifact) ./ Collected ./ (πCID, SIDCollection)
• We will now generate all possible combinations of three artifacts, and project the relevant columns of AN, species and SID: AllCombinations := πA1.AN, A1.species, A1.SID,(ρA1Triples × ρA2Triples × ρA3Triples)
A2.AN, A2.species, A2.SID,
A3.AN, A3.species, A3.SID
• We construct a relation that links all species with their families:
SpecFam := πspecies, family(Species ./ Genus)
• We now select only the rows where the same staff member has collected at least three different artifacts under the same species, and we add the appropriate family to each row via a theta join:
TripleSpecies := π A1.SID, σ A1.AN6=A2.AN∧ AllCombinations A1.species=SpecFam.species SpecFam
A1.species, A1.AN6=A3.AN∧
SpecFam.family A2.AN6=A3.AN∧
A1.species=A2.species∧
A1.species=A3.species∧
A2.species=A3.species∧
A1.SID=A2.SID∧
A2.SID=A3.SID
• Each SID that has at least 3 artifacts from one species in the associated family:
StaffFam := πSID, familyTripleSpecies
• Find all the possible species a staff member would need to collect three artifacts from in order to complete a family. This only considers the families from which they have already collected at least three artifacts from at least one species, not all possible families.
AllPossibleStaffSpeciesFamilies := StaffFam ./ SpecFam
• We form a difference to remove all confirmed tuples where the staff member has collected at least three artifacts of the corresponding species. The remaining tuples represent missing species that a staff member has failed to collect from the associated family.
MissingSpeciesFromFamily := AllPossibleStaffSpeciesFamilies − TripleSpecies
• Get answer multiTalentedStaff := StaffFam
5. Rationale: Which publications might have some specialized niche focus?
Query: Find all publications that have used exactly 2 of our artifacts.

We first determine the publications where at least two of the artifacts have been used:
atLeastTwo := σP1.journal = P2.journal∧(ρP1 Published × ρP2 Published)
P1.AN6=P2.AN
• We now determine the publications where at least three of the artifacts have been used:
atLeastThree := σP1.journal = P2.journal∧(ρP1 Published × ρP2 Published × ρP3 Published)
P2.journal = P3.journal∧
P2.AN6=P3.AN
• Now, we isolate for the publications which have used exactly two artifacts:
exactlyTwo := atLeastTwo − atLeastThree
6. Rationale: Identify motherlode locations.
Query: Find all locations where at least one artifact from every family has been collected.
• We begin by finding a list of families that have been found at every location:
familiesAtLocations := πlocation, family(Artifact ./ Species ./ Genus)
• Next, we determine all possible combinations of locations and families:
allFamilyLocationCombinations := πlocation(familiesAtLocations) × COL
• We then isolate all the locations that do not have at least one artifact from every family collected there:
notMotherlode := πlocation(allFamilyLocationCombinations − familiesAtLocations)
• Finally, we remove these non-motherlode locations from all possible locations to find the ones which are motherlode locations:
motherlodeLocations := πlocation(Artifact) − notMotherlode
Query: Find all staff who have collected only tissue samples. • We first determine any staff who have collected non-tissue samples:
notTissueStaff := πSIDσtype = ‘image’∨Artifact
type = ‘model’∨ type = ‘live’ • We then remove these staff to isolate for the tissue-only collectors:
tissueStaff := πSIDArtifact − notTissueStaff
8. Rationale: Collection staff who should be encouraged to diversify their network.
Query: Find all staff pairs who have worked only with each other on collections.
Create a relation that contains the SID of each staff member and which artifacts they have collected:
artifactsCollected := Collection ./ Collected
• Next, determine all collectors who have worked with others:
collaborativeCollectors := πartifactsCollected.SID(σartifactsCollected.AN=Artifact.AN∧ artifactsCollected.SID6=Artifact.SID (artifactsCollected × Artifact))
• Next, determine all maintainers who have worked with others:
collaborativeMaintainers := πArtifact.SID(σartifactsCollected.AN=Artifact.AN∧ artifactsCollected.SID6=Artifact.SID (artifactsCollected × Artifact))
• We then determine staff that have only worked alone:
aloneStaff := πSIDStaff − collaborativeCollectors − collaborativeMaintainers
• Now we find all the staff who have worked with at least two other people on a collection. We begin by doing a large combination of two Artifact relations (A1, A2) and two artifactsCollected (C1, C2) relations. This gives us four SID columns along with the artifact numbers they worked on. We first ensure that the SIDs in C1 and C2 are the same so that we can check the remaining two SIDs (A1 and A2) are all unique. If C1 or C2, A1, and A2 are not unique, then we eliminate the tuple. We also check that A1, C1, and A2, C2 are working on the same artifact. The remaining tuples contain SIDs of staff who have worked with at least two other people. We get all these unique SIDs. atLeastTwoOthers1 := πC1.SID(σC1.SID = C2.SID∧
A1.AN = C1.AN∧
A2.AN = C2.AN∧
A1.SID6=C1.SID∧
A2.SID6=C2.SID∧
A1.SID6=A2.SID
(ρA1Artifact × ρC1artifactsCollected × ρA2Artifact × ρC2artifactsCollected))
atLeastTwoOthers2 := πA1.SID(σC1.SID = C2.SID∧
A1.AN = C1.AN∧
A2.AN = C2.AN∧
A1.SID6=C1.SID∧
A2.SID6=C2.SID∧
A1.SID6=A2.SID
(ρA1Artifact × ρC1artifactsCollected × ρA2Artifact × ρC2artifactsCollected))
atLeastTwoOthers3 := πA2.SID(σC1.SID = C2.SID∧
A1.AN = C1.AN∧
A2.AN = C2.AN∧
A1.SID6=C1.SID∧
A2.SID6=C2.SID∧
A1.SID6=A2.SID
(ρA1Artifact × ρC1artifactsCollected × ρA2Artifact × ρC2artifactsCollected))
• We now have all staff who have only worked with themselves and all staff who have worked with at least two other people. We can subtract these from all staff to get the SIDs of those who have worked exclusively with each other.
exclusiveStaff := (πSIDStaff) − aloneStaff
− atLeastTwoOthers1 − atLeastTwoOthers2 − atLeastTwoOthers3
• Now that we have all exclusive staff SIDs, we need to put them in pairs. We create two copies of a relation to get the artifacts that exclusive staff have worked on.
exclusiveStaffArtifacts1 := Artifact ./ exclusiveStaff exclusiveStaffArtifacts2 := Artifact ./ exclusiveStaff
• Finally, we need to find pairs of staff who have worked on the same artifact. Since we have only selected artifacts that exclusive staff have worked on, if two unique people have worked on the same artifact, then they must be an exclusive pair. exclusiveStaffPairs := πE1.SID, E2.SID
(ρE1exclusiveStaffArtifacts1 ./E1.SID6=E2.SID∧ ρE2exclusiveStaffArtifacts2)
E1.AN = E2.AN
9. Rationale: Track the influence of a given staff member.
Query: Staff member SID1 is influenced by staff member SID2 if (a) they have ever worked together on a collection or (b) if SID1 has ever worked with a staff member who is influenced by SID2. Find SIDs of staff members influenced by SID 42.
• Cannot be expressed

3 Constraints
1. No species is also a genus.
σspecies = genus(Species) = ∅ 2. No genus belongs to more than one family.
ρG1Genus ./G1.genus = G2.genus∧G1.family6=G2.family ρG2Genus = ∅
3. All publications must be published after all artifacts they use have been collected.
jointCID := Collection ./ Collected
σtype = ‘live’∧rank = ‘student’(Artifact ./ Staff) = ∅

Reviews

There are no reviews yet.

Be the first to review “CSC343 Assignment 1: Relational Algebra (Solution)”

Your email address will not be published. Required fields are marked *