28
1 One thing in one place Data, Schemas and Applications Week 8

One thing in one place

Embed Size (px)

DESCRIPTION

One thing in one place. Data, Schemas and Applications Week 8. Data Normalisation. Formal technique to reduce complex relations to a set of simpler relations which are better structured and reduce redundancy - PowerPoint PPT Presentation

Citation preview

Page 1: One thing in one place

1

One thing in one place

Data, Schemas and Applications

Week 8

Page 2: One thing in one place

2

Data Normalisation

• Formal technique to reduce complex relations to a set of simpler relations which are better structured and reduce redundancy

• We shall look at it informally on this module. On a full database module this would be covered in more theoretical depth.

Page 3: One thing in one place

3

Facilities Table

FacilityNo Facility Type Facility Unit Price per session1 Squash court 1 £4.502 Squash court 2 £4.503 Squash court 3 £4.504 Tennis court 1 £4.005 Tennis court 2 £4.006 Tennis court 3 £4.007 Astroturf pitch 1 £8.008 Astroturf pitch 1 £8.009 Hall court 1 £6.00

10 Hall court 2 £6.0011 Skittles alley £7.50

Page 4: One thing in one place

4

Potential anomalies

• UPDATE the price per session of facility 1– now different rates for squash courts - suspect business rule is same rate for all facilities for a ‘sport’

• ADD a new tennis court – must also ensure the same rate is used as for

other tennis courts

• DELETE facility 11– lose the rate per session for skittles

Page 5: One thing in one place

5

Two better tables

FacilityFacilityType

Ftypeno name price1 Squash £4.502 Tennis £4.003 Astroturf £8.004 Hall £6.005 Skittles £7.50

FacilityNo FtypenoFacility Unit1 1 court 12 1 court 23 1 court 34 2 court 15 2 court 26 2 court 37 3 pitch 18 3 pitch 19 4 court 1

10 4 court 211 5 alley

FacilityNoFaciltyUnit

FtypeNoNamePrice

Page 6: One thing in one place

6

Normalisation• Must ensure no data has been lost in this process.

• Can recover the Original data using join:select * from facility natural join facilityType

• Informally we feel that the second solution is ‘better’ than the first.– Less redundancy– Update anomalies avoided– Simpler tables reusable

• Queries are slower because a join needed– original table:

select price from facility where facilityNo=6– normalised tables:

select price from facility natural join facilityType where faciltyNo = 6

Page 7: One thing in one place

7

Exercise

• Normalise this data on outstanding video rentals• Describe your result as an ER model

(relationships not foreign keys)

Source: www.gc.maricopa.edu/business/sylvester/cis164/norm-rev.htm

Page 8: One thing in one place

8

1st Normal form - 1NF

• the data must be in the form of valid relations - i.e. with uniquely named columns, each column containing only atomic (indivisible) values

• each row unique - identify the primary key

Page 9: One thing in one place

9

2nd Normal Form

• Non-key fields are fully dependent on the primary key

• ‘Dependent’ – Column A is dependent on Column B if, in the

data, knowing the value of B will determine the value of A

– E.g. E#, Age• Age is dependent on E# • E# is not dependent on Age ( several people with

the same age , different employee numbers)

Page 10: One thing in one place

10

3rd Normal Form 3NF

• A relation R is in 3NF if and only if– the non-key fields are fully dependent on the

primary key i.e is 2NF and– the non-key fields are mutually independent– e.g R(A,B, C)– then

• B dependent on A• C dependent on A• B and C independent

Page 11: One thing in one place

11

Mnemonic

• “the key, the whole key and nothing but the key”

• 2NF – “the key” : non-key fields depend on the key– “the whole key”: non-key fields depend fully on

the key

• 3NF– “nothing but the key”: non-key fields are not

dependent on each other

Page 12: One thing in one place

12

Summary : Data Normalisation• Place common relations in separate tables and join them when

required• Normalisation is bottom-up – given a mess of data in a table (a

spreadsheet for example), how to restructure it to reduce redundancy

• More names required – name for each relation at least – more concepts but more understandable

• Normalised tables may be reusable• Reduced performance due to extra joins – may have to de-

normalise to get better performance • Improved performance due to smaller database• Normalisation is a bottom-up process – starting from some mess –

like a spreadsheet• Top-down models are usually normalised anyway

Page 13: One thing in one place

13

Code ‘Normalisation’

• For very similar reasons, we want to write our code (PHP, HTML, Javascript..) so it is ‘normalised’ too – common, recurrent code in factored into one place.– To avoid update anomalies – update in one

place– To increase readability – less code to read – To create reusable components– But may reduce performance – but who cares

– maybe later if you need to optimize-

To factor (maths) – to divide into the basic units

68 = 2*2*17

Page 14: One thing in one place

14

Recurrent Values

• If the same value is used in several places, assign it to a variable (constant) and use the variable name where it is needed.

• If this common value has to be changed, it can be changed in one place only.

• The variable name provides a understandable explanation of what this value is for.

Page 15: One thing in one place

15

Silly example: colorheads.php<?phpprint "<h1 style='color:4AFF2F;'>This one is OK</h1>";print "<h1 style='color:FF1052;'>This one is BAD</h1>";?>

<p> here is some text </p>

<?phpprint "<h1 style='color:FF1052;'>This one is BAD too</h1>";print "<h1 style='color:4AFF2F;'>This one is OK too</h1>";?>

Page 16: One thing in one place

16

Revision 1 : colorheads2.php<?php

$goodColour = "4AFF2F";$badColour = "FF1052";

print "<h1 style='color:$goodColour;'>This one is OK</h1>";print "<h1 style='color:$badColour;'>This one is BAD</h1>";

?>

<p> here is some text </p>

<?php

print "<h1 style='color:$badColour;'>This one is BAD too</h1>";print "<h1 style='color:$goodColour;'>This one is OK too</h1>";

?>

Page 17: One thing in one place

17

Revision 2 : colorheads3.php<html><style>.good {color:4AFF2F;}.bad {color:FF1052;}</style></head><body>

<h1 class="good">This one is OK</h1><h1 class="bad">This one is BAD</h1><p> here is some text </p>

<h1 class="bad">This one is BAD</h1><h1 class="good">This one is OK</h1></body></html>

Page 18: One thing in one place

18

Common code

• A common piece of code can be put in on place and linked into the programme

• Suppose you have an HTML footer to go on every page (it might include to call Google Analytics for example)

• We want to put this in one file and then include it in the scripts for each of the pages

Page 19: One thing in one place

19

<hr/><p style="font-size: 10pt; color: blue">This page is generated on <?php print date("l, d/m/Y") ?> at <?php print date("H:i") ?> from a script written by Chris Wallace.</p>

Sample page (page1.php)

<html><head><title>First Page</title></head><body><p style="color:silver; font-size:15pt;">This is the main body of page 1 which is linked to <a href="page2.php">page 2</a></p><?php include("footer.php") ?></body></html>

Footer code (footer.php)

php function to include php script from another file

Page 20: One thing in one place

20

Modularisation

• Modularisation as a common pattern

• In PHP there are several slightly different ways to include code:• include(), include-once()• require(), require-once()

• Since PHP is a scripting language, code can’t be pre-compiled as in Java.

Page 21: One thing in one place

21

Functions

• Common code but for a few variations

• Variations can be expressed as parameters

• Create a function

• Call the function where required

Page 22: One thing in one place

22

Function declaration and use makefooter.php

<?php

// define the function function footer($author, $size) {return "<hr/><p style='font-size: {$size}pt; color: blue'>This page is generated on ". date("l, d/m/Y")." at ".date("H:i"). " from a script written by $author.</p>";}

//call the functionprint footer("Chris Wallace", 20);

//call with different parametersprint footer("Mark Butler", 30);

?>

Two parameters

{ } means evaluate

expression

Can’t write

$sizept

. Dot for concatenate

strings

Page 23: One thing in one place

23

Functions and Modules

• Common to combine these two ideas

• Place the function definitions in a separate file

• Include file in every program which needs these functions

• Call the function wherever required.

• E.g. page3.php

Page 24: One thing in one place

24

<html><head><title>Third Page</title></head><?php include("functs.php") ?><body><p style="color:silver; font-size:15pt;">This is the main body of page 3.</a></p><?php print footer("Chris Wallace", 25) ?></body></html>

<?php// generate a footer function footer($author, $size) {return "<hr/><p style='font-size: {$size}pt; color: blue'>This page is generated on ". date("l, d/m/Y")." at ".date("H:i"). " from a script written by $author.</p>";}?>

Function library functs.php

Use of function

Page 25: One thing in one place

25

Code normalisation : Summary

• Breaking a programming task cleanly into into simpler, re-useable units

• The sign of a good programmer• Good structure is arrived at iteratively

– You notice common parts as the code is written– You re-factor the code to put the common parts in

one place

• Avoid merely copying code because this creates update anomolies – an error has to be corrected in all copies

Page 26: One thing in one place

26

One thing in one place• An idea which is common to both data and code normalisation

– In database it means splitting big data sets into smaller, independent and reusable ones with no redundancy

– In programming it means splitting a large amount of code into smaller reusable items – variables, modules, functions and classes

• Where else do we see this principle?– A class or type separates the fixed parts of a set of things (every

employee has a name) from the variable (Empno ‘1111’ has name ‘Fred’)

– In learning, it is about identifying common themes or patterns, separating the general from the particular :

• String concept implemented differently in each language• Code and Annotations in one script• Several Languages in one script

Page 27: One thing in one place

27

Tradeoffs and judgement

• Normalisation requires judgement– Are these two things really the same?– Will they always be the same?

• Normalisation incurs a performance penalty – because the separate bits have to be put back together in execution – joins in a database, calls in code

• Data and code design involves trade-offs

Page 28: One thing in one place

28

Hints for debugging PHP

• Use the isa.cems.uwe.ac.uk host to show errors

• Add additional print statements– Comment out in case needed in the future

• Use print_r($array) to print a complete array (such as a row returned from the database)