Data organisation and some basic relational ideas3 WEEK 2: INTRODUCING THE RELATIONAL APPROACH TO...

Preview:

Citation preview

WEEK2:INTRODUCINGTHERELATIONALAPPROACHTODATA

Data organisation and some basicrelationalideasInthisweek,wewillcoverthefollowingtopics:

• Data:thatyoucandesignthestructureofdata,includingtherelationshipsbetweendata.

• Theimportantofbeingabletouniquelyidentifydatumwithinasetofdata.• Howtoaccumulaterelationshipsbetweendata,asdata,storedintables.• AbrieflookatarealwordsetofdataknownasCodepoint,andthelimitations.• ‘Programming’:thedistinctionbetweendeclarativeprogrammingandalgorithmic

…andwillresultinthefollowinglearningoutcomes:

• Aninitialappreciationthatitisgoodtobesystematicinhowdataisrepresented.• Thatdata‘keys’allowustoaccessspecificdatum.• Knowledgethattablescanbeusedtostoredataandrelationshipsbetweendata.• Thatreal-worddataisavailable,butnotnecessarilyperfectlyorganised.• Afeelingforthekindofprogrammingrelevantfordatabaseinteractions.

2

WEEK2:INTRODUCINGTHERELATIONALAPPROACHTODATA

TableofContents2.1 Designingfordata..........................................................................................................3

2.1.1 Relationshipsbetweendata/datum.................................................................................32.1.2 Twoinitialproblems.........................................................................................................4

2.2 Relationalsolutionstotheproblems..............................................................................52.2.1 Uniqueidentification........................................................................................................52.2.2 Accumulatingrelationaldata............................................................................................7

2.3 Areal-world,file-baseddataset:CodePoint@Open......................................................82.3.1 AcritiqueofCodePoint@Open......................................................................................11

2.4 ‘Declarative’programingv‘Algorithmic’Programming................................................122.5 Summary.....................................................................................................................13

Figure-1:Basicrelationshipsseenbothways...........................................................................3Figure2:Websitemock-upexamplesfor‘newcustomer’websiteform................................3Figure-3:Aviewofwhatarelationaltablerepresents...........................................................8Figure-4:Structureofcodepointopendirectories..................................................................9Figure-5TheGCUpostcode...................................................................................................11Table1:Accumulatedcustomers.............................................................................................4Table2:CustomertableversionB–non-uniquenames.........................................................4Table3:Products......................................................................................................................5Table4:CustomertableversionC...........................................................................................6Table5:Somekey,definitionsregardingrelationaldatabases................................................7Table6:Orderedtable.............................................................................................................7Table7:Non-verboseCode-Point®Openheader/columnfields.............................................9Table8:VerboseCode-Point®Openheader/columnfields...................................................10Table9:Extractfromfile<codepo_gb/Data/CSV/g.csv>.......................................................10Table10:Applicationprogrampseudocodetoretrievedatafromfile.................................12Table11:Codetoretrievedatafromfilevdatabasedeclaration..........................................13

3

WEEK2:INTRODUCINGTHERELATIONALAPPROACHTODATA

2.1 Designingfordata2.1.1 Relationshipsbetweendata/datumDuring week 1, we discussed the idea of a ‘legacy’ datamanagement system, seen as aconsequenceoftherebeingnoenforcedrulesfordataentries.Theorder(wehaveseenthatthishasimportantaspectstoitconcerningtheactualdelivery)hasother important informationattached to it.Namely, theorder relates a product to acustomer. We might think of this situation, then, as two physical entities (Product andCustomer)beingconnectedbyan‘event’,the‘order’:

• Customer‘ordered’Product.

• Product‘orderedby’Customer.

Figure-1:Basicrelationshipsseenbothways

Consideranassociateddataentryexample,notfordataenteredintoaspreadsheet,thistime,butthroughawebsite;herewemightimaginearetailerwebsiteformdesignedtoreceivedataconcerningthecustomer.

Figure2:Websitemock-upexamplesfor‘newcustomer’websiteform.

Wepresenttwochoices(seeFigure2).Thefirstchoiceisdesignedtoreceive5entries,i.e..,FirstName, Surname,Address,E-mail, andTel (short for telephone number). The secondchoiceisdesignedtoreceive6entries, i.e.,FirstName,Surname,Address,E-mail,Tel,andProduct.Wedonotneedtoquestionthedesignoftheformsthemselves.Theforms,forourcurrentpurposes,merelyprovideaninterfacethroughwhichdataisinputto‘thesystem’.Letustakealookatwhattheactualdatamight‘looklike’(instorage),afteracompanyhashadanumberofnewcustomersadded.Wewillassumethatthesecondformwasadopted

Product Customerordered by

ProductCustomerordered

First Name:

Surname:

Address:

E-mail:

Tel:

https://your_company.com/new_customer

First Name:

Surname:

Address:

E-mail:

Tel:

https://your_company.com/new_customer

Product:

Add Customer Add Customer

4

WEEK2:INTRODUCINGTHERELATIONALAPPROACHTODATA

bythecompany,theonewithallofcustomerdetails,inadditiontothefirstcustomer-chosenProduct.First_Name Surname Address E-mail Tel Product

John Davis SalmonRoad jd@cloudme.com 077**08 SoapPowder

Aileen McManus EastAvenue am@lmail.com 0131**9 TableDavid Smith NorthClose jd@wohoo.com 0141**1 WoodPolishSarah Jones QueenRoad jd@geewizz.com 0131**9 Nails

Table1:Accumulatedcustomers.

Fromthepointofview,acustomer’sfirstorder,maybethiskindofdatamakessense?Ifyoulookonmostretailwebsites,however,youwillfindthatthecustomerisrequiredtoregisteranaccountseparatelytotheorder.Nevertheless,aswementioned,wearenotconcernedhere about the design of the web interface, although the following issues are worthhighlightinginrelationtheaboveexample:

1. What happens if a customer, named John Davis, registers himself, ordering theproduct Soappowder, as above, then threemonths later somebody elsewith thesamenameregistersandordersaBike?

First_Name Surname Address E-mail Tel ProductName

John Davis SalmonRoad jd@cloudme.com 077**08 SoapPowderAileen McManus EastAvenue am@lmail.com 0131**9 Table,TVDavid Smith NorthClose jd@wohoo.com 0141**1 WoodPolishSarah Jones QueenRoad jd@geewizz.com 0131**9 NailsJohn Davis WestClose jd@cloudme.com 077**08 Bike

Table2:CustomertableversionB–non-uniquenames.

2. HowwelldoestheProductcolumn‘cover’therelevantdatanowandinthefuture?Point1:Assumingthatthisdoeshappen,ifoneofthe‘JohnDavis’customersthenordersaTable,howdoweknowwhichJohnDavisthisis?Perhapswecanidentifythemuniquelybytheirphonenumberore-mailaddress?Point2:Ifwelookagainatthetable,duringthetimethesecondJohnDavisregistered,AileenMcManus ordered a TV. Is storing the collection of products in this way acceptable? Itcertainlydoesnotseemconsistent.TheoriginalmeaningoftheProductcategorywasinitialProduct.2.1.2 TwoinitialproblemsAlthoughwearenowlookingatadifferentexample,oneinvolvingahypotheticalweb-basedsystem,wearestuckinthesamesortoffile-based,spreadsheetmind-setintroducedinweek1,butwehaveintroducedtwonewproblems:

1. Theproblemofallowinguniqueaccesstoasetofdata.

5

WEEK2:INTRODUCINGTHERELATIONALAPPROACHTODATA

2. Theproblemofhowtobestrepresentaccumulativechanges/additionstodata,butwithoutallowingstructuralchangestobemadei.e.,withoutallowingchangetotheoverallstructureofthedataitself.

2.2 RelationalsolutionstotheproblemsAsawayofintroducingsomeofthebasicideasofrelationaldatabases,wewillnowre-designhowourdataisstored,expandingtheexamplesabove,toaddressourtwoinitialproblems.Let us make the data a little bit more believable, even if not necessarily ‘realistic’1. Forexample,aretailerislikelytohavealargersetofdatathatrelatestoaspecificProducts;whatisimportantfromacustomer-facingpointofview–thenameoftheproduct–wouldthenbeonecolumninthislargersetofvalues.Examplesofwhatmightbeimportanttotheretailer:

• Wheretheproductcomesfrom–theSupplier.• Theunitcostofpurchase–PurchaseCost.• Theunitpriceatpointofsale–SalePrice.• Productcode–ProductCode.• Productcategory–Category.• Productname–Product.

ProductName Category ProductCode SalePrice PurchaseCost SupplierSoapPowder Home 02147698 4.50 2.50 15

TV Electrical 05070012 999.99 650.00 05Table Furniture 08608376 299.00 100.00 09

WoodPolish Home 02447701 2.50 1.00 11Nails Hardware 01570627 3.00 50.0 15Bike Leisure 09377700 680.00 400.00 155

Table3:Products

2.2.1 UniqueidentificationWearenowgoingtore-defineTable2(CustomertableversionB).Wedothistoprovideauniqueidentifierforeachcustomer,whichwewillcallCustomerId,andwemaintainalloftheothercolumnsasdefinedpreviously,inTable4(assumethattheE-mailentriesexist–theyareomittedforeaseofvisualization).CustomerId First_Name Surname Address E-mail Tel ProductName

1Real-worlddataisofteninappropriatetouseinlecturenotesandtextbooksandmostofthetimeyouwillbelookingatillustrativedataonly.However,lateroninthecourse,especiallyintheaccompanyingtutorialswewilltrytousesomerealdata-sets.

6

WEEK2:INTRODUCINGTHERELATIONALAPPROACHTODATA

0001 John Davis SalmonRoad

. 077**08 SoapPowder

1298 Aileen McManus EastAvenue

. 0131**9 Table,TV

0032 David Smith NorthClose

. 0141**1 WoodPolish

0099 Sarah Jones QueenRoad

. 0131**9 Nails

9834 John Davis WestClose

. 077**08 Bike

Table4:CustomertableversionC

YoumighthavenoticedapotentialproblemwiththeProductNAMEcolumninTable2andTable5.Again,thereispotentialfortheproductsto‘overlap’intermsoftheProductNamecolumn – the names are, again, too generic to act as unique identifiers. For example, acustomermightchoosebetweenmanydifferentkindsofSoapPowder,so,wecannotusesuchproductnamesasuniqueidentifiers(theyarenotunique!).Therearetworulesforuniqueidentifier’s:

• Theymustbeunique!

• Theyshouldnotchangevalue!o This iswhy it isabad idea touseaphonenumbersore-mail addressesas

uniqueidentifiers.Therefore, in order to uniquely identify a customer, we use theCustomerId. In order touniquelyidentifyaproduct,weusetheProductCode.Theuniqueidentifierissimplyavaluethatisusedthatisuniquetoagivenrow(rowswithinagiventablecannothavethesameuniqueidentifier).Thephrase‘uniqueidentifier’anaccuratedescription,isalsoreferredtoasthe‘primarykey’becauseitprovidesprimarymeansofrow-wiseaccesstodata.Relationaldatabases:keydefinitions

• Schema: theentire structureanddescriptionofdata. Importantly, the schema isthoughtof,andpackaged,aspartofthedatabase.Inthiswayadatabaseis‘self-describing’.

• Primarykey:thecolumnattributeofatable,whoserowvalueisuniqueinthesetofrows.

• Table:theaxiomaticrepresentationalformemployedinrelationaldatabases• Row:accumulative• Columns:schematic• Relations: typically, the ‘links’ made between datum. In the area of relational

databases,collectionsofsuchrelationsarerepresentedastables.• Tuples:asetofdatathatisorderedaccordingtothecolums.Anygivenrowdefines

atuple.Ifthenumberofcolumsisn,theneachrowisann-tuple.So,eachrowinTable4isan7-tuple,whereaseachrowinTable3isan6-tuple.

7

WEEK2:INTRODUCINGTHERELATIONALAPPROACHTODATA

• Attributes:Refertothesetofcolumnsinatable;thesearethusthetablesattributesandatthesametimetheattributesofeachrowofdata.

• Entitytype:thespecificnamegiventoanentity,whichisessentiallyacollectionofcolumns/tupleetc.,associatedwiththattype.

Table5:Somekey,definitionsregardingrelationaldatabases.

2.2.2 AccumulatingrelationaldataIntheprevioussubsectionwesolvedtheproblemofuniqueidentificationofdatabydefining,andgivinganexampleof,theuseofprimarykeys.Nowwewillshowhowtostorerelationsbetweendatainthestyleoftherelationalapproach.

CustomerId ProductCode Date0001 02147698 02-01-20169834 09377700 02-02-20160032 02447701 23-02-20161298 08608376 05-03-20161298 05070012 05-03-20160099 01570627 21-03-20160032 08608376 19-03-20160032 01570627 19-03-2016. . .

Table6:Orderedtable.

Note that the secondproblemof storingnumerousproducts per-customer is, also, easilysolvedbythistable;wesimplyaddanewroweachtimeanorderisplaces–noneedtoaddthenewproducttothesamerowasotherproducts.Accumulativechangesarerepresentedwithoutstructuralchangesbeingmadetothetable,suchasaddinganewcolumnforanewproduct.InFigure-3wepresent(ontheleft)arepresentationoftherelationshipsspecifiedbyTable6,alongwith the tables thatcontain theprimarykeys.Notice that thecircles inFigure -3contain the values of attributes pertaining to the primary keys. In otherwords,we havechosenareadable-friendlyattributeofeachrowindicatedbytheprimarykey,ratherthantorepresenttheprimarykeysthemselves.Thisisalittlebitlikesaying:

• Select from the Customer table the First and Second Name according to theCustomerIdsintheOrdertable.

8

WEEK2:INTRODUCINGTHERELATIONALAPPROACHTODATA

Figure-3:Aviewofwhatarelationaltablerepresents.

2.3 Areal-world,file-baseddataset:CodePoint@OpenInsteadofworkingwith‘fake’dataforillustrativepurposes,inthissectionwearegoingtotakea lookatareal-worlddataset.Thedataset isthesetofUKpostcodedataknownasCode-Point® Open (Ordinance Survey, 2015). Postcodes are grouped combinations ofnumbersandletters,whichareassociatedwithapostalarea.Anexamplepostcodeis:

G40BAThe‘G’partofthepostcodestandsforGlasgow.The‘4’partofthepostcoderepresentsanareainGlasgow,andthe0representsanareawithintheareaofarea‘4’.So,aswereadacrossthe string G40BAwe effectively ‘zoom-in’ to quite a small region. Postcodes are used todeliverletters,andareusedtosortmail,tomakethedeliveryofmailmoreefficient.ImagineifapostmanweretorandomlydeliverlettersintheUK.Theywould,forexample,deliveraletter in Birmingham, followed by one in Aberdeen, some 680 kilometers North! So,postcodes help sortmail,which in turn helps organize its efficient delivery. A list of thegeneralpostcodeareasisavailableonwikipedia.org:https://en.wikipedia.org/wiki/List_of_postcode_districts_in_the_United_KingdomWewillseelaterinthecoursethatpostcodes,andtheassociateddatawithinCode-Point®Open,alongwithsomeothertechnologyweintroducelater,canbeusefulforotherthings,too. Fornow,though,wewanttofocusonthedata,toseehowitisstructuredandtoseewhatwemightwanttodotoputitintoadatabase.Firstly,Code-Point®Opencanbedownloadedfrom:https://www.ordnancesurvey.co.uk/opendatadownload/products.htmlCode-Point®Open(OrdinanceSurvey,2015)isanalmost-comprehensivelistoftheentiresetof postcodes in the UK. That is, it has thewhole of the UK’s postcodes listed, excludingNorthernIreland.Thedatabasecontainsapproximately1.7millionUKpostcodes,withsome

John Davis

Soap Powderordered

02-01-2016

TableAileen McManus ordered

(05-03-2016)

TV

ordered(05-03-2016)

David Smith

Sarah Jones

Nails

Wood Polish

ordered(21-03-2016)

John Davis Bike

ordered02-02-2016

ordered(23-02-2016)

ordered(19-03-2016)

ordered(19-03-2016)

9

WEEK2:INTRODUCINGTHERELATIONALAPPROACHTODATA

25 million adjoining addresses; therefore, each separate postcode serves about 15 postaddress,onaverage(15x1.7=approx.25).

Figure-4:Structureofcodepointopendirectories.

ThedataisdistributedbytheOrdinanceSurveyasasetoffiles.InFigure-4,wepresentthestructureofthefoldersandwherethedatafilesetc.canbefound.Thetop-level folder is<codepo_gb>,whichcontainstwosubfolders,<Doc>and<Data>,andinside<Data>wehavethefolder<CSV>.Weareonlyinterestedinthe‘.csv’files.The‘.csv’postfixstandsforcommaseparatedvalues,and‘.csv’filesaretypicallytextfiles,whichcanbeopenedinanystandardtexteditor.Mostofthefilesarecontainedinthe<CSV>directory:

• codepo_gb/Data/CSV/*.csv:thissetof120files(thethreeasterisks‘*’ inFigure-4hide102filesthatfit,alphabetically,between<ca.csv>,<wn.csv>)containstheactualdata.

Butthereisanother.csvfilehere:

• codepo_gb/Doc/Code-Point_Open_Column_Headers.csv:thissinglefiledefinestheheaderstothedata

Giventhestructureofsomeofthepublicdatathatisavailable(whichcanbeverymessy),weshouldbereasonablyhappywiththisfile-basedorganizationofCode-Point®Open.Ofcourse,thereistheproblemthatthedataisstoredintofiles,andifwewantedtocreateaprogramthatneededtousepostcodes,thefileswouldtakealongtimetoreadintomemory.However,thefilesarerelativelywellorganised.

PC PQ EA NO CY RH LH CC DC WCTable7:Non-verboseCode-Point®Openheader/columnfields.

codepo_gb

DataDoc

CSV

***

10

WEEK2:INTRODUCINGTHERELATIONALAPPROACHTODATA

Postcode Positional_quality_indicator Eastings Northings CY RH LH CC DC WCTable8:VerboseCode-Point®Openheader/columnfields.

Letustakeacloserlookattheheaderfile.Thisfilecontains10‘columns’and2‘rows’.Inthetextfile,thewaythatthecolumnsarecodedarebytheinsertionofthecomma‘,’.Therowsarejustspecifiedbybeingplacedonseparatelines,usingonthekeyboardthe<Enter>key.Theway ‘enter’ iscoded ina text file isbyusing ‘/n’ (this isnotnecessarilyvisible in texteditors), which is otherwise known as the carriage return. The header/column fields arepresented in a relatively non-verboseway in Table 7,whereas Table 8 contains relativelyverbose labels. Don’t worry at the moment what these fields mean – we are not eveninterestedinknowingwhatallofthismeans,butwewilllookatthemostimportant(fromourpointofview)fieldsinanexampleofthedata.So,thefirstcolumnscontaintheactualpostcode.Then,there issomethingsknownasthepositionalqualityindicator,followedbyentriesfortheeastingsandnorthings.LetustakealookatanextractfromadatafileseeinTable9."G40AJ",10,260044,665214,"S92000003","","S08000021","","S12000046","S13002649""G40AL",10,260044,665214,"S92000003","","S08000021","","S12000046","S13002649""G40AN",10,259725,664965,"S92000003","","S08000021","","S12000046","S13002649""G40AP",10,260044,665214,"S92000003","","S08000021","","S12000046","S13002649""G40AQ",10,260044,665214,"S92000003","","S08000021","","S12000046","S13002649""G40AX",10,259349,666274,"S92000003","","S08000021","","S12000046","S13002650""G40BA",10,259296,666080,"S92000003","","S08000021","","S12000046","S13002650"

Table9:Extractfromfile<codepo_gb/Data/CSV/g.csv>

Lookathowdesignersofthedatahavedecidedtorepresentdifferenttypesofdatawithineachcolumn.TheydefineaStringtypeinthefilebyenclosingwithdoublequotations“”.Forexample,theright-handcolumninthefinalrow(SI3002650)isaStringtypebyvirtueofthefactthatitisrepresentedbetweenthecommasas“SI3002650”.Ontheotherhand,numbersarestoredasis,withoutbeingenclosedwithquotes.Letus takea lookatsomeexamplenumbersandwhat theyrepresent, taking theyellow-highlighted lineasanexample.Thisrow is theentry fortheG40BApostcode,mentionedabove, which is the postcode for Glasgow Caledonian University. The ‘positional qualityindicator’(column2)hasavalueof10,the‘easting’(column3)avalueof259296,andthe‘northing’(column4)avalueof666080.Eastingsandnorthingsareaformofgeographicalcoordinate.Therefore,alocationonthesurfaceoftheearthcanbefoundwithan<easting,northing>pair.Soletuscheckhowaccuratetheeastingnorthingsdatais.Perhapswecanusealeadingmapsprovidertodothis?Notnecessarily!IfweplugtheseintoGoogleMapswewillgetanerror,becauseGooglemaps(andmanyotherwebservicescometothat)representgeocoordinatesdifferently, using the Latitude and Longitude. Nevertheless, it is possible to check theinformation by using a web site (http://www.gridreferencefinder.com) coded to handle<easting,northing>-stylecoordinates.Ifyougotothissiteandenterthevalues,theapinwillbe dropped inside the boundary of theGCU campus! Indeed, aswe see in Figure -5, the

11

WEEK2:INTRODUCINGTHERELATIONALAPPROACHTODATA

coordinate<259296,666080>issituatedataspecificpointwithinthecampus,actuallyjustnexttotheSaltireCentre,whichisoneoftheuniversitiesfocalpointsforlearning.

Figure-5TheGCUpostcode.

2.3.1 AcritiqueofCodePoint@OpenWe are not interested here in critiquing the actual raw data as a resource. As we havementioned,theresourceisveryvaluableinitself.Wedo,however,wanttobrieflyreflecton:thefolder/filestructureoftheresource,andthestructureoftherawdataitself.Strictlythereisno‘right’waytostructureadataresource,butyoushouldtryandnoticesomeinconsistencieswiththeorganizationofthefiles.TakealookagainatFigure-4.Forexample:

1. Reflectingonthefolder/filestructureoftheresource:a. Thefile<codepo_gb/Doc/Code-Point_Open_Column_Headers.csv>contains

theheadingsofthedata.Shouldthisnotbepartofthedata itself,perhapskeptinsidethecodepo_gb/Datafoldersomewhere?Furthermore,thisfilehasan‘.csv’extension.Why,then,isitnotintheCSVdirectory?

b. Metadata is data about data. It is, therefore, data. Why iscodepo_gb/Doc/metadata.txtnotintheDatadirectorythen?

c. AllofthefilesintheDatadirectoryare‘.csv’typefiles.Isthenamingofthisfolderredundant?

2. Thestructureoftherawdataitself:

a. Asmentioned,Stringsrequireadditionalcoding(“String”)withinthedatafileitself!Shouldn’tthetypeofthedatabedeclaredoutsideoftheactualrawdataentries?Theanswertothatisyes.

b. Howdowewriteanapplicationprogramtogetaspecificdataentry,orsetofentries,outofafile,orsetoffiles?Andhowdowedothisquickly?Theanswertothisisthatitcanbequitecumbersometocodeaccesstospecificvalues.Thealgorithmsrequiredtodothisneedcoding in theapplication.Furthermore,readingfilesfromdiskisinherentlyasslowprocedure.

http://www.gridreferencefinder.comcodepo_gb/Data/CSV/*.csv

12

WEEK2:INTRODUCINGTHERELATIONALAPPROACHTODATA

Manyoftheissuewementionhere,concerningfile-basedrepresentations,simplyseemtodisappearwhenusingdatabasesbecauseweareworkingwithinthewell-definedconstraintsofadatabase.2.4 ‘Declarative’programingv‘Algorithmic’ProgrammingBeforewefinishthisweek,wewanttopick-uponpoint2b,andonthefactthatinwritinganapplicationprogramwewouldneedtowritealgorithmsthatprovideaccesstothedata,orusesomeoneelse’scodethatdoesthis.Table10providesaverysimplealgorithm,andonethatwouldpotentiallybeveryslow, toaccess theeastingandnorthingdata fromthe file<g.csv>.Table11providestheequivalentdatabasestyledeclarationdesignedtoretrievethesamedatafromadatabase.AnimportantpointtonoteisthatTable10isahighlysimplifiedsetofpseudocode,whereasTable11isthefullMySqlstatement,whichifyouranwithinadatabaseenvironmentwouldactuallywork.Wewillseelateroninthecourse,usingaprogramminglanguageknownasJavaTM,thatwithinapplicationprogramsyoucan runSql statements.Rather thanwritingapplicationcode toaccessafilebasedsystem,runningSqlstatement/queriesexposesalloftheadvantagesofdatabasestotheapplicationprogram.Programstylecode//Createafileinstanceinmemoryfromg.csvFilef=newfile(“g.csv”);//Openthefileg.csvf.open();//AccesslineG40BAforeachlineinthefile{if(lineelemetn1isequaltoG40BA){targetLine=current_line}}//GettheinformationyouwantfromthatlineEastinge=targetLine.getElemtn(3);Northingn=targetLine.getElemtn(4);

Table10:Applicationprogrampseudocodetoretrievedatafromfile.

Databasestyledeclaration//SelectrequireddatafromthedatabaseSelectEastings,NorthingsfromdatabasewherePostcodelike‘G40AB’;

13

WEEK2:INTRODUCINGTHERELATIONALAPPROACHTODATA

Table11:Codetoretrievedatafromfilevdatabasedeclaration.

Whatisthebasicdifferencebetweenalgorithmicprograminganddeclarativeprogramming?Algorithmicprogramming:herethecontrol(andthereforetheresponsibility)andthedesignof thealgorithmsare in thehandsof theprogrammer.Allof thestepsneeddefiningandimplementing in-code.Thiscanbeahighly rewardingprocessas theprogrammerdesignscodeforre-use,orspeed,orsometrade-offbetweenthetwo.Declarative programming: this is not ‘programming’ in the same sense. A declarative‘programming’languagestateswhichdataisneeded,fromwhichdatabase.Theenjoymentofworkingwithdatabases isoftenrelatedtothecomparativesimplicityofretrievingdatafromadatabaseascomparedwithhavingtoaccessitfromafile-basedsystem,butisalsorelatedtotheprocessofactuallydesigningthestructureofthedataitself.Eitherformofprogrammingisnosensebetterthananother.Aswehavesuggested,thetwostylesofprogrammingareabsolutelycomplementary.own creativity and resourcefulness. Youmight, not yet, be capableof implementing yourideas,but,ifyouworkhard,thenthiswillcomewithtime.2.5 Summary

• Database development should not just be undertaken by jumping straight intocreatingadatabase.Importantprocessesofdesignmusttakeplacefirst.

• We introduced the relational approach and listed some basic relational database

definitions.

• Wethenconsidered real-world file-baseddataset knownasCodePointOpen.Thereasonwedidthis,wastodevelopanappreciationforanexistingFile-basedsystem,andtheoften-usedcommaseparatedvalues(.csv)format.

• Wethenconsideredhowwemightwriteaprogramtoaccessthedatawithinsuchasystem.

• This led us to the important distinction between declarative programming and

algorithmic programming. Query languages are very much based on the idea ofdeclarativeprogramming.

~~~~~

Recommended