Btree project4 discussionpages.cs.wisc.edu/~jignesh/cs564/notes/BadgerDB_Btree_project.pdf ·...

Preview:

Citation preview

BadgerDB::Btree• Goal:BuildkeycomponentsofaRDBMS

– Firsthandexperiencebuildingtheinternalsofasimpledatabasesystem

– Andhavesomefundoingso!

• Twoparts– Buffermanager[✔]– B+tree(DueDate:Mar27by2PM)

• FirstclassdayaftertheSpringbreak

Allprojectsareindividualassignments

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 1

StructureofDatabase

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 2

Queryoptimizerandexecution

Relationaloperators

Fileandaccessmethods

Buffermanager

I/Omanager

Planfortoday

• ReviewofC++templatesandhelpfulfunctions– memset,memcpy andreinterpret_cast

• B+tree:insertion• <break>• BadgerDB::Btree

– Projectspecifications– Code

• Q&A

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 3

C++templates

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 4

Humanmiseryofduplicatecodes

• SupposeyouwriteafunctionprintData:

• lateryouwanttoprintdoubleandstd::string

void printData(int value) {std::cout << "The value is "<< value;

}

void printData(double value) {std::cout << "The value is "<< value;

}

void printData(std::string value) { std::cout << "The value is "<< value;

}2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 5

AndStroustrup said- lettherebetemplates

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 6

template<typename T>void printData(T value) {

std::cout << "The value is ” << value;}

AndStroustrup said- lettherebetemplates

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 7

template<typename T>void printData(T value) {

std::cout << "The value is ” << value;}

Templatesemantics

• Thesyntaxissimple:template< typename name ó class name >

• Functiontemplates• Classtemplates

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 8

Functiontemplates

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 9

template<typename T>void func() {}

int main() { func<int>();func<double>();

}

Functiontemplates

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 10

template<typename T>void func() {}

int main() { func<int>();func<double>();

}

template<typename T>void func(T value) {} template<typename T, typename U>T func2(U value) {

return T(value); }

int main() { // T=intfunc(3);// T=doublefunc(3.5);

// T=int, U=double func2(3.5);// T=std::vector, U=intfunc2<std::vector> (5); // specify both T and U // T=std::vector, U=intfunc2<std::vector, int>(5.7);

}

Classtemplates• Alsoworksonstructs

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 11

template<typename T, int i>struct FixedArray {

T data[i]; };

FixedArray<int, 3> a; // array of 3 integers

Classtemplates• Alsoworksonstructs

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 12

template<typename T, int i>struct FixedArray {

T data[i]; };

FixedArray<int, 3> a; // array of 3 integers

template<typename T>class MyClass { };

template<typename T1, typename T2=int> class MyClass{};

// specify all parameters MyClass<double, std::string> mc1;

// default value for T2MyClass<int> mc4;

Templaterequirements• Templatesimplicitlyimposerequirementsontheirparameters

• TypeThastobe:– Copy-Constructible if

T a(b);– Assignablei.e. definesoperator=() if:

a=b;– etc

• Forthisproject:operationssuchasa < b couldmeandifferentforint andstd::string

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 13

Pointersandarrays

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 14

Pointersandarrays

• arraysworkverymuchlikepointerstotheirfirstelementsint myarray [20];int * mypointer;

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 15

Pointersandarrays

• arraysworkverymuchlikepointerstotheirfirstelementsint myarray [20];int * mypointer;

Canyoudo?mypointer = myarray;

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 16

Pointersandarrays• arraysworkverymuchlikepointerstotheirfirstelementsint myarray [20];int * mypointer;

Canyoudo?mypointer = myarray;

Can you do?myarray = mypointer;

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 17

Pointersandarrays• arraysworkverymuchlikepointerstotheirfirstelementsint myarray [20];int * mypointer;

Canyoudo?mypointer = myarray; ç Yes

Can you do?myarray = mypointer; ç No

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 18

Example// more pointers#include <iostream>using namespace std;int main () {

int numbers[5];int * p;p = numbers; *p = 10;p++; *p = 20;p = &numbers[2]; *p = 30;p = numbers + 3; *p = 40; p = numbers; *(p+4) = 50;for (int n=0; n<5; n++)

cout << numbers[n] << ", ";return 0;

}2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 19

Example// more pointers#include <iostream>using namespace std;int main () {

int numbers[5];int * p;p = numbers; *p = 10;p++; *p = 20;p = &numbers[2]; *p = 30;p = numbers + 3; *p = 40; p = numbers; *(p+4) = 50;for (int n=0; n<5; n++)

cout << numbers[n] << ", ";return 0;

}

Prints: 10, 20, 30, 40, 50,

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 20

Pointersandstringliteral

• const char*foo="hello";

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 21

Pointersandstringliteral

• const char*foo="hello";

• Foocontainsthevalue1702,andnot'h',nor"hello“• Whatistheoutputof?

*(foo+4)foo[4]

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 22

C/C++helpfulfuctions

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 23

memset• void * memset ( void * ptr, int value, size_t num );• Fillblockofmemory

– Setsthefirst num bytesoftheblockofmemorypointedby ptr tothespecified value (interpretedasan unsignedchar)

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 24

int main () { char str[] = "almost every programmer should know memset!"; memset (str,'-',6); print(str);return 0;

}

memcpy• void*memcpy (void*dest,const void*source,size_t num );• Copyblockofmemory

– Copiesthevaluesof num bytesfromthelocationpointedtoby source directlytothememoryblockpointedtoby destination.

• std::memcpy ismeanttobethefastestlibraryroutineformemory-to-memorycopy(usuallymoreefficientthan std::strcpy)

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 25

struct { char name[40]; int age; } person, person_copy;

int main () { char myname[] = ”Bucky Badger"; /* using memcpy to copy string: */memcpy ( person.name, myname, strlen(myname)+1 ); /* using memcpy to copy structure: */ memcpy ( &person_copy, &person, sizeof(person) ); return 0;

}

reinterpret_cast<new_type>(expr)• reinterpret_cast convertsanypointertypetoanyotherpointer

type,evenofunrelatedclasses.• Allpointerconversionsareallowed:neitherthecontentpointed

northepointertypeitselfischecked.

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 26

int main () { int i = 7;int* p1 = reinterpret_cast<int*>(&i);assert(p1 == &i);// type aliasing through pointerchar* p2 = reinterpret_cast<char*>(&i);// type aliasing through referencereinterpret_cast<unsigned int&>(i) = 42; std::cout << i << '\n';

}

B+tree

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 27

B+tree

• Occupancy(d)– Minimum50%occupancy(exceptforroot)– Eachnodecontainsd<=m<=2dentries.

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 281/29/17 CS 564: Database Management Systems, Jignesh M. Patel 5

(Ubiquitous)B+Tree• Height-balanced(dynamic)treestructure• Insert/deleteatlogF Ncost(F=fanout,N=#leafpages)• Minimum50%occupancy(exceptforroot).

Eachnodecontainsd <=m <=2d entries.Theparameterd iscalledtheorder ofthetree.

• Supportsequalityandrange-searchesefficiently.

Index Entries(Direct search)

Data Entries

Data EntriesEntries in the leaf pages:

(search key value, recordid)

Index EntriesEntries in the index (i.e. non-leaf) pages:

(search key value, pageid)

1/29/17 CS 564: Database Management Systems, Jignesh M. Patel 5

(Ubiquitous)B+Tree• Height-balanced(dynamic)treestructure• Insert/deleteatlogF Ncost(F=fanout,N=#leafpages)• Minimum50%occupancy(exceptforroot).

Eachnodecontainsd <=m <=2d entries.Theparameterd iscalledtheorder ofthetree.

• Supportsequalityandrange-searchesefficiently.

Index Entries(Direct search)

Data Entries

Data EntriesEntries in the leaf pages:

(search key value, recordid)

Index EntriesEntries in the index (i.e. non-leaf) pages:

(search key value, pageid)

1/29/17 CS 564: Database Management Systems, Jignesh M. Patel 5

(Ubiquitous)B+Tree• Height-balanced(dynamic)treestructure• Insert/deleteatlogF Ncost(F=fanout,N=#leafpages)• Minimum50%occupancy(exceptforroot).

Eachnodecontainsd <=m <=2d entries.Theparameterd iscalledtheorder ofthetree.

• Supportsequalityandrange-searchesefficiently.

Index Entries(Direct search)

Data Entries

Data EntriesEntries in the leaf pages:

(search key value, recordid)

Index EntriesEntries in the index (i.e. non-leaf) pages:

(search key value, pageid)

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 291/29/17 CS 564: Database Management Systems, Jignesh M. Patel 9

B+-Tree:InsertingaDataEntry• FindcorrectleafL.• PutdataentryontoL.

– IfLhasenoughspace,done!– Else,mustsplit L(intoLandanewnodeL2)

• Redistributeentriesevenly,copyup middlekey.• InsertindexentrypointingtoL2intoparentofL.

• Thiscanhappenrecursively– Tosplitnon-leafnode,redistributeentriesevenly,but

pushingup themiddlekey.(Contrastwithleafsplits.)• Splits“grow”tree;rootsplitincreasesheight.

– Treegrowth:getswider oroneleveltallerattop.

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 301/29/17 CS 564: Database Management Systems, Jignesh M. Patel 10

Inserting8*intoB+TreeRoot

17 24 30

2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*

13

Entry to be inserted in parent nodeCopied up (and continues to appear in the leaf)

2* 3* 5* 7* 8*

5

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 311/29/17 CS 564: Database Management Systems, Jignesh M. Patel 10

Inserting8*intoB+TreeRoot

17 24 30

2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*

13

Entry to be inserted in parent nodeCopied up (and continues to appear in the leaf)

2* 3* 5* 7* 8*

5

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 32

1/29/17 CS 564: Database Management Systems, Jignesh M. Patel 11

Inserting8*intoB+Tree

Insert in parent node.Pushed up (and only appears once in the index)

5 24 30

17

13

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 331/29/17 CS 564: Database Management Systems, Jignesh M. Patel 12

2* 3*

Root17

24 30

14*16* 19*20*22* 24*27*29* 33*34*38*39*

135

7*5* 8*

Inserting8*intoB+Tree

• Rootwassplit:heightincreasesby1• Couldavoidsplitbyre-distributingentrieswithasibling

– Sibling:immediatelytoleftorright,andsameparent

5minsbreak

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 34

https://www.youtube.com/watch?v=AxSdWhkMB_A

BadgerDB:Btree

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 35

Afteryouuntar theproject:• btree.h:Addyourownmethodsandstructuresasyouseefitbut

don’tmodifythepublicmethodsthatwehavespecified.• btree.cpp:Implementthemethodswespecifiedandanyothers

youchoosetoadd.• file.h(cpp):ImplementsthePageFile andBlobFile classes.• main.cpp:Usetotestyourimplementation.Addyourowntests

hereorinaseparatefile.ThisfilehascodetoshowhowtousetheFileScan andBTreeIdnex classes.

• page.h(cpp):ImplementsthePageclass.• buffer.h(cpp),bufHashTbl.h(cpp):Implementationofthebuffer

manager.• Exceptions/*:Implementationofexceptionclassesthatyou

mightneed.• Makefile – makefile forthisproject.

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 36

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 371/29/17 CS 564: Database Management Systems, Jignesh M. Patel 7

B+-treePageFormatLe

af P

age

R1 K 1 R2 K 2 K n P n+1

data entries

record 1 record 2

Next PagePointer

Rn

record n

P0

Prev PagePointer

Non

-leaf

Pa

geP1 K 1 P 2 K 2 P 3 K m P m+1

index entries

Pointer to apage with Values < K1

Pointer to a pagewith values s.t.K1≤ Values < K2

Pointer to apage with values ≥Km

Pointer to a pagewith values s.t., K2≤ Values < K3

Pm

Index• theindex willstoredataentriesintheform<key, rid> pair• storedinafilethatisseparatefromthedatafile• i.e.theindexfile“pointsto”thedatafilewheretheactualrecords

arestored

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 38

• storesalltherelations(actualdata)aswedidinthebuffermanagerassignment

• Youdon’tactuallyusethisoneforthisproject

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 39

File

PageFile BlobFile• pagesinthefilearenotlinkedby

prevPage/nextPage links• treatsthepagesasblobsof8KB

sizei.e doesnotrequirethesepagestobevalidobjectsofthePageclass

• usetheBlobFile tostoretheB+indexfile

• everypageinthefileisanodefromtheB+tree

• wecanmodifythesepagestosuittheparticularneedsoftheB+treeindex

FileScan class• TheFileScan classisusedtoscanrecordsinafile.• FileScan(const std::string&relationName,BufMgr *bufMgr)

– TheconstructortakestherelationName andbuffermanagerinstance• ~FileScan()

– Shutsdown thescanandunpinsanypinnedpages.• void scanNext(RecordId& outRid)

– Returns(viatheoutRid parameter)theRecordId ofthenextrecordfromtherelationbeingscanned.ItthrowsEndOfFileException()whentheendofrelationisreached.

• std::string getRecord() – Returnsapointertothe“current”record.Therecordistheoneina

precedingscanNext()call.• void markDirty()

– Youdon’tneedthisforthisassignment

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 40

BadgerDB:B+Tree IndexSimplifications:• assumethatallrecordsinafilehavethesamelength(soforagivenattributeitsoffsetintherecordisalwaysthesame).

• onlyneedstosupportsingle-attributeindexing• theindexedattributemaybeoneofthreedatatypes:integer,double,orstring

• inthecaseofastring,youcanusethefirst10charactersasthekeyintheB+-tree

• wewillneverinserttwodataentriesintotheindexwiththesamekeyvalue2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 41

B+Tree Index:Constructor

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 42

const string&relationName

Thenameoftherelationonwhichtobuildtheindex.Theconstructorshouldscanthisrelation(usingFileScan)andinsertentriesforallthetuplesinthisrelationintotheindex

String&outIndexName

Thenameoftheindexfile;determinethisnameintheconstructorasshownabove,andreturnthename.

BufMgr *bufMgrIn Theinstanceoftheglobalbuffermanager.

const intattrByteOffset

Thebyteoffsetoftheattributeinthetupleonwhichtobuildtheindex.Forinstance,ifwearestoringthefollowingstructureasarecordintheoriginalrelation:

And,wearebuildingtheindexoverthedoubled,thentheattrByteOffset valueis0+offsetof(RECORD,i),whereoffsetof istheoffsetpositionprovidedbythestandardC++library“offsetoff”.

const DatatypeattrType

Thedatatypeoftheattributeweareindexing.NotethattheDatatypeenumeration{INTEGER,DOUBLE,STRING}isdefinedinbtree.h

If the index file already exists, open the file. Else, create a new index fileParameters:

B+Tree Index:insertEntry

const void*key Apointertothevalue(integer/double/string)wewanttoinsert.

const RecordId &rid Thecorrespondingrecordidofthetupleinthebaserelation.

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 43

insertEntryinserts a new entry into the index using the pair <key, rid>.Input to this function:

B+Tree Index:insertEntry

const void*key Apointertothevalue(integer/double/string)wewanttoinsert.

const RecordId &rid Thecorrespondingrecordidofthetupleinthebaserelation.

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 44

insertEntryinserts a new entry into the index using the pair <key, rid>.Input to this function:

You will be spending bulk of your time /code in this method

B+Tree Index:startScan

const void*lowValue

Thelowvaluetobetested.

const OperatorlowOp

Theoperationtobeusedintestingthelowrange.YoushouldonlysupportGTandGTEhere;anythingelseshouldthrowBadOpcodesException.

const void*highValue

Thehighvaluetobetested.

const OperatorhighOp

Theoperationtobeusedintestingthehighrange.YoushouldonlysupportLTandLTEhere;anythingelseshouldthrowBadOpcodesException.

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 45

startScanThis method is used to begin a “filtered scan” of the index. For e.g. if the method is called using arguments (“a”,GT,”d”,LTE), the scan should seek all entries greater than “a” and less than or equal to “d”.Input to this function:

B+Tree Index:scanNext

• scanNext• fetchestherecordidofthenexttuplethatmatchesthescan

criteria.Ifthescanhasreachedtheend,thenitshouldthrowtheexceptionIndexScanCompletedException

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 46

RecordId &outRid

outputvaluethisistherecordidofthenextentrythatmatchesthescanfiltersetinstartScan.

B+Tree Index:endScan

• endScan• terminatesthecurrentscanandunpins allthepagesthathavebeenpinnedforthepurposeofthescan

• throwsScanNotInitializedException ifcalledbeforeasuccessfulstartScan call.

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 47

Implementationnotes• callthebuffermanagertoread/writepages• don’tkeepthepagespinnedinthebufferpoolunlessyouneedto

• Forthescanmethods,youwillneedtorememberthe“state”ofthescanspecifiedduringthestartScan

• insert doesnot needtoredistributeentries• Attheleaflevel,youdonotneedtostorepointerstobothsiblings.Theleafnodesonlypointtothe“next”(theright)sibling

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 48

FAQs• HowdoIgetstarted?

iftheindexfiledoesnotexist:createnewBlobFile.allocate newmetapageallocate newrootpagepopulate'IndexMetaInfo'withtherootpage numscanrecordsandinsertintotheBTree

elsereadthefirstpagefromthefile- whichisthemetanodegettherootpagenum fromthemetanodereadtherootpage(bufManager->readPage(file,

rootpageNum,out_root_page)onceyouhavetherootnode,youcantraversedownthetree

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 49

FAQs

• howtocheckwhetheranindexfileexists?– Seefile.h:staticboolexists(const std::string&filename)

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 50

FAQs

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 51

• How do I write a node to disk as a Page? e.g. how to write IndexMetaInfonode to the file?

Þ You first need to allocate a Page using the bufferManager.Page* metaPage;bufManager->allocatePage(..., metaPage);

Then you can either cast it as IndexMetaInfo* and update it's parameter. Another way is to first create and populate MetaIndexInfo node.Then you can allocate a new Page* using bufferManager as above. Then use 'memcpy' to copy tothe new Page:

memcpy(metaPage, &metaInfo,sizeof(IndexMetaInfo));

But you do not need to write it back as page to disk explicitly. Buffer Manager does it for you.Remember project 2?

FAQs

• howtoconvertPage - thatyoureadfromthefiletoNode?– youcancaste.g.usingreinterpret_cast

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 52

Suggestions

• Startearly– 1000+linesofcodes

• Trytofinishbeforethespringbreak– NoTAhoursduringthebreak

• Makeincrementalprogress– Testaggressively

2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 53

Recommended