Ice 2 Layout

Embed Size (px)

Citation preview

  • 8/12/2019 Ice 2 Layout

    1/19

    The IceFileSystem 2.x Disk Layout

    By Leif Salomonsson (c) 2011

    Last updated: June 2 2011.

    Contents:

    Introduction. Conventions. eta !eaders. eta "#$ects. %&tents. 'dmin Space. Constants. unctions.

  • 8/12/2019 Ice 2 Layout

    2/19

    Introduction

    Ice ileSystem layout is e&tent #ased *+#it ,it- c-ec summedmeta data and a meta level $ournal.

    ain space allocator is #ased on /LS al orit-m adapted for on disstora e.

    'll metadata e&cept e&tent -eaders are located in speciale&tents called pools and use a local #itmap to eep tracof free meta space. etadata is very compact.

    esi n oals and priorities

    *+#it dis filesystem supportin a -i - num#er of useful features

    #y a desi n prioritised li e follo,in :1 3elia#ility 2 Scala#ility 4 %fficiency + Speed

    eatures

    64bit file/partition/extent sizes (actually very close to 2^63 bytes). No self imposed fra mentation oflar e files unless t!ere is not enou ! conti ious space available.

    "ll metadata on dis# is c!ec#summed. $!is means any errors on dis# %ill be detected a lot &uic#er. 'eta level ornallin . ardlin#s (directory and file)* softlin#s* file comments.

    +upports bloc#sizes from ,-2 bytes to 32 i . 0ilesystem does not et slo%er for lar er partitions (scales very %ell)* or %!en !eavily fra mented. No limit in 1 of files/dirs in partition/dirs. ecycle directory. "utomatically truncated lo files and file c!an e lo . astes only -2 # (and preallocates anot!er -2 # for meta space) for filesystem administration data*

    re ardless of partition size.

  • 8/12/2019 Ice 2 Layout

    3/19

    Conventions'll meta data stores information in #i endian format.

    'll on dis pointers e&press a #yte offset relative to t-e startof t-e volume.

    'll fields in all structures descri#ed are of t-e unsi ned inte er ind,it- si5es ran ein from 1 to 6 #ytes or array of unsi ned inte er ind.

    ield types:

    p*+ 7 *+#it pointerp42 7 42#it pointeri*+ 7 *+#it inte eri42 7 42#it inte eri1* 7 1*#it inte er

    i6 6#it inte er

  • 8/12/2019 Ice 2 Layout

    4/19

    eta !eaders

    'll meta o#$ects starts ,it- a meta -eader structure t-e 8meta8 o#$ect:

    selfadr p*+c-ec sum i42metasi5e i1*metarsrvd i6metata i6

    SI9%" 1*

    selfadr:pointer to our selves.

    c-ec sum:

    total sum of all 42#it ,ords (,it- 8c-ec sum8 field itselfcleared) t-at ma e up t-e ,-ole o#$ect (si5e of o#$ect in8metasi5e8 field).

    metasi5e:total si5e of meta o#$ect.

    reserved:5ero.

    metata :inte er descri#in t-e type of meta o#$ect.

    'dditionally all meta t-at is stored in meta pools -ave one e&tra 8#mapadr8 field. /-e8pmeta8 o#$ect:

    selfadr p*+c-ec sum i42metasi5e i1*metarsrvd i6metata i6#mapadr p*+

    SI9%" 2+

    #mapadr:pointer to t-e 8#itmap8 o#$ect.

  • 8/12/2019 Ice 2 Layout

    5/19

    eta "#$ects

    /-e 8filesys8 o#$ect:

    selfadr p*+c-ec sum i42metasi5e i1*metarsrvd i6metata i6creation i*+totalsi5e i*+rootentry p42rootnode p42root-as-ta# p42recyclednode p42lo filenode p42fsversion i42

    fsrevision i42#loc si5e i42reserved2 ;2< i42-as-type i42e&tinfo p42$ournal p42startofe&tents p42endofe&tents p*+po,erta#s ;*+< p42ori =selfadr p*+reserved ;1>< i*+

    metasi5e:?12

    metata :/'@= IL%SAS

    creation:time of creation e&pressed in microseconds since startof time ;1..*4 areactually used #ut all po,ers s-ould still #e initialised.

    pad:

    5ero.

    ta#le:ree e&tents are c-ained to et-er in a dou#ly

    lin ed list and t-e ta#le contains pointers to t-efirst e&tent in eac- list or 5ero if empty.

    /-e entry o#$ect:

    selfadr p*+c-ec sum i42metasi5e i1*metarsrvd i6metata i6#mapadr p*+lin p*+parent p*+namee&t p*+ne&t-lin p*+type i6fla s i6namelen i1*dirne&t p*+dirprev p*+-as-ne&t p*+-as-prev p*+name ;4*

  • 8/12/2019 Ice 2 Layout

    9/19

    pointer to parent entry. 5ero if root.

    namee&t:pointer to 8namee&t8 o#$ect if filename e&ceeds 4* c-aracterselse 5ero.

    ne&t-lin :

    ne&t -ard lin . pointer used to lin -ard lin s for anentry to et-er. ori inal entry is not part of t-is lin a e.

    type:%/A %= . eit-er %/A %=!'3 %H/3A or %/A %=S" /%H/3A.

    fla s:% L'@= . only one fla for no,: % L'@=!I %H.

    namelen:total len t- of filename in c-aracters.

    dirne&t:

    pointer to ne&t entry in directory or 5ero if last.

    dirprev:pointer to previous entry in directory or 5ero if first.

    -as-ne&t:pointer to ne&t entry in -as- ta#le lin a e.

    -as-prev:pointer to previous entry in -as- ta#le lin a e.

    name:array containin t-e (first 4* c-aracters of) filename.

    /-e 8namee&t8 o#$ect:

    selfadr p*+c-ec sum i42metasi5e i1*metarsrvd i6metata i6#mapadr p*+str ;1000

  • 8/12/2019 Ice 2 Layout

    10/19

    selfadr p*+c-ec sum i42metasi5e i1*metarsrvd i6metata i6

    #mapadr p*+softtime p*+str ;>>2

  • 8/12/2019 Ice 2 Layout

    11/19

    %it-er H/A %= IL% or H/A %= I3.

    rsrvd:5ero.

    fla s:efined fla s are H L'@=L"@ IL% and H L'@=3%CACL% I3.

    prot#its:rotection #its. Li e ami a protection #its #ut

    =,it-out= t-e inversion of t-e lo,er ni##le.

    reserved:5ero.

    o,nerinfo: Li e ami a o,nerinfo. ori entry:

    pointer to t-e 8ori inal8 entry.

    addentries:pointer to t-e first -ard lin entry or 5ero.

    modified:modification timestamp e&pressed in micro seconds sincestart of time.

    data=first:pointer to first filedata e&tent or 5ero.

    data=last:pointer to last filedata e&tent or 5ero.

    dir=first:pointer to first entry in directory or 5ero.

    dir=last:pointer to last entry in directory or 5ero.

    e&tleft:num#er of #ytes left in last e&tent.

    numentries:num#er of entries in directory. not really used #yanyt-in #ut s-ould #e updated any,ay.

    filesi5e:t-e current filesi5e if H/A %= IL%.

    -as-:pointer to 8-as-ta#8 o#$ect if H/A %= I3.

    comment:pointer to 8comment8 o#$ect or 5ero.

    rsrvd4:5ero.

  • 8/12/2019 Ice 2 Layout

    12/19

    /-e 8-as-ta#8 o#$ect:

    selfadr p*+c-ec sum i42metasi5e i1*metarsrvd i6

    metata i6#mapadr p*+parent p*+ta#le ;12+

  • 8/12/2019 Ice 2 Layout

    13/19

    %&tents

    'll e&tents starts ,it- t-e 8e&tent8 o#$ect -eader ,-ic- in turnstarts ,it- t-e 8meta8 o#$ect -eader.

    /-e 8e&tent8 o#$ect:

    selfadr p*+c-ec sum i42metasi5e i1*metarsrvd i6metata i6e&tsi5e i*+e&t=prev p*+data=ne&t p*+data=prev p*+dataofs i42

    data=e&tnum i42data=node p*+MN unions NMran e=ne&t p*+ data=ne&tran e=prev p*+ data=prevreuse=ne&t p*+ data=ne&treuse=prev p*+ data=prev#itmap p*+ dataofs

    metasi5e:*+

    selfadr:al,ays #loc si5e ali ned.

    metasi5e:al,ays *+

    metata :/ype of e&tent (%/'@= )./-ere are t-ree types of e&tents:%/'@= 3%%: free space.%/'@= %/' ""L: 126KiB meta pool.%/'@= IL% '/': file data e&tent.

    e&tsi5e:si5e of t-is e&tent in #ytes. al,ays #loc si5e ali ned.

    e&t=prev:pointer to previous e&tent #y placement on dis .

    Oif metata %/'@= IL% '/'

    data=ne&t:pointer to ne&t filedata e&tent.

    data=prev:

    pointer to previous filedata e&tent.dataofs:

  • 8/12/2019 Ice 2 Layout

    14/19

    Start of filedata relative to start of e&tent descri#ed in #ytes.%it-er SI9%" e&tent (*+) for files t-at fits in (#loc si5e 7 SI9%" e&tent)or #loc si5e for lar er files.

    data=e&tnum:filedata e&tent num#er startin at 0.

    data=node:pointer #ac to t-e 8node8 o#$ect.

    Oendif

    Oif metata %/'@= 3%%

    ran e=ne&t:pointer to ne&t free e&tent in t-is ran e.

    ran e=prev:pointer to previous free e&tent in t-is ran e.

    Oendif

    Oif metata %/'@= %/' ""L

    reuse=ne&t:pointer to ne&t non full metapool e&tent.

    reuse=prev:pointer to previous non full metapool e&tent.

    #itmap:pointer to 8#itmap8 o#$ect. 'L 'AS selfadr P *+.

    Oendif

  • 8/12/2019 Ice 2 Layout

    15/19

    'dmin Space

    /-e si5e of adminspace is 126KiB. It is located at #e inninof partition and starts ,it- t-e 8icestart8 structure:

    ice0 i42fsys i42filesys p42

    ice0:t-e ma ic ,ord I =IC% S= ISK

    fsys:t-e ma ic ,ord 8 SAS8

    filesys:

    pointer to t-e 8filesys8 o#$ect.

    eta o#$ects in admin space:

    1 ilesys1 %&tinfo*+ o,erQs1 Journal

    Bac ed up o#$ects.

    ' volume ends ,it- a copy of t-e root -as-ta#le o#$ect. 's of revision 4 also t-e filesys o#$ect is #ac ed up directly in front of t-e root -as-ta#le #ac up. Bot- -as-ta#le and filesys

    #ac up -ave t-eir 8selfadr8 field c-an ed to #ac up location. /-e 8filesys8 o#$ect eeps a #ac up of t-e ori inal 8selfadr8 in

    its 8ori =selfadr8 field.

  • 8/12/2019 Ice 2 Layout

    16/19

    ConstantsI =IC% S= ISK 0&+>+4+?02

    /'@=H"H% 0

    %/'@= 3%% 1%/'@= %/' ""L 2%/'@= IL% '/' 4%/'@=3%S%3R% +

    /'@= IL%SAS ?/'@=%H/3A */'@=H" % E/'@=H' %% / 6/'@=C" %H/ >/'@=S" /H' % 10/'@=!'S!/'B 11/'@=H"/DS% 12

    /'@= " %3 14/'@=J"D3H'L 1+/'@=BI/ ' 1?/'@=% /IH " 1*/'@=3%S%3R% 1E

    %/' ""LSI9% 102+ N 126' IHS 'C% 102+ N 126

    %/'BL"CKSI9% 126 %/'BL"CK 'SK 12E %/'BL"CKS!I / E

    ' BI/ ' IH % 41 BI/ ' L"H@S 42 BI/ ' 3%S%3R% 12

    %/A %=!'3 %H/3A 0 %/A %=S" /%H/3A 1

    % L'@=!I %H 1' %H/3AH' %BA/%S 4*

    H/A %= IL% 0H/A %= I3 1H L'@=L"@ IL% 1

    H L'@=3%CACL% I3 2

    !'S!/'B%H/3I%S 12+

    ' H' %% /S/3BA/%S 1000 ' S" /H' %S/3BA/%S >>2 ' C" %H/S/3BA/%S 1000

    IH " %3 *' " %3 *4

    3'H@%S %3 " %3 *+ 3'H@%S %3 " %3S!I / *

    IL%SAS3%S%3R% 1>

  • 8/12/2019 Ice 2 Layout

    17/19

    !'S!/A %=H% 2 !'S!/A %=H% =C'S% 4

    % /IH "3%S%3R% 21

    ' J"D3H'LSI9% 42E*6

    ' %/'SI9% 102+

  • 8/12/2019 Ice 2 Layout

    18/19

    unctions

    unctions are ,ritten in % lan ua e #ut s-ould not#e too -ard to translate into C or somet-in else.

    /-e directory entry name -as-in al orit-m:Case sensitive:

    3"C -as- is HameHe,Case12+(name: /3 /" C!'3)% i v 0:L"H@ c

    G et a small seed from num#er of c-aracters!IL% c : name;v