Upload
caroline-hoover
View
212
Download
0
Embed Size (px)
Citation preview
Data goodness
Mostly in black and white
By Dom
You must love your data!
• Lost data : • Current imaging data in BRIC cost ~£5.1M, just for scanning costs! (2011)
no research no publications
no jobs no PhDs!Sad Dom
• Look after your data! – It looks after you
• Happy Dom
Data Storage
• Home directories:– ISIS home, U Home
» Not for large amounts of imaging data
• Projects directory– ISIS, V: Big stuff goes here
• If you require large amounts of space – E.g. > 50 GB
– LET ME KNOW IN ADVANCE!
Server goodness
• Why is the server a good place to store data?
• Mirror and parity - some errors - data can be easily recovered
– BACKUPS:• Tape backups, daily - 1 month retention• if you have funding, processed data can be mirrored off site• raw data is always mirrored offsite (ECDF) by default
– Desktop PC's• not reliable - no mirroring, no parity - some errors - data is lost
(Often all of it)• Network backups often fail
– Machines turned off, Network busy– moving to a new system when I get time!
Data love• Curation: Do this as you work!
• Plan your data use
– Use meaningful folder names– Make 'README.txt' files with dates, names of students/employees
involved, references to software, scripts and versions, purpose of experiment/processing.
– Be tidy with your data - tidy up occasionally – Friday afternoon - quick tidy up– Big tidy up at end of experiment/ project/ phase/ year
• BE CAREFUL, don’t rush
• Data, spreadsheets, databases– Anonymisation– *** Repatriation keys***
Code and Scripts
• Coding:• Testing
– Make sure that the software you are using does exactly what you think it does!
» Check every step for every image!
– Do not use hard coded paths• Use versioning software (ECDF)
Safe data is Happy data!