Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Qualitative Data Management for Interdisciplinary Research
KRISTAL JONES, RESEARCH SCIENTIST, NATIONAL SOCIO-ENVIRONMENTAL SYNTHESIS CENTER (SESYNC) , UNIVERSITY OF MARYLAND-COLLEGE PARK
STEVEN M. ALEXANDER, MITACS SCIENCE POLICY FELLOW AND SCIENCE ADVISOR, F ISHERIES AND OCEANS CANADA
DataONE webinar, Apr i l 10, 2018
What is qualitative data?
NationalMuseumofNaturalHistory NOAAVoicesfromtheFisheries USCongressionalRecord
What is qualitative data?
Author’sphotos TheNewYorkTimes Twitter(#BearsEars)
What role does qualitative data play in interdisciplinary research? Qualitativedatacan…• Providetemporalinsights• Formcasecomparisons
• Scaleuppatterns• Scaledowninterpretations• Broadentheevidencebase
What are the benefits of sharing and re-using qualitative data? Scientific• Transparencyandtriangulation
Descriptive• Expansive,inclusive,andvariedaspectsofaphenomena• Opportunitiesforteaching&learning
Material• Reducetheburdenonindividualsandcommunities• Increasedreturnoninvestmentforfundersandinstitutions• Accessforknowledgeusersoutsideofresearchinstitutions
Is qualitative data currently being shared?
What challenges exist to accelerating qualitative data sharing and re-use? Practicalchallenges• Resources:Time,expertise,financialsupport• Infrastructure:Wheretodeposit?
Ethical• Confidentiality,representation,consent
Epistemological• Spectrumfrompurepositivismtopureconstructionism,withlotsof
pragmaticspaceinthemiddle
Research Data Lifecyle Plan&Design
Collect&Capture
Interpret&Analyze
Manage&Preserve
Release&Publish
Discover&Reuse
UCCurationCenter|CaliforniaDigitalLibrary
Who plays a role in addressing these challenges?
Researchers Researchinstitutions Journalsandpublishers Funders Datarepositories
Plan&Design
Collect&Capture
Interpret&Analyze
Manage&Preserve
Release&Publish
Discover&Reuse
Want a summary of the benefits, challenges, resources and recommendations?
SESYNCQualitativeDataInitiative
What resources exist to address qualitative data management challenges? Primarydatalifecycle: PlanandDesign• Datamanagementplanning:
https://qdr.syr.edu/guidance/managing/planning-data-management
• IRB:https://qdr.syr.edu/node/20260/
• Developsharedprotocols:https://www.atkinson.cornell.edu/collaborations/oxfam-cu.php
• TeachingqualitativedatamanagementwebinarfromIASSIST:https://www.youtube.com/watch?v=aATIKsF96Ro&feature=youtu.be
Primarydatalifecycle: CollectandCapture,InterpretandAnalyze• Myriadresourcesfromenvironmentalanthropologyandsociology,
humangeography• Communicatemethodologies,approachesandassumptions:
Cox,M.(2015).Abasicguideforempiricalenvironmentalsocialscience.EcologyandSociety20(1):63.http://dx.doi.org/10.5751/ES-07400-200163
• Sharingcodefromqualitativedatasoftware:RQDAhttps://cran.r-project.org/web/packages/RQDA/RQDA.pdfCodebooksfromAtlas.ti,NVivo,MAXQDA
What resources exist to address qualitative data management challenges?
Primarydatalifecycle: ManageandPreserve• IRBandde-identification:
https://qdr.syr.edu/node/20260/
• Fileformats:https://qdr.syr.edu/guidance/managing/formatting-data
• DDImetadatastandards:https://www.ddialliance.org/sites/default/files/AQualitativeDataModelForDDI.pdf
CollectionQudexArchive
status : String0..*0..*
LineTextendLine : IntegerstartLine : Integer
CharacterText AudioClip VideoClip
Text
TextSegmentsrc : StringstartOffset : IntegerendOffset : Integer
11
Clip
ClipSegmentsrc : StringclipType : ClipTypeclipBegin : StringclipEnd : StringotherClipType : String
11
ImageAreashape : ShapeTypecoordinates : String
ImageSegmentsrc : Stringshape : ShapeTypecoordinates : String
11
Nodesrc : StringxpathExpression : String
XMLSegment
11
Codeauthority : String
CodeCollection
0..*0..*MemoCollection
MemoDocumentsrc : String
MemoTexttext : String
Memo0..*0..*
0..*0..*0..*0..*SegmentComponent
Segment
11
SegmentCollection
0..*0..*
CategoryCollection
Category0..*0..*
0..*+referencedCategory
0..*
IdentifiableArtefactid : String
ObjectRelationobjectType : ObjectTyperelationName : RelationTypeotherRelationName : String
0..1+sourceObject 0..1 0..1 +targetObject0..1
RelationCollection
0..*0..*
Documents
ResourceCollection
0..10..1
Sources
0..10..1
Source0..*0..*
MemoSources0..10..1
MemoSource0..*0..*
DocumentdocumentType : DocumentType
0..*0..*
0..*
+referencedDocument0..*
ResourceComponentsize : Stringlocation : StringlocType : LocationTypeotherLocType : StringchecksumType : ChecksumTypeotherChecksumType : StringchecksumValue : StringmimeType : StringresourceType : ResourceTypeotherResourceType : String
11
+referencedResource
UKDataArchive
What resources exist to address qualitative data management challenges?
Primarydatalifecycle: ReleaseandPublish,DiscoverandRe-use
Whatisuniqueaboutpublishingqualitativedataforre-useininterdisciplinaryresearch?
What resources exist to address qualitative data management challenges?
Secondarydatalifecycle: PlanandDesign,CollectandCapture
Whatisuniqueaboutinterdisciplinaryresearchusingsecondaryqualitativedata?
What resources exist to address qualitative data management challenges?
Secondarydatalifecycle: InterpretandAnalyze• Overviewofsynthesismethods:Dixon-Woodsetal.(2005).Synthesisingqualitative
andquantitativeevidence:Areviewofpossiblemethods.JournalofHealthServicesResearchandPolicy10(1):45-53.http://dx.doi.org/10.1258/1355819052801804
• Meta-analysis:Cox,M.(2014).Understandinglargesocial-ecologicalsystems:introducingtheSESMADproject.InternationalJournaloftheCommons8(2):265-276.http://doi.org/10.18352/ijc.406
• Textmining:http://tm.r-forge.r-project.org/
• Regularexpressions:https://sesync-ci.github.io/text-mining-lesson/2016/09/14/
What resources exist to address qualitative data management challenges?
Secondarydatalifecycle: ManageandPreserve,ReleaseandPublish Whatisuniqueaboutpublishingqualitativedataforre-useininterdisciplinaryresearch?
What resources exist to address qualitative data management challenges?
Fromaresearcher’sperspective,howdoweoperationalizesharing(preservingandpublishing)orientedtowardre-use(discoveranddesign)?
What resources exist to address qualitative data management challenges?
Plan&Design
Collect&Capture
Interpret&Analyze
Manage&Preserve
Release&Publish
Discover&Reuse
Plan&Design
Collect&Capture
Interpret&Analyze
Manage&Preserve
Release&Publish
Discover&Reuse
Primarydatalifecyle
Secondarydatalifecyle
What resources exist to address qualitative data management challenges?
Plan&Design
Collect&Capture
Interpret&Analyze
Manage&Preserve
Release&Publish
Discover&Reuse
Plan&Design
Collect&Capture
Interpret&Analyze
Manage&Preserve
Release&Publish
Discover&Reuse
Primarydatalifecyle
Secondarydatalifecyle• Whatkindofmetadataisnecessary?• Hasthedatabeencleanedandmadeanonymous?• Willthedatabediscoverable?
• Whatcountsasdata?• Howdoesepistemologyshapewhatcanbesharedandre-used?
Linking qualitative data sharing and re-use: Levels of access
Levelofaccess DefinitionA–Open
Dataisfreelyavailableforuseinaccordancewithgeneraluseagreementofrepositoryandstandardcitationpractices
B-Restricted Dataisavailableforusewhenusermeetsstandardcriteriasetbydatarepositorytoensureethicaluseofdata(couldincludeuseagreement,obtainingIRBoraccessingdatathroughvirtualenvironment)
C-Controlled Dataisavailableforusebytrainedusersinacontrolledenvironment(accesscoulddependonsecondaryresearchquestionsandintendedanalysis,controlsonaccessmethodandamountofdatasharedisdecidedbyoriginalresearcher)
D-Closed Datadepositandcitationexistforarchivalpurposesbutnodataarecurrentlyavailable(couldbeembargoeduntilpublicationofresults,changeinsensitivesituation,deathofaparticipant,orcertaindurationoftimefromcollection)
Linking qualitative data sharing and re-use: Levels of processing
Levelofprocessing Definition0–Rawdata
Fulltext,imageoraudio Noredaction-allidentifiersincludedNoaggregationoranalysis Noadditionalinformationaboutcontextandmethodology
1 Fulltext,imageoraudio RedactionfordirectidentifiersNoaggregationoranalysis Idiosyncraticinformationaboutcontextandmethodology
2 Fulltext,imageoraudio RedactionfordirectandindirectidentifiersNoaggregationoranalysis Standardizedinformationaboutcontextandmethodology
3 Excerptedtext,imageoraudio RedactionfordirectandindirectidentifiersThematicortopicalaggregation Standardizedinformationaboutcontextandmethodology
4–Researchoutput
Summarizedtext,imageoraudio RedactionfordirectandindirectidentifiersThematicortopicalanalysis Summarizedinformationaboutcontextandmethodology
Levelofprocessing
LevelofaccessA[open] B[restricted] C[controlled] D[closed]
0[raw] Publicpolicydocuments Rawinterviewtranscriptsorfieldnotes
1 Publicpolicydocumentswithsearchtermsasmetadata
Interviewtranscriptswithnamesandlocationsredacted
2 Publicpolicydocumentswithcodeforwebscraping
Interviewtranscriptswithnamesandlocationsredactedandmetadataaboutsettingofinterviews
Interviewtranscriptswithnamesandlocationsredactedandmetadataaboutsettingofinterviews
3 Publicpolicydocumentsorganizedbythemeandwithcodeforthematicanalysis
Interviewexcerptswithnamesandlocationsredactedandmetadataincludingthematiccodes
4[Researchoutputs]
Descriptivesummaryofthemeswithinpolicieswithmethodologyexplained
Summaryofthematicanalysisofinterviewtranscriptswithmethodologyexplained
Summaryofthematicanalysisofinterviewtranscriptswithmethodologyexplained
Takeaways Researchersshouldconsiderdatasharingandre-useacrossallstagesofthelifecycleØ Thisoftenrequiresmoreplanningattheoutset(IRB,metadata
documentation,etc.)forqualitativedatathanquantitativedata
Interdisciplinaryresearchrequiresworkingacrosssystems,vocabularies,tools,etc.Ø Therearemanyresourcesoutthere,buttheyoftenaren’tusedinan
integratedworkflow
SupportforthisworkcomesfromtheNationalSocio-EnvironmentalSynthesisCenter(SESYNC),whichissupportedunderfunding
receivedfromtheNationalScienceFoundationDBI-1052875.
SESYNCQualitativeDataInitiative:https://www.sesync.org/for-you/cyberinfrastructure/research-and-tools/qualitative-data-
initiative
Contactinformation: