A Domain Model for Digital Curation Stephen Abrams UC Berkeley School of Information, Friday, October 16, 2015 UC Curation Center California Digital Library

  • Published on
    13-Dec-2015

  • View
    215

  • Download
    2

Embed Size (px)

Transcript

  • Slide 1

A Domain Model for Digital Curation Stephen Abrams UC Berkeley School of Information, Friday, October 16, 2015 UC Curation Center California Digital Library www.accademia.org/explore-museum/artworks/michelangelos-david Slide 2 A domain model for digital curation Justification sept Roadmap Transition from ad hoc and idiosyncratic to rigorous and systematic analysis, planning, deployment, and assessment Build up incrementally from first principles Synthesize and extend prior community efforts View curation as an inherently semiotic activity Benefits Better understand and express nuanced curation intentions and outcomes Set realistic stakeholder expectations Gain greater confidence that activities are comprehensive Slide 3 Programmatic change and aging infrastructure Imminent retirement of executive and program directors; mounting technical debt genesis An opportunity for strategic reassessment and planning Hoping for a short background paper to guide analysis The more investigation, the less confidence The more questions, the fewer persuasive answers Slide 4 Pragmatic advancement, but no robust and comprehensive conceptual underpinnings Two decades of progress state-of-the-art There is a model explicit or tacit underlying all of these Fedora OAIS Portico DIAS LOCKSS JHOVE PREMIS Dioscuri TRAC Plato 4C DPN Do they all fit together? Are we thinking about the right things and defining them properly? There is no more overloaded and under-formalized term of practice than digital object DSpace PRONOM AIHT PDF/A Chronopolis iRODS Ace NDSA FIDO Olive BitCurator PCDM Slide 5 Prior object modeling crosswalk Sender/ receiver Buckland Kahn- Wilensky FRBRNAAOAISPREMISBRMICO source info-as- knowledge workessence intellectual entity propositional content intellectual entity encoding info-as-thing data expression source data object / digital object bitstream / filestream symbol structure manifestation file / representation itembits patterned matter/energy information carrier frame-of- reference key-metadata representation information auxiliary information channel info-as- process processprojection signalperformance sensory impression content knowledge base decoding effect info-as- knowledge essence intellectual entity propositional content intellectual entity not fully populated fineness of granularity Slide 6 Sept object modeling crosswalk Sender/ receiver Buckland Kahn- Wilensky FRBRNAAOAISPREMISBRMICO source info-as- knowledge workessence intellectual entity propositional content intellectual entity encoding info-as-thing data expression source data object / digital object bitstream / filestream symbol structure manifestation file / representation itembits patterned matter/energy information carrier frame-of- reference key-metadata representation information auxiliary information channel info-as- process processprojection signalperformance sensory impression content knowledge base decoding effect info-as- knowledge essence intellectual entity propositional content intellectual entity Sept message structure form carrier annotation behavior stimuli ground interpretation experience Slide 7 Digital Curation Centre Maintaining, preserving and adding value to digital research data throughout its lifecycle digital curation Cui bono? Process-centric Explains the what, but not the why or for whom www.dcc.ac.uk Slide 8 UC Curation Center Complex of actors, policies, practices, and technologies enabling successful consumer engagement with authentic content of interest across space and time digital curation www.cdlib.org/uc3 Slide 9 UC Curation Center Complex of actors, policies, practices, and technologies enabling successful consumer engagement with authentic content of interest across space and time digital curation Distinguishable through consumer criteria www.cdlib.org/uc3 Slide 10 UC Curation Center Complex of actors, policies, practices, and technologies enabling successful consumer engagement with authentic content of interest across space and time digital curation Distinguishable through consumer criteria Is what it purports to be www.cdlib.org/uc3 Slide 11 UC Curation Center Complex of actors, policies, practices, and technologies enabling successful consumer engagement with authentic content of interest across space and time digital curation Distinguishable through consumer criteria Is what it purports to be Spanning production, management, and exploitation www.cdlib.org/uc3 Slide 12 UC Curation Center Complex of actors, policies, practices, and technologies enabling successful consumer engagement with authentic content of interest across space and time digital curation Distinguishable through consumer criteria Is what it purports to be Spanning production, management, and exploitation Use is feasible and beneficial www.cdlib.org/uc3 Slide 13 UC Curation Center Complex of actors, policies, practices, and technologies enabling successful consumer engagement with authentic content of interest across space and time digital curation Distinguishable through consumer criteria Is what it purports to be Spanning production, management, and exploitation Use is feasible and beneficial Equally dependent on human competencies, institutional mission and resources, and technology www.cdlib.org/uc3 Slide 14 Communication with the future A digital object is the unit of communication digital curation An object encapsulates the message to be communicated, but not its meaning Meaning is an emergent epistemic state of the consumer Content is realized by physical stimuli Perceived by a sense modality Interpreted in a context Experienced as cognitive meaning or psychological affect The final, crucial transition from perception to cognition is an inherently semiotic act Slide 15 Signs and systems of signification Charles Sanders Peirce (1839 1914) semiotics A sign is something that stands in for something else, for someone, in some manner. Semiosis is a triadic relation between a representation, its referent, and its experiential effect Interpretation takes place with respect to a subjective contextual ground referent representation ground effect stands in for contextualizes stimulates (re)presents objectivesubjective Slide 16 Object-mediated communication Modes of understanding semiosis Denotative Connotative object curatorial presentation contextual ground frames-of- reference experienced meaning intended meaning interpretationcodification owner curator creator consumer objectivesubjective feasiblebeneficial stimulus Slide 17 Object-mediated communication Modes of understanding semiosis Denotative Connotative object curatorial presentation contextual ground frames-of- reference experienced meaning intended meaning interpretationcodification owner curator creator consumer objectivesubjective feasiblebeneficial contextual noise channel noise Slide 18 Object modeling Dimensions analysis Semantics Syntactics Empirics Pragmatics Diplomatics Dynamics Meaning Symbolic expression Physical representation Realizing behavior Evidential authenticity Persistence and evolution SSEPDD Slide 19 Object modeling Dimensions analysis Semantics Syntactics Empirics Pragmatics Diplomatics Dynamics Meaning Symbolic expression Physical representation Realizing behavior Evidential authenticity Persistence and evolution SEPT Slide 20 Object modeling Dimensions analysis Semantics Syntactics Empirics Pragmatics Diplomatics Dynamics A subsidiary group or division of an extended family or clan SEPT Slide 21 Object modeling Components analysis Semantics Syntactics Empirics Pragmatics Diplomatics Dynamics carrier message behavior encoding inscribedrealized expressed describes semantics syntacticsempirics object pragmatics verificationintervention diplomaticsdynamics annotation Slide 22 Object typology Types analysis Empirics Blob bits bits bits SSD Slide 23 Object typology Types analysis Empirics Syntactics (morphology) Blob Artifact bits identity: SSD File bits bits bits Slide 24 Object typology Types analysis Empirics Syntactics (morphology) Syntactics (structure) Blob Artifact Exemplar bits identity: type: SSD File.pptx file Slide 25 Object typology Types analysis Empirics Syntactics (morphology) Syntactics (structure) Semantics Blob Artifact Exemplar Product bits identity: type: description: SSD File.pptx file Topical presentation Slide 26 Object typology Types analysis Empirics Syntactics (morphology) Syntactics (structure) Semantics Pragmatics Blob Artifact Exemplar Product Asset bits identity: type: description: behavior: SSD File.pptx file Topical presentation Presentation (in PowerPoint) Slide 27 Object typology Types analysis Empirics Syntactics (morphology) Syntactics (structure) Semantics Pragmatics Diplomatics Blob Artifact Exemplar Product Asset Record bits identity: type: description: behavior: verification: SSD File.pptx file Topical presentation Presentation (in PowerPoint) Presentation (really) Slide 28 Object typology Types analysis Empirics Syntactics (morphology) Syntactics (structure) Semantics Pragmatics Diplomatics Dynamics Blob Artifact Exemplar Product Asset Record Heirloom bits identity: type: description: behavior: verification: intervention: SSD File.pptx file Topical presentation Presentation (in PowerPoint) Presentation (really) Presentation (tomorrow) Slide 29 Object typology analysis BlobArtifactExemplarProductAssetRecordHeirloom empiricssyntactics semanticspragmaticsdiplomaticsdynamics formativeinformative performativeevaluativereformative inscriptionidentificationcharacterizationdescriptionrealizationverificationintervention media(outer) encoding(inner) encodingmeaning / affectexperienceauthenticitypersistence carrierformstructuremessagebehaviorevidenceaction existentialintentionalpurposefulinterpretableusefultrustworthyresilient nascentincipientpotentialtheoreticalpracticalassuredenduring provenancial / administrative / permissive provenancial / relational / associational structuralintellectualinstrumentalprovenancial Differentia Dimension Mode Act Concern Abstraction Quality Utility Annotation Slide 30 Modes of engagement Continuum, not lifecycle continuum Role Locus Concern Lifecycle implies a prescribed progression through well demarcated and distinguishable states Continuum allows adaptive navigation among overlapping and interdependent activities Creator Curator Consumer Owner Origination Organization Pluralization Production Management Exploitation Slide 31 Modes of engagement continuum OriginateOrganizePluralize observe, simulate, create, deriveidentify, classify, clean, annotate, packagelicense, submit, publish, cite, aggregate appraise, select, harvest, collect normalize, characterize, arrange, annotate, store, index, plan, watch, intervene, administer replicate, audit, notify, syndicate, resolve, resolve, authorize, report search, discover, retrieve, subselect analyze, correlate, synthesize, interpret, transform, annotate summarize, validate, assert, refute Locus Production Management Exploitation Slide 32 Policy and strategy Imperatives rubric Predilect Collect Protect Introspect Project Connect Decide what you intend Obtain ( or do) what you intend Preserve (or sustain) what you obtain Know what you protect Offer what you know Deliver what you offer Slide 33 Policy and strategy rubric BlobArtifactExemplarProductAssetRecordHeirloom service level agreement disaster recovery / business continuity format action plans collection development policy outreach and training evidentiary standards sustainability / succession planning annotationpackaging, submission normalization / canonicalization workflow / tool integration code / workflow repositories, aggregation chain of custodypreservation planning environmental control, redundancy, media refresh administrative control, fixity audit, malware detection/ sanitation, technical control, migration bibliographic control access control, emulation archival controlchange control, preservation watch forensic characterization morphological characterization, PID minting structural characterization, ontologies, format registries intellectual characterization, entity extraction, sentiment analysis, PID binding behavioral characterization, software registries, analytics archival characterization, master registry provenance, annotation media inventoryfile inventory, PID resolution object indexwork catalogtranscoding, syndication, discovery documentary formversioned change history legacy / emulated computational environments file deliveryformat-aware processing disciplinary-specific processing search / browse, hosted tools, annotation authenticity- dependent workflows consortial collaboration Imperative Predilect Collect Protect Introspect Project Connect Slide 34 Policy and strategy rubric BlobArtifactExemplarProductAssetRecordHeirloom service level agreement disaster recovery / business continuity format action plans collection development policy outreach and training evidentiary standards sustainability / succession planning annotationpackaging, submission normalization / canonicalization workflow / tool integration code / workflow repositories, aggregation chain of custodypreservation planning environmental control, redundancy, media refresh administrative control, fixity audit, malware detection/ sanitation technical control, migration bibliographic control access control, emulation archival controlchange control, preservation watch forensic characterization morphological characterization, PID minting structural characterization, ontologies, format registries intellectual characterization, entity extraction, sentiment analysis, PID binding behavioral characterization, software registries, analytics archival characterization, master registry provenance, annotation media inventoryfile inventory, PID resolution object indexwork catalogtranscoding, syndication, discovery documentary formversioned change history legacy / emulated computational environments file deliveryformat-aware processing disciplinary-specific processing search / browse, hosted tools, annotation authenticity- dependent workflows consortial collaboration Imperative Predilect Collect Protect Introspect Project Connect Slide 35 A domain model for digital curation Next steps sept Respond to feedback Continue development Strategic planning for program and services Use case analysis and requirements gathering for next generation repository Slide 36 A domain model for digital curation Summary sept Curation enables communication Objects carry messages, not meanings Consumer interpretation and experience are inherently subjective Progress towards greater rigor in conceptualizing digital curation Terminology for expressing nuanced intentions, actions, and outcomes Object modeling concerns span six analytic dimensions Object typology of increasing utility Engagement entails a continuum of roles, activities, and concerns Rubric for strategic and policy imperatives Slide 37 Thank you Stephen...