Upload
labsbl
View
25
Download
0
Embed Size (px)
Citation preview
1 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
British Library LabsWhat is British Library Labs and what have we learned over the last four years?
1305 – 1400 and 1500 - 1530, 3 April 2017Learning the Lessons of working with the British Library’s Digital Content and Data for your researchUniversity of Wolverhampton
https://goo.gl/Lh4zI6
Mahendra Mahey, Manager of British Library Labs@BL_Labs and @[email protected]
2 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
It’s all about you…jobs
3 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
It’s all about you…subjects
Please complete / correct sheet that is going round
4 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
The British Library
Inside the British LibrarySpace for 1200 readers, around 400,000 visitors per year
Uses low oxygen and robotsReading room and delivery to London
Document Supply and Storage at Boston Spa
Stockton-on-TeesAuthor right to payment each time their books
are borrowed from public libraries.
St Pancras, London, UKMany books are stored 4 stories below the buildingLegal Deposit Library – Reference only
5 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Living Knowledge Vision (2015 – 2023)
Custodianship Research Business
Culture Learning International
To make our intellectual heritage accessible to everyone, for research, inspiration and enjoyment and be the most open, creative
and innovative institution of its kind by 2023.
Document:http://goo.gl/h41wW7 Speech:https://goo.gl/Py9uHK
Roly Keating (Chief Executive Officer of the British Library)
To make our intellectual heritage accessible to everyone, for research, inspiration and enjoyment and be the most open, creative
and innovative institution of its kind by 2023.
6 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Collections – not just books!> 180* million items
> 0.8* m serial titles
> 8* m stamps
> 14* m books
> 3* m sound recordings> 4* m maps
> 1.6* m musical scores
> 0.3* m manuscripts
> 60* m patents
King’s Library *Estimates
7 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
http://www.bl.uk/projects/british-library-labsFunded by the Andrew W. Mellon Foundation
8 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
http://www.bl.uk/projects/british-library-labsFunded by the Andrew W. Mellon Foundation
9 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Wider…not just Researchers
Researchershttps://goo.gl/WutNyi
Artistshttp://goo.gl/nNKhQ2
LibrariansCurators
https://goo.gl/9NWZUW
Software Developershttps://goo.gl/7QQ5Tf
Archivistshttps://goo.gl/x7b4tg Educators
https://goo.gl/qh01Mi
10 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Digital research methods
Visualisations
Application Programming Interfaces for datasets e.g. Metadata, Images Annotation
Location based searching & Geo-tagging CrowdsourcingHuman Computation
11 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
How are we doing this?
12 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Competition
Awards
Projects
Tell us your ideas of what to do with our digital content
Show us what you have already done with our digital content in research, artistic, commercial and learning and
teaching categories
Talk to us about working on collaborative projects
13 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Why are we doing this?
14 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Why are doing this?
• Working closely with and listening to those who want use our digital collections and data for their work
• We can learn how we are and should be supporting them:– Access to digital collections?– Advice, guidance, technical support, training– Services, Tools and Processes?– Many more reasons…
• Where are the gaps between what users want and what we can give?
• How do we build the bridges to overcome the gaps?
15 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Born digitalData all around us!
/
Knowledge Quarter London55 knowledge organisations within 1 mile radius of Kings Cross, http://www.knowledgequarter.london
https://goo.gl/pGO7QY
Born digitalData all around us!
16 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
#bldigital1-2 %* digitised
* estimate
Digitisation
Partnerships Commercial & Other Organisations
Amountincreasing rapidly
Bias in digitisation
http://goo.gl/bR9UJL Sample Generator
17 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Have you got X?
https://upload.wikimedia.org/wikipedia/commons/5/50/Real_wuerzburg.jpg
Looking for Physical Content in the British Library
18 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Have you got X digitised?
http://www.yorkmix.com/wp-content/uploads/2014/04/mr-simms-sweet-shoppe-york.jpg
Looking for Digitised Content in the BL
19 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
So little digitised?• Digitisation costs time and resources…
• Still…over 650 Digital Collections but not all found through Google or even online
• Dialogue is either:– you are ‘lucky’ and we have the digital content relevant to your research– we don’t have exactly what your looking for, but is there anything of
interest? Let’s talk…
• Artists find this dialogue easier and we tend to attract researchers with ‘fuzzier’ research boundaries
• Access easier for openly licensed content
• More challenging for on-site and in-copyright contemporary material
20 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
only in Reading
Rooms due to ©
only on site due to
© or ethical etc
not online / available –
various storage devices,
personal data
online and open
British Library
online behind paywall
Challenges of access to Digital Collections
21 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
The Story of the Digital Collection…
DigitalCollection
CuratorWho paid for the digitisation?
Who did the digitisation?Technology used
Born digital?
Published
Unpublished
Where is it?
Can it still be accessed?
Generates income
Reputational Risk
Legalities
Political
Ego Surprises
Metadata
Old format not supportedWhat media was the digitisation done from?
Documentation
No Metadata
Messy Metadata
Still there?
Good to know the background of a Digital collection if you want to use it for research…
22 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Open Licensed Digital Content?
15% Openly Licensed
Around 10%* available online
Working through
Breakdown by collection*Manuscripts 59%Books 9%Maps and Views 7%Newspapers 3%Archives and Records 3%Paintings, Prints and Drawings 2%
*Based on digitisation projects
Largest proportion of fundingPublic / Private Partnership
15%* Openly Licensed85%* Available onsite
*Estimates
23 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
How do we give access to onsite-only
Digital Collections(85% of our Digital Collections)?
24 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
READING ROOM
ON SITE
NOT ONLINE
OPEN
British Library
£
Labs Residency Model
Challenges of access to Digital Collections
25 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Accessing digital collections onsite
OPEN £
• Have to be ‘onsite’
• Need to be security cleared for some collections– Hence ‘Researcher in Residence Model’
• Permission required (depending on ‘story’ of collection)
• Content on various media formats
• 20 % re-use of material for non commercial research for some collections
• We are learning ‘pathways’ so that this becomes ‘everyday’ to provide onsite access in the future
26 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Playbills, Books, Newspapers (includes OCR)
Digital collections and Datasets
British National Bibliography
http://bnb.data.bl.uk
http://sounds.bl.ukhttp://dml.city.ac.uk/
Music (Recordings & Sheet) & Soundshttp://goo.gl/frSMJtBroadcast News (TV and Radio)
http://goo.gl/cwThHw
http://goo.gl/pBkisZhttp://goo.gl/E8aRyQ
Usage dataEtHOSImages, Manuscripts & Maps
http://www.qdl.qa/ Qatar Digital Library
http://idp.bl.uk/International Dunhuang
Project
Mapshttp://www.bl.uk/maps/
Hebrew Manuscriptshttp://goo.gl/4sbCp9
Flickr & Wikimedia Commons
https://goo.gl/LZRmaZ
27 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Open Cultural Heritage DatasetsCollection Guides
Datasets about our collections Bibliographic datasets relating to our published and archival holdings
Datasets for content mining Content suitable for use in text and data mining research
Datasets for image analysisImage collections suitable for large-scale image-analysis-based research
Datasets from UK Web ArchiveData and API services available for accessing UK Web Archive
Digital mapping Geospatial data, cartographic applications, digital aerial photography and scanned historic map materials https://data.bl.uk
Discussion list: http://www.jiscmail.ac.uk/CULTURAL-HERITAGE-DATASETS
28 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
What did people
actually do?
29 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Typical pattern of research for Labs
•Finding invisible things in ‘messy’ historical data
•Unearthing / unlocking hidden histories and data to stimulate new research
•Celebrating hidden histories / data creatively through events, art and performance
30 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Finding things in messy OCR text
Mrs Folly• Clean up some manually• Get human ‘ground truth’• Write code to find things
reliably in it automatically• Try code on messy content• Tweak if necessary• Digital ‘lasso’ around content• Human sift through
Mrs Folly
31 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Code: Machine Learning / Reading• Analogies to how humans read / learn
• Machines acquire ‘knowledge’ / data and use that knowledge / data to make sense / identify patterns
• Labs doing this on a case by case basis so methods can vary
• Need computational AND human effort
• Legalities of this process being ‘ironed’ out with publishers,
• Often a misunderstood area…
• Computers look for ‘patterns’ or the ‘essence’ of something
32 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Smell of soup & Machine Learning
Thanks to Memo Akten (@memotv on twitter) for the inspiration!
https://goo.gl/toq4Bo Nasreddin, 13th Century Turkish Sufihttp://web2.uvcs.uvic.ca/elc/studyzone/330/reading/smell1.htm
33 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
http://victorianhumour.tubmblr.com
Victorian Meme Machine (2014)
https://goo.gl/HMqDt3
Bob Nicholson
http://victorianhumour.tumblr.com/
Bob Nicholson interviewed on BBC Radio 4 Making History Programme:
http://goo.gl/fmV9epAnd telling jokes to the public:
http://goo.gl/xIDRhzBob obtained further funding from his university
Looking for more collaborations https://www.youtube.com/watch?v=-GRgj7Q5OM0
Rob Walker, Victorian Mother-in-law Jokes
Victorian Comedy Night, 7 Nov 2016
34 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Katrina Navickas (2015) Political Meetings Mapper
http://politicalmeetingsmapper.co.ukhttps://goo.gl/Qq78Oa
Labs Symposium 2015
https://goo.gl/BSA3be
Interview 2015
The Chartist Newspaperhttp://goo.gl/vOLSnH
Chartist Monster Meeting
Chartists Walking Tour and Re-enactment London
35 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Black Abolitionist Performances & their Presence in Britain (2016) – Hannah-Rose Murray
FrederickDouglass
EllenCraft
JosiahHenson
Ida B Wells
A Performance by Joe Williams &
Martelle Edinborough
http://frederickdouglassinbritain.com/
36 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Data-mining verse in 18th Century newspapersBL Labs Project 16-17, Jennifer Batt
https://goo.gl/5Akthd
Slides courtesy Jennifer BattJennifer Batt @ the BL on World Poetry Day
37 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
What thoj' among ourrelves, with too much Heat, or t W: fweutimes.wongle, wvhen we Ihould debate, W – (A confequential Ill which Freedom drawvs, fl t A bad Efficf, but from a noble Caufe) t We can with univeifal Zcal advance, to To cutb the faithlefs Arrogancccof V rance. hi
Dublin Journal 10-14 September, 1745
Slides courtesy Jennifer Batt
38 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Verse: 81% lines begin with initial capital
Prose: 52% lines begin with initial capital
Westminster Journal 3 March 1745
Slides courtesy Jennifer Batt
39 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Use of Overproof / OCR Correction?
Re-OCR with ABBY FineReader?
https://www.abbyy.com/en-gb/
http://overproof.projectcomputing.com/
40 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Virtual Infrastructure for OCR text
OCR text scraped from digitised newspapers
and in cloud
Jupyter notebookWrite python code and results
in browserhttp://jupyter.org
Access available for researchers ‘in residence’
41 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Other experiments with images
42 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Worked better for female faces than men’s
Press
http://mechanicalcurator.tumblr.comPosts image every 30 minutes
http://www.flickr.com/photos/britishlibrary/
1,020,418 imagesneed tagging!
Creative uses of images
Face recognition
Mechanical Curator
http://goo.gl/qPPgxX
Flickr
Snipping out imagesfrom 65,000 Digitised Books*
>600,000,000 views
>20,000,000 tags
https://goo.gl/FgZ4HM
Work @ BL by Ben O’Steen, Labs
and Digital Research Team*Matt Prior - http://goo.gl/j29Tnx
Since Dec 2013
Tumblr
43 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Using other platforms to host BL collectionsLinks back to Library & community engagement
You can purchase a ‘High Res’ Copy
View in the Library Item Viewer
Download .pdfAll illustrations
in book
Other illustrations in booksPublished in same year
View the item in the Library Catalogue Tags auto generated
User generatedTag
Grouping for image
Same on Wikimedia commons
British Library Flickr Commons Tags
44 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Tagging, Tagging, Tagging…
45 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Tagging a million imagesIterative Crowdsourcing
http://goo.gl/j6fxac
Cardiff University’sLost Visions Project
http://www.metadatagames.org/
Metadata Games
James Heald
Mario Klingemann
Chico 45
Use computational methods
Human Tagger
Top British Library Flickr Commons Taggers
46 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Special Jury’s Prize (2015)James Heald – Wikimedia and Map work
https://goo.gl/WYZCB2
http://goo.gl/HNQq5e
https://goo.gl/VPgffL
https://commons.wikimedia.org/
https://goo.gl/djtm1b
Labs Symposium (2015)Geotagging maps
54,000 MapsFound in Flickr 1 million
Human & ComputationalTagging
& Community engagement
47 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Adam Crymble (2015)Crowdsource Arcade
What if crowd sourcing
looked like this?
http://goo.gl/LBfJ4W
http://goo.gl/OH9pOZ
https://goo.gl/7z0j8p
30 mins talkLabs Symposium (2015)
https://goo.gl/SSRsdd
5 min interview (2015)
http://goo.gl/0APpE8
Game Jam
Using Arcade Gamesto help Tag images
48 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
SherlockNet: Competition Winner 2016Karen Wang, Luda Zhao and Brian Do
Using Convolutional Neural Networks to Automatically Tag and Caption the British Library Flickr Commons 1 million Image Collection
12 categories
>20 million tags added >100,000 captions
bit.ly/sherlocknet
Pooled surrounding OCR text on page from similar images
Used Microsoft COCO (photographs) & British Museum Prints and Drawings
collections as training sets.
Tags Captions
49 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Artistic / Creative Works
http://goo.gl/dM8ieA
Mario Klingeman (2015)
https://www.youtube.com/watch?v=Q3SBxO34Zlc
David Normal 2014 and 2015
http://goo.gl/bNxGZZ
Kris Hoffman (2016)
https://goo.gl/QilqqT
Jiayi Chong 2016Ling Low 2016
https://www.youtube.com/watch?v=bcOP1E5bRE0
https://www.facebook.com/RealmlandStory/ Paul Rand Pierce 2016
A Hat on the Ground Spells trouble
Tragic Looking Women44 Men who Look 44
(Notice the direction faces)
50 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Mario Klingemann 2016
https://www.youtube.com/watch?v=xgnxnmqnR7YGoogle Arts and Culture Lab – Experiments with Machine Learning
https://artsexperiments.withgoogle.com/
51 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Imaginary Cities – BL Labs Project 16-17Michael Takeo Magruder
https://goo.gl/4ARwTyAn artistic exploration seeking to create provocative fictional cityscapes for the Information Agefrom the British Library’s digital collection of historic urban maps
52 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Lessons Learned & Challenges…(1)• Start with a conversation (external and internal), our data isn’t all on Google
(yet!) & not easy to find, need to create and embrace serendipity and opportunities for use by talking!
• Need to have several conversations with several stakeholders and tap into their tacit knowledge that isn’t always written down sometimes to progress ideas.
• Often misunderstandings because of jargon & different meaning of words.
• Learn the story of the collection
• Expectations change when researchers actually see the data, systems and experience the ‘culture’ of the organisation.
• Opening collections requires some to need to let go of the emotional and psychological connection to them
53 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Lessons Learned & Challenges…(2)• Embrace dirty data, it may never be perfect!
• We tend to work with researchers who can be ‘flexible’ with their research questions and are willing to embrace challenges.
• Many researchers have the domain knowledge but lack the technical / digital skills to use Digital Research methods. Should they be teamed up with those that want to solve problems (computer science) or get trained?
• Identifying / bridging gaps for researchers to use data, help them ‘navigate’ through the Library to get the data they want (sometimes).
• Huge appetite to use digital content & data (e.g. Flickr Commons stats).
• Stimulate the imagination, work fast, give it energy
54 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Labs mindset…
1. Start a conversation and try to support ideas2. Start with small experiments, but think big!3. Fail faster (don’t be afraid)4. Reject perfectionism5. Good enough is sometimes Good enough6. Celebrate the uses of collections
55 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
The Magic of Openness!
• If digitised / digital collections are not used, what is the point of digitising / keeping them?
• Opening up our digital collections offers new ways for the Library’s content to be remixed and re-imagined
• Opening up our digital collections ‘re-energises’ them and the Library
• Generates plenty of examples to inspire use by others
56 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
The Future of BL Labs
• Continue to engage with researchers, learn what they want to do and collect evidence of demand
• Develop Business Model and Support process to make ‘Business as Usual’ at the British Library
• Help to create pathway to developing a Digital Research Suite at the British Library
57 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Taking a peek at our Open Data
A digitised book…
58 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
002819694
59 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
60 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
61 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Optically Character Recognised (OCR)generated Text
Scanned Page
Image on Flickr Commons
https://goo.gl/AC43vs
62 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
OCR XML Generated by ABBY Fine Reader
63 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Taking a peek at our on-site only accessible data
A digitised newspaper
64 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Accessing digitised newspapers onsite at the BL
1
Windows 7External access possible through Citrix Server
Results of digitisation exist on Windows file shares!
65 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Accessing digitised newspapers onsite at the BL (JISC 1)
2
12 Volumes, each with terabytes of data
66 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Accessing digitised newspapers onsite at the BL
3
67 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Accessing digitised newspapers onsite at the BL
4
68 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Accessing digitised newspapers onsite at the BL
5
69 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Accessing digitised newspapers onsite at the BL
6
70 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Accessing digitised newspapers onsite at the BL
7
71 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Accessing digitised newspapers onsite at the BL
8
72 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Accessing digitised newspapers onsite at the BL
9
73 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Accessing digitised newspapers onsite at the BL
10
74 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Accessing digitised newspapers onsite at the BL
11
75 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Accessing digitised newspapers onsite at the BL
12
76 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Accessing digitised newspapers onsite at the BL
13
Accessing original ‘master’ image (not cropped or post processed)
Or ‘service’ copy (post processed) and results of OCR available as ALTO XML
77 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Accessing digitised newspapers onsite at the BL
14a
Accessing original ‘master’ image (not cropped or post processed) in .TIFF format
78 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Accessing digitised newspapers onsite at the BL
Accessing original ‘master’ image (not cropped or post processed)
14b
79 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Accessing digitised newspapers onsite at the BL
15a
Accessing ‘service’ Copy (post processed) and results of OCR available as ALTO XML
80 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Accessing digitised newspapers onsite at the BL
Accessing ‘service’ Copy (post processed)
15b
81 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Accessing digitised newspapers onsite at the BL
15c
Accessing OCR as ALTO XML
82 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Accessing digitised newspapers through Gale Interface (subscription)
1
83 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Accessing digitised newspapers through Gale Interface (subscription)
2
84 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
It’s all about you…
Please complete / correct sheet that is going round
85 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Explore or Imagine Our Data!• CSV of Metadata
https://data.bl.uk/digbks/dig19cbooks-mdata-csv.csv
• 19th Century Books - Book Metadata - 01/09/2013.https://data.bl.uk/digbks/db21.html
• Digitised Books - Flickr Tag History - Dec 2013 to March 2016. TSVhttps://data.bl.uk/digbks/db15.html
• Digitised Hebrew Manuscripts - Metadatahttps://data.bl.uk/hebrewmanuscripts/heb1.html
• Digitised Hebrew Manuscripts: Or 2210 - Or 2364https://data.bl.uk/hebrewmanuscripts/heb8.html
• Theatrical playbills from Britain and Ireland (OCR text only)https://data.bl.uk/playbills/pb2.html
• Portraits of actors, views of theatres and playbills (covering 1750 - 1821 in a single volume)https://data.bl.uk/singlesheet/por1.html
• Volumes of Lysons Collectanea (Amusements), comprising broadsides, cuttings, advertisements on amusements.1660-1840.https://data.bl.uk/singlesheet/ad1.html
https://data.bl.uk•Have a look at the data.•Data Quality•Issues
Or an idea you have thought ofwhat to do with the data!
http://labs.bl.uk/Ideas+for+Labs
Smaller datasets
86 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/Lh4zI6
Contact us
Mahendra MaheyManager of BL Labs
[email protected]@bl.uk