Upload
allan-cole
View
239
Download
7
Embed Size (px)
Citation preview
Data Cleansing and Matching
The Agenda
An Introduction to VisionWare
The Fundamental Elements of Data Cleansing and Matching
Case Study: Clackmannanshire Council
Open Discussion
The Benefits
Data Cleansing and Matching
Public Sector Public Sector PedigreePedigree
Public Sector Public Sector PedigreePedigree
Thought Thought LeadershipLeadership
Thought Thought LeadershipLeadership
Rapid, Deep Rapid, Deep IntegrationIntegration
Rapid, Deep Rapid, Deep IntegrationIntegration
About VisionWare plc
Strategic Strategic AlliancesAlliancesStrategic Strategic AlliancesAlliances
60+ Established Public Sector ClientsCustomer ReferencesCRM/SSA/ICS/Citizen Account/Smart Card Initiatives
60+ Established Public Sector ClientsCustomer ReferencesCRM/SSA/ICS/Citizen Account/Smart Card Initiatives
Broad and Deep Integration CapabilityBack-end Legacy & Front-end CRMSystems/Applications/Data/Functions/Services
Broad and Deep Integration CapabilityBack-end Legacy & Front-end CRMSystems/Applications/Data/Functions/Services
Products: MultiVue/relate/E-FormsNon-Prescriptive Trusted DataCustomisable Framework
Products: MultiVue/relate/E-FormsNon-Prescriptive Trusted DataCustomisable Framework
Microsoft ITNET Parity Solidsoft Deloittes Capita Microsoft ITNET Parity Solidsoft Deloittes Capita
Data Cleansing and Matching
HEALTH TRUSTS• Ayrshire & Arran
• Fife Primary Care
• Highlands Acute Hospitals
• NHS Highland
• Inverclyde Hospitals
• Yorkhill
• Lanarkshire Primary Care
• Fife Acute
• Peterborough
• Royal Wolverhampton
• West Suffolk Hospitals
• Weston Area Healthcare
• United Bristol
• Great Ormond Street Hospital
• North Cheshire
• Croydon PCT
HEALTH TRUSTS• Ayrshire & Arran
• Fife Primary Care
• Highlands Acute Hospitals
• NHS Highland
• Inverclyde Hospitals
• Yorkhill
• Lanarkshire Primary Care
• Fife Acute
• Peterborough
• Royal Wolverhampton
• West Suffolk Hospitals
• Weston Area Healthcare
• United Bristol
• Great Ormond Street Hospital
• North Cheshire
• Croydon PCT
LOCAL GOVERNMENT• Aberdeen City
• East Renfrewshire
• Glasgow City
• Moray
• North Ayrshire
• North Lanarkshire
• West Lothian
• South Lanarkshire
• Renfrewshire
• West Dunbartonshire
• East Lothian
• Clackmannanshire
• Wansbeck
• Leicestershire
• Sutton
LOCAL GOVERNMENT• Aberdeen City
• East Renfrewshire
• Glasgow City
• Moray
• North Ayrshire
• North Lanarkshire
• West Lothian
• South Lanarkshire
• Renfrewshire
• West Dunbartonshire
• East Lothian
• Clackmannanshire
• Wansbeck
• Leicestershire
• Sutton
A Selection of VisionWare Public Sector Customers
LOCAL GOVERNMENT
• Midlothian
• Merton
• Newham
• Croydon
• Luton
• Tower Hamlets
• North Tyneside
• Windsor & Maidenhead
• Wiltshire
• Blackburn with Darwin
• Calderdale
• East Sussex
• Inverclyde
• Cambridgeshire
• Bedfordshire Consortium
LOCAL GOVERNMENT
• Midlothian
• Merton
• Newham
• Croydon
• Luton
• Tower Hamlets
• North Tyneside
• Windsor & Maidenhead
• Wiltshire
• Blackburn with Darwin
• Calderdale
• East Sussex
• Inverclyde
• Cambridgeshire
• Bedfordshire Consortium
Data Cleansing and Matching
The Fundamental Elements
of Data Cleansing and Matching
Evaluate the Quality and Quantity of Data
Cleanse the Data
Match the Data
Maintain and Synchronise the Data
Data Cleansing and Matching
The Operational Challenge
The use and administration of data within Public Sector organisations has grown:
- Electronic Service Delivery
- Modernising Government Initiatives Each vertical departmental system stores demographic data
and information relating to their functional area
- This creates silos of information across the organisation We need to deliver services designed around the citizen NOT
around the departmental function We must therefore Join-Up Data to deliver Joined-Up Services What underpins these initiatives?
- Information Sharing
- Trusted Source of Unified Data
The use and administration of data within Public Sector organisations has grown:
- Electronic Service Delivery
- Modernising Government Initiatives Each vertical departmental system stores demographic data
and information relating to their functional area
- This creates silos of information across the organisation We need to deliver services designed around the citizen NOT
around the departmental function We must therefore Join-Up Data to deliver Joined-Up Services What underpins these initiatives?
- Information Sharing
- Trusted Source of Unified Data
Data Cleansing and Matching
Evaluate the Quality and Quantity of Data
Identity information is held within each of the organisation’s line of business applications
Each identity will vary in terms of:- Quality- Accuracy- Quantity
Need to be able to:- Report on the variance of both data quality and data
quantity across the departmental systems- Match and rationalise the information
Identity information is held within each of the organisation’s line of business applications
Each identity will vary in terms of:- Quality- Accuracy- Quantity
Need to be able to:- Report on the variance of both data quality and data
quantity across the departmental systems- Match and rationalise the information
Data Cleansing and Matching
Evaluate the Quality and Quantity of Data: On a National Scale
Scotland: 5,057,400
Annual demographic change factors:
52,395 birth registrations, 58,326 death registrations, 30,651 marriages recorded, 10,484 divorces recorded,125,000 Annual address changes, Unquantifiable job and circumstance changes
With these demographic changes on a yearly basis how can we ensure the quality of our data…?
Scotland: 5,057,400
Annual demographic change factors:
52,395 birth registrations, 58,326 death registrations, 30,651 marriages recorded, 10,484 divorces recorded,125,000 Annual address changes, Unquantifiable job and circumstance changes
With these demographic changes on a yearly basis how can we ensure the quality of our data…?
Births
Deaths
Marriages
Divorces
Address Changes
Unchanged
Births
Deaths
Marriages
Divorces
Address Changes
Unchanged
• This represents over 5% of the population.
• In 2 years, at least 10% of data could be out of date
• In 5 years, at least 30% of data could be out of date
• This represents over 5% of the population.
• In 2 years, at least 10% of data could be out of date
• In 5 years, at least 30% of data could be out of date
Data Cleansing and Matching
L
E
I
S
U
R
E
L
I
B
R
A
R
Y
T
R
A
V
E
L
S
C
H
O
O
L
L
SSmartCardSmartCard
DatasetDataset
V
U
L
N
E
R
A
B
L
E
E
L
D
E
R
L
Y
SSASSA
DatasetDataset
Evaluate the Quality and Quantity of Data: Modernising Government Initiatives
ICSICS
DatasetDataset
C
H
I
L
D
R
E
N
C
A
R
E
R
S
CRM DatasetCRM Dataset
P
E
O
P
L
E
P
R
O
P
E
R
T
Y
P
L
A
C
E
S
S
P
A
C
E
S
O
B
J
E
C
T
S
A
S
S
E
T
S
LLPGLLPG
DatasetDataset
Each Initiative generates its own datasetEach Initiative generates its own dataset
Existing LOB Applications do not participate and ADD to the problemExisting LOB Applications do not participate and ADD to the problem
Between 150-250 LOB Applications containing Customer Data ElementsBetween 150-250 LOB Applications containing Customer Data Elements
Data Cleansing and Matching
Data Cleansing and Matching
Integration of various identities will invariably lead to a series of data contentions- Multiple names, multiple addresses, inconsistent dates of birth,
incorrect (false) demographic information and duplicate information This needs to be resolved before we can provide a unified view of
trusted data relating to either a person or property. Need to be able to:
- Resolve data contentious issues- Aggregate all non-contentious information- Provide a composition that retains information of the highest quality
and quantity by:
- Matching the records- Merging the information - Managing the duplicated data
Integration of various identities will invariably lead to a series of data contentions- Multiple names, multiple addresses, inconsistent dates of birth,
incorrect (false) demographic information and duplicate information This needs to be resolved before we can provide a unified view of
trusted data relating to either a person or property. Need to be able to:
- Resolve data contentious issues- Aggregate all non-contentious information- Provide a composition that retains information of the highest quality
and quantity by:
- Matching the records- Merging the information - Managing the duplicated data
Data Cleansing and Matching
The Real Challenge: A Plethora of Systems, Silos of Information
CEX&Corporate
Registrar ofBD&M
Leisure
Education
Environment
Housing
HBIS (@CSL)VME/IDMS &
Pericles(Replacement)
Housing Rents(NORA)
Unix/Progress
Housing Repairs &Needs (OHMS/ROS)
Unix/Oracle
HWSVoids andAllocationsVME/IDMS
UNIVERSEUnix/Pick
NEWHAM APPLICATION SYSTEMS MAP
PARIS:Cashiers
NT/MS SQL
Geoff Connell - CICT Development11/09/2003
FMS(Masterpiece 3)Win2k/MS SQL
Benfit Adjustments / W
Benefit / WTerminations / D
Tenant Name Loader/M
Various Job Related(1/2 Hourly)
PARIS: CashManagementNT/MS SQL
4/D
1/O
MIS2 DebtorsNT4/Oracle 7
Council TaxVME/IDMS &
Pericles(replacement)Win2k/SQL
PayrollVME/IDMS &
Trent(Replacement)Win2k/Oracle
1/A
3/D
3/W
CADNT4/Oracle 7
CustomerTrackingCRMNT4/Oracle 7
LandlordO'payments/D
AllbacsNT4/Datafile
BACSSocial Services
CarefirstUnix/Oracle 7
1/A
NNDRNT2K/Pick
2/M
2/M
2/A
1/D
1/W&M
Start HereNT4/Flat Files
1/D
4/W
Iclipse (DIP)NT4/Oracle 7
3/M
CallscanNT4/Access
1/D
Estateman(Commercial Rents)
NT4/?
1/D
W
3/D
1/W&M
CAPSNT4/Oracle
1/D
4/D
1/D
Cashiers
ChequeReconciliation
Win2K/Oracle 8
ElectoralRegistration
DOS
3/M
1&3/D
1&3/W*2
O
Corporate &Commercial Debt
NT4/Oracle 7
4/O1/O
1/D
Rebates etc/D
Salary Ddns/W&M
3/D
W
4/D
O&D
D
PWA(Personnel)NT4/SQL
Key
Interface type1. Payment Details2. Direct Debits3. Reconciliation4. Account Balances
Interface FrequencyD. DailyW. WeeklyM. MonthlyA. Ad-HocO. On-LineP. Pending
1/W
W
3/P
StoresNT/Datafile
1/D& 3/W
3/W
3/DEMS
NT4/Oracle 8i
SIMSNT4/Oracle 7
Pupils ServicesNT4/Oracle 7
1/D& 3/W
BanksNat West &Girobank
3/W
3/M&W
NewFleetNT4/SQL7
3/M
HotlineNT4/Oracle
CorporateGIS (GGP)
NT/?
Internet(APLAWS)Unix/Oracle
Jobs/W
IntranetNT4/SQL7
Asylum SeekersWIN2K/Access
3/P
1/W*2
Parking(POW)
NT4/Oracle
1/D
N2 (Pending)NT4/SQL7
1/D& 3/W
RTB/ServiceCharges &
Major WorksNT4/Dataease
1/W
LAMPNT4/?
3/M
1/M
iSYS HousingSun/Oracle
TaskNT/Oracle
Choice BasedLettingsNT/SQL
Kiosks
W
O
People&Properties/DBids bi-W
DVLA/Trace
W
Planned Interface
Live Interrface
Kiosks
CareSupportWin2k/Oracle
9iAS
1/M
1/M
ProfessWin2k/SQL
1
StarkNT/Access
LEB
RegistrarGeneral's
Office
RSSLotus Notes
LibrariesTBC
LoCTAWin2k/
SqlServer/IIS
WCustomer Complaints
NT4/Oracle 7
Data WarehouseWin2k/Oracle 9i
Loctahub
O
Identification of £1.5m of Benefit FraudIdentification of £1.5m of Benefit Fraud580,000 records relating to 200,000 people 3:1 Ratio of Duplicates 580,000 records relating to 200,000 people 3:1 Ratio of Duplicates
Data Cleansing and Matching
Data Matching: Look at it this way…
Antonia Marie PilaskiAlias:
Toni PilaskiMarie Pilaski
Address:33 2 Prince Regent Street EDINBURGH
Antonia Marie PilaskiAlias:
Toni MarieAddress:
45 Dunfermline Av EDINBURGH
Mark Baker
Address:24 6 Montgomery Street EDINBURGH
Mark Ritchie
Address:24 6 Montgomery Street EDINBURGH
Antonia Ritchie
Address:24 6 Montgomery Street EDINBURGH
WHO THEN IS ANTONIA RITCHIE?
Lives with parents
Moves to own flat
Gets engaged
Fiancée changes name
Gets married & moves in with husband
Data Cleansing and Matching
Maintenance and Synchronisation
The maintenance of identity information is:- Time consuming- Inefficient manual process
Potential risk involved in latency of updates Possible inconsistencies within the datasets Need to be able to:
- Implement a mechanism that enables information to be passed and shared between the departmental systems
- Each connected application needs to be notified of any validated changes
The Benefits- Ensures consistent view of an individual- Level of data latency can be controlled- The risks of utilising redundant information is managed
The maintenance of identity information is:- Time consuming- Inefficient manual process
Potential risk involved in latency of updates Possible inconsistencies within the datasets Need to be able to:
- Implement a mechanism that enables information to be passed and shared between the departmental systems
- Each connected application needs to be notified of any validated changes
The Benefits- Ensures consistent view of an individual- Level of data latency can be controlled- The risks of utilising redundant information is managed
Data Cleansing and Matching
The Fundamental Elements of Data Cleansing and Matching
COUNCIL TAX
SYSTEMHOUSING
SYSTEMPLANNING
SYSTEM
CENTRAL
GOVERNMENT
CENTRAL
HEALTH
SYSTEMS
COMMUNITY RELATIONSHIP MANAGEMENT (CRM) SYSTEM – FRONT END INTEGRATION
MIDDLEWARE – BACK END SYSTEMS INTEGRATION
One Number
Contact Centre
EDUCATION
BENEFITS
AGENCYPOLICE
CUSTOMER CONTACT CHANNELS
TRUSTED DATA SCALING ACROSS THE LINE OF BUSINESS APPLICATIONS
SHARED INFRASTRUCTURE
Mediated Access
SOCIAL WORK
SYSTEM
S
Y
N
C
H
R
O
N
I
S
A
T
I
O
N
S
Y
N
C
H
R
O
N
I
S
A
T
I
O
N
P O R T A L S O F F U N C T I O N A L I T Y P O R T A L S O F F U N C T I O N A L I T Y
SYSTEMS
DATA
VOLUMES
DATA
VOLUMES
DATA
VALUE
DATA
VALUE
Data Cleansing and Matching
Case Study Presentation
Brian Forbes
Modernising Government Strategy Manager
Clackmannanshire Council
Data Cleansing and Matching
Open Discussion
VisionWare plc
Willie Clinton, Director
Campbell McNeill, Consultant
Clackmannanshire Council
Brian Forbes, Modernising Government Strategy Manager
Alexis Easton, Head of IT Services
Data Cleansing and Matching
Topics for Discussion
How do you change the culture to ensure that staff maintain quality data?
How do you measure data quality? Do we need to define national standards for the
Public Sector? What are the difficulties matching citizen data
with limited information? What resources are required to match data?
How do you change the culture to ensure that staff maintain quality data?
How do you measure data quality? Do we need to define national standards for the
Public Sector? What are the difficulties matching citizen data
with limited information? What resources are required to match data?
Data Cleansing and Matching
How do you change the culture to ensure that staff maintain quality data?
The Structure of the Organisation: - Silos of information exists across departmental systems- Each departmental system holds demographic information
about entities (person, property, assets) - Should each department manage their own data?- Should the organisation have a corporate-wide strategy?- Should we consider a Centralised Repository of
Information, for example, The Citizen Account?- Data quality has to be improved by changing business
processes and working practices
The Structure of the Organisation: - Silos of information exists across departmental systems- Each departmental system holds demographic information
about entities (person, property, assets) - Should each department manage their own data?- Should the organisation have a corporate-wide strategy?- Should we consider a Centralised Repository of
Information, for example, The Citizen Account?- Data quality has to be improved by changing business
processes and working practices
Data Cleansing and Matching
How do you measure data quality?
Data Quality- Data Quality = How accurate is the information?- Data Latency = How up-to-date is the information- Data Quantity = multiple systems, silos of information- Other areas to consider
Information Audit
Technology
Public Enquiry, at worst
- Some systems have more valuable data than others- How can these systems support the “weaker systems?”
Data Quality- Data Quality = How accurate is the information?- Data Latency = How up-to-date is the information- Data Quantity = multiple systems, silos of information- Other areas to consider
Information Audit
Technology
Public Enquiry, at worst
- Some systems have more valuable data than others- How can these systems support the “weaker systems?”
Data Cleansing and Matching
Do we need to define data standards
We have existing standards:- eGIF- Citizen Account Dataset- BS8766 (Name)- BS7666 (Addressing)- BS7799 (Security)- Data Protection
How do you stop standards from stifling innovation or impacting for example, Data Protection
We have existing standards:- eGIF- Citizen Account Dataset- BS8766 (Name)- BS7666 (Addressing)- BS7799 (Security)- Data Protection
How do you stop standards from stifling innovation or impacting for example, Data Protection
Data Cleansing and Matching
What are the difficulties matching citizen data with limited information?
Limited Information- Does the organisation know what information they
hold?- Is forename, surname and address limited
datasets?- Does limited data come from the imposition of
Data Protection and Information Sharing? Leverage the best of what we have
- The process has got to be evolutionary not revolutionary
- Dependent upon the Quality, Quantity and Latency of Information
Limited Information- Does the organisation know what information they
hold?- Is forename, surname and address limited
datasets?- Does limited data come from the imposition of
Data Protection and Information Sharing? Leverage the best of what we have
- The process has got to be evolutionary not revolutionary
- Dependent upon the Quality, Quantity and Latency of Information
Data Cleansing and Matching
What resources are required to match data?
People and Technology Manual Process
- Build a level of trust in the data Automatic Process
- Probalistic matching to deterministic matching - Parameters set by the organisation
People and Technology Manual Process
- Build a level of trust in the data Automatic Process
- Probalistic matching to deterministic matching - Parameters set by the organisation
Data Cleansing and Matching
The Benefits
Data Management
Trusted Data Source
Joined-Up Services
Multi-Agency Working
Data Cleansing and Matching
Some Case Study Examples
Customer CRM Citizen Account
Data Management
Master Address Database
Children Elderly
Clackmannanshire
East Lothian
Fife Constabulary
Glasgow City Council
Inverclyde
Midlothian
North Ayrshire
Renfrewshire
South Lanarkshire
West Dunbartonshire
West Lothian
Data Cleansing and Matching
MultiVue is the Key to Joined-Up Data
VisionWare specialises in the provision of trusted data with MultiVue Identification Server, an enterprise-wide data integration tool.
Public Sector departments and multiple agencies can now share accurate and reliable information on every citizen.
VisionWare specialises in the provision of trusted data with MultiVue Identification Server, an enterprise-wide data integration tool.
Public Sector departments and multiple agencies can now share accurate and reliable information on every citizen.
Data Cleansing and Matching
Thank you!Willie Clinton
Director
VisionWare plc
0141 285 7150
www.visionwareplc.com