19
1 of 19 High Level Requirements Specification Commonwealth Big Open Data (BOD) 353-2016

High Level Requirements Specification Commonwealth Big ...thecommonwealth.org/sites/default/files/inline/4 - 353-2016... · High Level Requirements Specification Commonwealth Big

  • Upload
    lynhi

  • View
    213

  • Download
    0

Embed Size (px)

Citation preview

1 of 19

High Level Requirements Specification

Commonwealth Big Open Data (BOD)

353-2016

2

3

1. Background _______________________________________________________________________ 4

1.1. Commonwealth Secretariat _______________________________________________________ 4

1.2. IT Services ______________________________________________________________________ 4

2. Overview _________________________________________________________________________ 5

3. Key Business Requirements__________________________________________________________ 7

4. Methodology ______________________________________________________________________ 7

5. Functionality ____________________________________________________________________ 13

5.1. Key Functional Requirements _____________________________________________________ 13

5.2. Functional Component Requirements ______________________________________________ 14

6. Services to which the cloud environment can facilitate access _____________________________ 16

6.1. Internal authentication __________________________________________________________ 16

6.2. Open Data interface ____________________________________________________________ 16

6.3. Enterprise Search Functionality ___________________________________________________ 16

6.4. Taxonomies ____________________________________________________________________ 16

7. Service Quality and Delivery _______________________________________________________ 17

7.1. Hosting and Availability__________________________________________________________ 17

7.2. Reliability _____________________________________________________________________ 17

7.3. Performance ___________________________________________________________________ 17

7.4. Maintainability _________________________________________________________________ 17

7.5. Scalability _____________________________________________________________________ 18

7.6. Backup & Recovery _____________________________________________________________ 18

7.7. Client Technology ______________________________________________________________ 18

7.8. Secure Architecture _____________________________________________________________ 18

7.9. Maintenance & Support __________________________________________________________ 18

8. Standards & Compliance ___________________________________________________________ 18

9. Contractor/Vendor Profile _________________________________________________________ 18

4

1. Background

1.1. COMMONWEALTH SECRETARIAT The Commonwealth Secretariat provides guidance on policy making, technical assistance and

advisory services to Commonwealth member countries. We support governments to help achieve

sustainable, inclusive and equitable development.

Our work promotes democracy, rule of law, human rights, good governance and social and economic

development. We are a voice for small states and a champion for youth empowerment.

Priority areas of work are agreed at Commonwealth Heads of Government Meetings, which occur

every two years. The next summit is in the United Kingdom in 2018.

Our vision is to help create and sustain a Commonwealth that is mutually respectful, resilient,

peaceful and prosperous and that cherishes quality, diversity and shared values.

1.2. IT SERVICES IT Services (ITS) works to support the Commonwealth Secretariat to achieve its strategic goals by

providing and maintaining high quality and reliable IT services and their enabling infrastructure. This

allows staff to securely access and manage their information and control its dissemination to other

interested parties and stakeholders.

The IT Services Section is currently implementing a Transformation program to:

Enable the delivery of IT Solutions as a Service allowing flexibility in the rapid

deployment, usage and enhancement of systems

Enhance technology solutions to provide the organisation with the tools for embarking on

industry leading initiatives

Deliver desired business agility and enhanced service levels through new IT service models

Deploy an environment supported by enterprise tools for the provision of organisation-

wide knowledge sharing and collaboration

Enable technology and solutions to contribute to business transformation

The Big Open Data initiative forms a part of this Transformation program.

5

2. Overview

The Commonwealth Secretariat has embarked upon an Enterprise wide transformation program

which seeks to leverage the power of current state of the art technologies to realise the goals

outlined in The Commonwealth Strategic Plan 2013/14 – 2016/17. A major enabler to realising these

goals is the establishment of an open data platform which would greatly simplify the collation,

analysis and publication of structured data that is produced across our organisational departments.

This would increase insight into the work done within the Secretariat and bring significant value to

the corpus of work produced by the Secretariat.

Ultimately such a platform should enable the Secretariat to:

Build trust and engage with its stakeholders globally

Become a Knowledge Centre and resource for all its stakeholders globally

Reduce the cost and inherent inefficiencies of multiple technical environments for similar

data-centric projects

Provide global access to data sets, combined and collected by various projects and provide

the tools and interface to allow individuals and institutions to access and use those data

sets to build meaningful reports and gain insight from the data.

This platform would enable the Secretariat to produce a number of engaging infographics,

visualisations, interactive maps, data stories and non-linear insights which should increase the

appeal of the Secretariat’s work to general public and, equally importantly, justify the value of

our work to our stakeholders. Further to this, it would enable the Secretariat to organise and

manage the vast corpus of data that has been generated by our internal Divisions over the past

five decades. This offers a tremendous opportunity and challenge to:

1) Leverage our work to not only add value internally but to also contribute to the work of

international organisations. Once achieved the Secretariat may possibly be par none in this

domain.

2) Become possibly the only instance of an international organisation of this stature creating an

open data platform that spans its entire business domain. This opens up a number of potential

for member states and bodies that each have their own unique data analysis needs.

To date, a significant number of Commonwealth data assets are stored and managed within

disparate and isolated systems. The general approach to building these assets involves the

employment of external resources which are assigned to a specific departmental context and

which usually has little or no knowledge of the wider information ecosystem. Naturally this has

proven to be expensive, inefficient and confusing. Furthermore, any changes to the data

underpinning these assets necessitates the re-engagement of these external resources to

perform the asset modifications needed in order to accommodate them. The Secretariat,

6

therefore does not have ownership of its data nor the management process. This approach is

unsustainable and impractical with ever increasing levels of data collated.

The Big Open Data platform is the Secretariat’s solution to this problem. By focusing on

centralising data and leveraging data management tools, Divisional partners and project

managers alike would be able to create, update and publish material within a controlled

environment with the assurance that it is secure. Partners and unrelated actors on the platform

would be able to consume this data and recast it from their view point – gaining previously

unforeseen insight into pre-existing data. This single managed platform would serve as the base

upon which the Secretariat can generate meaningful information and value to a specific project

domain.

7

3. Key Business Requirements

Our goal is to build a robust, accessible, scalable, Open Data platform for use by projects throughout

the Commonwealth Secretariat.

This platform will provide a centralised data resource for all data driven projects. A successful

implementation of this project would:

allow ‘open’ – data that has been approved for publishing to be accessed by anyone for use and

re-use

benefit the Commonwealth as an Organisation through:

o Increased transparency. Better availability and accessibility of data about performance of

the Secretariat, e.g., budgetary data or public contracts data, specialised project data

sets, health, governance, education, etc.

o Improved public relations and attitudes towards the Secretariat and to better inform

Commonwealth Stakeholders about its actions. This can help to build trust, understanding

and general attitude of all associated citizens and organisations.

o Increased reputation of the civil society initiatives and publications of the Commonwealth

allowing greater openness and transparency of our institution.

o Better understanding and management of data within the Secretariat.

o The cataloguing, collection and analysis and publication of maintained data sets to

promote better understanding of data assets for related and funded projects.

o Support re-use from which new value can be drawn

o Produce outputs and results - an open data platform will promote re-use of knowledge

assets, Increasing value of the data

This technology will be used by progressive applications, fostering innovation in this sector.

Therefore, it is necessary that we agree upon, and document, the existing requirements on

this platform and this is the purpose of this document.

4. Methodology

The aim is to develop a platform for the management of information flow with respect to the

authenticity of the content and the participating stakeholders. The goal is to achieve a new level

of transparency, information dissemination and collaboration in the Secretariat.

This will be achieved by taking a Division-centric approach to data management and ownership.

Data uploaded by a specific Division will be owned and managed by that Division – this data will

only be shared and exposed as per the limits set by said Division. The extent of these parameters

8

will be encoded into the design of the platform. This guarantees the integrity of data as well as

enshrining the principles of data governance and custodial responsibility.

Whilst there is no prescription to a specific methodology we have nonetheless distinguished

between five (5) classes data Fig 1.

They are:

1) Source Data (Data Source)

2) Raw data

3) Data asset

4) Information set

5) Data Consumer (Data Sink)

Raw data is fed into the Open Data Platform from a data source. The raw data is stored

in a predefined structure within the platform, yielding a data asset, which is maintained

by the platform.

Specific interpretations of a data asset can be represented as information sets. An

information set is provided to a data sink using appropriate formats – e.g. presenting a

tabular dataset to a map based platform to depict regional hotspots. Data sinks use the

information over the natural course of their particular business processes – it is important

to note that they might also contribute the information as data to the platform by way of

a feedback loop.

9

During the overall data processing steps, there are a number of interpretation activities

that introduce subjective input into the processes (e.g., deciding on the structuring of

the data assets, the choice of linked data that yields the information sets, the assignment

of semantics to the data assets). It is therefore necessary to provision additional analysis

processes that uses the interpreted information sets to build up knowledge to assess the

Figure 1 Data Classification and Information Flow Concept

10

authenticity of a data source. Necessary to this approach would be interoperability vis-à-

vis web based APIs and secure communication channels. For example, it would be

necessary to closely manage meta-data to leverage data-analytics. Referring to one of

the key aims of the project, which is to provide non-linear insight into basic data, we

would need to provide value chain/consumption flow mapping of data. This would enable

end users to not only identify the source of the data but to also invoke data at any point

in the value chain to gain greater insight as per their requirements.

Such functionality requires end-to-end data tracking, high scalability and a robust API

framework. Equally important is to be able to guarantee the integrity of data held on the

platform.

Although trust management is considered a part of security management, there are notable

differences when compared to conventional security approaches. Conventional security

systems usually employ mechanisms that aim at preventing users from executing certain

operations or accessing certain information. Trust-based mechanisms do not prohibit users

from performing access or execution functions, but continuously assess the behaviour of

both parties, as well as interpretation processes, to decide on the authenticity of each.

This information can then be used in conjunction with conventional security mechanisms

(e.g., refined access controls via the use of policies or the use of encryption for message

confidentiality) to achieve higher levels of system security.

A trust-based mechanisms will be especially useful in this situation where the user group

may be too large or too diverse to be sufficiently controlled using a closed mechanism.

The platform should follow a Service-Oriented Architecture (SOA). Platform components

(data classes) may belong to separate Divisions, may be globally distributed (member

states), may fail at any time, or may need to be replaced. To support this architectural

style, the platform components should be loosely coupled. For example, this may require

components to rely on messages in order to communicate (or exchange) information. We

envision the platform will be based on web technologies and as such, services will be

accessed using HTTP based mechanisms (such as RESTful interfaces, Web Service

endpoints, or SOAP messaging).

11

Secretariat Data Management

We identified nine processes employed for data management (DM), as depicted in Figure 2.

Control

The control process ensures that the overall

operation of the DM system is working correctly.

It regulates the system behaviour in accordance

with the operational policies as set forth by a

system operator and the responsible legal

authorities.

Definition

The definition process encompasses the

identification of data sources and definition of

data types, structure and technologies used to

store the data assets. The outputs of this process

are clearly defined structures for data assets and

the data sources used for acquiring the raw data.

Figure 2 Data Management Concept

Figure 3 Data Management Process

Acquisition

During the acquisition process, raw data is entered into the structure created by the

definition process, resulting in data assets. Based on type and source of the raw data this

process uses conversion operations to transform the data from the raw format into a

suitable storage format. The outputs of the process are data assets, stored in pre-defined

data structures.

Organisation

The organisation process constructs information sets from the data assets by constructing

meaningful links between individual data values or sets. The outputs of this process are

information sets.

Provision

The provision process enables a distribution of the linked information created by the

organisation process. The output of this process is an information set, in a format suitable

12

for consumption by clients of the platform. The output (i.e., the information) may also

serve as input data to the acquisition process.

Archival

The archival process is used at the end of the lifetime of an information set to guarantee

a required retention period, where the information is not provided through the platform

anymore, but still available to a system operator. This process has no output.

Maintenance

The maintenance process encompasses all of the activities necessary to guarantee the

proper operation of the platform. This includes activities like backup, restore and data

validation.

Interpretation

The interpretation process is a vertical process that encompasses activities where

subjective input can influence five other DM processes, namely, definition, acquisition,

organisation, provision, and archival. This does not only refer to the processes themselves,

but also to the artifacts shared between processes. Typical interpretation activities are

visualisation, transformation, or computation using the prescribed artefacts. The output

of the interpretation process can be used to assess the system’s operation.

Analysis

The analysis process evaluates the operation of the system and assesses the

interpretation of the stored data assets and information sets. The results of the analysis

process can be used to influence the interpretation process and therefore, the platform’s

operation

13

5. Functionality

5.1. KEY FUNCTIONAL REQUIREMENTS

The Open Data Platform would be delivered as a highly scalable service-based application. The

following are key functional requirements of this system:

Scalable and flexible: The Secretariat seeks an Open Data Platform delivered as a service

which can scale to a large number of datasets and users without a significant overhead in

maintenance or performance. The preference is for it to be hosted in a cloud based platform,

however internally hosted solutions can be considered provided that maintainability and

performance are not negatively impacted.

Simple interface for managing data from the front-end: The platform should be simple and

intuitive to use and manage. One that allows users with low technical skills to analyse data

and interpret data without the need for ‘heavy’ additional tools. The platform should include

a rich range of out-of-the-box (not requiring new development, OOTB) dataset management

capabilities.

Simple interface for analysing and visualising data from the front-end: these OOTB data

manipulation and visualisation tools should allow users to manipulate data in interactive

charts and maps. Additionally, they should be versatile enough to allow savvy users to

combine otherwise unrelated datasets to create higher level data visualisations.

Easy Integration with 3rd Party Tools: the solution should be simple enough to enable non-

technical users to embed visualisations on other website and blogs. It should also provide

more experienced users with a richly defined API that provides the capability to combine

data, charts, maps, text and other 3rd party content. This API must conform to industry and

market standards.

Advanced management capabilities: the platform should provide a flexible and simple to

use workflow with different levels of authorisation and access to data and data manipulation

features.

Open standards and developer friendly environment: The Secretariat is keen to encourage

developers to use its published data in their own applications. In order to achieve this, the

Open Data Platform needs to be able to publish our data using open standards and allow

developers to query the data using the most popular development tools.

Data APIs on datasets: The Secretariat seeks to empower third parties (members, citizens) to

use the Secretariat data in ways that matter to them. Automatic APIs on every tabular

dataset OOTB, including automatic-generation of context-sensitive API documentation with

code snippets on each dataset API, are essential.

14

Advanced analytics: The Secretariat wants to understand how users are interacting with the

data presented, not limited to how many times it is viewed. We seek to leverage the power

of Open Data to gain insight into how our data is interpreted and modified and the ways in

which in contributes to a given topic or field of interest. To this end, the platform and

solution must provide capabilities to track usage, referrers, traffic, viewpoints and

contributors on all pages and sites where the data is referenced. This information should to

fed-back to the Secretariat in a meaningful way. One that will provide us with insight into

the value added by the provision of the original dataset.

Ambitious development roadmap: The Secretariat is aware that the Open Data field is fast

paced and technology is constantly changing. It is very important that the platform keeps

pace with new developments in this field and is capable of implementing changes quickly

without compromising on availability, performance or introducing significant additional costs.

Understanding the supplier's roadmap for adding new capabilities, and the process for

prioritising and delivering them will be an important criteria.

5.2. FUNCTIONAL COMPONENT REQUIREMENTS

The above mentioned functionality (5.1) are expected to encapsulate the following central tenets of

the system:

1. An open and published API

2. The capability to Publish, Discover, Contextualise, and Distribute signficant amounts of data

3. Specifically designed for the Open Data objectives

4. The system must be available 24/7

5. The system must support data formats for CSV, Excel, TSV, Shapefiles, KML, KMZ and XML files

6. The system must use Open Data architecture - Open Data Protocol

7. The system must support CRUD - Create, Read, Update and Delete

8. The system must support KML - Keyhole Mark-up Language, XML notation representing geographic

annotations used for presenting data on maps in browsers.

9. The system must support REST - Representational State Transfer and use the HTTP methods GET,

PUT, POST and DELETE for searching and reading data in XML, JSON or RDF.

10. The system must include an out-of-the-box interface with a variety of data sources, so as to

ensure seamless import of updated data from databases.

11. The system must have an API driven service to export data into XML and/or JSON formats.

12. The system must support the monitoring of global indicators and defined development

effectiveness indicators. The Open Data platform should ensure that the indicators for

monitoring can be embedded.

13. The system must have ability to create and publish data assets alongside web page assets

14. The system should be able to spatially represent data on an interactive map.

15. The system should be capable of interfacing with existing enterprise level systems and

applications e.g. Project Management Systems.

16. The system should provide functionality that allows internal and external partners the ability to

monitor and track progress of relevant data in a work flow like fashion. Such a functionality for

15

example, should allow for the consolidation of key statistical data in order to facilitate analysis

and empower decision-makers.

17. The system must have a user-friendly interface.

18. The System must be publicly accessible via the internet on all modern standard compliant

browsers as well as mobile devices such as tablets and smart phones.

19. The system should allow for the design, viewing and printing of reports in flexible and modifiable

formats this includes the ability to generate various types of reports based on the data stored and

meta-information.

20. The reporting feature should be intuitive and user-friendly. It should include but not be limited to

the following formats CSV, XLS, TXT, JSON, XML, XBRL, GeoJSON, MS Word, MS Excel and PDF.

21. The system should provide optimised keyword search on data .

22. The system should support a data publishing API that supports append, replace and upsert

(update and insert) operations.

23. The system should allow publishers the ability to create multiple derived ‘views’ on the same

dataset thus enabling them to derive insights into data that were otherwise not obvious. Access

to these views should be controllable in such a way that they can be pointed to a specific

audience if necessary.

24. The system should allow for multiple user experiences for both internal and external web users

on a single digital platform.

25. The data platform must allow for real time modification and update of data.

26. The data platform must require no coding skills or staff to implement, maintain, or operate.

27. The data platform must provide mobile-optimised API’s.

28. Ideally, the system should provide a turnkey, secure, cloud-based managed-service that scales on

demand with no legacy constraints.

29. The system should include an interface that allows for seamless export of activities.

30. The system must have a web interface, including the possibility of decentralised data entry by

remote users.

31. The system must provide SLA’s with 99.95% uptime.

32. The system should provide real-time usage analytics allowing data publishers and site

administrators to measure data consumption, distribution and user engagement.

33. The system should easily integrate with Google Web Analytics.

34. The system should track and report usage and performance of key indications and metrics across

the platform to measure success.

35. The system must have connectors for, SDKs, and libraries for Google Android, DataSync SDK,

Apple iOS, Java, PHP, Ruby, Scala, Swift, PhpSoda, .NET, Julia, and Python. Additionally, all

hardware and network requirements and costs associated with these interfaces must be clearly

defined and documented for the Secretariat.

36. In the likelihood of coding it must adhere to industry standard software development principles

and quality assurance methodologies. Furthermore, full documentation of source code and

internal functions of the system must be handed over to the Secretariat.

37. The system must be fully documented for end users. This documentation must be handed over to

the Secretariat at the end of the development cycle. The Contractor will be required to explicitly

authorise staff in the Secretariat and provide training/certification for them to manipulate the

system in whatever manner they see fit, in order to meet evolving requirements of the system.

16

38. There must be at least six (6) months of warranty period after production implementation and

sign off.

6. Services to which the cloud environment can facilitate access

6.1. INTERNAL AUTHENTICATION

A method of accessing, replicating or linking to the Secretariat’s internal Active Directory

must be part of the Open Data Platform.

6.2. OPEN DATA INTERFACE

Access via API to an Open Data platform allowing presentation, visualisation and analysis

of relevant datasets.

It must be possible to connect directly to external data sets and retrieve data.

6.3. ENTERPRISE SEARCH FUNCTIONALITY

Integrated, configurable search facility which can access both internal and external data

sources.

6.4. TAXONOMIES

Specialised taxonomies defining and correctly applying relevant terms to structured

searches.

17

7. Service Quality and Delivery

7.1. HOSTING AND AVAILABILITY

The system can be hosted either on a platform provided by the Secretariat or by the

service provider. In the case of being hosted on the Secretariat’s platform the system

should be optimised for performance with a Windows 2012 R2 environment running on

Azure.

In the case of being hosted by the provider, it is important that the implementation

comply with the following:

1) The hosting environment must provide an acceptable, industry standard level of

availability of at least 99.95% as a monthly average

2) The hosting environment must have no single point of failure in its own underlying

systems

3) The hosting environment connectivity should allow for a connection of any reasonable

bandwidth and latency that may be required to support application and management

needs in line with current Industry standards

4) All services should be recoverable within a reasonable period in line with current

Industry standards

7.2. RELIABILITY

The system must be stable both functionally and operationally.

The bandwidth and latency of any connectivity within the cloud and via the direct

perimeter connection to Internet services should be stable and as specified.

7.3. PERFORMANCE

The services provided by, and hosted, in the cloud environment should be should be of an

acceptable, Industry Standard, level of performance and fully meet any specification

provided. Any additional requirements or modifications to the current infrastructure must

be clearly specified indicating the likely costs. The system must not suffer from adverse

performance speeds or response times from concurrent access.

The system must cater to accessing via low bandwidth connections and DSL

communications without experiencing performance degradation. Any foreseen issues must

be highlighted in the proposed solution.

7.4. MAINTAINABILITY

Any essential maintenance work must be performed in predictably scheduled and agreed

maintenance windows wherever possible.

18

Any emergency maintenance required should be approved beforehand wherever possible,

except in cases of recovery from unexpected service loss.

7.5. SCALABILITY

All services must be highly scalable to support possible future expansion or contraction of

required workload and any supplier imposed constraints should be explained.

7.6. BACKUP & RECOVERY

The proposal must indicate measures taken to ensure a safe backup and recovery process.

7.7. CLIENT TECHNOLOGY

The devices used will include desktop, laptops, tablets and also smart mobile devices so

we require adaptive and responsive design and interface.

7.8. SECURE ARCHITECTURE

As a leading international affairs organisation, the Commonwealth Secretariat has a strong

reputation to protect. Security must adhere to Commonwealth Secretariat standards

which are in line with ISO 27001: Information security management systems, and ISO

27002: Information security – code of practice for information security practice.

7.9. MAINTENANCE & SUPPORT

The Supplier shall provide maintenance and support throughout the contract period.

8. Standards & Compliance

The cloud environment should be regularly assessed for standards compliance and should have all usual

industry standards accreditation including ISO27001.

9. Contractor/Vendor Profile

1. Prior experience in designing, developing and supporting implementation of web-based data

collection, data storage and reporting systems.

2. Experience of implementing such systems in one or all of the following organisations:

a. Developing countries

b. Multinational Governmental Organisations

c. Non-Governmental Organisations

3. Extensive development and implementation knowledge and experience in the following fields:

a. UX Design

b. Web-based Application Development

c. Data Analytics

19

4. Proven track record in the following:

a. International Relations

b. Development Cooperation/Economics

c. Aid Effectiveness

d. Aid Transparency

5. Proven experience in using CMMI or equivalent framework for all product and service

development.

6. The solution provider will provide a team of experts suitable and sufficient for the timely

implementation of the agreed solutions.

7. Provide evidence of multiple Open Data Platform implementations