34
M150: Data, Computing and Information 1 Representation

M150: Data, Computing and Information 1 Representation

Embed Size (px)

Citation preview

Page 1: M150: Data, Computing and Information 1 Representation

M150: Data, Computing and Information

1

Representation

Page 2: M150: Data, Computing and Information 1 Representation

1- Introduction Aims of this unit:

Explain why communication lies at the heart of people’s use of Computers

Show how communication relies on agreed representations which associate a symbol with a meaning

Study the properties of representations which are useful in a computing Context

Explain the importance of picking the right representation (one that is fit-for-purpose)

Show how agreed formats for representing data stored in files promotes the sharing of this data

2

Page 3: M150: Data, Computing and Information 1 Representation

2- Communication, convention and representation

Using computers to communicate with other people Communication is the act of imparting information Communication governs our relationship with computers We use computers in two major ways (Why communication lies at

the heart of people’s use of Computers) To communicate with each other (people increasingly rely on

computers for communicating with other people through applications like email and the internet)

To solve problems. This implies that:

People need to communicate with computers (to give input and instruct it on how to solve a problem)

Computers need to communicate with people (to provide the output – solution)

Computers need to communicate with each other (to share parts of a problem)

3

Page 4: M150: Data, Computing and Information 1 Representation

4

2- Communication, convention and representation Communication is only possible where there is a convention

(agreement) about some representation. People use computers to exchange messages. Email, word

processing and the web are all computer-based technologies used in this way

Internet connectivity Is one way of estimating the success of electronic communication

between people Chatting

Chat with each other via messages in real-time (online, without a delay) Chat can be supported in different ways (one of a chat room) Chat room is a web application that allows them to exchange

messages with others in the same chat room Email

Email is not as immediate as chat, but it allows people to leave messages and reply in their own time

Page 5: M150: Data, Computing and Information 1 Representation

5

2- Communication, convention and representation Email benefits:

Comparatively cheap Reached anywhere in the world (global) Immediate Allows a variety of messages to be send Can send text, photograph, video clips and music

Email drawbacks: less time for reflection and checking, because email access is

immediate It is prone to error because it is so easy to send an email message to a

large unintended audience (wrong persons). Unlike letters, email appears to invite brief, informal communication.

This style is not suitable for all messages. Email messages are an impoverished form of communication. This is

because they are invariably written in haste, without much thought and with little indication of the writer’s mood.

Page 6: M150: Data, Computing and Information 1 Representation

6

2- Communication, convention and representation Appropriate use of any means of communication requires

people to adhere to a set of conventions or guidelines Netiquette is the name for the collection of guidelines setting

out appropriate email behavior. It aims to minimize misunderstanding in email use.

Face-to-face conversations tolerate much more definite and forceful behavior than do email exchanges. People pick up facial expressions and intonation to determine whether someone is getting upset

Emoticons (smileys) : Are small drawings composed of simple keystrokes, which can be

inserted in the text. Injecting information about mood into informal

communication. :-) , :-( , :-/ , ;-)

Page 7: M150: Data, Computing and Information 1 Representation

7

2- Communication, convention and representation Using computers to solve problems

Communication plays an important role in our relationship with computers because they help us to solve problems.

Four-way communication is needed to be able to delegate problems to computers

1. Programmers need to instruct computers how to solve the problem given a variety of inputs.

2. Users need to give computers the inputs to a particular problem.

3. Computers need to communicate the solution back to users (output).

4. For tasks or problems which require more than one computer, the computers need to communicate with each other to share the problem.

Page 8: M150: Data, Computing and Information 1 Representation

8

2- Communication, convention and representation The basic operation to solve a problem is to transform some input to

some output (may be of different type) Examples:

Word processing Communicate your input to the computer using keystrokes and the

computer communicates the output to you using a monitor or a printer. Speech synthesis

Produces spoken (stream of sound) output from typed text Interferometry

Different radio telescopes are pointed at the same area in space, and their readings combined.

All the readings gathered at the same moment are combined into a single image using imaging software.

A task would be completely impossible if there were no computers. several computers need to work together to solve the problem.

Page 9: M150: Data, Computing and Information 1 Representation

9

2- Communication, convention and representation Supercomputers and superclusters

Supercomputers Are very large, very fast computers. They have more than one processor, and a huge amount of memory. They are designed specifically to work on very large problems. Supercomputers are very expensive

Supercluster computers Are collections of computers, each very much like your own PC. They are linked by very fast connections that allow the computers to

share huge amounts of data and work on a problem in parallel. Very large problems are divided into smaller parts. Superclusters are available at a fraction of the cost of a supercomputer. They are less efficient for some problems because of the overheads of

managing the different machines in the cluster.

Page 10: M150: Data, Computing and Information 1 Representation

10

2- Communication, convention and representation Communication, conventions and protocols

Communication: It is the act of imparting information People excel at communicating in two ways: deliberate

(with intention), or explicit (with precise and exact ways) Communication relies on convention A convention is an agreement between a collection of

participants about what a message means.

Page 11: M150: Data, Computing and Information 1 Representation

2- Communication, convention and representation

Conventions might be agreed in different way: Public conventions: have a very large number of

participants (public at large will know them). Example: music notation, alphabetical classification system

Local conventions: have a smaller group of participants. Local conventions are not private, because anyone can find out what they are, but they are useful only in specialist settings.Example: Morse code, radio codes used by police

Private conventions: are agreed between a handful of participants with a view to keeping communication private. Example: secret code, ciphers.

11

Page 12: M150: Data, Computing and Information 1 Representation

2- Communication, convention and representation

Conventions (protocols) in communicating with computers Communication with and between computers also relies on

conventions (public or local). Web addresses are subject to public conventions (instruct a

browser to visit a specific web page) A protocol: (example of convention) is one of the systems of

conventions designed to enable (govern) communication with and between computers.

The address is written in a language called Hyper Text Transfer Protocol (HTTP).

Handshake protocol: is a collection of conventions (communication system) that enable computers and other electronic devices (dialing for an internet over the telephone) to identify each other - before the connection is established, you can hear a series of squeaks and tones.

12

Page 13: M150: Data, Computing and Information 1 Representation

2- Communication, convention and representation

Important internet protocols: TCP/IP (Transmission Control Protocol/Internet Protocol)

is possibly the most important internet protocol. TCP/IP is in fact not a single protocol but a combination of two, each looking after a vital task that makes it possible to send information over the internet. TCP takes data and bundles it up in appropriate parcels for transmission.

IP moves the parcels along networks and makes sure they are delivered to the right address.

FTP (File Transfer Protocol) lets authorized users download files from (or send files to) any other computer connected to the network.

SMTP (Simple Mail Transfer Protocol) uses the same principles to transfer email across the world via the network. 13

Page 14: M150: Data, Computing and Information 1 Representation

2- Communication, convention and representation

Symbols are part of conventions shared by a large number of people.

Conventions play a crucial role in communication because they set out an association between a form (or symbol) and a meaning (or content).

Communication becomes possible because the content of a message can be given a form which can be perceived and understood by all people party (representation) to the convention.

Representations: Association between a form and a content , subject to a convention. Its important in communication with computers.

14

Page 15: M150: Data, Computing and Information 1 Representation

3- Properties of representations

Effective representations A representation establishes a relationship between

a form (or symbol) and an associated meaning (or content).

To be effective in communication, a representation must meet at least the following conditions: The form of the representation must be perceivable in

some way The relationship between form and content is shared

by all parties involved in the communication process. (All parties must agree on the convention that links a form with a particular meaning)

15

Page 16: M150: Data, Computing and Information 1 Representation

3- Properties of representations

Representations have to be perceivable Different ways of perceiving representations:

1. Auditory representations are perceived as sound.

Human Examples: spoken language, music and Morse code

Computer Example: The handshake protocol, beeps to attract human attention

2. Visual representations are perceived as sight.

Human Examples: Written language, traffic signs, painting

Computer Examples: Scanner, screen display, print output

3. Tactile representations are perceived by touch

Plays central part in our interaction with computers but less common in human communication

Human Examples: Braille

Computer examples: keyboard, mouse, joysticks

16

Page 17: M150: Data, Computing and Information 1 Representation

3- Properties of representations

Representations: spoilt for choice Devising a representation involves two assumptions:

The relationship between a particular form and its contents is not predetermined For effective communication, all parties must be agree

on which form is paired up with which content People can choose which form to associate with

which content People can choose how they get a message across. Different symbols can represent the same content Word offers mouse and keyboard alternatives for most

word-processor operations

17

Page 18: M150: Data, Computing and Information 1 Representation

3- Properties of representations

Page 19: M150: Data, Computing and Information 1 Representation

19

3- Properties of representations

Some useful properties of representations If there is choice of representation, there are likely to be

reasons for picking one rather than another. Representations have different properties that may play

a role in deciding when they are best deployed

1. Context sensitivity Representations are context sensitive. This is because

the same form may be associated with different meanings, depending on the context.

Symbols were assigned alternate meanings dependent on the circumstances.

Examples: pressing the arrow keys in Word and Excel

Holding down both keys <Ctrl><D> in Word and Excel

Black color in some cultures

Page 20: M150: Data, Computing and Information 1 Representation

20

3- Properties of representations

2. Ambiguity Associates the same form with more than one meaning in the same context Human language are extremely ambiguous Ambiguity is not a common feature in the representations designed for

communication with, or between, computers. Because computers cannot guess what a user might mean, representations have to be explicit and precise.

3. Precision Sometime representations is ambiguous because they lack precision Precision is a measure of the grain (or granularity) of the message a

representation is able to convey Information loss: Replacing a more precise representation with a less

precise.

Page 21: M150: Data, Computing and Information 1 Representation

21

3- Properties of representations

4. Redundancy A representation contains redundancy if the meaning or content

can be recovered from only a part of its form Sometimes , part of a message might not be lost, but the

meaning might still be reconstructed (sufficient redundancy) Example: noisy telephone connections, or weak mobile telephone

Removal of redundancy is important in computing, because representations are shorter, less space, and less time to transmit.

Example: Telephone number (no redundancy), Library record (redundant)

Compression is a technique which increases efficiency by removing redundancy from representations.

Lossless compression: do not lose any significant information Lossy compression: loses parts of the original

Decompression is the reverse operation (compression), where the redundant parts are put back into the representation to restore it to its initial form

Page 22: M150: Data, Computing and Information 1 Representation

22

4- Picking representations

Choosing representations: Picking representation with care in order to be fit-for-purpose Example: the traffic sign is more suited to roadside display than textual

definition of the sign

Fit-for purpose representations A representation must contain at least sufficient information for the

purpose of the task, without irrelevant detail Example: Maps

Abstraction Abstraction: stripping away of irrelevant detail, one of the most

fundamental notions in computing. Abstraction is the process of fitting a problem to a representation which

has exactly the information relevant to solving the problem. Abstraction amounts to finding a minimal fit-for-purpose representation

for a problem

Page 23: M150: Data, Computing and Information 1 Representation

Picking representations

Page 24: M150: Data, Computing and Information 1 Representation

24

4- Picking representations

Complex representation systems Which is assembled out of simpler representations

according to predetermined rules. Complex representation systems have two characteristics:

1. The form of the complex representation is made up of several more basic parts

2. The meaning of the complex representation is constructed from the meanings of the more basic parts in some systematic way.

Complex representations are central to computing. All computer languages, including protocols, formats and programming languages are instances of complex representation systems.

Example: Traffic sign are complex representation (Contains shapes, cooler, symbols

Page 25: M150: Data, Computing and Information 1 Representation

25

4- Picking representations

The consequences of using a complex representation system rather than a series of simple representations:

1. The system allows new representations to be created that build on the same underlying meanings (or rules) because they use the same building blocks.

2. The system ensures that new representations can be interpreted by everyone who understands the rules.

3. When the rules of the system are breached, representations become incomprehensible.

Representation system and languages Language: Setting up a representation system where complex

representations can be built up systematically using more basic forms, with a predictable relationship between form and content.

Examples: constructing traffic signs is a language. Computer languages are formal languages (lot simpler and never

ambiguous) – artificial languages where gives the right context

Page 26: M150: Data, Computing and Information 1 Representation

4- Picking representations

Page 27: M150: Data, Computing and Information 1 Representation

5- Sharing and formats

The problems of sharing The problems associated with sharing information only became

pressing when computers were networked Users wanted to share information without having to worry about

which computer or which application the recipient was using (compatibility).

Formats: are formal languages used to represent the detail of the input and output associated with particular applications(a file representations used with computers)

Formats are necessary for two reasons:1. They ensure consistency. Applications such as browsers and word

processors need to be able to display a document in the same way every time it is opened by a user.

2. They enable sharing. If a format is shared, or ‘understood’ by two applications, then they can exchange inputs and outputs. This is what allows a word processor, say, to send a document to a printer

27

Page 28: M150: Data, Computing and Information 1 Representation

5- Sharing and formats

The power of formats Formats not only enable sharing, they can also prevent it If the format associated with a popular application is made

unavailable for sharing (copyright), applications will be unable to exchange or share

Formats fall into two categories:1. Proprietary formats

Belong to a particular software provider. Using a proprietary format requires a purchase or contractual agreement

Example: Microsoft word (you can only store and open .doc) Protecting a proprietary format is expensive

2. Public formats Anyone is free to use Encourages the development of new applications Increases the collective market share Example: HTML

28

Page 29: M150: Data, Computing and Information 1 Representation

5- Sharing and formats

Format evolution or confusion Compatibility between applications is achieved by

managing their formats (reach agreement about what the format actually is)

Reasons for confusion (compatibility is not a simple task):1. No existing format. Suppliers bring out new applications with

new functionality that require new formatsExample: no applications for handling digital images before

digital cameras and scanners2. New versions. Suppliers need to update their products by

bringing out newer, better versions of existing software. (include enhanced features which may require formats to be altered extended). For example: word processors

3. Dependency between versions. New versions of an application usually need to recognize that users still have a stake in the old version.Downward compatible: able to work with the earlier formats

29

Page 30: M150: Data, Computing and Information 1 Representation

5- Sharing and formats

4. Obsolete versions. Formats may become obsolete because old technologies disappear or are superseded. For example, word processors that predate mouse-driven Windows technology

legacy problems: large amount of data stored in formats which are no longer supported by applications. (heavy overheads for users)

Formats can only be effective in ensuring compatibility between applications if their evolution is orchestrated in some way.

Formats are managed to ensure compatibility: through standards and through conversion

Two ways of achieving compatibility between formats:1. A standard: a wide group of users agreeing to use a particular

format.

2. Conversion : translating between the formats.

30

Page 31: M150: Data, Computing and Information 1 Representation

5- Sharing and formats

Standard formats A large collection of users has agreed to use. Standardization: The process of agreeing a standard Users can reach agreement in two different ways.

1. A standard is set through deliberate, explicit standardization. This happens when a small group of people write down what the standard is, publish the description and keep track of it over time. Standards based on public formats are typically managed in this way.

2. A standard may simply emerge because a particular application has a very large group of users. In this case, a format becomes a de facto standard.

31

Page 32: M150: Data, Computing and Information 1 Representation

5- Sharing and formats

Advantages of standard:1. Compatibility between any number of applications

2. New formats and standards can evolve together

3. Users need not worry about whether their particular application uses a format that will stop them from communicating or sharing.

4. The responsibility for ensuring compatibility need not lie with users, (delegated to the companies who develop applications).

Disadvantages of standard:1. Standardization reduces the number of formats

2. A new format evolves slowly into a standard.

3. Incorporating new features into an existing standard and distributing it across a user population is time consuming and expensive.standards slow down the rate at which new applications or

new versions can be introduced32

Page 33: M150: Data, Computing and Information 1 Representation

5- Sharing and formats

Conversion Conversion means making sure that one format can always be

translated into another (dealing with standard drawbacks) Conversion can be useful for two purpose:

1. Conversion between standards is very beneficial. word processors, allow to be saved as an HTML document

2. Conversion between obsolete and current formats is also useful to prevent legacy problems

Conversion disadvantage: Overhead in maintaining conversion as standards evolve over time. Many standards exist that it is unrealistic to convert between all of

them Standardization and Conversion are used together to ensure

sharing remains possible while still allowing formats to emerge and evolve (not alternative strategies).

33

Page 34: M150: Data, Computing and Information 1 Representation

5- Sharing and formats

What’s in a name: File: a collection of information that is to be kept together

in a specific representation, or format. Filename is need to give the document a unique name, so

that you can find it again. Different computer operating systems have different rules

for filenames. Windows O.S. names including spaces and numbers, but

no spaces in UNIX File names separated (by dot) into name and file extension

A file extension is a relatively short sequence of letters that denotes the format the file is held in. its specify which application to open a document. It is standardized.

Popular file extension: .doc , .exe, .gif, .html, .jpeg, .mdb, .rtf, .mp3, .txt, .wav …..(see what they mean – Table 5.1)

34