Errors in Data Collection

Embed Size (px)

Citation preview

  • 7/29/2019 Errors in Data Collection

    1/18

    1

    TYPES OF ERRORS INDATA COLLETION

    June 27- Jul 2011

  • 7/29/2019 Errors in Data Collection

    2/18

    2

    ProblemsThere are many sources of error in data collection.

    Here are a few: Transient personal factors such as health, fatigue,motivation

    Situational factors such as relations with colleagues,distractions

    Instrumentation problems such as lack of clarity in aninterview schedule, troublesome arrangement/organization on a questionnaire, or response factorssuch as respondents selecting "no" rather than "yes"because "yes" leads to more questions

    Analysis factors such as errors in scoring, tabulation, orthe use of an inappropriate statistical test.

    June 27- Jul 2011

  • 7/29/2019 Errors in Data Collection

    3/18

    3

    Example Here is an example of a data collection

    problem that was overcome. A questionnaire with a low response rate

    on a pilot test was redesigned to include anendorsement cover letter, reduced length,

    and a better benefit/rationale statement. If problems cannot be overcome, it may be

    necessary to change the research designor delete a troublesome variable.

    You are responsible for convincing thereader that your data collection methodwas appropriate and free from error.

    June 27- Jul 2011

  • 7/29/2019 Errors in Data Collection

    4/18

    4

    Reliability and ValidityBy: Eng. Luteganya Lucius RUniversity of Dar es salaam

    Computing Centre

    June 27- Jul 2011

  • 7/29/2019 Errors in Data Collection

    5/18

    5

    Introduction As a researcher, we need to be able to

    understand the usefulness of the data wecollect: How accurate a picture of social life we

    are getting Whether or not the conclusions we draw

    are applicable to everyone or simply thegroup of people we have studied("representativeness").

    Can our research be repeated by others (a

    process known as "replication") and wouldthey get similar results if they did repeatour research?

    Two concepts that we use to test the

    usefulness of the data we collect are:

  • 7/29/2019 Errors in Data Collection

    6/18

    1. Reliability

    2. Validity

    6June 27- Jul 2011

  • 7/29/2019 Errors in Data Collection

    7/18

    7

    1. Reliability

    The reliability of the data wecollect must, of course, be animportant consideration, since if

    the data we use is not reliable,then the conclusions we draw onthe basis of such data are going

    to be fairly useless.

    June 27- Jul 2011

  • 7/29/2019 Errors in Data Collection

    8/18

    8

    Data reliability, therefore, is concerned with ideas suchas:

    The consistency of the data collected. The precision (or lack of same) with which it is

    collected For example, how systematic is a form of data

    collection that relies upon asking people questionsabout something about they may have little directknowledge?

    The repeatability of the data collection method For example, if another researcher attempted torepeat my research "down the pub", would similarresults be achieved?

    In simple terms, data can be considered broadly

    reliable if the same results (or broadly similar) can begained by different researchers asking the samequestions to the same (or broadly similar) people.

    June 27- Jul 2011

  • 7/29/2019 Errors in Data Collection

    9/18

    9

    For example, a researcher may attempt tocheck the reliability of a response within aquestionnaire by asking basically the samequestion in a slightly different way:How old are you...When were you born?

    In this (very simple) example, theresearcher attempts to cross-check thereliability of an answer - if they get twodifferent answers, then it is likely that thedata being collected is not going to be very

    reliable (this, incidentally, is a form of datatriangulation)

    June 27- Jul 2011

  • 7/29/2019 Errors in Data Collection

    10/18

    10

    2. Validity

    Data is only useful if it actuallymeasures what it claims to bemeasuring and, in this respect, theconcept of validity refers to theextent to which the data we collectgives a true measurement /description of "social reality" (what

    is "really happening" in society).

    June 27- Jul 2011

  • 7/29/2019 Errors in Data Collection

    11/18

    11

    What causes data validity errors?

    Data validity errors are usually caused byincorrect data entries, when a large volumeof data is entered in a short period of time.For example, a data entry operator enters12/25/2010 as 13/25/2010, by mistake,

    and this data is therefore invalid. How can you reduce data validity errors?

    You can use one of the following two, simplefield validation techniques.

    June 27- Jul 2011

  • 7/29/2019 Errors in Data Collection

    12/18

    12

    Technique 1:If the date field in a database usesthe MM/DD/YYYY format, then youcan use a program with the following

    two data validation rules: "MM"should not exceed "12", and "DD"should not exceed "31".

    June 27- Jul 2011

  • 7/29/2019 Errors in Data Collection

    13/18

    13

    Technique 2:

    If the original figures do not seem tomatch the ones in the database, thenyou can use a program to validate data

    fields. You can compare the sum of thenumbers in the database data field tothe original sum of numbers from thesource. If there is a difference

    between the two figures, it is anindication of an error in at least onedata element.

    June 27- Jul 2011

  • 7/29/2019 Errors in Data Collection

    14/18

    14

    As should be clear, the concepts ofreliability and validity go hand-in-hand in sociological research:

    If data is reliablebut not valid,

    then it may have limited use. We canmake general statements about theworld, but such statements may notactually apply to any one social group

    (such as the "unemployed").

    June 27- Jul 2011

  • 7/29/2019 Errors in Data Collection

    15/18

    15

    If data is valid, but notreliable, we may not be able touse it to make generalstatements about the world (forexample, we may be able tounderstand something about onegroup of unemployed people that

    doesn't necessarily apply to allunemployed people).

    June 27- Jul 2011

  • 7/29/2019 Errors in Data Collection

    16/18

    16

    Finally, therefore, a general rule to

    follow whenever you are presentedwith data to analyse / interpret(whether it be data collected fromprimary sources such as interviews,

    experiments, observation and thelike, or secondary sources such asnovels, Official Statistics and soforth), is that you should always seek

    to apply the concepts of reliabilityand validity to the data.

    June 27- Jul 2011

  • 7/29/2019 Errors in Data Collection

    17/18

    17

    Triangulation. Various methods of data collection

    have different advantages anddisadvantages and, given this fact, itwould seem to make sense for theresearchers to make use of a number

    of different methods in theirresearch since: A weakness in one method could beavoided by using a second method

    that is strong in the area that thefirst is weak.

    June 27- Jul 2011

  • 7/29/2019 Errors in Data Collection

    18/18

    18

    For example, when we interview people ageneral weakness here is that we have totake it on trust that the respondent istelling us the truth. In this instance, wemight use an observational method totry and check we are getting the truthabout someone's behaviour. By observingthem in their everyday life, forexample, we might be able to check theyactually do what they tell us they do...

    June 27- Jul 2011