76
NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

Embed Size (px)

Citation preview

Page 1: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

NPSAS DAS Training

December 2006

TrainingShefali V. Mehta

Minnesota Office of Higher Education

Page 2: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

NPSAS Background

NPSAS 2004 Data

Based on training by:

Lutz Berkner of MPR Associates, Inc. Tracy Hunt- White and James Griffith of

the National Center for Education Statistics

Page 3: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 3

What is the National Postsecondary Student Aid Survey?

The National Postsecondary Student Aid Survey, or NPSAS, is a nationally-representative stratified random sample of undergraduate, graduate and first-professional students attending postsecondary institutions.

Today’s presentation will focus on the undergraduate sample data- how it was collected, what it contains and how to access and use it.

Page 4: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 4

NCES’ Recent Surveys: Higher Education Longitudinal and Cross-Sectional Studies

NPSASNational Postsecondary Student Aid Study

1986-871989-90

BPS

1992

1994

1992-93B&B

1994

1997

2003

1995-96BPS

1998

2001

1999-2000B&B

2001

2003-2004BPS

2006

2009

NSOPF:88 NSOPF:93 NSOPF:99 NSOPF:04

Page 5: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 5

Data Sources for the NPSAS 2004

Central Processing System (CPS) Match Institutional Records (CADE) Student Interviews NSLDS Loan Match NSLDS Pell Grant File Match ETS File Match ACT File Match

Page 6: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 6

NPSAS 2004 Data Collection Timeline

Sample Institutions

Obtain Cooperation

Obtain Lists/Select Student Sample

CPS Matching

Preload

CADEStudent

Interviews

Data File Preparation

Aug 2002

Jan – Oct 2003

Jan – July 2004

Mar – Sep 2004

Mar – Dec 2004

Page 7: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 7

Products related to NPSAS

Public Use Data Systems (DASs) Methodology Reports describing study

design, procedures, and outcomes Restricted use research files ED Tabs and Descriptive Reports

based on analyses of merged data.

Page 8: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

Using the DAS online

Accessing the NPSAS 2004 Data

Page 9: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 9

What is the DAS?

The Data Application System, or DAS, is a software application that produces tables and correlation matrices for NCES datasets.

The DAS, which is available for each NCES dataset, includes

Over 1,000 variables with full descriptions and

Statistical information, such as standard errors and the distribution of the data.

It is available online through the NCES website:

http://nces.ed.gov/dasol/

Page 10: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 10

DAS Home Page:http://nces.ed.gov/das/

Page 11: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 11

DAS Onlinehttp://nces.ed.gov/dasol/

Page 12: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 12

DAS Online: Select a dataset

Page 13: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 13

DAS Online

Page 14: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 14

NCES Data Usage Agreement

Select “I agree..” to continue to the DAS for NPSAS 2004.

Note:To use DAS online, you need to enable pop-up windows from this website. The application relies heavily on pop-up windows, such as this usage agreement.

Page 15: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 15

DAS Online Window

Toolbar

Page 16: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 16

DAS Online Window

Subject Category

Topic

Subtopic

Page 17: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 17

DAS Online Window

Variable list

Blue = continuous variableGreen = categorical variableRed = weight

Page 18: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 18

Available variables

Click on the “view/download list of variables” link to see all available variables.

Page 19: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 19

Locating variables in the NPSAS

Frequently Used: Variables

Aid: Application, Federal, Grants, Institutional, Net Price, Outside, Package, Ratio, State, Total

Background: Demographics, Family, Residence

Education: Attendance, Program

Employment: Description, Employer, Future, Licensure, Status, While Enrolled

Finances: Income

Institution: Other, Price, Type

Parent: Education, Family

Public Service: Participation

Survey: Sample, Weights

There are two ways to select variables. The first is through the drop-down menus available on the main page. The menus are organized in the following categories:

Page 20: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 20

The second way to locate a variable is by clicking on the “Search for variable” link on the toolbar. This pop-up window will appear.

Locating variables in the NPSAS

Page 21: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

Using the DAS online

Using the Variable Tags

Page 22: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 22

What kind of estimates can the DAS produce?

Means (including observations = 0) Averages (of observations > 0) Percent distributions Percent positive (or greater than a selected

value) Percentiles (10th, 25th, 50th, 75th, and 90th)

(with or without observations = 0) Medians (the 50th centile)

or Correlation matrices

Page 23: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 23

Variable Description Window

Each variable window contains the following:

•a description of the variable

•the sources for the variable

Page 24: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 24

Variable Description Window

And the distribution of the variable.

In this case, 63.2 percent of the data has a value for the total amount received.

The range for this variable is $50-$56,740.

Remember- this information is for the national level, each state has its own distribution.

Page 25: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 25

Select a Tag for the Variable

Click on the “Select a tag” tab to show the tag options available for the variable.

These “tags” tell you the various ways this variable can be represented in your table.

Page 26: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

Using the DAS online

Practice exercises to illustrate the tags

Page 27: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 27

NPSAS - Exercise 1

What is the percent distribution of full-time, full-year undergraduates according to degree program and gender, by dependency status, institution sector, aid status, and age?

Find the percentage of full-time, full-year independent male students who attended a public 4-year institution.

Page 28: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 28

Exercise 1 – Breakdown

Run 1 – What is the percent distribution of undergraduates according to degree program, by dependency status and institution sector?

Run 2 – What is the percent distribution of undergraduates according to degree program, by dependency status, institution sector, aid status, and age?

Run 3 – What is the percent distribution of full-time, full-year undergraduates according to degree program and gender, by dependency status, institution sector, aid status, and age?

Page 29: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 29

Tags: Column_Cat

Creates percentages for each category of a variable

Missing values and legitimate skips are not included in any of the categories

Responses coded as “0” are not included

Pertains to categorical variables only

Also applies to: Row_Cat, Span_Cat, By_Cat

Page 30: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 30

Tags: Row_Cat

Similar to Column_Cat

Creates a row of estimates for each category

Responses coded as “0” are not included

Pertains to categorical variables only

Also applies to: Column_Cat, Span_Cat, By_Cat

Page 31: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 31

Tags: Row_Lump

Creates customized categories by grouping existing variable categories

Responses coded as “0” can be included

Legitimate skips can be excluded or included in the new categorization

Allows reordering of existing categories

Pertains to categorical variables only

Also applies to: Column_Lump, Span_Lump, By_Lump

Page 32: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 32

Tags: Row_Cut

Divides a continuous variable into categories by specifying ranges

Creates a row of estimates for each category

Specify beginning cut-point value in each range

Cut-point must be a number with a decimal (e.g., 10.5)

Also applies to: Column_Cut, Span_Cut, By_Cut

Page 33: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 33

Tags: Row_Cut

Range

1: (>= 0.5 and < 18.5)

2: (>= 18.5 and < 23.5)

3: (>= 23.5 and < 29.5)

4: (>= 29.5 up to infinity/max value)

Range

1: (>= -0.5 and < 0.5) includes 0

2: (>= 0.5 up to infinity) at least $1 in aid

Page 34: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 34

Tags: Filter

And_Filter Subsets (focuses

on) the population of interest

All conditions have to be met (filters selected) in order for case to be included

Or_Filter Subsets (focuses

on) the population of interest

If any condition is met (filter is selected) the case will be included

Page 35: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 35

Tags: Filter

Integer filter: Limit population to the categories selected.

Cut-point filter: Limit population to those with values greater than or less than a specific point or between two points.

Page 36: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 36

Tags: Span_Cat

Uses all of a variable’s categories to group sets of rows in the table

Creates a subtable of estimates for each variable category

Does not provide an overall summary table

Warning: Drastically increases the number of estimates in the table

See also: Span_Cut, Span_Lump

Page 37: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 37

NPSAS - Exercise 2

What percentage of full-time, full-year undergraduates received financial aid by dependency status, institution sector, and age? What was the average amount they received?

Steps: Import exercise 1Delete Column_Cat and Span_Cat tagsDelete Row_Cut tag for Total AidAdd Percent and Average tags

Page 38: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 38

Tags: Percent>

Defines a column of percentages based on values greater than a specified cut point

Can be used with the Mean and Average>0 tags

Page 39: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 39

Tags: Mean versus Average

Mean will include zeros in the denominator

Average will not include zeros in the denominator

Page 40: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 40

Mean vs. Average

All respondents, including those

with no aidOnly respondents

who have aid

Page 41: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 41

Tags: By_Cat

Creates a column of Average, Mean, or Percent> estimates for each category of a variable

Can be used with only ONE Mean, Average>0, or Percent> variable

Provides an overall summary column

Will increase the size of your table

See also: By_Cut, By_Lump

Page 42: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 42

Example of By_Cat with Percent>

Percent> yields percent FT, full-year UG with aid By_Cat generates percent FT, full-year UG with aid by degree program.

Ex: 77.4% of FT, full-year UG in a certificate degree program received aid.

Page 43: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 43

Representative Sample States

NPSAS:04 is not designed to be representative at the state level except for undergraduates attending public 2-year, public 4-year, and private not-for-profit 4-year institutions in the 12 specific states.

Use these to look at these representative sample states:

- INSTSAST (NPSAS institution representative sample states)

- INSTSTSE (NPSAS institution representative state sample by sector)

Do not use: INSTSTAT (NPSAS institution state)

Page 44: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 44

Tags: Centile vs. Centile>0

Generates percentile columns from continuous variables

Produces the cut points for the following percentiles: 10th, 25th, 50th, 75th, 90th

Median = the 50th centile -- the value above and below which half of the observations lie

Centile includes zero values

Centile>0 excludes zero values

Page 45: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 45

Example of Centile>0

Note: Last column shows the percentage of FT, FY undergraduates who received no aid.

Page 46: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

Using the DAS online

Saving, modifying and loading files: .tpf files

Page 47: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 47

Saving tables you created

You can save the parameter file for re-use and modification

Files containing the specifications for tables are called .tpf files, or table parameter files

After creating a file in the DAS window, click on Save in the toolbar.

The .tpf file will be saved to the location specified by you

Page 48: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 48

Uploading tables to the DAS application

Click on Import in the toolbar. Locate the .tpf file to be uploaded and upload it. Note: for the DAS online application to read the

file, they must be saved with the extension .tpf Once the file is uploaded, it can be altered and

run as usual.

Page 49: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 49

Reproducing or modifying tables created by others

You can download and use any parameter file used to create a report or ED Tab from our web site: http://nces.ed.gov/das

.tpf files can be edited in a text editor (such as Notepad or Wordpad) but they must be saved with the .tpf extension (not the .txt default extension)

Page 50: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 50

Using the batch processor

The batch processor allows you to run several tpfs at once

You must create an account and log-in by clicking on “Batch processor” on the left-hand side of http://nces.ed.gov/dasol/

The files must be in added to a .zip file and then uploaded. After uploading the file, COPY down your batch number to retrieve your files

Page 51: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 51

Using the batch processor: rules for naming files

There is one catch with the batch processor- it will not run files unless they have specific names (while the DAS has no such rules)

All file names (.ZIP/.TPF/.CPF) must fulfill the following requirements Begin with a letter (for example, A, B, C,...X,Y,Z) Contain at least 2 but no more than 8 characters Not contain spaces between characters Not include symbols or special characters (underscore is

allowed) These guidelines are available on the DAS website:

http://nces.ed.gov/das/das_windows/run_1.asp

Page 52: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

Using the DAS online

Sampling and Data Issues

Page 53: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 53

Data sources by percentage

Which sources did NCES use to collect the student data?

Primary sources Institution records (CADE) 95% Student interviews (CATI) 70% Federal aid applications (CPS) 60%

Combinations of primary sources All three sources 40% Two sources 50% One source 10%

Additional sources Federal loans and Pell Grants (NSLDS) 50%

Page 54: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 54

Data issues: data collection problems

Data collection problems arose such as: Missing data

No source or incomplete sources Data did not exist (EFC, student budgets)

Discrepancies among sources Timing issues Reporting or data entry errors Students make guesses during interview

Mismatches Student social security numbers Institution identification numbers

Page 55: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 55

Data issues: addressing the collection problems

Imputation used to complete missing data or to check inconsistencies. NCES used two types of statistical imputation methods: Logical Stochastic (hot deck)

Perturbation used to protect privacy of individuals. Social security numbers switched around for individuals.

Reconciliation used to confirm data okay after imputation and perturbation.

Page 56: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 56

Sample size and weights

15 million undergraduates enrolled in Fall 2003

19 million undergraduates enrolled anytime during the 2003-04 academic year

80,000 undergraduate cases in NPSAS sample: Represent about 1 out of 240 undergraduates

Therefore, average weight for each respondent = about 240

Page 57: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 57

Sample size and weights (cont.)

Each NPSAS sample case has one record containing about 600 derived variables

Each case has been assigned a weight

The average weight for each case is 240, but there is a wide range of weight values

There is only one weight for each case

In general, the weights are lower for the 12 state cases

Page 58: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 58

Why Do The Weights Vary?

Initial sampling rates differ (for the type of institution, type of student, 12 states, etc)

Non-response weight adjustments- need to adjust for those who did not respond to certain questions

Poststratification to known totals- the samples adjusted using poststratification to match known population totals

Smaller sample sizes result in larger weights

Lower institutional/student response result in larger weights

Larger weights mean less precision in estimates

Page 59: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 59

An example to illustrate weights

Case 1

Case 2

Case 3

Case 4

Case 5

Case 6

Total

With grants

100

200

300

400

500

500

2000

1600

WeightedTotal Grant

$200,000

100,000

150,000

0

150,000

1,000,000

1,600,000

$2000

500

500

0

300

2000

Case WeightExampleCase Grant Average $

80% (1600/2000)

$1,000 ($1.6 million/1600)

$800 ($1.6 million/2000)

% with grants% with grants

Average grantAverage grant

Mean grantMean grant

Page 60: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 60

DAS Output Example: Total Grant Amount (TOTGRT)

The weighted N shown in cells is the denominator

Percentage>0 TOTGRT Total students

Average>0 TOTGRT Students with grants

Mean TOTGRT Total students

Function:Weighted N in cells:

(denominator)

Page 61: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 61

Small Sample Sizes: Low N in DAS Output

DAS will produce “low N” instead of an estimate

When does this occur? if the denominator has less than 30 cases (meaning the sample size is less than 30) the result is suppressed by “low N”

The rule-of-thumb in statistics: if the sample size is less than 30, you can not produce meaningful estimates of the population

Percentages: The row (denominator) must have 30+ cases

Average>0: The number in the cell (denominator) must have 30+ cases

Page 62: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 62

Small Sample Size Example

Dependents

Independents

Weighted N’s shown:

Dependents

Independents

Average grant

# of Cases[not shown]

% Grants

Cases in denom.

[not shown]

[100]

[50]

5,000

5,000

20%

80%

5,000

5,000

Low N

$400

Low N

4,000

[20]

[40]

5,000

5,000

Meangrant

$80

$320

5,000

5,000

Cases in denom.

[not shown]

[100]

[50]

5,000

5,000

Note: The weighted N’s do not give an indication of the size of the samples. The number of cases in each category is not shown in the

DAS output. Only those with access to the raw data know this information.

Page 63: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 63

Poststratification to known totals

Primary weights were adjusted in computer models using 75 control totals to reflect:

National enrollment totals for sectors (9 totals)

National total Pell Grant dollars by sector (9 totals)

National total Stafford loan dollars by sector (9 totals)

12 state Pell dollars by sector (36 totals)

12 state Stafford loan dollar totals (12 totals)

Page 64: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

Statistical Analysis

Standard errors and analyzing estimates

Page 65: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 65

Reliability of NPSAS data

Representative data

At the national level

For the three major sectors at the state level

Unlike the Census, this does not provide data for the whole population, only for a sample of institutions and students.

When analyzing data, the uncertainty and errors related to sample data must be kept in mind.

Page 66: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 66

Standard errors

Standard errors accompany certain statistical estimates- such as percents, averages, and means.

Specify expected uncertainty in study results.

Reflects the extent to which a study result represents the “true” value in the population.

Calculated from two general sources of error.

Page 67: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 67

Errors in data

Sampling error occurs due to . . . Random-chance selection of too many of a

particular type of student or institution.

Measurement error occurs due to . . . Refusal of some students or institutions to

participate

Not all students and institutions provide data for each item

Respond differently to items.

Mistakes in recording and coding responses.

Page 68: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 68

Analyzing estimates by assessing their errors

All estimates have some measure of error accompanying them.

There are 2 ways of analyzing the errors in NPSAS data: One-Sample Case

For any given statistic, how representative is the statistic of the population (parameter)?

Two-Sample Case: Comparing 2 statistics Do the sample statistics differ enough to conclude that the

populations actually differ on the measured characteristic (or parameter)?

Page 69: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 69

One-sample case: confidence intervals

Confidence intervals provide a range for the estimate- this interval represents the probability that the population’s true statistics is actually in the interval

The larger the confidence interval, the less precise the estimate and the wider the range of possible population statistics

This will be easier to illustrate with an example.

Page 70: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 70

One-sample case: confidence intervals (CI) (cont)Constructing a CI for the percent of all dependent students in

Minnesota who applied for federal aid:NPSAS institution representative sample states = Minnesota

  Applied for any aid

Applied for federal aid

  (%>0.5) (%>0.5)

--------- Dependency status = Dependent ----------

Estimates

    Total 88.1 77.6

Race-ethnicity (with multiple)

  White 88.4 77.6

  Minority/non-white

85.5 77.5

Standard Errors

Total 1.20 1.27

Race-ethnicity (with multiple)

  White 1.33 1.53

  Minority/non-white

3.84 2.97

To construct a confidence interval with 95 percent confidence level (which means that the interval contains the true population average 95 percent of the time), find the estimate and its standard error

Multiply the standard error by 1.96

1.96*1.27=2.489

Subtract and add this number from the estimate

77.6 -/+ 2.489 = (75.111, 80.089)

This is the 95 percent CI for this estimate- about 95 percent of the time (if this sample is repeated), the actual number of dependent students in MN who applied for federal aid is between 75%-80%

Page 71: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 71

One-sample case: confidence intervals (CI) (cont)

The CI for the percent of all dependent students in Minnesota who applied for federal aid:

This interval represents the upper and lower values, with 95% probability, that we would expect to observe the true population

characteristic (or parameter) i.e. the actual percent of dependent students in MN who applied for federal aid is between 75%-80%

75.1% 77.6% 80.1% % receiving

aid

Page 72: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 72

Two-sample case: comparing two estimatesConstruct CIs to compare the difference between the percent of white and minority/non-white dependent students in MN who applied for federal aid:

NPSAS institution representative sample states = Minnesota

  Applied for any aid

Applied for federal aid

  (%>0.5) (%>0.5)

--------- Dependency status = Dependent ----------

Estimates

    Total 88.1 77.6

Race-ethnicity (with multiple)

  White 88.4 77.6

  Minority/non-white

85.5 77.5

Standard Errors

Total 1.20 1.27

Race-ethnicity (with multiple)

  White 1.33 1.53

  Minority/non-white

3.84 2.97

Construct a CI with 95 percent confidence level for each estimate:

The CI for the % of white students who applied for federal aid:

88.4 -/+ (1.33*1.96) = (85.8, 91.0)

The CI for the % of minority/non-white students who applied for federal aid:

85.5 -/+ (3.84*1.96) = (78, 93)

Now compare these two CIs- do they overlap?- In this case, they overlap which means that the differences are NOT statistically significant. For two estimates to be statistically significant, the CIs must not overlap.

Page 73: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 73

Two-sample case: comparing two estimates

Not only are these estimates not statistically significantly different, but we can learn something else from this sample. The large standard error for the minority/non-white estimate indicates that there is some error in this estimate. In this case, the sample is small which reflects the fact that the population in Minnesota is small (thus a larger standard error is to be expected).

White students

Minority/non-white students

78% 85.5% 93%

85.8% 88.4% 91%

Page 74: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 74

Two-sample case: another approach for comparing two estimates Besides constructing CIs, you can use the two sample t-test. Either you can do

this by hand using the equation below or by going to the DAS help center and selecting on T-tests.

The two-sample t-test uses the estimates and the standard errors:

Estimate1 – Estimate2

((Std Error1)2 + (Std Error2)2)

The result of this calculation is compared to 1.96; if it is larger than 1.96, then the difference between the estimates is statistically significant. In this case,

45.82 – 13.47((3.2)2 + (1.42)2)

This equals 9.24. Since this is larger than 1.96, the difference between these two estimates is statistically significant.

NPSAS institution representative sample states= Minnesota

State grants total (>0.5%)

Estimates

Total 18.71

Income of dependent student's parents

< $40,000 45.82

$40,000 + 13.47

Standard Errors

Total 1.08

Income of dependent student's parents

< $40,000 3.2

$40,000 + 1.42

Page 75: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

November 2006 Minnesota Office of Higher Education 75

Two-sample case: another approach for comparing two estimates

The two sample tests (both the CI comparisons and the two-sample t-test) are meant for comparing two distinct populations (i.e. no overlap).

If the populations overlap, such as if one is a subset of the other (like Minnesota and the U.S.), then the two-sample t-test has a correction factor and the following test statistic is used:

Estimatea – Estimateb

Square root of (SEa2 + SEb

2 – 2 * rab * SEa * SEa)

Since the middle term, 2*rab, is not available, we can set this up without that term. Then it looks like the regular two sample t-test. Note, this test statistic is more conservative than it would be if we had used the correct formulation.

Page 76: NPSAS DAS Training December 2006 Training Shefali V. Mehta Minnesota Office of Higher Education

The End – Thank you!

For more information, contact Tricia Grimes [email protected] Shefali Mehta [email protected] technical support: Aurora D'Amico (NCES) [email protected] questions about the NPSAS 2004: Tracy Hunt-White (NCES) [email protected]