OFFICE OF INSTITUTIONAL RESEARCH AND ASSESSMENT
APRIL 27, 2011
1
Introduction to the Institutional Research
Database (IRDB) and Discoverer
Today’s Agenda2
IntroductionsHousekeepingData SourcesIRDB Structures
Tables/Fields Facts/Dimensions
Discoverer Folders/Items Rows, Columns, Page Items Calculations
Documentation
How are data moved from operational systems into a data warehouse?
3
Step 1. Snapshots are extracted from operational systems. Step 2. Extracted files are reformatted and cleaned. Step 3. Pre-processed files are loaded into staging tables and
metadata are loaded into lookup tables in an Oracle relational database.
Step 4. Data in the staging tables are migrated to normalized tables.
Step 5. Summary tables and other high performance query structures are created from the normalized tables and lookup tables.
Step 6. Semester-based fact and dimension tables are created from the normalized tables and lookup tables.
Step 7. Longitudinal fact tables are created from the semester-based fact and dimension tables.
Steps 1–3Why are snapshots used to populate the
IRDB?4
SIMS
Show-Registration
File
SPSS
CleanShow-Reg
FileSQL*Loader SHOW_FILE
Extraction
COBOL
Transformation Load
AcademicProgramInventory
NYSED_PPROGRAM_LOOKUPDatabase Link Database View
SKAT
PerformanceFile
GraduationFile
PostGraduateSurveys
SkillsTests
Results
NCSPearsons
COBOL
PERF_FILE
GRAD_FILE
SKAT_FILE_02
VTEA_SURVEY_FILE_02
CleanPerformance
File
CleanGraduation
File
SPSS
SQL*Loader
SQL*Loader
SQL*Loader
SPSSSurvey
data withSSN’s
SQL*Loader
5
CUNY IRDBData Flow Diagram
CAS(freshman
admissions)
Special Reports
StandardizedFiles
Joins fromMultiple Tables
acrossMultiple Terms
Group bySelectedColumns
(SQL)
OracleDiscovererCrosstabs
Ad-HocQueries
Migrate Datainto Oracle9iEnvironment(SQL*Loader)
NormalizeData
(PL/SQL)
OracleForms
CUNY Data Bookon Institutional
ResearchWeb Site
Extract Files
OracleDiscoverer
Tables
ASTA(transfer
admissions)
SHOW(enrollment)
SKAT(skills tests)
PERF(grades)
GRAD(degrees)
NCES (job survey)
SFA(financial aid)
Clearinghouse(transfers to non-CUNYcolleges)
Reformatand CleanInput Files
(SPSS)
Create Factand Dimension
Tables(SQL)
Migrate Datainto Oracle 9iEnvironment(SQL*Loader)
Type orCut and Paste
SPSSfor
Windows
Crystal Reportsand
Oracle Portal
InstitutionalResearchers
UniversityAdministrators
Public Users
OracleDiscovererCrosstabs
StagingTables
Code Descriptionsfrom File Layouts
PC Files
OperationalData Store
(normalizedstudent-level
data)
LookupTables
(metadata)
Flash Enrollment
SummaryTables
(denormalizedaggregate-level
data)
DataWarehhouse
(denormalizedstudent-level
data)
LongitudinalCohorts
(denormalizedstudent-level
data)Ad-HocQueries
Spreadsheets
6
What are fact and dimension tables and how are they related?
7
A fact table is composed of numerical measures of business performance. Examples of facts would be headcount, FTE’s, and cumulative credits earned.
Dimension tables contain items that describe or categorize the items in the fact table. Examples of dimensions would be gender, full-time/part-time status, and college of attendance.
The fact table also contains foreign keys that can be used to join it with the primary keys of the dimension tables. For example, “Student ID”, “Term Enrolled Date”, and “College ID” are used to join the table “History Facts” with the table “History Major 1 Dim”.
A central fact table with multiple dimension tables radiating out from it is called a star schema.
What are the advantages of using a star schema?
8
Creates a database design that improves performance. Parallels, in the database design, how the end users
usually think and use the data. Provides versatile and robust ad-hoc query capabilities.
Provides an extensible design which supports changing
business requirements. Can be used with point-and-click tools such as Oracle
Discoverer 9iAs.
9
History Facts and Dimensions
The Joins between the Fact Table “History Facts” and its Dimension Tables Are Defined by an OIRA Administrator in the
Discoverer End-User Layer
10
How is a campus limited to viewing only the data of its own students?
11
IR.USERID_LOOKUP# userid# college_id# table_name# table_grant
IR.HISTORY_FACTS# student_id# term_enrolled_date# college_id
IR.SEC_COLLEGE_07_MV# sec_student_id
IRASIInstitutional
ResearchStaten Island
IRASI.HISTORY_FACTS# student_id# term_enrolled_date# college_id
Users Select “Items” from a “Folder”with a Mouse Rather than Writing and Executing SQL
Code
12
Discoverer13
IRDB End-User Query ToolCurrently accessed via Citrix
Requires user id/password – domain log in (managed by CIS) Discoverer (account required – managed by OIRA)
Set of Business Areas (linked fact and dimension tables) History Facts – Historical Enrollment Records Degree Facts - Historical Degree Records (through most recent
complete academic year) Cohort Facts – Integration of Enrollment and Degree data in a
longitudinal structure for tracking cohorts over time Special Business Area - mostly stand-alone tables for specific
analyses (e.g., PMP)
Users Arrange the Items as the Page-Breaks, Columns, and Rows for a Desired Report
14
Accessing the IRDB Through Discoverer
15
Navigate your web browser to https://ez.cuny.edu Log in with your LAN user id and passwordClick on the Discoverer icon in the list of available
applications via CitrixInstall Java code as prompted upon first use of a
given computer (you may need an IT technician to install programs on your computer)
After Java installation, you will be prompted to log in to Discoverer (user id and initial password established by OIRA)
Documentation available
Creating a New Workbook as a Crosstabs Reportwith Discoverer 9iAS
16
The Derived Fact “Headcount” Reflects the Business Rules for Excluding Some Students from Official Enrollment
Statistics
17
Creating a Layout for the Crosstabs Report
18
Discoverer Estimates the Time Needed to Run a Query
19
A Crosstab Built from “History Facts” andTwo Related Dimension Tables
20
Creating Totals and Subtotals21
Creating a New Workbook as a Table Report (or Extract) using Discoverer
22
Creating a Layout for the Table Report
23
An Example of an Implicit Condition
24
Sorting the Rows Retrieved
25
A Table Report of Fall 2002 Graduates with the Original Dimension “Birth Date”, the Computed Fact “Age”, and the Computed Dimension
“Age Group 1”
26
With Discoverer, Table Reports Can Be Exported in a Variety of Formats
27
28
Tracking Student Progress:How Should Many-to-Many Relationships
between Fact Tables be Resolved?
History Facts Degree Facts
29
Answer: Create an Intersection Entity that Has Many-to-One Relationships with both
Tables
History FactsIntersection
EntityDegree Facts
30
Selecting Three Different “Headcount” Facts from the Table “Cohort Facts”
31
“Headcount” of Undergraduates Who Entered in Fall 1990“Headcount” of Fall 1990 Entrants Who Returned in Fall 1991
“Headcount” of Fall 1990 Entrants Who Graduated by Summer 1996
32
Fall-to-Fall Retention of Fall 1990 Undergraduate Entrants
33
The “Headcount” Facts in the Table Cohort Facts” and the Foreign Keys that Join it with the Table “Degree Facts” Can Be Used to Create
a Graduation Rate Item
34
Six- Year Graduation Rates of Fall 1990 Undergraduate Entrants
35