6
Usability of PSPP 1 Usability of PSPP Lim Zhen Ling Samantha Nanyang Technological University

Usability of PSPP

Embed Size (px)

Citation preview

Page 1: Usability of PSPP

Usability of PSPP 1

Usability of PSPP

Lim Zhen Ling Samantha

Nanyang Technological University

Page 2: Usability of PSPP

Usability of PSPP 2

PSPP is a free open source alternative to SPSS the commonly used proprietary statistical package. SPSS’s cost of ownership is high – access is limited to the school’s computer laboratories. It can be accessed via NTU VPN in the comfort of the home however non-Internet Explorer users will not be able to do so. Therefore, I would like to investigate whether I can easily conduct similar statistical tasks using PSPP and thereby evaluate my user experience of this program.

I decided to follow-up on a tutorial on multiple regressions. The three tasks I would like to carry out are generation of box plot, histogram and summary statistics for the four variables in the tutorial dataset, a correlations table for the four variables and linear regression for a reduced model i.e. M2 which contains only one out of three independent variables (IVs).

I conducted the tasks using my Acer laptop in my room at home. I found the GNU PSPP website using the Google search engine. Then I downloaded the software via the website and installed it on my laptop. With a simple click of the icon, the program started up.

My first impression of the program is that it looked more accommodating to the novice user than SPSS. There are large colourful icons for various functions such as OPEN, SAVE, GOTO CASE and FIND. Despite visibility of these functions, I was at a loss to how to plan my action sequence as the program did not appeal to my prior knowledge. Apart from OPEN and SAVE, the other icon functions were unfamiliar. A closer inspection revealed a similar horizontal menu bar at the top.

I clicked on OPEN icon; a dialog box asking for the FILE NAME appeared; I selected the dataset with a click and selected OPEN; the system feed backed by loading the dataset immediately. To generate histogram, box plot and summary statistics, I selected ANALYZE from the menu bar; I selected DESCRIPTIVE STATISTICS; a drop down menu appeared at the side; I selected EXPLORE and a dialog box appeared. I have to classify the four variables into the FACTOR LIST and the DEPENDENT LIST. As I cursor over the variables, I was pleased to notice the short form of the variable popped up on top of the long variable name. This is a convenient feature. However, I noticed I was unable to highlight and drop multiple variables at one time. After classification, I selected STATISTICS; no options for choice of summary statistics were available. I selected CONTINUE; I clicked on OPTIONS in the dialog box, still no options for histogram and box plot. I restarted by exploring the alternative FREQUENCIES. I closed the current dialog box; I selected ANALYZE; I clicked on DESCRIPTIVE STATISTICS and then selected FREQUENCIES. A dialog box popped up. I selected the DV and clicked on the arrow button in the middle to shift the DV to the VARIABLES list. I scrolled down and ticked the boxes for mean, standard deviation, kurtosis and skewness. I explored the dialog box prompted by the FORMAT button. Still, no options for histogram or box plot. I clicked OK to generate the output. Another dialog box appeared. I have successfully obtained summary statistics but not the histogram or box plot, which PSPP lack functions to produce. However, I was surprised to see that instead of a graphical interface, the tables are made up of # and = which caused the data to be difficult to read (see Fig. 1.1).

Page 3: Usability of PSPP

Usability of PSPP 3

My second task is to obtain correlations for the four variables in the dataset. First, I selected ANALYZE but there was no CORRELATIONS option in the drop down menu. I have reached a gulf of execution. I decided to search for the option by trial-and-error. Under ANALYZE, there were only three options; only the two former options had drop down menus. I decided to explore the options available in their drop down menus. I decided to explore DESCRIPTIVE STATISTICS as I felt it was unlikely the rest contained the CORRELATIONS function. The CORRELATIONS option was found under the STATISTICS button which was in the CROSSTABS dialog box. However after generating the output, I realized that option in CROSSTABS is a false clue as the data obtained did not coincide with that from SPSS. Before I could declare this task a failure, I recalled that I am able to manually type in the syntax (in SPSS) which would generate the output. I attempted that as it was likely that the CORRELATIONS option is available but poor mapping caused visibility to be compromised, resulting in critical functions to be ‘hidden’ from the user. In the CROSSTABS dialog box, I clicked on PASTE. A dialog box for the syntax appeared. By deleting all and typing in the appropriate syntax used in SPSS into the dialog box, selecting the PLAY icon, output is generated. The icon is a good metaphor borrowed from the familiar video player. An ERROR MESSAGE dialog box popped up indicating the PRINT function (of the syntax) could not be recognized (see Fig. 1.2). The feedback was not meaningful but fortunately it is constrained by the sole option available: OK. I selected it and output was generated successfully.

Lastly, I was to run linear regression analysis for a reduced model. To begin, I selected ANALYZE then I selected LINEAR REGRESSION. In the dialog box, I classified progress as DV and logo as IV. I noticed there were no options for the entry method of the variables i.e. FORWARD, BACKWARD. I assumed that the default entry method if unspecified was ENTER. I clicked on the STATISTICS button in the dialog box; I selected the boxes for ANOVA, R and COEFF and clicked CONTINUE. I selected OK to generate the output. The data was successfully generated however I noticed the results are in two decimal places which were less accurate. I attempted right-clicking and clicking on the numerical data for options on number of decimal places or the original non-estimate; no options were available. Again, poor mapping thwarted with a function’s accessibility.

Instructions

Summary statistics (PSPP cannot generate histogram and box plot)

Go to ANALYZE (menu bar at the top); select DESCRIPTIVE STATISTICS; select FREQUENCIES. Select the variable you would like to generate the statistics for then click on the ARROW BUTTON in the middle to put the variable into the VARIABLES list. Scroll down and tick the boxes of the summary statistics you would like to generate. Click OK to generate the output.

Correlations (by syntax input)

Go to ANALYZE; select DESCRIPTIVE STATISTICS; select CROSSTABS. You may put the variables in any way you wish (it doesn’t matter). After putting the variables under

Page 4: Usability of PSPP

Usability of PSPP 4

ROWS and COLUMNS, you will notice that the PASTE button is active – select that. A dialog box will pop up. Clear the dialog box i.e. by selecting all and pressing DELETE.

CORRELATIONS

/VARIABLES=Progress Logo Vocal Syntax [short form of variables]

/PRINT=TWOTAIL NOSIG

/MISSING=PAIRWISE

Click on the PLAY icon to generate output.

Linear regression for a reduced model

Go to ANALYZE; select LINEAR REGRESSION; select your DV and place it under DEPENDENT VARIABLE; select the IVs in your linear regression model and place it under INDEPENDENT VARIABLE. Note that by selecting all the IVs available, you would be running an analysis on the full model instead – do not do so. Click on the STATISTICS button; ensure that COEFF, R and ANOVA are ticked; select CONTINUE. Select OK to run the output.

Conclusion

PSPP is a useful statistical tool. As an open source project in its developmental stages, usage remains somewhat bumpy and frustrating. As my mental model for usage of this program is derived from experience using SPSS, I used that as a guide to complete the tasks in PSPP. Therefore, the software appeared easy to use and tasks can be completed when PSPP is consistent with SPSS design. However, I do not have a mental model for PSPP’s unique organization and features. Given some time to learn these discrepant functions, the action sequences will become automatic, thus the software will seem significantly easier to use. Still, not all errors can be accountable by unfamiliarity. The lack of visibility of critical functions for instance is a misrepresentation of the designer’s model to the system model resulting in a flawed user mental model. The inappropriate conceptual model causes planning of action sequences to be incorrect, inevitably resulting in tasks uncompleted.

Overall, five to six errors were made, not inclusive of trial-and-error in task two. The alternative software SPSS would be easier to use due to an existing mental model and well-learned automatic action sequences. Since that is the most widely used statistical tool in the market, designers should assume that most users’ men0tal models are derived from it and thus design PSPP to be more consistent with it. This will allow savings in learning the interface and thereby enhance user experience. The existing drop down menus may be short easing serial search but visibility and accessibility are compromised. I suggest the multiplying the display of options in the initial drop down menus – this will enable clues to generating the correct action sequences and minimize gulfs of execution. Other critical functions such as generation of graphs and choice of entry method in linear regression should

Page 5: Usability of PSPP

Usability of PSPP 5

be visibly included. Intuitiveness of the program can also be enhanced. For instance, the user should be allowed to manipulate the data directly.

Annexe

Fig. 1.1

Fig. 1.2

Page 6: Usability of PSPP

Usability of PSPP 6