StatPac for Windows User's Guide
A procedure in StatPac refers to a set of commands that perform one or more tasks. A procedure may specify a single analysis or several analyses of the same type. Procedures can also contain commands to perform transformations and write subfiles.
The commands to perform an analysis (or series of analyses), can be stored in a file called a "procedure file". This means that you can easily recall a previously executed procedure, and make changes to it without having to retype the commands. The procedure file is automatically stored on disk with the study name and a .PRO extension. You can also store procedure files using different names. Procedure files can be saved, loaded and merged with other procedure files.
Click on the Analysis Button to start the procedure file editor. The commands to run analyses are typed into the text window on the left of the screen. The Variable List window (on the right of the screen) will show the names of the variables in the current study. The first time you click the Analysis button for a given study, you will be offered the opportunity to create an automatic topline procedure file.
The procedure file editor is similar to any text editor, although it has numerous built in features to simplify editing procedure files.
StatPac uses an easy programming language for designing procedures. It also has a large selection of automatic features to simplify the process.
For example, a simple procedure might be:
This procedure consists of a single task using a study called SURVEY (i.e., a codebook called SURVEY.COD and a data file called SURVEY.DAT). The procedure says to use the study called SURVEY, and perform a frequency analysis of the RACE variable. Notice that a procedure always ends with two dots (periods).
A more complex procedure would be:
FREQUENCIES RACE, INCOME, SEX
This procedure contains three tasks, each using the same study (and data file). If you execute this procedure, the program will first do a frequency analysis of RACE, then of INCOME, and finally of SEX. There is no limit to the number of tasks that can be specified in a single procedure.
A procedure file may also contain many different procedures. The only requirement is that the procedures be separated from each other by two dots. For example, the following commands specify three procedures:
CROSSTABS RACE BY AGE
FREQUENCIES AGE, INCOME, SEX, PREFERENCE
The first procedure contains one task, the second one task, and the third four tasks. These commands would actually run six analyses. Notice that the study name is only specified once (in the first procedure). Subsequent procedures will automatically use the same study and data file. Usually, the STUDY command is used only once.
The use of the STUDY keyword in the first procedure is mandatory since it defines the codebook and data file names for all the following procedures. However, the STUDY keyword may also be used in subsequent procedures. If the keyword STUDY is specified in another procedure, that procedure, and the procedures following it, will use the new codebook and data file.
The following commands contain two procedures, each having two tasks. The STUDY command is used in both procedures. This means that the first procedure will analyze data from one study (SURVEY1) and the second procedure will analyze data from another study (SURVEY2).
FREQUENCIES AGE, PREFERENCE
FREQUENCIES AGE, PREFERENCE
The STUDY keyword not only specifies the name of the codebook to be analyzed, but it also implicitly specifies the name of the data file. In most cases, the codebook and the data file name are the same (except for the extensions). A codebook called SURVEY would usually use a data file called SURVEY.DAT.
Sometimes, the codebook name and data file names will not be the same. For example, if the same study had been performed each year, you might have several data files with the same codebook name, but with different data file names. The DATA keyword may be used to analyze different data files (all using a common codebook name).
FREQUENCIES AGE, ATTITUDE
FREQUENCIES AGE, ATTITUDE
In the above example, each procedure uses the DATA keyword to specify what data file should be analyzed by that procedure. Both procedures will use the codebook called SURVEY.COD. Even though the STUDY command is only specified in the first procedure, subsequent procedures will use the same codebook name unless another STUDY keyword is used to change it. The first procedure will read data from a file called DATA-97.DAT and the second procedure will read from a data file called DATA-98.DAT.
Whenever you use the STUDY or DATA commands, the last specification will remain in effect until changed by another STUDY or DATA command. For example, all three of the following procedures will use a codebook called OPINION.COD and a data file called MARKET.DAT.
CROSSTABS INCOME WITH PREFERENCE
BANNERS AGE RACE INCOME WITH PREFERENCE
When you use the STUDY command to specify a new codebook, the data file will be changed to the new codebook name automatically. In other words, using the STUDY command overrides all previous STUDY and DATA command specifications. In the following example, the second procedure will use a data file called STUDY-2.DAT, even though the DATA command was used in the first procedure to specify a data file called SENSORY.DAT.
TTEST PRETEST WITH POSTTEST
If a command is too long to fit on one line (e.g., a long variable list), StatPac will automatically indent subsequent lines. You can simply continue typing and the word-wrap feature will take care the indentation for you. You can also use an explicit (hard) return to begin the next line. A break between lines should always occur between words or between sets of parentheses. A continuation line is denoted by indenting at least one character (typing at least one space at the start of the line). The following procedure will perform a frequency analysis on eight variables. Since the second line is indented, StatPac will interpret it as a continuation of the previous line.
FREQUENCIES AGE RACE SEX INCOME STATUS
CATEGORY HOUSING TRANSPORTATION
Sometimes, continuation lines can be used to make a procedure easier to read. The following procedure will perform descriptive statistics on three variables. Because the variable names are indented, they will be interpreted as a continuation of the DESCRIPTIVE line.
Comment lines may be included in a procedure file. Their purpose is to allow you to imbed notes within a procedure file. They are especially helpful when reviewing a procedure file that you have not used for a long time. Comment lines will be ignored when performing an analysis. A comment line begins with an apostrophe, or the word REM. There are no restrictions on the text that may be included in a comment line. Comment lines may also use continuation lines. For example, the following procedure contains two comment lines. The second comment also has a continuation line:
REM This procedure has two comment lines
' This procedure will only use the first 50 records
for the analysis because the SELECT command is used
Comment lines can be useful when debugging a procedure that contains an unknown error. By selectively making each line a comment (adding an apostrophe to the beginning of the line), you can essentially eliminate that line as a possible cause of the error.
Most of the examples in this manual use variable names. However, it is important to note that either variable names or V numbers may be used interchangeably. For example, if AGE is variable twelve in a study, the following two commands would produce identical results:
Designing a procedure with StatPac consists of typing a series of commands. With the exception of comment lines and continuation lines, each line in the procedure will begin with a keyword or analysis command.
Keywords are used to modify an analysis. They may be used in a procedure to change labeling and perform transformations. In fact, they are used for everything except the actual selection of an analysis type. The STUDY and DATA commands are keywords. A listing of the keywords can be displayed by selecting View, Syntax Help. The keyword menu will appear in a window.
A single procedure can contain many keyword commands, but only one analysis command. In the following example, the analysis output will have a page heading and title because of the inclusion of two keywords in the procedure.
HEADING Acme Research, Inc. - System Analysis Study
TITLE Crosstabs between Shift and Efficiency Rating
CROSSTABS SHIFT BY RATING
While many keywords can be used in a procedure, only one analysis command can be specified. A listing of the analysis commands can be displayed by selecting View, Syntax Help. The help window will appear.
The Variable List window enables you to view and select variables for an analysis. It can be displayed by selecting View, Variable List. The width of the Variable List window can be adjusted by dragging the bar that separates the procedure file text from the Variable List window.
One convenient feature of the Variable List window is the ability to transfer variable names to the procedure text. To select a variable, highlight it in the Variable List window. To select multiple variables, hold down the shift or control key while clicking on the desired variables in the Variable List window.
To transfer selected variables from the Variable List window to the text of the procedure file, first select the desired variables in the Variable List window. Then double click in the procedure file where you want the variable names to appear. The highlighted variables in the Variable List window will be copied to the procedure file text, and the variable(s) will be deselected in the Variable List window.
The Variable Detail window lets you see detailed information about any variable, or change the information for a variable. To display the Variable Detail window, select View, Variable Detail. You can also double-click on a variable in the Variable List window to evoke the Variable Detail window
All information for a variable can be changed except its format. Changes made in the Variable Detail window (e.g., revised labeling) will be saved in the codebook, and therefore will appear in the analyses.
The current variable displayed in the Variable Detail window can be changed by using the drop-down variable selection or by clicking on the desired variable in the Variable List window.
The Variable Detail window can be dragged to any location on the screen. Press and hold the left mouse button anywhere on the gray borders of the window. Drag the Variable Detail window to the desired location and release the mouse button to drop the window at that location.
You can hide the Variable Detail window by selecting View, Variable Detail. Alternatively, click on the X in the top right corner of the Variable Detail window.
Use the Find Dialog window to search for specific text in the procedure file or the results. Select Edit, Find (or use the Ctrl F shortcut) to display the Find Dialog window.
To begin a search, type the search text and click on the Find Next Button. After a search has been started and a match has been found, you can continue the search by clicking on the Find Next Button (or by pressing the [F3] shortcut). Upper and lower case differences will be ignored in the search.
Use the Replace Dialog window to replace specified text in the procedure file or results. Select Edit, Replace (or use the Ctrl H shortcut) to display the Replace Dialog window. Alternatively, you can click the Replace Button from the Find Dialog window.
Upper and lower case differences will be ignored when finding text. However, replaced text will use the exact text typed into the Replace With window.
Options are used to control the analysis. Options allow you to modify the defaults for an analysis; that is, they allow you to customize the analysis parameters themselves. Analysis options may be changed temporarily or permanently. When changed permanently, the current procedure and all future procedures will use the new defaults. When changed temporarily, only the current procedure will use the new options.
Some options are global and apply to all analyses. Other options are specific to the type of analyses being performed. If you select Options when there is no procedure file or when the cursor is in a procedure that does not specify an analysis, only the global options will be displayed. They allow you to set the pitch (point size) for the report, the page margins and paper orientation, the next page number to be printed, zoom factor, and weighting.
The margins are expressed in inches. The paper orientation may be set to OR=P (portrait) or OR=L (landscape). The zoom factor is any easy way to reduce the size of a table so it will fit on one page. Normally, ZF=100 and the printouts will be displayed at 100% their normal size. Setting ZF=80 would display the tables at 80% of their normal size, so more columns would be able to fit on a page. The ecology option may be used to save paper. When EC=Y and you are saving the output to a batch file, all page breaks will be excluded. At the conclusion of the batch run, select System, Current Batch File, to print the file. When running interactively or batch to printer, EC=Y will only suppress page breaks within each task or procedure.
The WT (weighting) option lets you weight the data based on the value of another variable, and the FC (fractional counts) option controls whether the reports will show integer or fractional counts. They currently apply to all analyses in the Basic Statistics Module.
To view the options for an analysis, move the cursor to the procedure where the analysis command is specified and select Options. If no analysis is specified in the current procedure, only the global options will be shown.
The options for each analysis are different. If the current procedure contains an options line that changes the default values, the modified values will be displayed in yellow. Any errors in the option line will be displayed in red. To change the option temporarily, simply type the new value for the option. To make a permanent option change, type the new value and add an exclamation point as a suffix. For example, typing Y changes an option to yes for the current procedure only. Typing Y! changes the option permanently so that all future analyses will use the default of Y.
Weight and Fractional Counts Options
The WT option lets you apply non-integer (fractional) weighting to procedures. It is used when the sample differs from known population parameters. To apply case weighting, you must first create a variable that contains a weight.
The following example computes weights for each of three groups and saves the weight for subsequent analyses. The CaseWeight variable will become the last variable in the study.
NEW (N7) "CaseWeight"
IF GROUP = 1 THEN COMPUTE CaseWeight = 0.4172
IF GROUP = 2 THEN COMPUTE CaseWeight = 0.8735
IF GROUP = 3 THEN COMPUTE CaseWeight = 1.0963
Subsequent procedures could then apply weights to the analyses using the WT option. Parentheses are required around the variable name. Since an exclamation point is used as a suffix, weighting will become the default for all subsequent analyses. In this example, both the frequencies and descriptive statistics procedures would weight the data. If the exclamation point had been excluded, weighting would only be used in the frequencies procedure.
Unlike other options, the WT option (with a ! suffix) only applies to the current StatPac session. If you quit StatPac and restart it, the WT option will be set to N (None). This is done to prevent a potentially serious mistake. For example, suppose you run a procedure file with weighting and then end StatPac. The next day you run StatPac and begin processing a different procedure file. If the WT option was persistent, weighting might inadvertently be applied to the new procedure file when you didn't intend it to be and worse, you might not realize it.
You can turn weighting on and off by using WT=(VariableName) and WT=N. In the following example, weighting is applied to the first, second procedures, but not the third and fourth procedures.
TITLE Weighted Frequencies for: (#)
TITLE Weighted Descriptive Statistics for: (#)
TITLE Unweighted Frequencies for: (#)
TITLE Unweighted Descriptive Statistics for: (#)
The FC (fractional counts) option my be set to Y or N. It sets whether the N's (counts) in the reports will be shown as integers or decimal values. The FC option only applies when weighting is used. In unweighted data, the counts will always be integer values (whole numbers).
Weights are easily calculated as the desired percentage divided by the observed percentage (or the desired count divided by the observed count). For example, suppose you know that the population has 55% males and 45% females. This is called a known population parameter. Your survey sample, however, has 40% males and 60% females. If the responses to other variables were different for males and females, your reports might present a distorted estimate of the population. Weighting would be used to eliminate the gender sampling error. The weight for males would be 55/40 and the weight for females would be 45/60. In the following example, the first procedure calculates a GENDERWEIGHT variable and saves it. The second procedure uses the WT option to weight the data based on the GENDERWEIGHT variable.
NEW (N5) "GENDERWEIGHT"
IF SEX="M" THEN COMPUTE GENDERWEIGHT = 55/40
IF SEX="F" THEN COMPUTE GENDERWEIGHT = 45/60
Important User Tip
The first few times you run StatPac for Windows, experiment with the options to find the values that produce the report formatting you want. Rather than setting these options in each procedure, use the exclamation point suffix to make them permanent. After running a few procedures, you'll have configured the default formats for StatPac to produce the reports you most often use.
The File selection of the menu allows you to load, merge, and save procedure files. To open a new procedure file, select File, Open, or click the Open Button. To save the current text in a procedure file, select File, Save, or click the Save Button.
To begin a new procedure file, select File, Open, and change the Files of Type to codebooks. Select the codebook and click OK. The STUDY command will be inserted as the first line of the procedure.
Procedure files are always saved with a .pro extension.
If loading a new file, and the current procedure file text has changed, StatPac will check to see if you want to save the current text before abandoning it and loading the new file. Note that anytime you run a procedure, the entire procedure file will be saved before the procedure is run. Thus, if you load a new file immediately after running a procedure, it is not necessary to save the current procedure file before loading the new procedure file because it will already have been saved.
To merge the text from a procedure file previously stored on disk into the current text window, position the cursor where you want the text to be loaded and then select File, Merge. The text will be inserted ahead of the cursor.
The current procedure file can be printed by selecting File, Print. Select the procedures you want to print and click OK.
If you choose to specify procedures, you must type the procedure numbers that you want to be printed. Procedure numbers can be separated from each other by commas or spaces. A dash can be used to indicate a range of procedures. For example, the following would print procedures 1, 2, 8, 9, 10 and 15
1, 2, 8-10, 15
Click on the Run button to execute the commands in the text window (i.e., to run the analysis). StatPac will give you the option to specify which procedures should be run, the operating mode, disposition of the output, name of a file to store the output, and the starting page number for the output.
After setting these parameters, click OK to run the analyses.
Procedure(s) To Run
The "Procedure(s) To Run may be an individual procedure or range of procedures. The default will be the procedure where the cursor was located when the Run button was clicked. If you highlight text before clicking the Run button, the default procedure(s) will be all the procedures that contained highlighted text. A range of procedures may be specified with a dash. To run procedures one through ten, you would type 1-10 in the Procedure(s) to Run field. To run from procedure 5 to the last procedure, you would type 5- in the Procedure(s) to run field. To run a single procedure, simply type the procedure number.
The Mode selection allows you to set the analysis to operate interactively, batch, or in the test mode. When interactive is selected, all output will first be displayed on the screen before being sent to it's final disposition (printer or file). You will be able to view, edit, print, and save the output; and you must manually tell StatPac when to go on to the next task or procedure.
The Batch mode is similar to the Interactive mode. The difference is that the program will automatically go on to the next task or procedure after showing the results for 3 seconds. During the 3-second display time, you can freeze the screen and view the output of the current task in more detail. You can then continue or cancel the batch run.
The Test mode will simply check the syntax of the selected procedures without actually running them.
When you begin to run an analysis, StatPac will first check the syntax of your procedure(s). The syntax checker will catch all major errors. It is, however, only a syntax checker. It can tell if the syntax is correct, not if the commands will do what you want. It also cannot check for data dependent errors, since these can only be discovered through actual data processing. If a syntax error is discovered, correct the error and re-run the procedure.
The Output selection will be displayed when processing in the Batch mode. It refers to the final disposition of the output (i.e., where you want the results of the analysis to be sent). You may send the output to the printer or a file. When you choose to send the output to a file, you will also be able to enter a file name for the output. The output will be saved in Rich Text Format.
In the Batch mode, the results of all analyses will first be displayed on the screen for 3 seconds. If you do nothing, the results will then be sent to the Output (printer or file). However, if you temporarily freeze the output by pressing the Pause button, you will have the choice whether to save or print the results.
Starting Page Number
The Starting Page Number is especially useful when processing in the Batch mode. StatPac will automatically increment the page numbers with each task. If you stop a batch run, or decide to rerun a particular task, you will need to manually set the starting page for the next run.
If the Starting Page Number is left blank, no page numbers will be printed on the output.
The page number will be placed on the page in the location specified in the header or footer template.
When running in the batch mode, the results will normally be shown on the screen for a few seconds before being printed or written to file. This gives you time to review the results before printing or saving them. When the No Delay box is checked, the results will be immediately sent to the output device.
After StatPac has finished processing an analysis, the results will be displayed. (In a batch run, the results will only be displayed for 3 seconds unless you freeze the program with the Pause button in the results editor). The results editor will allow you to examine and edit the results before printing or saving them.
All files saved or loaded with the Results Editor will be in Rich Text Format with a .rtf extension. These files can also be loaded with your word processor.
Graphics are available when performing frequencies, crosstabs, descriptive statistics, breakdowns, and correlation analyses. A colorful graphics button will be shown on the Results Editor tool bar for these analyses. Clicking the button will let you select and edit the graphs.
To modify the appearance of a graph, select Edit, Legend & Labels. This will give you the opportunity to change any of the text on the graph, including legend information (if there is one). The legend information can be saved by selecting the Save As Default check box. Future graphs with legends will then use the new settings.
The actual creation of the graphs happens while the analysis is being performed. You can control the kinds of graphs that will be created and the labeling methods by selecting Edit, Creation Settings. This option is also available in the Analysis Editor by selecting System, Graph Creation Settings.
After you are satisfied with the appearance of the graph, you can do several things with it:
1. Print the graph immediately by selecting File, Print.
2. Add the graph to the next page of the results by selecting Edit, Copy Graph To Report. When you exit the Graphics Editor and return to the Results Editor, the graph will have been added to the end of the current results.
3. Save the graph as a file (.jpg, .bmp, or .wmf) by selecting File, Save Graph Image.
4. Save the graph in the clipboard by selecting Edit, Copy Graph To Clipboard.
5. Create a tab delimited file of the labels and data used to create the graph (not the graph itself) by selecting File, Save Delimited File.
When processing in the batch mode a table of contents will be created if page numbering is used. After the batch processing is finished, you can display and print the table of contents by selecting System, Current Table of Contents. This can be copied and pasted to the beginning of the document created by the batch processing so that it begins with a table of contents.
The Title keyword in each procedure of the batch run will be used to create the entry in the table of contents. If a procedure had no title, the table of contents will contain the analysis command line in the procedure.
The current table of contents will be erased and a new table of contents will be started when the starting page is set to 1 in the Run dialog. Setting the starting page number to a value greater than 1 will add to the existing table of contents.
A Topline contains basic analyses for all the variables in a study. It consists of frequencies, descriptive statistics, and listings of open-ended comments.
When you click on the Analysis button, StatPac will ask if you want to create an automatic Topline procedure file. After you've run any analysis and the StudyName.pro file exists, you will not be asked the question again.
The report generated by an automatic Topline will provide a good summary of the data. If you're just after "answers" and not particularly concerned about labeling, the automatic Topline procedures can be run "as is". If you want a "camera-ready" report, you'll want to edit the procedures (especially the Tile and Options commands). Most users will view the automatically generated Topline a solid foundation rather than and end-product.