StatPac for Windows User's Guide
Seven utility programs are provided to give greater control and more versatility over studies and data files They can be run from the Analysis, Utilities menu.
The Import and Export program will allow you to read files created by other software or write files that can be read by other software. Several formats are supported: Access, Excel, all prior versions of StatPac, all prior versions of StatPac Gold, comma delimited, tab delimited, multiple record files, Internet response files, and plain-text e-mail.
The Merge program is used to merge variables from different studies and data files or to rearrange the sequence of variables in a file. It can merge data from up to five individual data files. It can also be used to concatenate (join) data files using the same codebook.
The Aggregate program is used to create a true or compositional aggregate study and data file. Aggregate files are useful for summarizing subgroups of data.
The Codebook program is used to quickly create a codebook, or to check a codebook and data file for errors. The Check program is used when you suspect that there is a problem in the codebook or data file. If a specific procedure won't run, this program can sometimes provide a solution. A common use of this program is when you are planning to use a data file created from a source other than StatPac, and you want to make sure that your study design matches the data file.
The Sampling program is used to generate a random number table, create a random digit dialing table for telephone studies, and to select a random sample from a data file.
The Compare Data Files program is used to compare two data files for differences. It is used to check the accuracy of data entry when a double entry system has been used.
The Statistics Calculator is used to calculate distributions, probabilities and other statistics from proportions and summary data.
StatPac can import and export information to other software. Select Utilities, and then select import or export. Import means you want to convert a non-StatPac file into StatPac for Windows format. Export means you want to convert a StatPac for Windows file to a different format. When importing or exporting data, the original file(s) will remain intact and a new file(s) will be created.
When importing, select the type of file and name of the file to be imported. If you are importing from previous versions of StatPac, it will be assumed that the codebook/study name and the data file name are the same. The names for the StatPac for Windows codebook and data may be different.
Each of the import and export file formats is explained below.
Prior versions of StatPac and StatPac Gold used a "codebook" file or "study design" files to store variable format and labels information. StatPac for Windows stores all this information in a codebook file. Because older versions of StatPac have limited labeling space, some labels may be truncated when exporting to an older version of StatPac.
The import program assumes that the data file name is the same as the study file name for the previous version. Thus, the "Name of the File to Import" will indirectly also specify the data file name. For example, if the file to import is SURVEY.EZ0, the program will try to import a data file called SURVEY.DAT from the same folder. If the data file does not exist, only the codebook will be imported. StatPac for Windows stores data in the same format as prior versions of StatPac (fixed format sequential ASCII). Therefore, if there is not a matching data file name, you can simply copy the old data file to the folder where you imported the codebook and use it without modification.
StatPac can import or export to a variety common data base/worksheet formats. The appropriate extension will be used when you select the type of data base to be imported or exported. When importing to StatPac, the import procedure will create a new StatPac codebook and data file. Do not create a StatPac codebook before doing the import or it will be overwritten by the import procedure.
When importing from Lotus to StatPac, the default variable names are the worksheet column letters (e.g. Column A, Column B, etc.). If your worksheet contains locked column headings, they will be used as variable names for those columns. The column headings may be locked in Lotus by using the /Worksheet Titles Horizontal command. If the column titles are not locked, they will be written as the first record in the StatPac data file (an undesirable situation). Also, be sure your worksheet does not contain any empty rows or dividing lines between the titles and the first row of data.
StatPac can import and export comma and tab delimited files. There are many software packages that can interchange data in this format. A comma delimited file is a sequential ASCII file where the variables are separated from each other by commas (rather than each variable using a fixed number of characters). In a tab delimited file, the separator is a tab character. Tab delimited imports and exports are generally more reliable than comma delimited files.
When importing, StatPac creates a new codebook based on the field widths required to hold the data. If a codebook already exists, you may use it instead of creating a new codebook. If a data file already exists, you'll be offered the option of appending to the existing data file or deleting it. Selecting append will add the newly imported data to the end of the existing data file.
Many software packages write quotation marks around alpha fields, while others do not. When importing a comma delimited file, all quotation marks will automatically be eliminated, since StatPac does not use quotes. When exporting to a comma delimited file, any field containing a comma will always be enclosed in quotes. The StatPac.ini file contains a setting QuoteAlphaFields. When QuoteAlphaFields=1, all alpha fields will be quoted when exporting to comma delimited. When set to zero, only fields containing a comma will be enclosed in quotes.
Many software packages can read or write a header record in comma and tab delimited files. A header record is usually the first record in the data file. It contains the names of the variables instead of actual data. If you are importing a comma or tab delimited file and don't know if there is a header record, load the file into your word processor and look at it. It the first line in the file is the names of the variables, there is a header record. If the first record looks just like the other records, it's data, and not a header record.
When exporting to a tab or comma delimited file, StatPac will give the option to convert the raw data to the value labels. For example. if the first variable were gender and coded 1=Male and 2=Female, a normal export would write a 1 or 2 for the variable. If the Expand To Text option is selected, it would write Male or Female to each data record instead of the raw data.
When exporting to a tab or comma delimited file (or Excel), with the intention of being able to import the file into SPSS, StatPac will give the option to also create a SPSS syntax file. This file is a text file with the same name as the exported file except it will have a .sps extension. In SPSS, first import the data. Then load the .sps file into Notepad or another text editor. Copy the contents of the file into the SPSS syntax editor and click play. This will create the variable and value labels in SPSS. When using this feature, StatPac may modify (abbreviate) the variable names, labels and value labels in order to fit the limited space offered by SPSS.
The tab delimited import utility can be used to import a text file for Verbatim Blaster open-ended response coding. For example, you might have used Microsoft Word to enter verbatim comments into a .txt file. Each person's comments were entered as a paragraph (i.e., a continuous string of text ending with a carriage return). This file can be imported as a tab delimited file. Since there are actually no tabs in the file, StatPac will correctly import the text into a codebook and data file containing a single variable. The variable will be an alpha type and will be as long as necessary to hold the open-ended comments.
Exported tab delimited files may use a .txt or .tsv (tab separated variables) extension. Exported comma delimited files may use a .txt or .csv (comma separated variables) extensions
When exporting to a tab or comma delimited file, keep in mind that many programs (Excel, Access, etc.) limit the number of columns to 255 (while StatPac can have as many as 2,000). If your codebook has more than 255 variables, an export to an Access file is preferred because it will split the data into multiple tables as necessary. Otherwise, you’ll have to use the Write command to create a series of codebooks and data files (each containing 255 or fewer variables) and then export each one individually to a delimited file.
Many researchers want to use data that is in card-image format on a mainframe computer. Also, many data entry services are capable of only punching data in card-image format. While it is relatively easy to download data from a mainframe, it often comes in 80-column format. If there is only one record per case, this data can be read by StatPac without performing an import. However, when there is more than one "card image" (i.e., record) per case, it becomes necessary to concatenate (join) the "card-image" records together to produce a StatPac readable file.
Importing a multiple record file that looks like this...
Card 1 Case 1
Card 2 Case 1
Card 3 Case 1
Card 1 Case 2
Card 2 Case 2
Card 3 Case 2
will become a StatPac file that looks like this...
Card 1 Case 1 Card 2 Case 1 Card 3 Case 1
Card 1 Case 2 Card 2 Case 2 Card 3 Case 2
StatPac requires that a data record be a continuous stream of characters terminated with a carriage return and linefeed. This program will read a file in multiple record format and create a new data file with one record per case. The filename should have a .txt extension.
StatPac assumes that there are 80 characters in each record of the multiple record file. If the "card-image" record length is less than 80, StatPac will pad the records with spaces before combining them
You will need to specify how many records there are for each case. If the downloaded data file has 3 records per case, you will answer 3 (even if the third "card-image" record is only partially used).
The preferred method of performing Internet surveys is to store the responses in a file on the server. When using the method, responses are stored in ASCII (.asc) text format. When you're ready to perform an analysis, download the file to your local computer using an FTP program or Auto Transfer. If you use a different FTP program, be sure to set it to downloaded the file as an ASCII (not binary) file.
If you use Auto Transfer, the downloaded file will automatically be imported into StatPac. If you manually download the file, you will need to use this import utility to convert the .asc file to StatPac data.
Downloaded Internet response files are not automatically deleted from your server. Therefore, each time you download the responses, it will be the entire set of responses since the beginning of the survey. StatPac will offer you the choice of deleting the existing data file or appending to the end of it. Since the downloaded file is usually the entire data set, you would normally want to replace the existing data file with the newly downloaded data.
Because of the variety of Email programs, it is not possible to describe the exact steps you must take to import a returned Email survey. Each Email program operates a little differently, and you will need to experiment with your program.
StatPac provides import capabilities for CGI and plain text Email surveys. CGI Email would be produced by a survey placed on a web site that used StatPac's email method of capturing responses. A plain-text survey would be produced by a survey that was simply part of the text in the body of an Email.
Select e-mail as the import type and use the browse button to select the file to be imported. Usually, this would be a .mbx file (i.e., a mailbox in Outlook Express or Eudora where the e-mails were filtered to. Use the browse button to select the existing StatPac codebook and specify the name of the data file. If the data file does not exist, it will be created by the import procedure. If it does exist, the new data will be appended to the end of the existing data file. Finally, select Text as the Email type and click OK. StatPac will advise you if any errors were encountered during the import. If so, the notepad will appear on the menu bar. Click Notepad on your menu bar to see a description of the errors.
Setting Defaults for the Email Import
An e-mail consist of two parts. The first part is the e-mail header and the second part is the contents (or body) of the e-mail. The header contains many lines that are often hidden by e-mail readers, but can be seen by loading an e-mail into the notepad. StatPac must be able to properly identify where the header starts and stops in order to know where the e-mail body begins. The settings in the StatPac.ini file may be adjusted to be compatible with your e-mail reader or language.
The StartEmailHeader and EndEmailHeader settings should be set to the text that begins and ends the header section. The StartEmailHeader parameter should be the text that begins the header section, and the EndEmailHeader parameter should be set to the last Email header line. If you are manually copying and pasting incoming e-mails to mailbox file, it may be important to change these settings. The default values for these parameters are:
StartEmailHeader = Return-Path:
EndEmailHeader = X-UIDL:
Other e-mail parameters may also be set in the StatPac.ini file. StartEmailField and EndEmailField can be used to change the brackets from [ and ] to other characters. The EmailPrefix parameter tells StatPac what line contains the name/e-mail address of the respondent. The EmailVarName is the name of the StatPac codebook variable that will automatically capture the respondent's e-mail address in a plain-text e-mail, and the EmailDateField parameter is used to get the date of the e-mail in order to more precisely report which e-mails contained errors. By modifying these parameters, StatPac can be made to work with any e-mail reader or language.
There are two basic ways that data files can be merged. The first is called concatenation, and it is used to merge two or more data files that contain the same variables in the same order. The second type of merge lets you join data containing different variables. Select Utilities, Merge, and then the type of merge you want to perform.
Many times, several data entry operators will simultaneously enter data into a data file on their own machines. When all the data files have been entered, they can be merged into one large file by concatenating (joining) the data files.
For example, let's say you have three months of data in three separate files (JAN.DAT, FEB.DAT and MAR.DAT). The following DOS command would create a new file called QUARTER1.DAT which contained all three months of data. You could then run your analysis on all the data for the first quarter.
The concatenation-style merge assumes that the codebook(s) for all the data files are exactly the same. The Merge program will let you concatenate any number of data files into a new (larger) data file. You can type the data file names or use the browse button to select data files. Only one data file name should appear per line.
Do not confuse concatenating files with the MERGE utility program. If all your data files reference identical study information (contain the same variables in the same order), use concatenation to merge your data into one file. If your data files, however, contain different variables, use the MERGE utility program.
The merge program allows you to extract selected variables from up to five studies and create an entirely new study that will be saved on disk. If data files have already been entered for any of the studies, they can also be restructured to match the new study format.
Do not confuse the function of this program with data file concatenation. If two data files have identical formats (i.e., they contain the same variables in the same order), the data files should be merged with the concatenation program.
The restructure and merge program can be used to reorganize a single study (and data file) or to combine several studies (and their associated data files). It allows complete versatility with regard to which variables are selected from each of the studies and the order of the variables.
The program will ask for the name(s) of the codebooks, data files, and common variables that will be utilized. For each specified codebook, also enter the name of the associated data file (if one exists). If no data file is specified for a particular study, the program will use blanks for all variables requested from that study.
Also select the common variable in each of the studies. This refers to a variable that can be used to match up the records from each data file (e.g., "CASE ID"). If there is not a common variable, it is imperative that the data files contain the same number of records and in the same order. That is, record one from data file one should represent the same respondent (case) as record one from data file two.
If a data record is missing in any of the data files, it could cause data from one file to be matched with the wrong data from another file. Therefore, it is always a good idea to have a common variable in each of the data files (and associated study information) that represents a unique case identification number. All data files must be sorted by this variable before running this program. If one of the data files is missing a particular record, blanks will be merged into the output file.
Click OK to continue. The study numbers and names will be displayed, and the program will request the format for the new study. The format statement defines the structure of the new codebook.
The general format for creating a new file structure is:
(<Study number>) <Variables> or <Variable range>
An example of a format statement is:
(1) 1-3,8,4 (3) 2-7 (2) 9 14 (1) 12
This statement indicates that the new study format should contain variables in the following order:
From study 1 - variables 1, 2, 3, 8 and 4
From study 3 - variables 2, 3, 4, 5, 6 and 7
From study 2 - variables 9 and 14
From study 1 - variable 12
Notice that the study number is enclosed in parentheses with no spaces. Individual variables may be separated by either commas or spaces. A range of variables is specified by a dash (minus sign) with no spaces on either side of the dash. If the format statement requires more than one line, just continue typing and word-wrap will correctly break the line
Variables may be specified in any order. The study numbers will be displayed at the top of the screen and are assigned by the computer simply for convenience when specifying the new study format. The individual variable numbers for each codebook can be determined by examining the Variable Names windows
The new study format will be checked for validity before processing begins. If errors are found, you will be asked to re-enter the format. The new study and data file (if specified) will be written.
The aggregate utility program creates a new study and data file that consist of aggregate statistics for subgroups of the original data. Any descriptive statistic may be included in the aggregate files. The program allows the creation of both compositional and true aggregate files.
For example, let's say we've distributed a questionnaire to 200 people in each of 50 communities. After performing some preliminary analyses, we want to compare the communities on a number of the interval or ratio-type questions. We could, of course, use the IF-THEN SELECT and WRITE commands to create subfiles for each of the communities and then perform descriptive statistics analyses on each of the subfiles. Obviously, this would be a very time consuming procedure. The aggregate utility program provides a much more efficient way to derive this information.
By using the aggregate program, we could create a new codebook and data file that just contain the descriptive statistics we desire. Each record in the new aggregate file would represent one community. The record would contain the descriptive statistics for the community as a whole (and not the raw data from the original file). Since there are 50 communities, the aggregate file would contain 50 records. This type of aggregate file is called a true aggregate file. It is made up of just the aggregate statistics and does not contain the original data collected. After creating a true aggregate file, the LIST command could be used to print a summary of the descriptive statistic for the communities.
The other type of aggregate file is referred to as a compositional file. Using the same example as above, let's say we want to compare each case in our original file to the descriptive statistic for the community. For example, we might want to compare the individual's age with the mean age in that person's community. In other words, we want each record in the aggregate data file to contain both the original raw data and the descriptive statistic for the community as a whole. The number of records in the compositional aggregate file will contain the same number of records as the original raw data file. However, the aggregate file will contain more variables (the original variables plus the aggregate statistics).
When creating either a true or compositional aggregate file, a new study information file will also be automatically created to match the new aggregate data file.
Before running the aggregate program, the data file must be sorted by the variable that contains the group code. For example, if you plan to create an aggregate file by community, the data file must be sorted by community before running the aggregate utility program. The sort order is not important, however, it is important that all cases from the same community fall together in the file. The aggregate program will accommodate a minimum of 1000 individual groups.
To sort the file, you might use the following procedure:
SORT (A) COMMUNITY
Then run the Aggregate program. It will ask for the codebook name, data file and the variable containing the group code. This refers to the codebook and data file that already exist (not the new aggregate files). The variable containing the group code is the same variable that was used to sort the data file before running this program. In this example, it is the "community" variable. You must also select the type of aggregate file to be created, either compositional or true.
Click OK to continue. Now you can select the variable(s) for which you want to calculate aggregate statistics. Select the desired variable. Then click on the statistics you want for that variable. Each time you click on a statistic, an aggregate statement will be created in the Aggregate Statement window. Each aggregate statement will create one new aggregate variable.
When performing a compositional aggregate procedure, the new aggregate variables will be added to the end of each data record. If the study and data file contain 10 variables, and you type two aggregate statements, the new aggregate variables would be added as variables 11 and 12.
When performing a true aggregate procedure, the first variable in the aggregate file will always be the group code (that is, the variable used to determine the groups). Each aggregate statement will produce a statistic that is added as the next variable in the file. The first aggregate statement would create variable two, the next variable three, and so forth.
Aggregate statistics can only be calculated for numeric-type variables. There is one exception to this rule: If the variable used to split the data file into groups is alpha, you may still calculate the number of valid cases. In our example, if community were coded alpha, it would be acceptable to ask for the number of valid cases (statistic 17) for this variable.
Each aggregate statement you enter will create a new variable in the aggregate file. After entering all the aggregate statements, click OK. A new codebook will be created. The new variable labels in this study will include both the original labels and the types of statistics. After the new study has been created, the program will perform all the aggregate calculations and write the new data file.
Because many calculations are involved in creating an aggregate file, the program may take some time to finish. It will display a message informing you of successful completion.
If any statistic cannot be calculated, or if there are an insufficient number of columns to hold the aggregate statistic, the output file will contain spaces for that variable. For example, if you requested the mode, and the group was multi-modal, the aggregate statistic would be stored as blanks.
There are two utility programs for codebooks. The Quick Codebook Creation utility creates an entire codebook using a single FORTRAN-like statement. The Check Codebook & Data utility is used to verify the integrity of the codebook and to fix errors in the file.
The fastest way to create a codebook is to use the Quick Codebook Creation program. However, this will create a "barebones" codebook consisting of only the format for each variable. In most cases, you’ll want to use the Grid or Variable Detail window to create a new codebook.
Select Analysis, Utilities, Codebook, Quick Codebook Creation. You will need to enter a file name for the new codebook and a format statement. This is essentially a data definition statement and is similar to a FORTRAN style format statement.
The Format Statement defines the number and type of variables that will be in the new study. It is the combination of all the individual variable formats. Using the format statement can save considerable time if variable and value labels are not required, or if you plan to use a fixed format data file from another source.
The syntax for each component of a format statement is:
<No. of Vars.> <Var. Type> <No. of Cols> . <Decimals>
<No. of Vars.> is the number of consecutive variables that use the format defined by the next three parameters. If this component of the format statement is omitted, the default is one.
<Var. Type> is always A or N and refers to whether the variable(s) are alpha or numeric. StatPac automatically left justifies alpha variables and right justifies numeric variables.
<No. of Cols> is the field width allocated for the variables(s). This is the total field width for the variable(s) and it must be large enough to hold a plus or minus sign and a decimal point if necessary.
. <Decimals> is the number of significant decimal places that the variable(s) will contain. This component of the format statement is optional and may be omitted. If <decimals> is not specified, the data will be stored exactly as entered (with or without a decimal point).
Examples of Format Statements
This utility program will verify the integrity of a codebook and data file. If errors are found the program will attempt to fix them. If you have created a codebook to match a foreign data file (one created by a program other than StatPac), use this program to make sure that the data record lengths match the codebook you created.
Select the codebook and data file to be checked and click OK. If the program corrects any errors, they will be listed in the notepad.
The Sampling program is used to generate a random number table, create a random digit dialing table for telephone studies, and to select a random sample from a data file.
When planning to conduct a survey, choosing the sample is just as important as the survey itself. If the sample is incorrectly chosen, any results are likely to be distorted. That is, the characteristics of the sample will not represent the characteristics of the population.
One of the best ways to choose a sample is to use a random sampling technique. If the sample is randomly chosen from the population, it will represent the population. That is, characteristics of the sample are likely to be found in similar proportions in the population.
The classical method of selecting the sample is to give each case in the population a number and then randomly select numbers until the sample size is achieved. The second function of this program is to print a random number table.
You should first select whether the numbers should be selected with or without replacement. When replacement is used, a number may be selected more than once (selection does not eliminate it from being available for future selection). When random numbers are selected without replacement, the selection of a number eliminates it from the pool of available numbers. The algorithm used for selection without replacement will display the random numbers in sequential order.
Enter the number of random numbers you want to be printed. This relates to the sample size determined with the Statistics Calculator. Be sure to add a sufficient number to the ideal sample size to accommodate a pilot test and replacement of nonresponders (if part of your study design).
Enter the smallest allowable random number and the largest allowable random numbers. Typically, the lowest value would be one and the highest value would be equal to the number of cases in the population.
Enter the name of the StatPac codebook and data file to store the random numbers and click OK. A StatPac codebook and data file will be created that contains one variable called "RANDOM". Finally, the random numbers will be displayed in a compressed format in the Notepad. You do not need to save them with Notepad since they are already stored in a StatPac data file.
Telephone surveys sometimes use random digit dialing to secure the sample. While this method will result in many non-working or non-voice numbers, it will produce a random sample of people who have telephones. Since local prefix codes are set (i.e., predefined by the phone company), only the last four digits of a phone number can be randomly selected. The random number method of creating a telephone file allows you to specify a series of local prefix codes and the number of random telephone numbers you want created for each prefix code.
There is an important consideration to keep in mind when creating a random digit file. Many of the random numbers will not be useful. For example, a number may be non-existent, a business office, or a fax or computer line. There are several algorithms for maximizing the number of home phone numbers, however, these techniques have generally produced poor results and are not included in StatPac. Therefore, it is usually a good idea to select more phone numbers than you actually need.
The random number utility program allows you to specify any number of prefixes and to specify how many numbers you want from each prefix. For local surveys, the prefix will be three digits (the local exchange); for long distance surveys, the prefix will be seven digits (i.e., 1 + three digits for the area code + three digits for the local exchange).
In the Local Exchange examples on the screen display, 50 numbers would be created with a 929 prefix and 35 numbers would be created with a 987 prefix. For the Long Distance examples, 25 numbers would be created that begin with 1-612-925 and 50 numbers would be created that begin with 1-807-927.
After you have finished typing the prefixes and quantities, click OK to create the phone number file. A StatPac codebook and data file will be created that contains one variable "TELEPHONE_NUMBER". Finally, the random numbers will be displayed in a compressed format in the Notepad. You do not need to save them with Notepad since they are already stored in a StatPac data file.
The actual technique used to create the file is called random number selection without replacement. This means that as a phone number is selected, it will be eliminated from the pool of available numbers for the next selection. This eliminates the possibility of selecting the same number (with the same prefix) twice.
Depending on the number of prefixes and the quantities from each prefix, the actual creation of the file may take a little while. Please be patient; the program will inform you when the sample selection has been completed.
With this utility, you can select a specified number of random records from a data file and write them to a new data file. If you have a very large data base and a long procedure file, you might use this utility to create a shorter data file, and perform a test run of the procedure file on it.
Enter for the name of the existing data file, the new data file, and the number of records to be selected and written to the new data file.
Many data entry operators use a double entry method of data verification. Data is entered into one data file and the same data is re-entered into another data file. The two data files are then compared for differences.
The purpose of this utility program is to identify possible errors in the data; it does not have any editing features.
Enter the name of the StatPac codebook and the names of the two data files to be compared. The data files should contain the same number of records in the same order.
Upon completion, the total number of errors will be reported. If differences are found, the record numbers and which variables are different will be shown in the Notepad. Use the notepad to print the errors listing
StatPac supports only two data types, alpha and numeric. This can make it difficult to work with dates and currency variables. These utilities simplify the task of working with date and currency variables.
The conversion utilities read an existing codebook and data file, and create a new codebook and data file with a new converted variable(s). The original date or currency variable is not modified and will remain “as is” in the codebook and data. Instead, a new variable (the converted field) is created and added to the end of the codebook and data.
The conversion utilities also offer a way to change dichotomous multiple response variables to the multiple response format required by StatPac.
The most common functions with dates are sorting and selecting. Typically, a user would create an alpha variable for a date variable because it contains non-numeric characters such as slashes or dashes. Regardless of the format, sorting by date or selecting the records between two dates can be difficult unless the date can be readily converted to a numeric eight-column (N8) variable in the format YYYYMMDD.
The first function will take one or more date variables in any format and create new N8 variable(s) in YYYYMMDD format. The new N8 variable(s) can be used with the Sort command to sort a file by date. It can also be used with the Select command to select a range of dates.
The second function will calculate the number of days between two dates. The two dates can be any date format and the new variable (number of days) will be an N5 format. The absolute value of the difference between the two dates will be calculated and added to the end of the new codebook and data file.
The third function will create an English text version of a date in “D Mon, YYYY” format (e.g., 5 Oct 2005). The purpose is to make it possible for the user to subsequently use the List command to create an easily readable listing of the data.
The currency conversion utility is useful for adding or removing the $ or £ symbols, interpreting a K or M suffix, and removing commas from currency fields.
When conducting internet surveys (where the respondent is entering their own response) currency fields can create problems. You can require numeric input but that is often frustrating for respondents who want to enter something like 50K or 10M or $25,000. If you believe respondents will want to enter anything other than a number, you can specify the field as alpha in the codebook (which will accept any input from the respondent). After the survey is closed, use this utility to convert the data to a numeric field.
The CurrencySymbol setting in the defaults (StatPac.ini) file can be set to your country’s currency symbol. When converting the alpha field to a number, commas will be removed, the letter K will multiple the value times a thousand, and the letter M will multiple the value times a million.
The dichotomous multiple response conversion utility is useful when you have imported data from an external source that coded multiple response variables in a dichotomous format.
For example, data in the external file might be coded as ones and blanks, where a one means the respondent selected the attribute and blank means they didn't.
Assume the question was "What are your favorite colors?" Imported data might look like this:
After importing, you could write a procedure to convert the data to the multiple response format used by StatPac. It would look like this:
Labels V1=What are your favorite colors?
Labels V2=What are your favorite colors?
Labels V3=What are your favorite colors?
Labels V4=What are your favorite colors?
Labels V1-V4 (1=Orange)(2=Blue)(3=Yellow)(4=Red)
Recode V2 (1=2)
Recode V3 (1=3)
Recode V4 (1=4)
This will work fine although it is cumbersome. When there are ten or more variables in the multiple response group, it becomes more difficult because the imported variables are likely coded as N1, while the StatPac variables need to be coded as N2.
Two methods are incorporated into StatPac to deal with imported data that use dichotomous multiple response.
The first method is in the frequencies program itself. The MX=Code option can be added to the frequency program to tell StatPac that the variables are dichotomous. Then "Code" is the single character value that indicates the item is selected. In the above example, the data was coded as ones and blanks, so MX would be set to 1. If the data had been coded as Y and N, then MX would be set to Y.
Options MR=Y MX=1
Using this method does not actually change the data file. StatPac just reads the data differently for the frequencies procedure. An exclamation mark cannot be used to permanently set the MX option. It must be explicitly specified in each procedure where you want to use it.
The other method is to actually convert the data file to the format used by StatPac for multiple response. If you plan to do banners or other procedures that utilize the dichotomous multiple response variables, then it is best to permanently alter the data. After conversion, the above data set would look like this:
The conversion utility lets you select several sets of variables that are dichotomous multiple response, however they are done one at a time. The first screen lets you select the codebook and data file, and specify a name for the new (converted) codebook and data file. After selecting the codebook, the variable names will appear so they can be selected.
After selecting the variables that make up the first multiple response group, click the plus button to add them to the conversion list.
The second screen lets you set the code and labeling for the selected group of variables.
The code is the dichotomous value that indicates the item is selected.
Since the imported data doesn't have a single variable name for the group of variables, StatPac names them MR_Group_A, MR_Group_B, etc. After the conversion, the converted variables in the group will be named using the _x convention (e.g., MR_Group_A_1, MR_Group_A_2, MR_Group_A_3, and MR_Group_A_4). Thus, you might want to change the variable name to something more meaningful, For example, if you changed the name to Color, the converted variables would be named Color_1, Color_2, Color_3, and Color_4.
Similarly, the variable label might be changed to the actual question. All the converted variables will use that variable label. You could change "Multiple Response Group A: V1-V4" to "What are your favorite colors?"
After you are satisfied with the conversion labeling, click the Convert button. This will return you to the first screen where you can select an additional set of multiple response variables and click the plus button to add them to the conversion.
After you have finished selecting all the groups of multiple response variables, click OK to perform the conversion. The new codebook and new data file will then contain the multiple response variables in StatPac format.