2.1.1. Variable Selection Dialogue
When a procedure is selected from the pull-down menu, a Variable Selection Dialogue will be opened first (with the exception of a few procedures that do not require data, such as Plot of Functions, Critical Value, Cumulative Probability, Sample Size and Power Estimation). In some dialogues, there will be a section on top, offering various options for the type of data to be analysed. Under this, there will be a list box on the left displaying all available variables. There will also be one or more buttons (task buttons) displayed at the centre and a list box on the right corresponding to each button. These are used to assign specific tasks to selected variables.
For instance, in X-Y Plots procedure, the X-Axis variable and Y-Axis variables will have two separate task buttons. The type and number of task buttons and their corresponding list boxes are specific to each procedure. They will also differ within a procedure according to the type of data used. It is possible to highlight multiple items in any list and send them to any other list either by clicking on task buttons or by drag-dropping using the right-mouse button. In some procedures, Variable Selection Dialogues may contain other controls like check or text boxes.
Not all selections are compulsory in a Variable Selection Dialogue. It is possible, for instance, to plot an X-Y line diagram without selecting an X-Axis variable, or to display a summary statistics table without having to select a categorical data (factor) variable. Compulsory variables are marked bold and [Next] and [Finish] buttons are not enabled unless all compulsory variables are selected.
2.1.1.1. Data Type Selection
Variable Selection Dialogues of some procedures will have a section on top, displaying options for the type of data to be analysed. For instance, in t- and F-Tests procedure, it is possible to run a test between two columns in the spreadsheet by selecting them as [Variable]s. It is also possible to select one or more optional categorical variables (factors) to run the test between the subgroups defined by categories. There will also be an option to run the tests between the selected variables, but only for the rows defined by some categories. All this is possible while the first data option is selected.
When the second data option is selected, the dialogue will be updated to display a different set of task buttons. In this case, you can select two data columns by clicking on [Column 1] and [Column 2]. Next, the program will ask for a cut-point for the second column, which will divide it into two groups smaller than and greater than or equal to the cut-point. The test will then be performed between these two groups using the values in data in Column 1.
When the third data option is selected, you can perform t- and F-Tests without selecting any data columns. If you know all the parameter values required, you will be able to run a t-test or F-test without having the data set itself.
This t-test example demonstrates only one of the many data type options available in UNISTAT. For other types of options see sections 5.0. Overview for descriptive statistics and 6.0. Overview for statistical tests.
2.1.1.2. Variable Selection by Task Buttons
On entry, all available data columns will be listed on the left (the Variables Available list) and the lists on the right (the Variables Selected lists) will be empty. In these lists, the numeric variables are referred to as C1, C2, etc., followed by their Column Labels, if any. String variables are distinguished from numeric variables in that the letter C in their column reference is replaced by S or L. Similarly, date variables will be represented by the letter D and time variables by T. For the details of these different types of data see 3.0.2. Data Types.
When one or more items are highlighted on the Variables Available list, an arrow will appear on the right hand side of each task button pointing to the right. When you click on a task button, the highlighted variables on the Variables Available list will be moved to the list that is immediately to the right of this particular button. Variables selected for one task are removed from the Variables Available list and therefore they cannot be selected for another task simultaneously. Exceptions to this are the regression and GLM Variable Selection Dialogues where variables can be selected for more than one purpose.
In order to deselect a variable from the analysis, click on this variable on the right Variables Selected list. A left-pointing arrow will be located to the left of the task button corresponding to this list. Clicking on the button will deselect the variable and add it to the Variables Available list on the left. Multiple highlighting and drag-and-drop will work for all list boxes. There are two types of selection buttons:
1) Buttons that allow any number of items (like [Variable], [Factor]).
2) Buttons that allow a limited number of items (like [X axis], [Column 1]). If a procedure allows only one variable to be assigned a certain task, then the button for this task will be disabled for further selections. To assign the task to a different variable, the selected variable must first be deselected.
Buttons used for selecting columns are specific to procedures. However, it is possible here to give a brief description of the most commonly used buttons.
[Variable]: Selects any number of variables for analysis. Examples: Y-axis variables in X-Y Plots, independent variables in Regression Analysis, test variables in statistical tests, etc. The order of selection is significant, that is, the analysis will be carried out on the selected variables in the order they appear in the Variables Selected list.
[Factor]: Selects categorical variables that define subgroups of one or more continuous variables. The order of selection is significant. The number of factors selected is usually unlimited but in some procedures it may be limited to one (as in Survival Analysis).
[Dependent]: Selects columns containing continuous data, usually for use in Regression Analysis and Analysis of Variance procedures.
[X-axis], [Y-axis], [Z-axis]: Select columns for X, Y or Z axis of a graph.
[Weight]: Selects a column as weights in the analysis of other columns.
2.1.1.3. Variable Selection by Drag-Drop
It is also possible to assign tasks to selected (highlighted) variables by pressing down the right mouse button, dragging them on a Variables Selected list and dropping. Conversely, the highlighted variables can be deselected by drag-dropping them from a Variables Selected list to the Variables Available list. Drag-drop also works between different Variables Selected lists.
2.1.1.4. Variable Selection from Data Processor
In Stand-Alone Mode, another method of selecting variables for analysis is to highlight them in the Data Processor. Non contiguous blocks of columns can be selected by holding down the <Ctrl> key whilst clicking on the column label. When a procedure is selected, the program will automatically issue a [Finish] command and proceed to perform the procedure with default selections. Normally, the user will obtain the output without seeing any dialogues.
Columns of data highlighted in the Data Processor are assigned the non-specific [Variable] task. This will be sufficient to generate an output with default settings in many procedures. If the selected procedure requires further compulsory variable assignments (e.g. a dependent variable for Regression Analysis), then the program will issue a warning and display the relevant Variable Selection Dialogue.
It is also possible to select a block of cells (instead of entire columns) to run a procedure on the selected range only. If a block of cells is highlighted and, say Summary Statistics is selected, the program will automatically generate a Select Row variable and run the procedure on the selected block only. All procedures will run on the selected cases as long as the Select Row column remains in effect.