BEST Viewpoints: Relational Viewpoints

Introduction

This module is designed to identify relations between variables. The methods used in this module to analyze data are sometimes referred as associations analysis or as Market Basket Analysis. More information about the algorithms used in this module are provided in the documentation for Market Basket Analysis.
To illustrate how to use this module a new dataset is loaded from the examples as shown below. Reading the documentation for Hierarchical Viewpoints will help understand many of the options for this module which are not described in this section. This is done to avoid repeating the same explanation since many of the options for Hierarchical Viewpoints are repeated in this module.
The loaded dataset contains sales data for employees in different countries. The OrderID is an important field because it will be used to define a "basket" or grouping category on the next examples.
The basic appearance of the Relational Viewpoints module is shown below. The Analysis window shows field names of the data loaded which can be used for analysis. Initially the user needs to select at least two fields to start generating results.
Selecting Product followed by OrderID has the effect of identifying Product associations when data is grouped by OrderID. The results are displayed below. The Associations web tells that the couple InkJet - Laptop is present in 14 orders. This is the strongest association found. To simplify the output and to make the results more meaningful, not all associations found are displayed but only the strongest ones.
To display more relations the slider named Grow can be moved to the right. Similarly, the slider named Trim has the effect of removing weak associations from the currently displaying web.
Additionally, the Output is by default set to Web, but by selecting Web Data the relations can also be presented in tabular form as shown below. Web Data presents the associations present in the Web, however, selecting All Associations shows all associations found in data.
Pressing the Group By button generates the Data Structure window to help explain how the data has been grouped based on the fields selection in the Analysis menu. As expected, selecting Product followed by OrderId created a set of baskets where each basket contains all the products sold in each order.

Build Menu

The Build Menu provides options to modify how the web is displayed and also options that affect how or which calculations are made. The Setup tab is mostly used for displaying options. As an example note that setting the Plot Method to Radial changes how the web is drawn.

Build Menu: Calculate

The first set of options in the Calculate tab are named Data because are options intended to change the way input data is used for analysis. First of all the default data Type is List of Lists and this will always be the case if the input dataset is a rectangular array as a database table.
The Distinct checkbox is selected by default such that only distinct items in baskets are used for identifying associations. In some applications the analyst may want to consider all possible couples (or permutations instead of combinations) in data. The image below shows that although the associations found are the same, the number of associations is larger when Distinct is not selected. Additionally, self-loops indicate that in some baskets there were more than one item of the same class thus, for example, Ink Cartridge has 11 accumulated self associations.
The Ordered option generates even more information as it is used to take into account the order of events. Later in this document an example will be presented where this capability becomes very useful. For now, the image below shows the effect of enabling the Ordered option in the previous example.

Build Menu: Evaluation

The Evaluation sub menu provides options to help users learn using Relational Viewpoints and to avoid trying to solve extremely complicated problems. The User Skill is by default set to Basic to force the user selecting a Group-By field for analysis. This is made because building baskets this way is most of the time what the user needs to do. However, setting User Skill to Advanced will allow the user define how to build the basket freely. As explained earlier, the Data Structure window is useful to help understand how baskets are being built based on the user selections on the Analysis window. The Moderate skill level forces grouping by at least one field when more than one analysis field is selected.
The example below shows that when the User Skill is set to Advanced selecting one field has the effect of considering all the records in that field as one basket. In this case Distinct is selected so the information in the web basically tells that there are five different products in the field Product.
In the example above if the Distinct option is deselected then all possible combinations of two elements in the basket are identified and counted (see image below). The strongest association is between InkJet and Ink Cartridge.
The example below shows that baskets can be defined in many ways. In particular grouping by Country and Employee has the effect of creating baskets for each combination of Country-Employee found in data.
As an example of the many ways to define baskets consider the example below where Country-Product are the elemental items in each OrderID. The practical implication of this setup is that now the Market Basket Analysis is being performed independently for each Country and the differences between countries can be easily evaluated. Note that the order in which analysis fields were selected is important. At the top menu it can be read that the program is Analyzing: Country-Product and not vice-versa.
For information on the meaning of Evaluation Method and Complexity Limit refer to the documentation for Market Basket Analysis. In general, these are options that warn the user if the problem gets too complicated. Simple problems usually have Complexity Limit below 1 so a quick response will be attained. When Evaluation Method is set to Automatic the complexity limit will be estimated prior to trying to solve the problem and the user will be warned if the user-defined value of Complexity Limit is exceeded. If this happens then the user can choose between 1)not solving the problem, 2) increase the Complexity Limit value, or to set the Evaluation Method to Solution to force the program to find a solution independently of the time it takes.
The image below shows the Problem Complexity Index window shown when the complexity limit is exceeded or when Evaluation Method is set to Complexity.

Build Menu: Basket Size

For problems like the Product by OrderID example it is useful to have statistics for the Basket Size (or total items per basket or group) that allow the analyst to limit analysis to baskets that meet certain criteria. For example, for retail data the analyst may be interested in differentiating small baskets (e.g. 10 items or less) from large ones. The sliders provided allow the user to limit analysis to baskets which size is greater than or equal to a lower limit (the From value in the first slider) to values less than or equal to an upper limit (the To value in the second slider). By default these From-To values are Auto-Reset such that all baskets are included but the analyst may prefer to set other values by means of the sliders.
To confirm that the analysis was made as specified the Basket Size statistics can be calculated using the Basket button. A window with some basket statistics will be generated (see example below).
Additionally, it is important to estimate from the baskets analyzed how likely is to have a given product present in a customer basket. This is often called Penetration analysis and is also provided as an option. For a particular product or item PenetrationDistinct is calculated as FrequencyDistinct / Total Baskets, where FrequencyDistinct is the number of baskets where the product was present, and Total Baskets is the total number of baskets being analyzed. Thus, products with high penetration are those that are likely to be present in a customer basket.

Scope

The Scope tab is used to simplify the associations web by making relevant (or not relevant) a set of user-defined nodes. The idea is to make sure that relevant nodes are always displayed in the web, while not-relevant nodes will only be displayed when connecting to relevant nodes. Scoping options are also present in each node as shown in the image below.
For example, by selecting Make Relevant for InkJet we get a web where only strong associations to this product are shown (see image below). It is possible to select more than one relevant node to emphasize the associations to more than one product.
When Make Not-Relevant is applied to the selected nodes, these nodes will only be displayed if they have connections to relevant nodes.

Categories and Data

The options available in the Categories and the Data tabs are analog to those available and discussed in the Hierarchical Viewpoints module, thus the reader is referred to the pertinent sections in this documentation.