Using LikertMakeR

Download and Install LikertMakeR

from CRAN

> ```
>
> install.packages("LikertMakeR")
> library(LikertMakeR)
>
> ```

development version from GitHub.

> ```
> 
> library(devtools)
> install_github("WinzarH/LikertMakeR")
> library(LikertMakeR)
>
> ```

Generate synthetic rating-scale data

To synthesise a rating scale with LikertMakeR, the user must input the following parameters:

n: sample size
mean: desired mean
sd: desired standard deviation
lowerbound: desired lower bound
upperbound: desired upper bound
items: number of items making the scale - default = 1

The previous version of LikertMakeR had a function, lexact(), which was very slow and no more accurate than lfast(). So, lexact() is now deprecated.

lfast()

lfast() draws repeated random samples from a scaled Beta distribution. It produces a vector of values with mean and standard deviation correct to two decimal places.

lfast() example

a four-item, five-point Likert scale


x1 <- lfast(
 n = 512,
 mean = 2.5,
 sd = 0.75,
 lowerbound = 1,
 upperbound = 5,
 items = 4
)
#> [1] "best solution in 645 iterations"

an 11-point likelihood-of-purchase scale

lfast()


x3 <- lfast(256, 3, 2, 0, 10)
#> [1] "best solution in 9293 iterations"

lexact()

lexact() Deprecated. lexact() is now simply a wrapper for lfast().

Correlating rating scales

The function, lcor(), rearranges the values in the columns of a data-set so that they are correlated at a specified level. It does not change the values - it swaps their positions within each column so that univariate statistics do not change, but their correlations with other vectors do.

lcor()

lcor() systematically selects pairs of values in a column and swaps their places, and checks to see if this swap improves the correlation matrix. If the revised data-frame produces a correlation matrix closer to the target correlation matrix, then the swap is retained. Otherwise, the values are returned to their original places. This process is iterated across each column.

To create the desired correlated data, the user must define the following parameters:

data: a starter data set of rating-scales. Number of columns must match the dimensions of the target correlation matrix.
target: the target correlation matrix.

lcor() example

Let’s generate some data: three 5-point Likert scales, each with five items.


## generate uncorrelated synthetic data
n <- 128
lowerbound <- 1
upperbound <- 5
items <- 5

mydat3 <- data.frame(
 x1 = lfast(n, 2.5, 0.75, lowerbound, upperbound, items),
 x2 = lfast(n, 3.0, 1.50, lowerbound, upperbound, items),
 x3 = lfast(n, 3.5, 1.00, lowerbound, upperbound, items)
)
#> [1] "best solution in 994 iterations"
#> [1] "best solution in 763 iterations"
#> [1] "best solution in 1724 iterations"

The first six observations from this data-frame are:

#>    x1  x2  x3
#> 1 2.6 3.4 4.4
#> 2 2.8 1.0 3.2
#> 3 2.2 5.0 3.2
#> 4 4.0 1.4 3.8
#> 5 1.6 5.0 3.8
#> 6 2.2 5.0 2.0

And the first and second moments (to 3 decimal places) are:

#>         x1  x2    x3
#> mean 2.502 3.0 3.500
#> sd   0.750 1.5 0.999

We can see that the data have first and second moments very close to what is expected.

The synthetic data have low correlations:

#>        x1     x2    x3
#> x1  1.000  0.038 -0.01
#> x2  0.038  1.000 -0.13
#> x3 -0.010 -0.130  1.00

Now, let’s define a target correlation matrix:


## describe a target correlation matrix

tgt3 <- matrix(
 c(
 1.00, 0.85, 0.75,
 0.85, 1.00, 0.65,
 0.75, 0.65, 1.00
 ),
 nrow = 3
)

So now we have a data-frame with desired first and second moments, and a target correlation matrix.


## apply lcor() function

new3 <- lcor(mydat3, tgt3)

The first column of the new data-frame will not change, but values of the other columns are rearranged.

The first six observations from this data-frame are:

#>    V1 V2  V3
#> 1 4.2  5 5.0
#> 2 1.0  1 1.4
#> 3 1.4  1 1.6
#> 4 3.6  5 4.8
#> 5 4.2  5 5.0
#> 6 1.2  1 1.8

And the new data frame is correlated close to our desired correlation matrix; here presented to 3 decimal places:

#>      V1   V2   V3
#> V1 1.00 0.85 0.75
#> V2 0.85 1.00 0.65
#> V3 0.75 0.65 1.00

Generate a correlation matrix from Cronbach’s Alpha

makeCorrAlpha()

makeCorrAlpha(), constructs a random correlation matrix of given dimensions and predefined Cronbach’s Alpha.

To create the desired correlation matrix, the user must define the following parameters:

items: or “k” - the number of rows and columns of the desired correlation matrix.
alpha: the target value for Cronbach’s Alpha
variance: a notional variance coefficient to affect the spread of values in the correlation matrix. Default = ‘0.5’. A value of ‘0’ produces a matrix where all off-diagonal correlations are equal. Setting ‘variance = 1.0’ gives a wider range of values.

makeCorrAlpha() is volatile

Random values generated by makeCorrAlpha() are highly volatile. makeCorrAlpha() may not generate a feasible (positive-definite) correlation matrix, especially when

variance is high relative to
- desired Alpha, and
- desired correlation dimensions

makeCorrAlpha() will inform the user if the resulting correlation matrix is positive definite, or not.

If the returned correlation matrix is not positive-definite, a feasible solution may be still possible, and often is. The user is encouraged to try again, possibly several times, to find one.

makeCorrAlpha() examples

Four variables, alpha = 0.85, variance = default


## define parameters 
 items <- 4
 alpha <- 0.85
 # variance <- 0.5 ## by default

## apply makeCorrAlpha() function
 set.seed(42)
 
 cor_matrix_4 <- makeCorrAlpha(items, alpha)
#> correlation values consistent with desired alpha in 2261 iterations
#> The correlation matrix is positive definite

makeCorrAlpha() produced the following correlation matrix (to three decimal places):

#>       [,1]  [,2]  [,3]  [,4]
#> [1,] 1.000 0.445 0.481 0.577
#> [2,] 0.445 1.000 0.632 0.647
#> [3,] 0.481 0.632 1.000 0.735
#> [4,] 0.577 0.647 0.735 1.000

test output with Helper functions


## using helper function alpha()

 alpha(cor_matrix_4)
#> [1] 0.8500036

 
## using helper function eigenvalues() 

eigenvalues(cor_matrix_4, 1)

#> cor_matrix_4  is positive-definite
#> 
#> Eigenvalues:
#>  2.771181 0.5922674 0.3851331 0.2514183
#> [1] 2.7711812 0.5922674 0.3851331 0.2514183

twelve variables, alpha = 0.90, variance = 1


## define parameters 
 items <- 12
 alpha <- 0.90
 variance <- 1.0

## apply makeCorrAlpha() function
 set.seed(42) 

 cor_matrix_12 <- makeCorrAlpha(items, alpha, variance)
#> correlation values consistent with desired alpha in 26735 iterations
#> The correlation matrix is positive definite

-

makeCorrAlpha() produced the following correlation matrix (to two decimal places):

#>        [,1]  [,2]  [,3]  [,4]  [,5]  [,6]  [,7]  [,8] [,9] [,10] [,11] [,12]
#>  [1,]  1.00 -0.31 -0.26 -0.25 -0.17 -0.16 -0.08 -0.05 0.04  0.07  0.08  0.10
#>  [2,] -0.31  1.00  0.10  0.17  0.19  0.19  0.20  0.20 0.21  0.25  0.34  0.34
#>  [3,] -0.26  0.10  1.00  0.35  0.36  0.36  0.37  0.41 0.42  0.42  0.43  0.45
#>  [4,] -0.25  0.17  0.35  1.00  0.45  0.46  0.46  0.46 0.48  0.49  0.49  0.52
#>  [5,] -0.17  0.19  0.36  0.45  1.00  0.52  0.53  0.53 0.54  0.58  0.59  0.60
#>  [6,] -0.16  0.19  0.36  0.46  0.52  1.00  0.61  0.62 0.64  0.65  0.66  0.67
#>  [7,] -0.08  0.20  0.37  0.46  0.53  0.61  1.00  0.67 0.67  0.70  0.71  0.72
#>  [8,] -0.05  0.20  0.41  0.46  0.53  0.62  0.67  1.00 0.74  0.77  0.79  0.81
#>  [9,]  0.04  0.21  0.42  0.48  0.54  0.64  0.67  0.74 1.00  0.83  0.88  0.91
#> [10,]  0.07  0.25  0.42  0.49  0.58  0.65  0.70  0.77 0.83  1.00  0.92  0.92
#> [11,]  0.08  0.34  0.43  0.49  0.59  0.66  0.71  0.79 0.88  0.92  1.00  0.95
#> [12,]  0.10  0.34  0.45  0.52  0.60  0.67  0.72  0.81 0.91  0.92  0.95  1.00

test output


 alpha(cor_matrix_12)
#> [1] 0.9000022

 eigenvalues(cor_matrix_12, 1) |> round(3)

#> cor_matrix_12  is positive-definite
#> 
#> Eigenvalues:
#>  6.6755 1.414525 0.935596 0.6735671 0.5536723 0.5100483 0.3867301 0.3380953 0.2605829 0.1603447 0.05985932 0.03147918
#>  [1] 6.675 1.415 0.936 0.674 0.554 0.510 0.387 0.338 0.261 0.160 0.060 0.031

Generate a dataframe of rating scales from a correlation matrix and predefined moments

makeItems()

makeItems() generates a dataframe of random discrete values from a scaled Beta distribution so the data replicate a rating scale, and are correlated close to a predefined correlation matrix.

Generally, means, standard deviations, and correlations are correct to two decimal places.

makeItems() is a wrapper function for

lfast(), which takes repeated samples selecting a vector that best fits the desired moments, and
lcor(), which rearranges values in each column of the dataframe so they closely match the desired correlation matrix.

To create the desired dataframe, the user must define the following parameters:

n: number of observations
dfMeans: a vector of length ‘k’ of desired means of each variable
dfSds: a vector of length ‘k’ of desired standard deviations of each variable
lowerbound: a vector of length ‘k’ of values for the lower bound of each variable (For example, ‘1’ for a 1-5 rating scale)
upperbound: a vector of length ‘k’ of values for the upper bound of each variable (For example, ‘5’ for a 1-5 rating scale)
cormatrix: a target correlation matrix with ‘k’ rows and ‘k’ columns.

makeItems() examples


## define parameters

 n <- 128
 dfMeans <- c(2.5, 3.0, 3.0, 3.5)
 dfSds <- c(1.0, 1.0, 1.5, 0.75)
 lowerbound <- rep(1, 4)
 upperbound <- rep(5, 4)
 
 corMat <- matrix(
 c(
 1.00, 0.25, 0.35, 0.45,
 0.25, 1.00, 0.70, 0.75,
 0.35, 0.70, 1.00, 0.85,
 0.45, 0.75, 0.85, 1.00
 ),
 nrow = 4, ncol = 4
 )

## apply makeItems() function
 df <- makeItems(
 n = n, 
 means = dfMeans, 
 sds = dfSds, 
 lowerbound = lowerbound, 
 upperbound = upperbound, 
 cormatrix = corMat
 )
#> [1] "best solution in 16384 iterations"
#> [1] "best solution in 16384 iterations"
#> [1] "best solution in 727 iterations"
#> [1] "best solution in 16384 iterations"

## test the function
 head(df); tail(df)
#>   V1 V2 V3 V4
#> 1  2  1  1  2
#> 2  2  1  1  2
#> 3  4  5  5  5
#> 4  4  5  5  5
#> 5  2  1  1  2
#> 6  3  3  4  4
#>     V1 V2 V3 V4
#> 123  2  2  1  3
#> 124  2  3  1  3
#> 125  3  4  4  4
#> 126  4  3  2  4
#> 127  1  4  2  4
#> 128  1  3  3  4
 apply(df, 2, mean) |> round(3)
#>  V1  V2  V3  V4 
#> 2.5 3.0 3.0 3.5
 apply(df, 2, sd) |> round(3)
#>    V1    V2    V3    V4 
#> 1.004 1.004 1.501 0.753
 cor(df) |> round(3)
#>       V1   V2   V3    V4
#> V1 1.000 0.25 0.35 0.448
#> V2 0.250 1.00 0.70 0.750
#> V3 0.350 0.70 1.00 0.850
#> V4 0.448 0.75 0.85 1.000

Generate a dataframe from Cronbach’s Alpha and predefined moments

This is a two-step process:

apply makeCorrAlpha() to generate a correlation matrix from desired alpha,
apply makeItems() to generate rating-scale items from the correlation matrix and desired moments

So required parameters are:

k: number items/ columns
alpha: a target Cronbach’s Alpha.
n: number of observations
lowerbound: a vector of length ‘k’ of values for the lower bound of each variable
upperbound: a vector of length ‘k’ of values for the upper bound of each variable
means: a vector of length ‘k’ of desired means of each variable
sds: a vector of length ‘k’ of desired standard deviations of each variable

Step 1: Generate a correlation matrix


## define parameters
k <- 6
alpha <- 0.85

## generate correlation matrix
set.seed(42)
myCorr <- makeCorrAlpha(k, alpha)
#> correlation values consistent with desired alpha in 2425 iterations
#> The correlation matrix is positive definite

## display correlation matrix
myCorr |> round(3)
#>       [,1]  [,2]  [,3]  [,4]  [,5]  [,6]
#> [1,] 1.000 0.123 0.314 0.394 0.422 0.433
#> [2,] 0.123 1.000 0.453 0.455 0.480 0.486
#> [3,] 0.314 0.453 1.000 0.531 0.537 0.639
#> [4,] 0.394 0.455 0.531 1.000 0.657 0.671
#> [5,] 0.422 0.480 0.537 0.657 1.000 0.690
#> [6,] 0.433 0.486 0.639 0.671 0.690 1.000

### checking Cronbach's Alpha
alpha(myCorr)
#> [1] 0.8499957

Step 2: Generate dataframe


## define parameters
n <- 256
myMeans    <- c(2.75, 3.00, 3.00, 3.25, 3.50, 3.5)
mySds      <- c(1.00, 0.75, 1.00, 1.00, 1.00, 1.5)
lowerbound <- rep(1, k)
upperbound <- rep(5, k)

## Generate Items
myItems <- makeItems(n, myMeans, mySds, lowerbound, upperbound, myCorr)
#> [1] "best solution in 2617 iterations"
#> [1] "best solution in 30 iterations"
#> [1] "best solution in 2383 iterations"
#> [1] "best solution in 271 iterations"
#> [1] "best solution in 10740 iterations"
#> [1] "best solution in 4776 iterations"

## resulting data frame
head(myItems)
#>   V1 V2 V3 V4 V5 V6
#> 1  3  1  2  1  1  1
#> 2  4  4  4  5  5  5
#> 3  5  5  5  5  5  5
#> 4  1  1  2  1  1  1
#> 5  5  5  5  5  5  5
#> 6  1  1  1  1  1  1
tail(myItems)
#>     V1 V2 V3 V4 V5 V6
#> 251  3  4  2  4  2  5
#> 252  2  4  4  3  3  5
#> 253  3  3  4  4  4  5
#> 254  2  3  4  4  3  1
#> 255  3  4  3  4  5  5
#> 256  3  3  4  3  4  4

## means and standard deviations
myMoments <- data.frame(
  means = apply(myItems, 2, mean) |> round(3),
  sds = apply(myItems, 2, sd) |> round(3)
) |> t()
myMoments
#>          V1    V2    V3    V4    V5  V6
#> means 2.750 3.000 3.000 3.250 3.500 3.5
#> sds   0.998 0.751 1.002 0.998 1.002 1.5

## Cronbach's Alpha of data frame
alpha(NULL, myItems)
#> [1] 0.8498222

Summary plots of new data frame

Helper functions

likertMakeR() includes two additional functions that may be of help when examining parameters and output.

alpha() calculates Cronbach’s Alpha from a given correlation matrix or a given dataframe
eigenvalues() calculates eigenvalues of a correlation matrix, a report on whether the correlation matrix is positive definite, and produces an optional scree plot.

alpha()

alpha() accepts, as input, either a correlation matrix or a dataframe. If both are submitted, then the correlation matrix is used by default, with a message to that effect.

alpha() examples


## define parameters
df <- data.frame(
  V1 = c(4, 2, 4, 3, 2, 2, 2, 1),
  V2 = c(3, 1, 3, 4, 4, 3, 2, 3),
  V3 = c(4, 1, 3, 5, 4, 1, 4, 2),
  V4 = c(4, 3, 4, 5, 3, 3, 3, 3)
)

corMat <- matrix(
  c(
    1.00, 0.35, 0.45, 0.75,
    0.35, 1.00, 0.65, 0.55,
    0.45, 0.65, 1.00, 0.65,
    0.75, 0.55, 0.65, 1.00
  ),
  nrow = 4, ncol = 4
)

## apply function examples
alpha(cormatrix = corMat)
#> [1] 0.8395062
alpha(data = df)
#> [1] 0.8026658
alpha(NULL, df)
#> [1] 0.8026658
alpha(corMat, df)
#> Warning: Both cormatrix and data present.
#>                 
#> Using cormatrix by default.
#> [1] 0.8395062

eigenvalues()

eigenvalues() calculates eigenvalues of a correlation matrix, reports on whether the matrix is positive-definite, and optionally produces a scree plot.

eigenvalues() examples


## define parameters
correlationMatrix <- matrix(
  c(
    1.00, 0.25, 0.35, 0.45,
    0.25, 1.00, 0.70, 0.75,
    0.35, 0.70, 1.00, 0.85,
    0.45, 0.75, 0.85, 1.00
  ),
  nrow = 4, ncol = 4
)

## apply function
evals <- eigenvalues(cormatrix = correlationMatrix)
#> correlationMatrix  is positive-definite
#> 
#> Eigenvalues:
#>  2.748499 0.8122627 0.3048151 0.1344231

print(evals)
#> [1] 2.7484991 0.8122627 0.3048151 0.1344231

eigenvalues() function with optional scree plot


 evals <- eigenvalues(correlationMatrix, 1)

#> correlationMatrix  is positive-definite
#> 
#> Eigenvalues:
#>  2.748499 0.8122627 0.3048151 0.1344231

LikertMakeR

synthesise and correlate rating-scale data

Hume Winzar

February 2024

LikertMakeR

Purpose

Motivation

Rating scale properties

LikertMakeR functions

Using LikertMakeR

Download and Install LikertMakeR

from CRAN

development version from GitHub.

Generate synthetic rating-scale data

lfast()

lfast() example

a four-item, five-point Likert scale

an 11-point likelihood-of-purchase scale

lfast()

lexact()

Correlating rating scales

lcor()

lcor() example

Generate a correlation matrix from Cronbach’s Alpha

makeCorrAlpha()

makeCorrAlpha() is volatile

makeCorrAlpha() examples

Four variables, alpha = 0.85, variance = default

test output with Helper functions

twelve variables, alpha = 0.90, variance = 1

-

test output

Generate a dataframe of rating scales from a correlation matrix and predefined moments

makeItems()

makeItems() examples

Generate a dataframe from Cronbach’s Alpha and predefined moments

Step 1: Generate a correlation matrix

Step 2: Generate dataframe

Summary plots of new data frame

Helper functions

alpha()

alpha() examples

eigenvalues()

eigenvalues() examples

eigenvalues() function with optional scree plot

Alternative methods & packages

sampling from a truncated normal distribution

sampling with a predetermined probability distribution

marginal model specification

References