This vignette is a walkthrough the R code used in the paper introducing ViSiElse package (EM Garnier, N Fouret & M Descoins. ViSiElse: An innovative R-package to visualize raw behavioral data over time).
This vignette is organized as the R code in 4 parts:
ViSiElse is a graphical tool designed to visualize behavioral observations over time. Based on raw time data extracted from video recorded sessions of experimental observations, ViSiElse grants a global overview of a process by combining the visualization of multiple actions timestamps for all participants in a single graph. Individuals and/or group behavior can easily be assessed. For example, ViSiElse was first developed to visualize the actions performed by caregivers during simulated medical procedures in order to get insights of their behavior. Supplementary features allow users to further inspect their data by adding summary statistics (mean, standard deviation, quantile or statistical test) and/or time constraints to check the accuracy of the realized actions.
A first vignette showing ViSiElse’s step by step process is available here.
The first thing to do is simply to install the package and load it.
The installation is done once. All future use of ViSiElse will only require to load the package.
To explain ViSiElse process, we will use 3 different datasets that are all attached to the package:
typDay dataset: This is the main data that we will use. This dataset shows the actions usually performed during a typical day. The simulated dataset of 100 subjects correspond to the timestamps (in min) of each action of the day, from midnight to midnight. Each value is the time elapse between the beginning of the day (midnight) and the execution of the action.
intubation dataset: Time data from a high-fidelity simulation experiment of a neonatal resuscitation with 37 midwife students. The simulation was video recorded and actions required in the intubation process were tagged. This dataset is the execution time (in seconds) of each action performed by the students.
shoppingBehavior dataset: This dataset shows the buying process of consumers over internet based on a 5-steps model: need recognition, information search, evaluation, purchase decision, and post-purchase behavior. This simulated dataset of 100 subjects correspond to the timestamps (in s) of each action of the model (except for the post-purchase behavior) executed by the subjects.
data("typDay") # load typDay dataset
head(typDay) # print first rows
?typDay # Information about the data
data("intubation") # load intubation dataset
head(intubation) # print first rows
?intubation # Information about the data
data("shoppingBehavior") # load shoppingBehavior dataset
head(shoppingBehavior) # print first rows
?shoppingBehavior # Information about the data
In this section, we will focus on the typical day dataset. All ViSiElse features will be explained through this example.
ViSiElse explores human or animal behavior during a process or a procedure or any observations that can be divided into actions. For example, the typical day tasks can be divided into the following actions: sleep, wake up, take a shower, eat breakfast, drink the first coffee of the day, start and stop working, lunch break, pick up the kids, cook and eat dinner and then finally go to sleep.
If we take a look at the data, we can see that the first column is the subject id and each following column is one action of the typical day. Each value is the time elapse (in min) between the beginning of the day (midnight) and the execution of the action.
## id start_sleep stop_sleep wake_up shower breakfast start_work start_lunch
## 1 1 0 366 366 375 389 486 738
## 2 2 0 391 391 406 426 511 751
## 3 3 0 329 329 329 334 449 720
## 4 4 0 335 335 336 342 455 723
## 5 5 0 437 437 464 496 557 774
## 6 6 0 353 353 359 370 473 731
## stop_lunch stop_work pickup_kids start_cook stop_cook go_sleep first_coffee
## 1 789 985 997 1011 1059 1326 479
## 2 811 1022 1037 1057 1118 1351 451
## 3 757 929 960 965 995 1289 535
## 4 763 938 960 966 999 1295 489
## 5 852 1091 1112 1144 1228 1397 481
## 6 777 964 974 985 1026 1313 1068
We can see that subject 1 sleeps until 366min of the day (6h06), takes his shower at 6h15 (375min) and eat breakfast at 6h29 (389min). All the data are the timestamps of the actions performed in a day.
The dataset is the only required argument to run
visielse()
function. Let’s plot our first ViSiElse graph
!
v1 <- visielse(typDay, informer = NULL) # informer = NULL removes the summary statitics that are displayed by default (we will talk about it later).
On the graph, actions are organized one under the other (y-axis) and their executions are distributed along the time axis (x-axis). A drawn rectangle means that at least one subject has done the action in the interval of time. The size of the time interval is set by the breaks of the time axis (here every 30 min from midnight to midnight). The color’s intensity of the rectangles is proportional to the number of individuals who realized the action during the time interval.
In the legend, we can notice that there are two sections: punctual action and long action. What are punctual and long actions ?
ViSiElse differentiate two type of actions (punctual and long). A punctual action is an action with no duration, or not lasting long enough to be measured regarding the time scale of the studied behavior. A long action is an action having duration defined by two punctual actions, one for its beginning, and one for its ending. For example, the action “sleep” is long —it lasts at least a few hours which is substantial in the scale of a day - while the action “wake up” is punctual —it usually only takes seconds or a few minutes and its duration is not relevant in the scale of a day.
For the typical day example, the list of actions can then be transformed to:
Now, to visualize both punctual and long actions with ViSiElse, we
need a new argument of the visielse()
function: the
ViSibook object.
While the dataset contains the raw time data of the studied behavior, the ViSibook contains its structure. Mainly, it is a table consisting of the characteristics of every action.
When running visielse()
function with only the dataset
as argument, the ViSibook is automatically generated and can then be
extracted from the visielse object. Then, we can use the
ConvertFromViSibook()
function to transform the ViSibook
into a data.frame
and modify it.
Let’s see what the ViSibook from our first plot looks like !
## vars label typeA showorder deb fin
## 1 start_sleep start_sleep p 1 <NA> <NA>
## 2 pickup_kids pickup_kids p 10 <NA> <NA>
## 3 start_cook start_cook p 11 <NA> <NA>
## 4 stop_cook stop_cook p 12 <NA> <NA>
## 5 go_sleep go_sleep p 13 <NA> <NA>
## 6 first_coffee first_coffee p 14 <NA> <NA>
## 7 stop_sleep stop_sleep p 2 <NA> <NA>
## 8 wake_up wake_up p 3 <NA> <NA>
## 9 shower shower p 4 <NA> <NA>
## 10 breakfast breakfast p 5 <NA> <NA>
## 11 start_work start_work p 6 <NA> <NA>
## 12 start_lunch start_lunch p 7 <NA> <NA>
## 13 stop_lunch stop_lunch p 8 <NA> <NA>
## 14 stop_work stop_work p 9 <NA> <NA>
The minimum structure of a ViSibook is :
We can now change the labels and add long actions to our graph !
b1 <- b1[order(as.numeric(b1$showorder)), ] # order the data.frame
# Change the labels of the punctual actions #
b1$label <- c("Sleep", "Stop sleeping", "Wake up", "Take a shower", "Eat breakfast", "Start working", "Start eating lunch", "End of lunch", "Stop working", "Pick up the kids", "Start cooking", "End of dinner", "Go to sleep", "First coffee")
# Define the long actions
b1[15,] <- c("sleep", "Sleeping", "l", 1, "start_sleep", "stop_sleep")
b1[16,] <- c("work", "Working", "l", 5, "start_work", "stop_work")
b1[17,] <- c("lunch", "Lunch break", "l", 6, "start_lunch", "stop_lunch")
b1[18,] <- c("cook", "Cook and eat dinner", "l", 8, "start_cook", "stop_cook")
# Define which actions should be plotted and in which order
b1$showorder <- c(NA, NA, 2, 3, 4, 5, NA, NA, 7, 9, NA, NA, 11, 12, 1, 6, 8, 10)
b1 <- b1[order(as.numeric(b1$showorder)), ] # re-order the ViSibook according to the action order
# The new ViSibook
b1
## vars label typeA showorder deb fin
## 15 sleep Sleeping l 1 start_sleep stop_sleep
## 8 wake_up Wake up p 2 <NA> <NA>
## 9 shower Take a shower p 3 <NA> <NA>
## 10 breakfast Eat breakfast p 4 <NA> <NA>
## 11 start_work Start working p 5 <NA> <NA>
## 16 work Working l 6 start_work stop_work
## 14 stop_work Stop working p 7 <NA> <NA>
## 17 lunch Lunch break l 8 start_lunch stop_lunch
## 2 pickup_kids Pick up the kids p 9 <NA> <NA>
## 18 cook Cook and eat dinner l 10 start_cook stop_cook
## 5 go_sleep Go to sleep p 11 <NA> <NA>
## 6 first_coffee First coffee p 12 <NA> <NA>
## 1 start_sleep Sleep p NA <NA> <NA>
## 7 stop_sleep Stop sleeping p NA <NA> <NA>
## 12 start_lunch Start eating lunch p NA <NA> <NA>
## 13 stop_lunch End of lunch p NA <NA> <NA>
## 3 start_cook Start cooking p NA <NA> <NA>
## 4 stop_cook End of dinner p NA <NA> <NA>
The long action sleep is defined by the two punctual actions: start_sleep (beginning) and stop_sleep (ending). The punctual actions used to define long actions do not need to be displayed so their order is set to “NA”.
Let’s see the new ViSiElse graph !
v2 <- visielse(typDay, book = b1, informer = NULL, doplot = F, pixel = 30)
plot(v2, vp0w = 0.7, unit.tps = "min", scal.unit.tps = 30, main = "Typical day")
Long actions are displayed by one line per subject with a size proportional to the duration of the action and the lines are sorted by their starting time.
NB1: As long actions are delimited by punctual actions, no additional
data are required in the dataset. visielse()
automatically
computes the duration of long actions.
NB2: In the visielse()
function, doplot is set
to FALSE as the ViSiElse graph is displayed by the plot()
function. Using plot()
allows the access to more formatting
options. For example, changing the units of the x-axis (in s by default)
or the name of the graph.
You may have notice the use of the pixel argument in the
visielse()
function. For punctual actions, the
pixel parameter represents the time precision
i.e. the time limit to which one subject is moved from a time interval
to another. The default pixel size is set to 20 seconds. It also sets
the breaks of the x-axis in the graph. In the plot()
function, the pixel argument is called scal.unit.tps.
Those two parameters should always be identical.
Data are aggregated into the time intervals. If pixel is too small, the plotted information will not be aggregated enough to allow interpretation.
# Small pixel parameter : data are not aggregated enough #
p1 <- 10
v3 <- visielse(typDay, book = b1, informer = NULL, doplot = F, pixel = p1)
plot(v3, vp0w = 0.7, unit.tps = "min", main = "Typical day, pixel = 10min", scal.unit.tps = p1)
On the contrary, if pixel is too large, the plotted information will be too much aggregated to allow interpretation.
# High pixel parameter : data are too aggregated #
p2 <- 120
v4 <- visielse(typDay, book = b1, informer = NULL, doplot = F, pixel = p2)
plot(v4, vp0w = 0.7, unit.tps = "min", main = "Typical day, pixel = 120min", scal.unit.tps = p2)
The pixel parameter must be chosen carefully.
ViSiElse enables the differentiation of two subsets of participants
through color distinction. Distinguishing experimental groups of
participants is useful to identify different patterns of behavior. In
the typical day dataset, we created two groups: people who employ a
babysitter (in blue) and people who do not (in pink). To display groups
with ViSiElse, users simply specify the group and the
method arguments in the visielse()
function. The
first argument is a vector containing the group distribution for each
individual. The second argument is the name of the chosen visualization
method. ViSiElse provides three methods to plot groups:
# Group definition
group <- rep(1, 100)
group[typDay$pickup_kids > 1019] <- 2
# Groups plotted with "cut" method : each group is one under the other #
v5 <- visielse(typDay, book = b1, informer = NULL, group = group, method = "cut", tests = F, pixel = 30, doplot = F)
plot(v5, vp0w = 0.7, unit.tps = "min", scal.unit.tps = 30, main = "Typical day, 'cut' method")
This representation is useful to compare groups as data are completely dissociated.
# Groups plotted with "join" method : group spacially mixed #
v6 <- visielse(typDay, book = b1, informer = NULL, group = group, method = "join", tests = F, pixel = 30, doplot = F)
plot(v6, vp0w = 0.7, unit.tps = "min", scal.unit.tps = 30, main = "Typical day, 'join' method")
With this method, users can analyze the group distribution among the data.
# Groups plotted with "within" method : all data are plotted together in blue
# and the group specified in "grwithin" is plotted again in pink #
v7 <- visielse(typDay, book = b1, informer = NULL, group = group, method = "within", grwithin = "2", tests = F, pixel = 30, doplot = F)
plot(v7, vp0w = 0.7, unit.tps = "min", scal.unit.tps = 30, main = "Typical day, 'within' method")
This visualization allows users to examine a specific group behavior against the global population. As the ViSiElse package only allows two colors of distinction, the “within” method is the most suitable option for data containing more than two groups. Users can confront each group, one after the other, to the global population and identify their differences.
Green and black zone are here to help visualize time boundaries and assess if the actions are done in time. Indeed, some actions can be restricted by time guidelines or external rules. Green zones represent time obligation i.e. when actions should be achieved while black zones set time interdiction i.e. when actions should not occur.
For punctual actions, ViSiElse can display one green zone and two black zones. To create those zones, we only have to add the following new columns in the ViSibook:
In our typical day example, we can imagine multiple time constraints. For example, the working hours can be delimited. Usually, people should be at work and start working before 10 a.m. (black zone) and they cannot leave work before 4 p.m (black zone). In addition, schools often end at 4 p.m. and close at 5 p.m. so people must pick up their kids during this one-hour interval (black zones before 4p.m. and after 5p.m. and green zone in between).
Let’s add the green and black zones in the ViSibook !
b2 <- b1
b2 <- b2[order(as.numeric(b2$showorder)), ]
# Add definition of the green zones #
b2$GZDeb <- c(rep(NA, 8), 960, rep(NA, 9))
b2$GZFin <- c(rep(NA, 8), 1020, rep(NA, 9))
# Definition of the black zones before the green one #
b2$BZBeforeDeb <- c(rep(NA, 4), 600, NA, 0, NA, 0, rep(NA, 9))
b2$BZBeforeFin <- c(rep(NA, 4), Inf, NA, 960, NA, 960, rep(NA, 9))
# Add definition of the black zones after the green one #
b2$BZAfterDeb <- c(rep(NA, 8), 1020, rep(NA, 9))
b2$BZAfterFin <- c(rep(NA, 8), Inf, rep(NA, 9))
Before analyzing the changes made, let’s add the time constraints on long actions.
For long actions, ViSiElSe only allows one black zone (actually displayed by a darker blue color) but this one black zone can be of two types:
No green zone can be plotted for long actions.
In our example, lunch time is often limited to 1 hour. We now have to create two new columns in the ViSibook. The first one, BZLong, defines the time limit and the second one, BZLtype defines the type of the black zone: here it’s a duration not to exceed (“span”).
# Add definition of the time limit for long action #
b2$BZLong <- c(rep(NA, 7), 30, rep(NA, 10))
b2$BZLtype <- c(rep(NA, 7), "span", rep(NA, 10)) # type should either be "span" (for a duration not to exceed) or "time" (for a deadline not to cross)
b2
## vars label typeA showorder deb fin
## 15 sleep Sleeping l 1 start_sleep stop_sleep
## 8 wake_up Wake up p 2 <NA> <NA>
## 9 shower Take a shower p 3 <NA> <NA>
## 10 breakfast Eat breakfast p 4 <NA> <NA>
## 11 start_work Start working p 5 <NA> <NA>
## 16 work Working l 6 start_work stop_work
## 14 stop_work Stop working p 7 <NA> <NA>
## 17 lunch Lunch break l 8 start_lunch stop_lunch
## 2 pickup_kids Pick up the kids p 9 <NA> <NA>
## 18 cook Cook and eat dinner l 10 start_cook stop_cook
## 5 go_sleep Go to sleep p 11 <NA> <NA>
## 6 first_coffee First coffee p 12 <NA> <NA>
## 1 start_sleep Sleep p NA <NA> <NA>
## 7 stop_sleep Stop sleeping p NA <NA> <NA>
## 12 start_lunch Start eating lunch p NA <NA> <NA>
## 13 stop_lunch End of lunch p NA <NA> <NA>
## 3 start_cook Start cooking p NA <NA> <NA>
## 4 stop_cook End of dinner p NA <NA> <NA>
## GZDeb GZFin BZBeforeDeb BZBeforeFin BZAfterDeb BZAfterFin BZLong BZLtype
## 15 NA NA NA NA NA NA NA <NA>
## 8 NA NA NA NA NA NA NA <NA>
## 9 NA NA NA NA NA NA NA <NA>
## 10 NA NA NA NA NA NA NA <NA>
## 11 NA NA 600 Inf NA NA NA <NA>
## 16 NA NA NA NA NA NA NA <NA>
## 14 NA NA 0 960 NA NA NA <NA>
## 17 NA NA NA NA NA NA 30 span
## 2 960 1020 0 960 1020 Inf NA <NA>
## 18 NA NA NA NA NA NA NA <NA>
## 5 NA NA NA NA NA NA NA <NA>
## 6 NA NA NA NA NA NA NA <NA>
## 1 NA NA NA NA NA NA NA <NA>
## 7 NA NA NA NA NA NA NA <NA>
## 12 NA NA NA NA NA NA NA <NA>
## 13 NA NA NA NA NA NA NA <NA>
## 3 NA NA NA NA NA NA NA <NA>
## 4 NA NA NA NA NA NA NA <NA>
In this new ViSibook, the punctual action “start_work” is associated with a black zone starting at 600min (10 a.m.) and ending at “Inf”. The “Inf” notation means that the zone (green or black) should be extended to the end of the studied process. In our case, the end of the day (midnight or 1440min). We can define 3 zones (2 black and 1 green) per punctual action and 1 black zone per long action.
Now, we can see how the time zones are displayed !
v8 <- visielse(typDay, book = b2, informer = NULL, pixel = 30, doplot = F)
plot(v8, vp0w = 0.7, unit.tps = "min", scal.unit.tps = 30, main = "Typical day")
Black zones for long actions are drawn by a dark blue at the end of the lines. For each subject that eat lunch for more than 60min, the dark blue starts at the 61st minute of the action and lasts until the end.
To complete the ViSiElse’s graph and analyze the behavioral data
tendency, summary statistics can be added with the
informer argument of the visielse()
function.
Users can choose between plotting the mean and standard deviation or the
median with the first and third quartiles.
# Mean + standard deviation #
v9 <- visielse(typDay, book = b1, informer = "mean", tests = F, pixel = 30, doplot = F)
plot(v9, vp0w = 0.7, unit.tps = "min", scal.unit.tps = 30, main = "Typical day, mean + SD")
# Median + IQR #
v10 <- visielse(typDay, book = b1, informer = "median", tests = F, pixel = 30, doplot = F)
plot(v10, vp0w = 0.7, unit.tps = "min", scal.unit.tps = 30, main = "Typical day, median + IQR")
Moreover, if the data contains groups and summary statistics are defined, ViSiElse computes a statistical test to compare the time data between the two groups. If the informer parameter is set to mean and standard deviation, ViSiElse runs a Wilcoxon test; if the informer parameter is set to median and quartiles, ViSiElse runs a Mood’s two-sample test. An asterisk appears on the right side of the graph if the statistical test is significant with a 0.01 alpha risk (default value). The alpha risk could be manually set.
# Statistical test between groups #
v11 <- visielse(typDay, book = b1, informer = "mean", group = group, method = "cut", pixel = 30, doplot = F, tests = TRUE, threshold.test = 0.05)
plot(v11, vp0w = 0.7, unit.tps = "min", scal.unit.tps = 30, main = "Typical day")
Here, informer is set to mean, so Wilcoxon tests are performed between the groups for each action. The alpha risk is set to 0.05 by the threshold.test parameter.
ViSiElse performs statistical tests as an indication of the statistical difference between groups. However, ViSiElse is not a reporting tool and only provides the statistical significance of the group comparison. To get complete results and the test details, users should run the tests separately.
ViSiElse was originally developed to visualize behavioral data extracted from video recorded sessions of simulated healthcare procedures. For example, midwife students are trained, through high-fidelity simulation, to the neonatal resuscitation procedure which includes endotracheal intubation (EI). EI is the process of inserting a tube through the mouth and then into the airway in order to ventilate the newborn. By restoring airway patency, EI is a lifesaving procedure and it has to be readily available to all patients whose ventilation is compromised in emergency or anesthesia context. EI consists of six punctual actions completed by two long actions:
EI, as most of the medical procedures, follows guidelines set by local or international committees. For the neonatal resuscitation procedure, the International Liaison Committee on Resuscitation (ILCOR) sets the guidelines that caregivers should follow, including those for EI. ILCOR recommendations state that the intubation should not occur during the first minute of life. An appropriate time for the insertion of the laryngoscope blade in the newborn’s mouth is between 120 and 210 seconds.
Let’s create the ViSiElse graph !
## id deci_intub stop_ventil blade_in insert_tube blade_out restart_ventil
## 1 1 65 76 80 82 93 98
## 2 2 119 150 153 159 163 171
## 3 3 115 144 145 154 175 180
## 4 4 192 233 235 242 255 268
## 5 5 169 187 189 199 206 213
## 6 6 110 157 159 169 191 200
The dataset shows the intubation process performed by 37 midwife students during a high-fidelity simulation session.
#### Figure 5 in ViSiElse paper ####
v16 <- visielse(intubation, doplot = F)
b3 <- ConvertFromViSibook(v16@book)
b3$label <- c("Decision to intubate", "Stop ventilation", "Laryngoscope\nblade in", "Insert endotracheal\ntube", "Laryngoscope\nblade out", "Restart ventilation")
b3[7,] <- c("dur_laryngoscope", "Laryngoscope\nduration use", "l", "8", "blade_in", "blade_out")
b3[8,] <- c("dur_intub", "Intubation duration", "l", "9", "stop_ventil", "restart_ventil")
b3$GZDeb <- c(NA, NA, 120, NA, NA, NA, NA, NA)
b3$GZDeb <- c(NA, NA, 120, NA, NA, NA, NA, NA)
b3$GZFin <- c(NA, NA, 210, NA, NA, NA, NA, NA)
b3$BZBeforeDeb <- c(NA, NA, 0, NA, NA, NA, NA, NA)
b3$BZBeforeFin <- c(NA, NA, 120, NA, NA, NA, NA, NA)
b3$BZAfterDeb <- c(NA, NA, 210, NA, NA, NA, NA, NA)
b3$BZAfterFin <- c(NA, NA, Inf, NA, NA, NA, NA, NA)
b3$BZLong <- c(rep(NA, 7), 30)
b3$BZLtype <- c(rep(NA, 7), "span")
v17 <- visielse(intubation, book = b3, informer = "median", doplot = F)
plot(v17, scal.unit.tps = 20, rcircle = 8, vp0h = 0.65, vp0w = 0.7, Fontsize.label.Action = 9, Fontsize.label.Time = 9, Fontsize.label.color = 9, main = "Intubation process in neonatal resuscitation algorithm")
In the graph, thanks to the long action “intubation duration”, we see that midwives performed the EI heterogeneously during the neonatal resuscitation. Some midwives intubate early in the resuscitation while others only start after 4 min. ViSiElse allows a graphical inspection of the adequacy to the recommendations. For training processes, ViSiElse provides a visual assessment of the performance of caregivers.
NB: We can note the additional formatting options used in the
plot()
function.
Online shopping behavior is a buying process where consumers purchase items over the internet based on a 5-steps model: need recognition, information search, evaluation, purchase decision, and post-purchase behavior. Many factors influencing the buying process have been studied such as nationality, gender, age, education or income.
## id need start_search stop_search start_eval stop_eval deci
## 1 1 8 16 38 31 53 61
## 2 2 10 20 50 40 70 80
## 3 3 4 8 18 16 26 30
## 4 4 5 10 22 19 31 36
## 5 5 15 30 75 59 104 119
## 6 6 6 12 30 25 43 49
The dataset is a simulated dataset for one hundred consumers divided into two groups: 50 women and 50 men.
#### Figure 6 in ViSiElse paper ####
v18 <- visielse(shoppingBehavior, doplot = F)
b4 <- ConvertFromViSibook(v18@book)
b4$label <- c("Need recognition", "Start information search", "Stop information search", "Start evaluation", "Stop evaluation", "Purchase decision")
b4$showorder <- c(1, NA, NA, NA, NA, 4)
b4[7,] <- c("search", "Information search", "l", "2", "start_search", "stop_search")
b4[8,] <- c("eval", "Evaluation", "l", "3", "start_eval", "stop_eval")
v19 <- visielse(shoppingBehavior, book = b4, informer = "mean", pixel = 5, group = group_shop, method = "cut", doplot = F)
plot(v19, scal.unit.tps = 5, rcircle = 8, vp0h = 0.6, vp0w = 0.75, Fontsize.label.Action = 9, Fontsize.label.Time = 9, Fontsize.label.color = 9, lwd.grid = 1, lwdline = 2, main = "Online shopping behaviour", unit.tps = "min")
With ViSiElse, researchers can visually assess the differences in online shopping behavior between different categories of consumers. ViSiElse graph also displays the summary statistics for each group. This graph is an example of ViSiElse application for online shopping data but ViSiElse representation can be used to visualize any web navigation behavior.
ViSiElse’s options are accessible through the ViSibook for all the
actions characteristics (names, labels, types, order, long action
delimitations and green and black zones), through the arguments of the
visielse()
function for all the analysis features (time
scale, groups, statistics) and through the arguments of the
plot()
function for all the formatting options.
Formatting options include:
The complete list is available in the documentation of the
plot-ViSigrid-method
function, here.
In ggplot2 package, the closest graph type to ViSiElse is the
geom_tile()
function (heatmap). Thanks to ggplot2 flexible
formatting options, the graph can be design as ViSiElse. However it
requires to transform the dataset.
First, the time data should be turned into a frequency table calculating how many subjects performed the action in the time interval. Then, the data should be reshaped to match ggplot2 required data structure. Finally, to match ViSiElse style, 0 values in the dataset should be set to “NA”. With this trick, we can plot “NA” values in white in the formatting options of ggplot2.
Let’s see the modified dataset !
# load the packages
library(ggplot2) # for the heatmap
library(reshape2) # reshape the dataset to adjust its structure
# Create a frequency table with 30min time intervals
typDay2 <- sapply(typDay[,4:15], function(x){
table(cut(x, breaks=seq(0, 1440, 30)))
})
# Reshape the dataset to fit ggplot2 data structure
typDay2 <- data.frame(time = factor(seq(0, 1410, 30)), typDay2)
rownames(typDay2) <- 1:nrow(typDay2)
colnames(typDay2) <- c("time", b1$label[c(12, 11, 18, 17, 9, 7, 16, 15, 5, 4, 3, 2)])
typDay2 <- melt(typDay2, id = "time")[, c(2, 1, 3)]
# Set 0 values to "NA"
typDay2$value[typDay2$value == 0] <- NA
head(typDay2)
## variable time value
## 1 First coffee 0 NA
## 2 First coffee 30 NA
## 3 First coffee 60 NA
## 4 First coffee 90 NA
## 5 First coffee 120 NA
## 6 First coffee 150 NA
The adjusted dataset only contains 3 columns:
How does the heatmaps look like ?
heatmap <- ggplot(data = typDay2, aes(x = time, y = variable, fill = value)) +
geom_tile() +
scale_fill_gradient(low = "#E2E8FF", high = "#2D39A5", name = "Participants", na.value = 'white', limit = c(0, 53)) +
xlab("Time (min)") + ylab(element_blank()) +
scale_x_discrete(expand = c(0, 0), breaks = seq(0, 60, 30)) +
theme(axis.line = element_line(colour = "black"),
axis.title = element_text(size = 12, face = "bold"),
axis.text = element_text(colour = "black", size = 8)) +
theme(legend.text = element_text(size = 8),
legend.title = element_text(size = 10),
legend.position ="bottom",
legend.margin = margin(0, 0, 0, 0, unit = "mm"),
legend.key.width = unit(1, "cm"),
legend.key.height = unit(3, "mm"))
print(heatmap)
The visual output is very close to ViSiElse as the gradient was modified to match ViSiElse colors.
However, we can see that only punctual actions are displayed. Indeed,
the main drawback of using geom_tile()
is that heatmaps do
not provide long actions visualization which can be critical. Plotting
long actions instead of just the punctual actions (i.e. start and end
time) can useful when the duration of actions matter.
In our intubation example, the act of intubation should be as fast as possible considering that the newborn does not receive any oxygen during the procedure. If we only see the start and end time of the intubation, we would not be able to visualize the duration of the intubation. Indeed, for punctual actions, individuals are pulled together so we cannot link a start time to its end time: some might start early and finish early, others might start early but finish late or start late and finish late…
In addition, heatmaps also do not allow groups or time constraints visualization.
If you have any questions, you can reach us by email at [email protected]
For more documentations, you can see the help in R:
Or you can read the first vignette of the package: ViSiElse Step by Step.