5 Behavioral Analysis
We are now ready to analyze and plot the behavioral data. Let’s start with a quick recap of the data and tasks we have:
Sorting Task
The day sorting task (Figure 1D) was performed in front of a computer screen. The 20 event images from the day learning task were presented on the screen in a miniature version. They were arranged in a circle around a central area displaying 4 rectangles. Participants were instructed to drag and drop all events of the same sequence into the same rectangle with a computer mouse. Participants freely chose which rectangle corresponded to which sequence as the sequences were not identifiable by any label and were presented in differing orders across mini-blocks during learning.
Thus, in analysis, we took the grouping as provided by the rectangles and assigned the four groups of events to the four days in a way that there was maximal overlap between actual days and sorted days. We found the best solution for this by trying all combinations in a preparation step.
Timeline task
In this task, participants saw a timeline ranging from 6 a.m. to midnight together with miniature versions of the five event images belonging to one sequence (Figure 1E). Participants were instructed to drag and drop the event images next to the timeline so that scene positions reflected the event times they had inferred in the day learning task. To facilitate precise alignment to the timeline, event images were shown with an outward pointing triangle on their left side, on which participants were instructed to base their responses.
Participants responses were read out from the logfiles of this task and converted to virtual hours. The data are saved in the text file including all behavioral data (virtem_behavioral_data.txt).
Let’s begin by loading the data into a tibble (dataframe) with the following columns:
- sub_id = subject identifier for all rows of this subject
- day = virtual day. There are 4 virtual days
- event = number of event in a day, i.e. order. Each day has 5 events.
- pic = picture identifier. Pictures were randomly assigned to events and days for each subject
- virtual_time = true virtual time of an event
- real_time = real time of an event
- memory_time = data from the timeline task, where participants indicated remembered (virtual) time of each event
- memory_order = remembered ordinal position of an event
- sorted_day = data from the day sorting task, where participants sorted all event pictures into the 4 days.
# load data from CSV
<- file.path(dirs$data4analysis, "behavioral_data.txt")
fname <- cols_only(
col_types_list sub_id = col_factor(),
day = col_factor(),
event = col_factor(),
pic = col_integer(),
virtual_time = col_double(),
real_time = col_double(),
memory_time = col_double(),
memory_order = col_double(),
sorted_day = col_integer()
)<- as_tibble(read_csv(fname, col_types = col_types_list))
beh_data
head(beh_data)
sub_id | day | event | pic | virtual_time | real_time | memory_time | memory_order | sorted_day |
---|---|---|---|---|---|---|---|---|
31 | 1 | 1 | 12 | 9.25 | 7.81 | 9.45 | 1 | 1 |
31 | 1 | 2 | 20 | 11.8 | 23.4 | 11.5 | 2 | 1 |
31 | 1 | 3 | 2 | 14.8 | 42.2 | 14.5 | 3 | 1 |
31 | 1 | 4 | 8 | 16.2 | 51.6 | 16.5 | 4 | 1 |
31 | 1 | 5 | 19 | 18.2 | 64.1 | 19 | 5 | 1 |
31 | 2 | 1 | 1 | 9.5 | 9.38 | 9.53 | 1 | 2 |
5.1 Sorting Task
For analysis of the sorting task, we took the grouping of event images as provided by the participants and assigned them to the four sequences to ensure maximal overlap between actual and sorted sequence memberships. While the assignment of groupings to sequences is unambiguous when performance is, as in our sample, high, this procedure is potentially liberal at lower performance levels. We then calculated the percentage of correctly sorted event images for each participant, see the raincloud plot(100) in Figure 2A.
Calculate percentage of correct responses
So to calculate participants’ accuracy in this task we need to figure out how often the true day label matches the label of the quadrant into which an event was sorted.
# calculate the number and percentages of correct sorts
<- beh_data %>% group_by(sub_id) %>%
day_sorting summarise(
n_correct = sum(day == sorted_day),
prcnt_correct = sum(day == sorted_day)/(n_days*n_events_day)*100,
group = as.factor(1),
.groups = "drop"
)
# print a simple summary of descriptive statistics
summary(day_sorting$prcnt_correct)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 40.00 75.00 95.00 86.43 100.00 100.00
Results of the sorting task: 86.43 ± 16.82 mean±standard deviation of correct sorts
Plot sorting performance
# raincloud plot
<- ggplot(day_sorting,aes(x=1,y=prcnt_correct, fill = group, colour = group)) +
f2a # plot the violin
::geom_half_violin(position=position_nudge(0.1),
gghalvesside = "r", fill=ultimate_gray, color= NA) +
# single subject data points with horizontal jitter
geom_point(aes(x = 1-.2), alpha = 0.7,
shape=16, colour=ultimate_gray, size = 1,
position = position_jitter(width = .1, height = 0)) +
# box plot of distribution
geom_boxplot(width = .1, colour = "black") +
# add plot of mean and SEM
stat_summary(fun = mean, geom = "point", size=1, shape = 16,
position = position_nudge(.1), colour = "black") +
stat_summary(fun.data = mean_se, geom = "errorbar",
position = position_nudge(.1), colour = "black", width = 0, size = 0.5) +
# edit axis labels and limit
ylab('% correct') + ylim(0, 100) +
xlab('sorting task') +
# make it pretty
theme_cowplot() + theme(legend.position = "none",
axis.ticks.x = element_blank(), axis.text.x = element_blank()) +
# change color
scale_color_manual(values = ultimate_gray) +
scale_fill_manual(values = ultimate_gray) +
guides(color = "none", fill = "none")
f2a
# save source data
<-ggplot_build(f2a)$data[[2]]
source_dat ::write_tsv(source_dat %>% select(x,y),
readrfile = file.path(dirs$source_dat_dir, "f2a.txt"))
Sorting errors as a function of sequence position
# DISTRIBUTION OF SORTING ERRORS
# test whether frequencies of sorting errors for the sequence positions differs from uniformity
<- beh_data %>% filter(sorted_day != day) %>%
stats group_by(event) %>%
summarise(n_sort_errs = sum(sorted_day != day), .groups="drop") %>%
pull(n_sort_errs) %>%
chisq.test() %>%
tidy()
# histogram of sorting errors as a function of sequence position
<- ggplot(beh_data %>% filter(sorted_day != day), aes(x=event, fill = event)) +
fig_sort_err_hist geom_bar(stat="count") +
scale_fill_manual(values = event_colors) +
theme_cowplot() +
theme(legend.position = "none",
text = element_text(size=10, family=font2use),
axis.text = element_text(size=8),
legend.text=element_text(size=8),
legend.title=element_text(size=8))
<- here("figures", "letter_sort_errors_histogram")
fn ggsave(paste0(fn, ".pdf"), plot=fig_sort_err_hist, units = "cm",
width = 4, height = 5.5, dpi = "retina", device = cairo_pdf)
ggsave(paste0(fn, ".png"), plot=fig_sort_err_hist, units = "cm",
width = 4, height = 5.5, dpi = "retina", device = "png")
print(fig_sort_err_hist)
\(\chi^2\) test against uniformity for distribution of sorting errors: \(\chi^2\)=2.55, p=0.635
5.2 Timeline Task
Visualize the raw data
First, let’s get an overview of participants’ behavior in this task. For this, we plot the responses from the timeline task for all participants. In these plots, each row is one virtual day. The circles along the gray lines represent the true virtual times that participants were supposed to learn. For each event on each day, we then add a full plot of the behavior. This includes the single-subject data points (colored circles), their mean and standard error across subjects (black circle and line) as well as the boxplot and kernel density plot of the distribution.
# reduce data frame to the times specified by the design
<- beh_data %>% group_by(day, event) %>%
design_temp_struct summarise(virtual_time = unique(virtual_time), .groups = "drop")
# raincloud plot
<- ggplot(beh_data, aes(x=day,y=memory_time,
f2b fill = virtual_time, colour = virtual_time,
group=paste(day, event, sep = "_"))) +
::geom_half_violin(position = "identity", side = "r", color = NA) +
gghalves# plot the violin
#geom_flat_violin(position = position_nudge(x = 0.1, y = 0),adjust = 1, trim = TRUE) +
# single subject data points with horizontal jitter
geom_point(aes(x = as.numeric(day)-0.25, y = memory_time, fill = virtual_time), alpha = 0.7,
position = position_jitter(width = .05, height = 0),
shape=16, size = 1) +
# box plot of distribution
geom_boxplot(position = position_nudge(x = -0.1, y = 0), aes(x = day, y = memory_time),
width = .05, colour = "black", outlier.shape = NA) +
# add plot of mean and SEM
stat_summary(fun = mean, geom = "point", size = 1, colour = "black", shape = 16) +
stat_summary(fun.data = mean_se, geom = "errorbar",
colour = "black", width = 0, size = 0.5)+
# plot point and line for true virtual times
geom_line(data = design_temp_struct, aes(x = day, y = virtual_time, group = day),
position = position_nudge(-0.45), size = 1, colour = ultimate_gray)+
geom_point(data = design_temp_struct, aes(x = day, y = virtual_time, fill = virtual_time),
position = position_nudge(-0.45), size = 2, color = ultimate_gray, stroke = 0.5, shape = 21) +
ylab('virtual time') + xlab('sequence') +
scale_y_continuous(limits = c(6, 24), breaks=seq(6,24,4)) +
::scale_fill_scico(begin = 0.1, end = 0.7, palette = "devon") +
scico::scale_color_scico(begin = 0.1, end = 0.7, palette = "devon") +
scicocoord_flip() +
guides(color=guide_colorbar(title.position = "top", direction = "horizontal",
title = element_blank(),
barwidth = unit(20, "mm"), barheight = unit(3, "mm")),
fill = "none")+
theme_cowplot() + theme(text = element_text(size=10), axis.text = element_text(size=8),
legend.position = c(1,1), legend.justification = c(1,1))
f2b
# save source data
<-ggplot_build(f2b)$data[[2]] %>%
source_dat rename(x=y,y=x) # rename because of flipped coordinates
::write_tsv(source_dat %>% select(x,y),
readrfile = file.path(dirs$source_dat_dir, "f2b.txt"))
Accuracy of remembered times
We analyzed how well participants constructed the event times based on the day learning task. We quantified absolute errors across all events (Figure 2C) as well as separately for the five sequence positions (Figure 2D), the four sequences (Supplemental Figure 3A) and as a function of virtual clock speed (Supplemental Figure 3B).
Average absolute error per participant
Now, let’s look at the average error per participant collapsed across all trials.
# calculate signed timeline error as difference between virtual time and response
<- beh_data %>%
beh_data mutate(timeline_error = virtual_time - memory_time)
# average absolute timeline error across subjects
<- beh_data %>%
timeline_group group_by(sub_id) %>%
summarise(
avg_error = mean(abs(timeline_error)),
.groups = "drop")
# print summary of the average error
summary(timeline_group$avg_error)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.2377 0.5625 0.9093 0.9103 1.1221 2.0963
<- ggplot(timeline_group, aes(x=1,y=avg_error), colour= time_colors[1], fill=time_colors[1]) +
f2c # plot the violin
::geom_half_violin(position=position_nudge(0.1),
gghalvesside = "r", fill=time_colors[1], color = NA) +
# single subject data points with horizontal jitter
geom_point(aes(x = 1-.2), alpha = 0.7,
position = position_jitter(width = .1, height = 0),
shape=16, colour=time_colors[1], size = 1)+
# box plot of distribution
geom_boxplot(width = .1, fill=time_colors[1], colour = "black", outlier.shape = NA) +
# add plot of mean and SEM
stat_summary(fun = mean, geom = "point", size = 1, shape = 16,
position = position_nudge(.1), colour = "black") +
stat_summary(fun.data = mean_se, geom = "errorbar",
position = position_nudge(.1), colour = "black", width = 0, size = 0.5)+
# edit axis labels
ylab('absolute error') + xlab('timeline task') +
guides(color = "none", fill = "none")+
theme_cowplot() + theme(legend.position = "none", axis.text.x = element_blank(),
axis.ticks.x = element_blank())
f2c
# save source data
<-ggplot_build(f2c)$data[[2]]
source_dat ::write_tsv(source_dat %>% select(x,y),
readrfile = file.path(dirs$source_dat_dir, "f2c.txt"))
Timeline task: 0.91 ± 0.47 mean±standard deviation of average absolute errors.
Event-wise average absolute errors
We can also aggregate the data across days by group the events as a function of event position (i.e. the order). Let’s look at the absolute errors in the timeline task for the events at positions 1-5, averaged across days.
# calculate mean absolute error per event for each subject
<- beh_data %>%
timeline_per_event group_by(sub_id, event) %>%
summarise(
mean_abs_error =mean(abs(timeline_error)),
.groups = "drop")
# raincloud plot
<- ggplot(timeline_per_event,aes(x=event,y=mean_abs_error, fill = event)) +
f2d ::geom_half_violin(position=position_nudge(0.1), side = "r", colour=NA) +
gghalvesgeom_point(aes(x = as.numeric(event)-.2, y = mean_abs_error, colour=event), alpha = 0.7,
position = position_jitter(width = .1, height = 0),
shape=16, size = 1) +
geom_boxplot(position = position_nudge(x = 0, y = 0), aes(x = event, y = mean_abs_error),
width = .1, colour = "black", outlier.shape = NA, outlier.size = 1) +
stat_summary(fun = mean, geom = "point", size=1, shape = 16,
position = position_nudge(.1), colour = "BLACK") +
stat_summary(fun.data = mean_se, geom = "errorbar",
position = position_nudge(.1), colour = "BLACK", width = 0, size = 0.5)+
ylab('absolute error')+ xlab('event') +
scale_colour_manual(values = event_colors, name="event position") +
scale_fill_manual(values = event_colors) +
guides(fill = "none", color=guide_legend(override.aes=list(fill=NA, alpha = 1, size=2))) +
theme_cowplot() + theme(legend.position = "none")
f2d
# save source data
<-ggplot_build(f2d)$data[[2]]
source_dat ::write_tsv(source_dat %>% select(x,y, group),
readrfile = file.path(dirs$source_dat_dir, "f2d.txt"))
Timeline error differences between sequences
Let’s test if absolute errors differ between sequences. We use a permutation-based repeated measures ANOVA with permutation-based t-tests for post-hoc comparisons.
# mean absolute timeline error per day
<- beh_data %>% group_by(sub_id, day) %>%
timeline_group_by_day summarise(avg_timeline_error = mean(abs(virtual_time-memory_time)), .groups = "drop")
# permutation repeated measures from permuco package with day as the only factor
set.seed(511) # set seed for reproducibility
<- permuco::aovperm(formula = avg_timeline_error ~ day + Error(sub_id/day),
aov_fit_perm data = timeline_group_by_day, method = "Rd_kheradPajouh_renaud", np =n_perm)
summary(aov_fit_perm)
SSn | dfn | SSd | dfd | MSEn | MSEd | F | parametric P(>F) | permutation P(>F) |
---|---|---|---|---|---|---|---|---|
5.54 | 3 | 25.5 | 81 | 1.85 | 0.315 | 5.86 | 0.00113 | 0.0008 |
# posthoc pairwise t-tests
set.seed(511) # set seed for reproducibility
for (i in 1:(n_days-1)){
for (j in (i+1):n_days){
<- paired_t_perm_jb(timeline_group_by_day$avg_timeline_error[timeline_group_by_day$day==i]-
stats $avg_timeline_error[timeline_group_by_day$day==j],
timeline_group_by_dayn_perm = n_perm)
print(sprintf("sequence %d vs. %d: t=%.2f, p=%.3f",
$statistic, round(stats$p_perm, 3)))
i, j, stats
} }
## [1] "sequence 1 vs. 2: t=3.38, p=0.001"
## [1] "sequence 1 vs. 3: t=-0.12, p=0.912"
## [1] "sequence 1 vs. 4: t=2.59, p=0.013"
## [1] "sequence 2 vs. 3: t=-2.92, p=0.001"
## [1] "sequence 2 vs. 4: t=-1.15, p=0.271"
## [1] "sequence 3 vs. 4: t=2.15, p=0.023"
# raincloud plot
<- ggplot(timeline_group_by_day,aes(x=day,y=avg_timeline_error)) +
sfig_beh_a ::geom_half_violin(position=position_nudge(0.1), side = "r", colour=NA, fill=ultimate_gray) +
gghalvesgeom_point(aes(x = as.numeric(day)-.2, y = avg_timeline_error), alpha = 0.7, fill=ultimate_gray,
position = position_jitter(width = .1, height = 0),
shape=16, size = 1) +
geom_boxplot(position = position_nudge(x = 0, y = 0), aes(x = day, y = avg_timeline_error),
width = .1, colour = "black", fill=ultimate_gray, outlier.shape = NA, outlier.size = 1) +
stat_summary(fun = mean, geom = "point", size=1, shape = 16,
position = position_nudge(.1), colour = "BLACK") +
stat_summary(fun.data = mean_se, geom = "errorbar",
position = position_nudge(.1), colour = "BLACK", width = 0, size = 0.5)+
ylab('timeline error')+ xlab('sequence') +
annotate(geom = "text", x = c(1.5, 2.5), y = Inf, label = 'underline(" * ")',
hjust = 0.5, vjust=1, size = 8/.pt, family=font2use, parse=TRUE) +
guides(fill = "none", color=guide_legend(override.aes=list(fill=NA, alpha = 1, size=2))) +
theme_cowplot() + theme(legend.position = "none")
sfig_beh_a
Effect of Clock Speed on Timeline Errors
Next, we test whether behavior differed between sequences as a function of clock speed (fast or slow).
# add clock speed variable
<- beh_data %>%
beh_data mutate(clock_speed = factor(ifelse((day == 1 | day ==2), "slow", "fast"), levels=c("slow", "fast")))
# average absolute errors as a function of speed (and jitter for plotting)
<- beh_data %>%
timeline_group_by_speed group_by(sub_id, clock_speed) %>%
summarise(avg_timeline_error = mean(abs(virtual_time-memory_time)), .groups = "drop") %>%
mutate(x_jit = as.numeric(clock_speed) + rep(jitter(rep(0,n_subs), amount=0.05), each=2) * rep(c(-1,1),n_subs))
# t-test between errors in slow vs. fast
<- paired_t_perm_jb(timeline_group_by_speed$avg_timeline_error[timeline_group_by_speed$clock_speed=="slow"] -
stats $avg_timeline_error[timeline_group_by_speed$clock_speed=="fast"],
timeline_group_by_speedn_perm=n_perm)
# raincloud plot
<- ggplot(data=timeline_group_by_speed, aes(x=clock_speed, y=avg_timeline_error, fill = clock_speed, color = clock_speed)) +
sfig_beh_b geom_boxplot(aes(group=clock_speed), position = position_nudge(x = 0, y = 0),
width = .1, colour = "black", outlier.shape = NA) +
scale_fill_manual(values = c("#ece7f2", "#a6bddb")) +
scale_color_manual(values = c("#ece7f2", "#a6bddb"),
name = "clock speed", labels=c("slow", "fast")) +
::geom_half_violin(data = timeline_group_by_speed %>% filter(clock_speed == "slow"),
gghalvesaes(x=clock_speed, y=avg_timeline_error),
position=position_nudge(-0.1),
side = "l", color = NA) +
::geom_half_violin(data = timeline_group_by_speed %>% filter(clock_speed == "fast"),
gghalvesaes(x=clock_speed, y=avg_timeline_error),
position=position_nudge(0.1),
side = "r", color = NA) +
stat_summary(fun = mean, geom = "point", size = 1, shape = 16,
position = position_nudge(c(-0.1, 0.1)), colour = "black") +
stat_summary(fun.data = mean_se, geom = "errorbar",
position = position_nudge(c(-0.1, 0.1)), colour = "black", width = 0, size = 0.5) +
geom_line(aes(x = x_jit, group=sub_id,), color = ultimate_gray,
position = position_nudge(c(0.15, -0.15))) +
geom_point(aes(x=x_jit, fill = clock_speed), position = position_nudge(c(0.15, -0.15)),
shape=16, size = 1) +
ylab('timeline error') + xlab('clock speed') +
annotate(geom = "text", x = 1.5, y = Inf, label = 'underline(" n.s. ")',
hjust = 0.5, vjust=1, size = 8/.pt, family=font2use, parse=TRUE) +
theme_cowplot() +
theme(text = element_text(size=10), axis.text = element_text(size=8),
legend.position = "none")
sfig_beh_b
Summary Statistics: paired t-test comparing timeline errors of slow vs. fast sequences t27=-0.82, p=0.423
Relationship of sorting errors and timeline errors
Correlation
To test whether performance in the two tasks is related, we run an across-subject correlation of the number of sorting errors with mean absolute timeline errors.
# make usable data frame
<- inner_join(timeline_group, day_sorting) beh_group
## Joining, by = "sub_id"
<- beh_group %>% rename(avg_timeline_error = avg_error,
beh_group n_correct_sorts = n_correct,
prcnt_correct_sorts = prcnt_correct) %>%
mutate(n_sorting_errors = n_events_day*n_days - n_correct_sorts,
perfect_sort = n_sorting_errors==0,
perfect_sort = factor(if_else(perfect_sort, "yes", "no"), levels = c("yes", "no")))
# SORTING AND TIMELINE ERRORS
# Spearman correlation for all subjects and non-perfect sorters
<- beh_group %$%
stats cor.test(n_sorting_errors, avg_timeline_error,method = "spearman") %>%
tidy()
## Warning in cor.test.default(n_sorting_errors, avg_timeline_error, method = "spearman"): Cannot compute exact p-value with ties
# scatter plot of errors in the two tasks
<- ggplot(beh_group, aes(x=n_sorting_errors, y=avg_timeline_error)) +
sfig_beh_c geom_smooth(method='lm', formula= y~x, size=1, color=ultimate_gray, fill="lightgrey") +
geom_point(shape = 16, size = 1, color=ultimate_gray) +
scale_x_continuous(breaks=seq(0,12,3)) +
xlab("sorting errors") +
ylab("timeline error") +
annotate("text", x = Inf, y = Inf, hjust="inward", vjust="inward", size=6/.pt,
label=sprintf("r=%.2f\np=%.3f", round(stats$estimate, digits = 2), round(stats$p.value, digits=3)),
family=font2use) +
theme_cowplot() +
theme(legend.position = "none",
text = element_text(size=10, family=font2use),
axis.text = element_text(size=8),
legend.text=element_text(size=8),
legend.title=element_text(size=8))
sfig_beh_c
Spearman correlation of number of sorting errors and mean absolute timeline errors: r=0.23, p=0.246
Comparison of perfect sorters and participants with sorting errors
The correlational approach is limited because there is limited variance for sorting errors as a large group of participants performs the sorting task at ceiling and does not make sorting errors. We thus compare in a next step the timeline errors between participants who do make at least one sorting erorr and those who don’t make sorting errors using an independent t-test.
# t-test based on perfect sorter or not
<- tidy(t.test(beh_group$avg_timeline_error[beh_group$perfect_sort=="yes"],
stats $avg_timeline_error[beh_group$perfect_sort=="no"],
beh_groupvar.equal = TRUE))
<- ggplot(beh_group, aes(x=perfect_sort, y=avg_timeline_error, fill=perfect_sort, color=perfect_sort)) +
sfig_beh_d geom_boxplot(width = .1, colour = "black", outlier.shape = NA, outlier.size = 2) +
::geom_half_violin(data=beh_group %>% filter(perfect_sort=="yes"), aes(x=1, y=avg_timeline_error),
gghalvesposition=position_nudge(-0.1), side = "l", colour=NA) +
::geom_half_violin(data=beh_group %>% filter(perfect_sort=="no"), aes(x=2, y=avg_timeline_error),
gghalvesposition=position_nudge(0.1), side = "r", colour=NA) +
geom_point(aes(x=ifelse(n_sorting_errors==0, 1.2, 1.8)), alpha = 1, position = position_jitter(width = .1, height = 0), shape = 16, size = 1) +
stat_summary(fun = mean, geom = "point", size=1, shape = 16,
position = position_nudge(c(-0.1, 0.1)), colour = "BLACK") +
stat_summary(fun.data = mean_se, geom = "errorbar",
position = position_nudge(c(-0.1, 0.1)), colour = "BLACK", width = 0, size = 0.5) +
ylab('timeline error')+xlab('sorting errors') +
annotate(geom = "text", x = 1.5, y = Inf, label = 'underline(" n.s. ")',
hjust = 0.5, vjust=1, size = 8/.pt, family=font2use, parse=TRUE) +
scale_x_discrete(labels = c("0", "≥1")) +
scale_color_manual(values=c("#81A88D", "#972D15"), labels=c("no sorting error", "≥1 sorting error")) +
scale_fill_manual(values=c("#81A88D", "#972D15")) +
guides(color=guide_legend("sorting task", override.aes = list(shape=16, size=1, linetype=0, fill=NA, alpha=1)), fill = "none") +
theme_cowplot() +
theme(legend.position = "none", text = element_text(size=10, family=font2use),
axis.text = element_text(size=8),
legend.text=element_text(size=8),
legend.title=element_text(size=8))
sfig_beh_d
Summary Statistics: independent t-test comparing timeline errors for participants without and with sorting errors t26=-1.79, p=0.085, 95% CI [-0.66, 0.05]
Time metrics underlying remembered times
Using two approaches we tested whether virtual time drove participants’ responses rather than the sequence order or objectively elapsing time.
While participants were asked to reproduce the virtual time, any of the other two factors could have an impact on their behavioral responses. If participants had, for example, only memorized the order of scenes in a given day, they would probably do reasonably well on the timeline task by distributing the scenes evenly along the timeline in the correct order.
Summary Statistics
For the summary statistics approach, we ran a multiple regression analysis for each participant with virtual time, sequence position (order), and real time since the first event of a day as predictors of responses in the timeline task. To test whether virtual time indeed explained participants’ responses even when competing for variance with order and real time, included in the model as control predictors of no interest, we compared the participant-specific t-values of the resulting regression coefficients against null distributions obtained from shuffling the remembered times against the predictors 10,000 times. We converted the resulting p-values to Z-values and tested these against zero using a permutation-based t-test (two-sided; α=0.05; 10,000 random sign-flips, Figure 2E). As a measure of effect size, we report Cohen’s d with Hedges’ correction and its 95% confidence interval as computed using the effsize-package(101).
set.seed(115) # set seed for reproducibility
# do RSA using linear model and calculate z-score for model fit from permutations
<- beh_data %>% group_by(sub_id) %>%
fit # run the linear model
do(z = lm_perm_jb(in_dat = .,
lm_formula = "memory_time ~ virtual_time + event + real_time",
nsim = n_perm)) %>%
mutate(z = list(setNames(z, c("z_intercept", "z_virtual_time", "z_order", "z_real_time")))) %>%
unnest_longer(z) %>%
filter(z_id != "z_intercept") %>%
mutate(z_id = factor(z_id,
levels = c("z_virtual_time", "z_order", "z_real_time")))
# run group-level t-test for virtual time
<- fit %>%
stats filter(z_id =="z_virtual_time") %>%
select(z) %>%
paired_t_perm_jb (., n_perm = n_perm)
# Cohen's d with Hedges' correction for one sample using non-central t-distribution for CI
<-cohen.d(d=(fit %>% filter(z_id =="z_virtual_time") %>% select(z))$z, f=NA, paired=TRUE, hedges.correction=TRUE, noncentral=TRUE)
d$d <- d$estimate
stats$dCI_low <- d$conf.int[[1]]
stats$dCI_high <- d$conf.int[[2]]
stats
# print results
huxtable(stats) %>% theme_article()
estimate | statistic | p.value | p_perm | parameter | conf.low | conf.high | method | alternative | d | dCI_low | dCI_high |
---|---|---|---|---|---|---|---|---|---|---|---|
2.44 | 10.6 | 3.82e-11 | 0.0001 | 27 | 1.97 | 2.91 | One Sample t-test | two.sided | 1.95 | 1.38 | 2.7 |
# raincloud plot of the results
<- ggplot(fit, aes(x=z_id, y=z, fill = z_id, colour = z_id)) +
f2e # plot the violin
::geom_half_violin(position=position_nudge(0.1), side = "r", colour=NA) +
gghalves# single subject data points with horizontal jitter
geom_point(aes(x = as.numeric(z_id)-.2, y = z, colour = z_id), alpha = 0.7,
position = position_jitter(width = .1, height = 0), shape = 16, size = 1) +
geom_boxplot(position = position_nudge(x = 0, y = 0), aes(x = z_id, y = z),
width = .1, colour = "black", outlier.shape = NA, outlier.size = 2) +
stat_summary(fun = mean, geom = "point", size=1, shape = 16,
position = position_nudge(.1), colour = "BLACK") +
stat_summary(fun.data = mean_se, geom = "errorbar",
position = position_nudge(.1), colour = "BLACK", width = 0, size = 0.5) +
ylab('Z multiple regression')+xlab('time metric') +
scale_x_discrete(labels = c("virt. time", "order", "real time")) +
coord_cartesian(ylim = c(-2,4.5))+
scale_color_manual(labels = c("virtual time", "order", "real time"),values=time_colors) +
scale_fill_manual(labels = c("virtual time", "order", "real time"),values=time_colors) +
guides(fill= "none", color=guide_legend(override.aes=list(fill=NA, alpha = 1, size=2),
title = element_blank(), direction ="vertical",
title.position = "bottom")) +
annotate(geom = "text", x = 1, y = Inf, label = "***", hjust = 0.5, vjust = 1, size = 8/.pt, family=font2use) +
theme_cowplot() + theme(text = element_text(size=8), axis.text = element_text(size=8),
legend.position = c(1,1), legend.justification = c(1,1),
legend.spacing.x = unit(0, 'mm'), #legend.spacing.y = unit(2, 'mm'),
legend.key.size = unit(3,"mm"),
legend.box.margin = margin(0,0,0,0,"mm"), legend.background = element_rect(size=0))
f2e
# save source data
<-ggplot_build(f2e)$data[[2]]
source_dat ::write_tsv(source_dat %>% select(x,y, group),
readrfile = file.path(dirs$source_dat_dir, "f2e.txt"))
Summary Statistics: t-test against 0 for virtual time when order and real time are in the model
t27=10.62, p=0.000, d=1.95, 95% CI [1.38, 2.70]
Linear mixed effects
A potentially more elegant way of testing the above is to use linear mixed effects models. However, drawing statistical inferences from these data is less straight forward and care needs to be taken with respect to the precise hypotheses that are tested.
Second, we addressed this question using linear mixed effects modeling. Here, we included the three z-scored time metrics as fixed effects. Starting from a maximal random effect structure(102), we simplified the random effects structure to avoid convergence failures and singular fits. The final model included random intercepts and random slopes for virtual time for participants. The model results are visualized by dot plots showing the fixed effect parameters with their 95% confidence intervals (Supplemental Figure 4A) and marginal effects (Supplemental Figure 4B) estimated using the ggeffects package103. To assess the statistical significance (α=0.05) of virtual time above and beyond the effects of order and real time, we compared this full model to a nested model without the fixed effect of virtual time, but including order and real time, using a likelihood ratio test. Supplemental Table 1 provides an overview of the final model and the model comparison.
The main point we want to make is that virtual time explains variance above and beyond order. To test this, we run the full LMM and a reduced version of the model without virtual time. These two models are then compared using a likelihood ratio test to obtain a p-value.
Following Barr et al. (2013), we want to implement a maximal random effect structure. However, for the models to converge and to avoid singular fits, we have to simplify the random effects structure to include only random intercepts for each subject and a random slope for the effect of virtual time for each subject. Barr et al. (2013, p. 275) “propose the working assumption that it is not essential for one to specify random effects for control predictors to avoid anti-conservative inference, as long as interactions between the control predictors and the factors of interest are not present in the model”. However, “in the common case where one is interested in a minimally anti-conservative evaluation of the strength of evidence for the presence of an effect, our results indicate that keeping the random slope for the predictor of theoretical interest is important” (Barr et al., 2013, p. 276). Thus, we keep the random slopes for virtual time in the model.
More specifically, we follow these steps:
- we start with a model that includes random intercepts and slopes for all 3 fixed effects for each subject. This results in a singular fit
- Thus, we drop the random effects of real time and order.
We do not incorporate random intercepts for the individual events as this sampling unit cannot be dissociated from the predictors. See the description of the exception by Brauer & Curtin (Psych Methods, 2018, p. 13).
# center the predictor variables (set scale to true for z-score)
<- beh_data %>%
beh_data group_by(sub_id) %>%
mutate(
order_z = scale(as.numeric(event), scale = TRUE),
virtual_time_z = scale(virtual_time, scale = TRUE),
real_time_z = scale(real_time, scale = TRUE)
%>%
) ungroup()
# set RNG
set.seed(27)
# define the full model with all 3 time metrics as fixed effects and
# by-subject random intercepts and random slopes for each time metric --> singular fit
<- "memory_time ~ virtual_time_z + order_z + real_time_z + (1 + virtual_time_z + order_z + real_time_z | sub_id)"
formula <- lme4::lmer(formula, data = beh_data,
lmm_full REML = FALSE, control=lmerControl(optCtrl=list(maxfun=20000)))
## boundary (singular) fit: see ?isSingular
# remove random slopes for the fixed effects of non-interest (order and real time)
# by-subject random intercepts and random slopes for virtual time
set.seed(212) # set seed for reproducibility
<- "memory_time ~ virtual_time_z + order_z + real_time_z + (1 + virtual_time_z | sub_id)"
formula <- lme4::lmer(formula, data = beh_data,
lmm_full REML = FALSE, control=lmerControl(optCtrl=list(maxfun=20000)))
summary(lmm_full)
## Linear mixed model fit by maximum likelihood ['lmerMod']
## Formula: memory_time ~ virtual_time_z + order_z + real_time_z + (1 + virtual_time_z | sub_id)
## Data: beh_data
## Control: lmerControl(optCtrl = list(maxfun = 20000))
##
## AIC BIC logLik deviance df.resid
## 1940.0 1974.6 -962.0 1924.0 552
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -7.3874 -0.4034 0.0429 0.4259 7.8191
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## sub_id (Intercept) 0.04928 0.2220
## virtual_time_z 0.02742 0.1656 0.23
## Residual 1.75541 1.3249
## Number of obs: 560, groups: sub_id, 28
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 14.01002 0.06996 200.252
## virtual_time_z 3.06932 0.25997 11.807
## order_z 1.66763 0.43023 3.876
## real_time_z -0.33226 0.47331 -0.702
##
## Correlation of Fixed Effects:
## (Intr) vrtl__ ordr_z
## virtul_tm_z 0.017
## order_z 0.000 -0.112
## real_time_z 0.000 -0.426 -0.841
# tidy summary of the fixed effects that calculates 95% CIs
<- broom.mixed::tidy(lmm_full, effects = "fixed", conf.int=TRUE, conf.method="profile") lmm_full_bm
## Computing profile confidence intervals ...
# tidy summary of the random effects
<- broom.mixed::tidy(lmm_full, effects = "ran_pars") lmm_full_bm_re
To test whether virtual time is relevant to explaining the data even when order and real time are in the model, we compare the full model defined above against a reduced model. In this reduced model, we do not include the fixed effect of virtual time. We compare the full against the nested (reduced) model using a likelihood ratio test.
# to test for significance let's by comparing the likelihood against a simpler model.
# Here, we drop the effect of virtual time and run an ANOVA. See e.g. Bodo Winter tutorial
# this is the current best way of testing the effect of virtual time on behavior
<- "memory_time ~ order_z + real_time_z + (1 + virtual_time_z | sub_id)" # random intercepts for each subject and random slopes for virtual time
formula set.seed(213) # set seed for reproducibility
<- lme4::lmer(formula, data = beh_data, REML = FALSE, control=lmerControl(optCtrl=list(maxfun=20000)))
lmm_no_vir_time
# because the model fails to converge with a warning that the scaled gradient at the fitted (RE)ML estimates
# is large, we restart the model as described here (https://rdrr.io/cran/lme4/man/troubleshooting.html)
# obtaining consistent results (with no warning) suggests a false positive
<- update(lmm_no_vir_time, start=getME(lmm_no_vir_time, "theta"))
lmm_no_vir_time
# run the ANOVA
<- anova(lmm_no_vir_time, lmm_full)
lmm_aov lmm_aov
npar | AIC | BIC | logLik | deviance | Chisq | Df | Pr(>Chisq) |
---|---|---|---|---|---|---|---|
7 | 2.05e+03 | 2.08e+03 | -1.02e+03 | 2.04e+03 | |||
8 | 1.94e+03 | 1.97e+03 | -962 | 1.92e+03 | 116 | 1 | 4.88e-27 |
Mixed Model: Fixed effect of virtual time time with order and real time in the model of remembered times \(\chi^2\)(1)=115.95, p=0.000
Make mixed model summary table that includes overview of fixed and random effects as well as the model comparison to the nested (reduced) model.
<- c("intercept", "virtual time", "order", "real time")
fe_names <- c(rep("participant",3), "residual")
re_groups <- c("intercept", "virtual time (SD)", "correlation random intercepts and random slopes", "SD")
re_names
<- make_lme_huxtable(fix_df=lmm_full_bm,
lmm_hux ran_df = lmm_full_bm_re,
aov_mdl = lmm_aov,
fe_terms =fe_names,
re_terms = re_names,
re_groups = re_groups,
lme_form = gsub(" ", "", paste0(deparse(formula(lmm_full)),
collapse = "", sep="")),
caption = "Mixed Model: Virtual time explains constructed times with order and real time in the model")
## Warning: `select_vars()` was deprecated in dplyr 0.8.4.
## Please use `tidyselect::vars_select()` instead.
# convert the huxtable to a flextable for word export
<- convert_huxtable_to_flextable(ht = lmm_hux)
stable_lme_memtime_time_metrics
# print to screen
theme_article(lmm_hux)
fixed effects | ||||||
---|---|---|---|---|---|---|
term | estimate | SE | t-value | 95% CI | ||
intercept | 14.010019 | 0.069962 | 200.25 | 13.868056 | 14.151981 | |
virtual time | 3.069324 | 0.259967 | 11.81 | 2.558874 | 3.579774 | |
order | 1.667630 | 0.430230 | 3.88 | 0.822785 | 2.512476 | |
real time | -0.332261 | 0.473306 | -0.70 | -1.261696 | 0.597173 | |
random effects | ||||||
group | term | estimate | ||||
participant | intercept | 0.221991 | ||||
participant | virtual time (SD) | 0.232089 | ||||
participant | correlation random intercepts and random slopes | 0.165592 | ||||
residual | SD | 1.324919 | ||||
model comparison | ||||||
model | npar | AIC | LL | X2 | df | p |
reduced model | 7 | 2053.90 | -1019.95 | |||
full model | 8 | 1939.95 | -961.98 | 115.95 | 1 | 4.88e-27 |
model: memory_time~virtual_time_z+order_z+real_time_z+(1+virtual_time_z|sub_id); SE: standard error, CI: confidence interval, SD: standard deviation, npar: number of parameters, LL: log likelihood, df: degrees of freedom, corr.: correlation |
To visualize the model, we use the confidence intervals to create dot plots for the model coefficients. Further, we estimate marginal means for each fixed effect while holding the other parameters constant. We do this for the quartiles (including minimum and maximum values).
# make the time metrics a factor
<- lmm_full_bm %>%
lmm_full_bm mutate(term=as.factor(term) %>%
factor(levels = c("virtual_time_z", "order_z", "real_time_z")))
# dot plot of Fixed Effect Coefficients with CIs
<- ggplot(data = lmm_full_bm[2:4,], aes(x = term, color = term)) +
sfigmm_a geom_hline(yintercept = 0, colour="black", linetype="dotted") +
geom_errorbar(aes(ymin = conf.low, ymax = conf.high, width = NA), size = 0.5) +
geom_point(aes(y = estimate), size = 1, shape = 16) +
scale_fill_manual(values = time_colors) +
scale_color_manual(values = time_colors, labels = c("virtual time (same seq.)", "order", "real time"), name = element_blank()) +
labs(#title = "Fixed Effect Coefficients",
x = element_blank(), y="fixed\neffect estimate", color = element_blank()) +
theme_cowplot() +
#coord_fixed(ratio = 3) +
theme(plot.title = element_text(hjust = 0.5), axis.text.x=element_blank()) +
guides(fill = "none", color=guide_legend(override.aes=list(fill=NA, alpha = 1, size=2, linetype=0)))+
annotate(geom = "text", x = 1, y = Inf, label = "***", hjust = 0.5, vjust = 1, size = 8/.pt, family=font2use)
sfigmm_a
# estimate marginal means for each model term by omitting the terms argument
<- ggeffects::ggpredict(lmm_full, ci.lvl = 0.95) %>% get_complete_df
lmm_full_emm # convert the group variable to a factor to control the order of facets below
$group <- factor(lmm_full_emm$group, levels = c("virtual_time_z", "order_z", "real_time_z"))
lmm_full_emm
# plot marginal means
<- ggplot(data = lmm_full_emm, aes(color = group)) +
sfigmm_b geom_line(aes(x, predicted)) +
geom_ribbon(aes(x, ymin = conf.low, ymax = conf.high, fill = group), alpha = .3, linetype=0) +
scale_color_manual(values = time_colors, name=element_blank(),
labels = c("virtual time", "order", "real time")) +
scale_fill_manual(values = time_colors, labels = c("Virtual Time", "Order", "Real Time")) +
ylab('estimated\nmarginal means') +
xlab('z time metric') +
scale_y_continuous(breaks = c(9, 14, 19), labels = c(9, 14, 19)) +
guides(fill = "none", color = "none") +
theme_cowplot() +
theme(plot.title = element_text(hjust = 0.5), strip.background = element_blank(),
strip.text.x = element_blank())
sfigmm_b
LME model assumptions
lmm_diagplots_jb(lmm_full)
- Absence of collinearity: The different time metrics are correlated. This correlation is slightly reduced by z-scoring the predictors. In any case, these correlations are inherent to the design and not really a problem as long as estimated coefficients are interpreted correctly. For more info, see e.g. this opinion piece.
Compose Figure for Behavioral Data
This figure consists of the plots showing the results of the sorting task and the timeline task. It will probably be figure 2 of the manuscript.
= "
layout AABBBBBBDDDDDD
AABBBBBBDDDDDD
AABBBBBBDDDDDD
AABBBBBBDDDDDD
AABBBBBBDDDDDD
AABBBBBBDDDDDD
CCBBBBBBEEEEEE
CCBBBBBBEEEEEE
CCBBBBBBEEEEEE
CCBBBBBBEEEEEE
CCBBBBBBEEEEEE
CCBBBBBBEEEEEE"
<- f2a + f2b + f2c + f2d + f2e +
f2 plot_layout(design = layout, guides = "keep") &
theme(text = element_text(size=10, family=font2use),
axis.text = element_text(size=8),
legend.text=element_text(size=8),
legend.title=element_text(size=8)
&
) plot_annotation(theme = theme(plot.margin = margin(t=0, r=0, b=0, l=-5, unit="pt")),
tag_levels = 'A')
# save as png and pdf and print to screen
<- here("figures", "f2")
fn ggsave(paste0(fn, ".pdf"), plot=f2, units = "cm",
width = 18, height = 16, dpi = "retina", device = cairo_pdf)
ggsave(paste0(fn, ".png"), plot=f2, units = "cm",
width = 18, height = 16, dpi = "retina", device = "png")
5.3 Generalization bias
If participants use structural knowledge about the sequences when constructing times of events, then we might expect biases in their behavior: Errors in constructed event times could be non-random. Specifically, when constructing the time of one specific event, participants could be biased in their response by the times of the events from other sequences at that sequence position. This would indicate that knowledge about the other sequences in generalized to influence specific mnemonic constructions, resulting in a bias.
Quantify relative time of other events
To explore whether structural knowledge about general time patterns biases the construction of event times, we assessed errors in remembered event times. Specifically, when constructing the time of one specific event, participants could be biased in their response by the times of the events from other sequences at that sequence position. For each event, we quantified the average time of events in the other sequences at the same sequence position (Figure 8A). For example, for the fourth event of the first sequence, we calculated the average time of the fourth events of sequences two, three and four.
To test for such a bias, we quantify, for each event, the relative time of the other events at that sequence position. We then calculate by how much virtual time each individual event time deviates from the average virtual time of the other events at that sequence position (e.g. the difference in virtual time for event 1 from sequence 1 compared to the average virtual time of events 1 from sequences 2-4). We do this so that positive values of this relative time measure indicate that the other events happened later than the event of interest.
# quantify the deviation in virtual time for each event relative to other events
# at the same sequence position
$rel_time_other_events<-NA
beh_data
for (i_day in 1:n_days){
for (i_event in 1:n_events_day){
# find the events at this sequence position from all four sequences
<- beh_data[beh_data$event == i_event,]
curr_events
# find the average virtual time of the other events at this sequence position
<- curr_events %>%
avg_vir_time_other_events filter(day != i_day) %>%
summarise(avg_vir_time = mean(virtual_time)) %>%
select(avg_vir_time) %>%
as.numeric(.)
# for the given event, store the relative time of the other events,
# as the difference in virtual time (positive values mean other events happen later)
<- beh_data$day==i_day & beh_data$event==i_event
event_idx $rel_time_other_events[event_idx] <- avg_vir_time_other_events - beh_data$virtual_time[event_idx]
beh_data
} }
Test Generalization Bias
We then asked whether the deviation between the average time of other events and an event’s true virtual time was systematically related to signed errors in constructed event times. A positive relationship between the relative time of other events and time construction errors indicates that, when other events at the same sequence position are relatively late, participants are biased to construct a later time for a given event than when the other events took place relatively early.
The crucial test is then whether over- and underestimates of remembered time can be explained by this deviation measure.
# calculate timeline error so that positive numbers mean overestimates (later than true virtual time)
<- beh_data %>%
beh_data mutate(timeline_error=memory_time-virtual_time)
Summary statistics
In the summary statistics approach, we ran a linear regression for each participant (Figure 8B, Supplemental Figure 10A) and tested the resulting coefficients for statistical significance using the permutation-based procedures described above (Figure 8C).
In the summary statistics approach, we run a linear regression model for each participant and test the resulting coefficients against a permutation-based null distribution. The resulting z-scores are then tested against 0 on the group level.
# SUMMARY STATISTICS
set.seed(117) # set seed for reproducibility
# test for generalization bias using linear model and calculate z-score for model fit from permutations
<- beh_data %>% group_by(sub_id) %>%
fit_beh_bias # run the linear model (also without permutation tests to get betas)
do(model = lm(timeline_error ~ rel_time_other_events, data=.),
z = lm_perm_jb(in_dat = .,
lm_formula = "timeline_error ~ rel_time_other_events",
nsim = n_perm)) %>%
# store beta estimates and their z-values
mutate(beta_rel_time_other_events = coef(summary(model))[2,"Estimate"],
t_rel_time_other_events = coef(summary(model))[2,"t value"],
#p_rel_time_other_events = coef(summary(model))[2,"Pr(>|t|)"],
z = list(setNames(z, c("z_intercept", "rel_time_other_events")))) %>%
# get rid of intercept z-values and model column
unnest_longer(z) %>%
filter(z_id != "z_intercept") %>%
select (.,-c(model))
# add Pearson correlation (for plot annotation only)
$r <- beh_data %>%
fit_beh_biasgroup_by(sub_id) %>%
summarise(r = cor(rel_time_other_events, timeline_error)) %>%
pull(r)
## `summarise()` ungrouping output (override with `.groups` argument)
# run group-level t-tests on the RSA fits from the first level in aHPC for within-day
<- fit_beh_bias %>%
stats filter(z_id =="rel_time_other_events") %>%
select(z) %>%
paired_t_perm_jb (., n_perm = n_perm)
# Cohen's d with Hedges' correction for one sample using non-central t-distribution for CI
<-cohen.d(d=fit_beh_bias$z, f=NA, paired=TRUE, hedges.correction=TRUE, noncentral=TRUE)
d$d <- d$estimate
stats$dCI_low <- d$conf.int[[1]]
stats$dCI_high <- d$conf.int[[2]]
stats
# print results
huxtable(stats) %>% theme_article()
estimate | statistic | p.value | p_perm | parameter | conf.low | conf.high | method | alternative | d | dCI_low | dCI_high |
---|---|---|---|---|---|---|---|---|---|---|---|
1.38 | 5.32 | 1.3e-05 | 0.0001 | 27 | 0.849 | 1.92 | One Sample t-test | two.sided | 0.976 | 0.552 | 1.48 |
t27=5.32, p=0.000, d=0.98, 95% CI [0.55, 1.48]
We observe a significant positive effect of the virtual time of other events at the same sequence position on remembered virtual time. That means that when other events at the same sequence position are later than a given event, participants are likely to overestimate the event time. Conversely, when the other events at this sequence position relatively early, participants underestimate. This demonstrates an across-sequence effect of virtual time. Virtual time is generalized across sequences to bias remembered times at similar sequence positions.
To illustrate this effect, lets plot the data for one example subject (data from all subjects will be plotted below). For this, we pick a subject with an average fit.
# pick average subject based on t-value of regression
<- fit_beh_bias$sub_id[sort(fit_beh_bias$t_rel_time_other_events, decreasing = TRUE)[n_subs/2]==fit_beh_bias$t_rel_time_other_events]
example_sub <- beh_data %>% filter(sub_id==example_sub)
example_dat
<- ggplot(example_dat, aes(x=rel_time_other_events, y=timeline_error)) +
f8b geom_smooth(method='lm', formula= y~x,
color=aHPC_colors["across_main"],
fill=aHPC_colors["across_main"])+
geom_point(size = 1, shape = 16) +
scale_x_continuous(breaks = c(-2.5, 0, 2.5), labels= c("-2.5", "0", "2.5")) +
xlab("relative time\nof other events") +
ylab("timeline error") +
annotate("text", x=Inf, y=Inf, hjust="inward", vjust="inward", size=6/.pt,
label=sprintf("r=%.2f", fit_beh_bias$r[fit_beh_bias$sub_id==example_sub]),
family=font2use) +
theme_cowplot() +
theme(strip.background = element_blank(),
strip.text = element_blank(),
text = element_text(size=10, family=font2use),
axis.text = element_text(size=8))
f8b
# save source data
<-ggplot_build(f8b)$data[[2]]
source_dat ::write_tsv(source_dat %>% select(x,y),
readrfile = file.path(dirs$source_dat_dir, "f8b.txt"))
To visualize the effect from the summary statistics approach on the group-level, we create a raincloud plot of the z-values from the linear model permutations for each subject.
# plot regression z-values
<-ggplot(fit_beh_bias, aes(x=as.factor(1),y=z), colour= aHPC_colors["across_main"], fill=aHPC_colors["across_main"]) +
f8c# plot the violin
::geom_half_violin(position=position_nudge(0.1),
gghalvesside = "r", fill=aHPC_colors["across_main"], color = NA) +
geom_point(aes(x = 1-.2), alpha = 0.7,
position = position_jitter(width = .1, height = 0),
shape=16, colour=aHPC_colors["across_main"], size = 1)+
geom_boxplot(width = .1, fill=aHPC_colors["across_main"], colour = "black", outlier.shape = NA) +
stat_summary(fun = mean, geom = "point", size = 1, shape = 16,
position = position_nudge(.1), colour = "black") +
stat_summary(fun.data = mean_se, geom = "errorbar",
position = position_nudge(.1), colour = "black", width = 0, size = 0.5)+
ylab('Z regression') + xlab('generalization\n bias') +
annotate(geom = "text", x = 1, y = Inf, label = "***", hjust = 0.5, vjust = 1, size = 8/.pt, family=font2use) +
guides(color = "none", fill = "none")+
theme_cowplot() +
theme(axis.text.x = element_blank())
f8c
# save source data
<-ggplot_build(f8c)$data[[2]]
source_dat ::write_tsv(source_dat %>% select(x,y),
readrfile = file.path(dirs$source_dat_dir, "f8c.txt"))
Mixed Model
Further, we analyzed these data using the linear mixed model approach (Supplemental Figure 4OP, Supplemental Table 16).
We also want to this effect using a linear mixed model. We use the maximal random effects structure with random intercepts and random slopes for participants.
Let’s fit the model and get tidy summaries.
# z-score the relative time of other events for each participant
<- beh_data %>%
beh_data group_by(sub_id) %>%
mutate(rel_time_other_events_z = scale(rel_time_other_events)) %>%
ungroup()
# Fit the model
set.seed(245) # set seed for reproducibility
<- lme4::lmer("timeline_error ~ rel_time_other_events_z + (1+rel_time_other_events_z|sub_id)",
lmm_full data=beh_data, REML=FALSE)
# tidy summary of the fixed effects that calculates 95% CIs
<- broom.mixed::tidy(lmm_full, effects = "fixed", conf.int=TRUE, conf.method="profile") lmm_full_bm
## Computing profile confidence intervals ...
# tidy summary of the random effects
<- broom.mixed::tidy(lmm_full, effects = "ran_pars") lmm_full_bm_re
Compare against a reduced model without the fixed effect of interest.
# fit reduced model
set.seed(248) # set seed for reproducibility
<- lme4::lmer("timeline_error ~ 1 + (1+rel_time_other_events_z|sub_id)",
lmm_reduced data=beh_data, REML=FALSE)
<-anova(lmm_full, lmm_reduced)
lmm_aov lmm_aov
npar | AIC | BIC | logLik | deviance | Chisq | Df | Pr(>Chisq) |
---|---|---|---|---|---|---|---|
5 | 1.96e+03 | 1.98e+03 | -974 | 1.95e+03 | |||
6 | 1.94e+03 | 1.97e+03 | -965 | 1.93e+03 | 17.9 | 1 | 2.32e-05 |
Mixed Model: Fixed effect of relative time of other events on timeline errors
\(\chi^2\)(1)=17.90, p=0.000
Make summary table.
<- c("intercept", "relative time other events")
fe_names <- c(rep("participant",3), "residual")
re_groups <- c("intercept", "relative time other events (SD)", "correlation random intercepts and random slopes", "SD")
re_names
<- make_lme_huxtable(fix_df=lmm_full_bm,
lmm_hux ran_df = lmm_full_bm_re,
aov_mdl = lmm_aov,
fe_terms =fe_names,
re_terms = re_names,
re_groups = re_groups,
lme_form = gsub(" ", "", paste0(deparse(formula(lmm_full)),
collapse = "", sep="")),
caption = "Mixed Model: Behavioral generalization bias")
# convert the huxtable to a flextable for word export
<- convert_huxtable_to_flextable(ht = lmm_hux)
stable_lme_beh_gen_bias
# print to screen
theme_article(lmm_hux)
fixed effects | ||||||
---|---|---|---|---|---|---|
term | estimate | SE | t-value | 95% CI | ||
intercept | -0.352481 | 0.069962 | -5.04 | -0.494444 | -0.210518 | |
relative time other events | 0.337262 | 0.067360 | 5.01 | 0.200579 | 0.473945 | |
random effects | ||||||
group | term | estimate | ||||
participant | intercept | 0.220016 | ||||
participant | relative time other events (SD) | -0.114173 | ||||
participant | correlation random intercepts and random slopes | 0.183681 | ||||
residual | SD | 1.331485 | ||||
model comparison | ||||||
model | npar | AIC | LL | X2 | df | p |
reduced model | 5 | 1958.57 | -974.29 | |||
full model | 6 | 1942.67 | -965.34 | 17.90 | 1 | 2.32e-05 |
model: timeline_error~rel_time_other_events_z+(1+rel_time_other_events_z|sub_id); SE: standard error, CI: confidence interval, SD: standard deviation, npar: number of parameters, LL: log likelihood, df: degrees of freedom, corr.: correlation |
To visualize the mixed model we create a dot plot of the fixed effect coefficient and the estimated marginal means.
# dot plot of Fixed Effect Coefficients with CIs
<- ggplot(data = lmm_full_bm[2,], aes(x = term, color = term)) +
sfigmm_o geom_hline(yintercept = 0, colour="black", linetype="dotted") +
geom_errorbar(aes(ymin = conf.low, ymax = conf.high, width = NA), size = 0.5) +
geom_point(aes(y = estimate), size = 1, shape = 16) +
scale_fill_manual(values = unname(aHPC_colors["across_main"])) +
scale_color_manual(values = unname(aHPC_colors["across_main"]), labels = c("across sequence bias")) +
labs(x = element_blank(), y="fixed\neffect estimate") +
theme_cowplot() +
theme(plot.title = element_text(hjust = 0.5), axis.text.x=element_blank()) +
guides(color = "none", fill = "none")+
annotate(geom = "text", x = 1, y = Inf, label = "***", hjust = 0.5, vjust = 1, size = 8/.pt, family=font2use)
# estimate marginal means for each model term by omitting the terms argument
<- ggeffects::ggpredict(lmm_full, ci.lvl = 0.95) %>% get_complete_df
lmm_full_emm
# plot marginal means
<- ggplot(data = lmm_full_emm, aes(color = group)) +
sfigmm_p geom_line(aes(x, predicted)) +
geom_ribbon(aes(x, ymin = conf.low, ymax = conf.high, fill = group), alpha = .3, linetype=0) +
scale_color_manual(values = unname(aHPC_colors["across_main"]), name=element_blank()) +
scale_fill_manual(values = unname(aHPC_colors["across_main"])) +
scale_x_continuous(breaks = c(-1.5, 0, 1.5), labels= c("-1.5", "0", "1.5")) +
ylab('estimated\nmarginal means') +
xlab('relative time\nof other events') +
guides(fill = "none", color = "none") +
theme_cowplot()
+sfigmm_p sfigmm_o
Lastly, we show that the effect is quite visible from the single-subject plots: The slopes of the least-squares lines are, on average, positive.
# plot the relationship for each subject
<- ggplot(beh_data, aes(x=rel_time_other_events, y=timeline_error)) +
sfig_bias_single_sub geom_smooth(method='lm', formula= y~x,
color=aHPC_colors["across_main"],
fill=aHPC_colors["across_main"])+
geom_point(size = 1, shape = 16) +
geom_text(data = fit_beh_bias, aes(label = sprintf(" r=%.2f", round(r,digits = 2))),
x=-Inf, y=Inf, hjust="inward", vjust="inward", size=6/.pt,
family=font2use) +
scale_x_continuous(breaks = c(-2.5, 0, 2.5), labels= c("-2.5", "0", "2.5")) +
facet_wrap(~sub_id, scales="free_y", nrow=4) +
xlab("relative time of other events (virtual hours)") +
ylab("timeline error") +
theme_cowplot() +
theme(strip.background = element_blank(),
strip.text = element_blank(),
text = element_text(size=10, family=font2use),
axis.text = element_text(size=8))
sfig_bias_single_sub
5.4 Replication of Generalization Bias
To replicate the results from this exploratory analysis, we conducted the same analysis in an independent group of participants. These participants (n=46) constituted the control groups of a behavioral experiment testing the effect of stress induction on temporal memory(66). They underwent the same learning task as described above with the only difference being the duration of this learning phase (4 rather than 7 mini-blocks of training). The timeline task was administered on the day after learning. The procedures are described in detail in Montijn et al.(66). The data from this independent sample are shown in Figure 8D and Supplemental Figure 10B.
# load data from CSV
<- file.path(dirs$data4analysis, "beh_dataNDM.txt")
fname <- cols_only(
col_types_list sub_id = col_factor(),
day = col_factor(),
event = col_factor(),
virtual_time = col_double(),
memory_time = col_double()
)<- as_tibble(read_csv(fname, col_types = col_types_list))
beh_data_replication
head(beh_data_replication)
sub_id | day | event | virtual_time | memory_time |
---|---|---|---|---|
P001 | 1 | 1 | 9.25 | 9.21 |
P001 | 1 | 2 | 11.8 | 11.1 |
P001 | 1 | 3 | 14.8 | 16.1 |
P001 | 1 | 4 | 16.2 | 15 |
P001 | 1 | 5 | 18.2 | 18.4 |
P001 | 2 | 1 | 9.5 | 8.5 |
Replication data: Quantify relative time of other events
To test for such a bias, we quantify, for each event, the relative time of the other events at that sequence position. We then calculate by how much virtual time each individual event time deviates from the average virtual time of the other events at that sequence position (e.g. the difference in virtual time for event 1 from sequence 1 compared to the average virtual time of events 1 from sequences 2-4). We do this so that positive values of this relative time measure indicate that the other events happened later than the event of interest.
# quantify the deviation in virtual time for each event relative to other events
# at the same sequence position
<- beh_data_replication %>%
beh_data_replication add_column(rel_time_other_events=NA)
for (i_day in 1:n_days){
for (i_event in 1:n_events_day){
# find the events at this sequence position from all four sequences
<- beh_data_replication[beh_data_replication$event == i_event,]
curr_events
# find the average virtual time of the other events at this sequence position
<- curr_events %>%
avg_vir_time_other_events filter(day != i_day) %>%
summarise(avg_vir_time = mean(virtual_time)) %>%
select(avg_vir_time) %>%
as.numeric(.)
# for the given event, store the relative time of the other events,
# as the difference in virtual time (positive values mean other events happen later)
<- beh_data_replication$day==i_day & beh_data_replication$event==i_event
event_idx $rel_time_other_events[event_idx] <- avg_vir_time_other_events - beh_data_replication$virtual_time[event_idx]
beh_data_replication
} }
Replication data: Test Generalization Bias
The crucial test is then whether over- and underestimates of remembered time can be explained by this deviation measure.
# calculate timeline error so that positive numbers mean overestimates (later than true virtual time)
<- beh_data_replication %>%
beh_data_replication mutate(timeline_error=memory_time-virtual_time)
Replication data: Summary statistics
In the summary statistics approach, we run a linear regression model for each participant and test the resulting coefficients against a permutation-based null distribution. The resulting z-scores are then tested against 0 on the group level. Both of the tests rely on the custom stats function defined above.
# SUMMARY STATISTICS
set.seed(120) # set seed for reproducibility
# test generalization bias using linear model and calculate z-score for model fit from permutations
<- beh_data_replication %>% group_by(sub_id) %>%
fit_beh_bias_replication # run the linear model (also without permutation tests to get betas)
do(model = lm(timeline_error ~ rel_time_other_events, data=.),
z = lm_perm_jb(in_dat = .,
lm_formula = "timeline_error ~ rel_time_other_events",
nsim = n_perm)) %>%
# store beta estimates and their z-values
mutate(beta_rel_time_other_events = coef(summary(model))[2,"Estimate"],
t_rel_time_other_events = coef(summary(model))[2,"t value"],
z = list(setNames(z, c("z_intercept", "rel_time_other_events")))) %>%
# get rid of intercept z-values and model column
unnest_longer(z) %>%
filter(z_id != "z_intercept") %>%
select (.,-c(model))
# add Pearson correlation (for plot annotation only)
$r <- beh_data_replication %>%
fit_beh_bias_replicationgroup_by(sub_id) %>%
summarise(r = cor(rel_time_other_events, timeline_error)) %>%
pull(r)
## `summarise()` ungrouping output (override with `.groups` argument)
# run group-level t-tests on the RSA fits from the first level in aHPC for within-day
<- fit_beh_bias_replication %>%
stats filter(z_id =="rel_time_other_events") %>%
select(z) %>%
paired_t_perm_jb (., n_perm = n_perm)
# Cohen's d with Hedges' correction for one sample using non-central t-distribution for CI
<-cohen.d(d=fit_beh_bias_replication$z, f=NA, paired=TRUE, hedges.correction=TRUE, noncentral=TRUE) d
## Warning in pt(q = t, df = df, ncp = x): full precision may not have been achieved in 'pnt{final}'
## Warning in pt(q = t, df = df, ncp = x): full precision may not have been achieved in 'pnt{final}'
$d <- d$estimate
stats$dCI_low <- d$conf.int[[1]]
stats$dCI_high <- d$conf.int[[2]]
stats
# print results
huxtable(stats) %>% theme_article()
estimate | statistic | p.value | p_perm | parameter | conf.low | conf.high | method | alternative | d | dCI_low | dCI_high |
---|---|---|---|---|---|---|---|---|---|---|---|
1.66 | 11.3 | 9.76e-15 | 0.0001 | 45 | 1.37 | 1.96 | One Sample t-test | two.sided | 1.64 | 1.23 | 2.13 |
We observe a significant positive effect of the virtual time of other events at the same sequence position on remembered virtual time. That means that when other events at the same sequence position are later than a given event, participants are likely to overestimate the event time. Conversely, when the other events at this sequence position relatively early, participants underestimate. This demonstrates an across-sequence effect of virtual time. Virtual time is generalized across sequences to bias remembered times at similar sequence positions.
To visualize this effect from the summary statistics approach, we create a raincloud plot of the z-values from the linear model permutations for each subject.
# plot regression z-values
<- ggplot(fit_beh_bias_replication, aes(x=as.factor(1),y=z), colour= aHPC_colors["across_main"], fill=aHPC_colors["across_main"]) +
f8d # plot the violin
::geom_half_violin(position=position_nudge(0.1),
gghalvesside = "r", fill=aHPC_colors["across_main"], color = NA) +
geom_point(aes(x = 1-.2), alpha = 0.7,
position = position_jitter(width = .1, height = 0),
shape=16, colour=aHPC_colors["across_main"], size = 1)+
geom_boxplot(width = .1, fill=aHPC_colors["across_main"], colour = "black", outlier.shape = NA) +
stat_summary(fun = mean, geom = "point", size = 1, shape = 16,
position = position_nudge(.1), colour = "black") +
stat_summary(fun.data = mean_se, geom = "errorbar",
position = position_nudge(.1), colour = "black", width = 0, size = 0.5)+
ylab('Z regression') + xlab('generalization\nbias') +
annotate(geom = "text", x = 1, y = Inf, label = "***", hjust = 0.5, vjust = 1, size = 8/.pt, family=font2use) +
guides(color = "none", fill = "none")+
theme_cowplot() +
theme(axis.text.x = element_blank())
f8d
# save source data
<-ggplot_build(f8d)$data[[2]]
source_dat ::write_tsv(source_dat %>% select(x,y),
readrfile = file.path(dirs$source_dat_dir, "f8d.txt"))
Replication data: Mixed Model
We also want to this effect using a linear mixed model. We use the maximal random effects structure with random intercepts and random slopes for participants.
Let’s fit the model and get tidy summaries.
# z-score the relative time of other events for each participant
<- beh_data_replication %>%
beh_data_replication group_by(sub_id) %>%
mutate(rel_time_other_events_z = scale(rel_time_other_events)) %>%
ungroup()
# Fit the model with full random effect structure -> singular
<- lme4::lmer("timeline_error ~ rel_time_other_events_z + (1+rel_time_other_events_z|sub_id)",
lmm_full data=beh_data_replication, REML=FALSE)
## boundary (singular) fit: see ?isSingular
# fit again after removing correlation of slope and intercept
<- lme4::lmer("timeline_error ~ rel_time_other_events_z + (1+rel_time_other_events_z||sub_id)",
lmm_full data=beh_data_replication, REML=FALSE)
## boundary (singular) fit: see ?isSingular
# fit again after removing correlation of also random intercept (only random slopes left)
set.seed(278) # set seed for reproducibility
<- lme4::lmer("timeline_error ~ rel_time_other_events_z + (0+rel_time_other_events_z|sub_id)",
lmm_full data=beh_data_replication, REML=FALSE)
## boundary (singular) fit: see ?isSingular
# tidy summary of the fixed effects that calculates 95% CIs
<- broom.mixed::tidy(lmm_full, effects = "fixed", conf.int=TRUE, conf.method="profile") lmm_full_bm
## Computing profile confidence intervals ...
# tidy summary of the random effects
<- broom.mixed::tidy(lmm_full, effects = "ran_pars") lmm_full_bm_re
Compare against a reduced model without the fixed effect of interest.
# fit reduced model
set.seed(221) # set seed for reproducibility
<- lme4::lmer("timeline_error ~ 1 + (0+rel_time_other_events_z|sub_id)",
lmm_reduced data=beh_data_replication, REML=FALSE)
<-anova(lmm_full, lmm_reduced)
lmm_aov lmm_aov
npar | AIC | BIC | logLik | deviance | Chisq | Df | Pr(>Chisq) |
---|---|---|---|---|---|---|---|
3 | 4.5e+03 | 4.52e+03 | -2.25e+03 | 4.5e+03 | |||
4 | 4.45e+03 | 4.47e+03 | -2.22e+03 | 4.44e+03 | 53.7 | 1 | 2.29e-13 |
Mixed Model: Fixed effect of relative time of other events on timeline errors
\(\chi^2\)(1)=53.74, p=0.000
Make summary table.
<- c("intercept", "relative time other events")
fe_names <- c(rep("participant",1), "residual")
re_groups <- c("relative time other events (SD)", "SD")
re_names
<- make_lme_huxtable(fix_df=lmm_full_bm,
lmm_hux ran_df = lmm_full_bm_re,
aov_mdl = lmm_aov,
fe_terms =fe_names,
re_terms = re_names,
re_groups = re_groups,
lme_form = gsub(" ", "", paste0(deparse(formula(lmm_full)),
collapse = "", sep="")),
caption = "Mixed Model: Behavioral generalization bias (replication)")
# convert the huxtable to a flextable for word export
<- convert_huxtable_to_flextable(ht = lmm_hux)
stable_lme_beh_gen_bias_replication
# print to screen
theme_article(lmm_hux)
fixed effects | ||||||
---|---|---|---|---|---|---|
term | estimate | SE | t-value | 95% CI | ||
intercept | -0.320564 | 0.089155 | -3.60 | -0.495488 | -0.145640 | |
relative time other events | 0.863631 | 0.091472 | 9.44 | 0.684152 | 1.043110 | |
random effects | ||||||
group | term | estimate | ||||
participant | relative time other events (SD) | 0.000000 | ||||
residual | SD | 2.704218 | ||||
model comparison | ||||||
model | npar | AIC | LL | X2 | df | p |
reduced model | 3 | 4501.04 | -2247.52 | |||
full model | 4 | 4449.30 | -2220.65 | 53.74 | 1 | 2.29e-13 |
model: timeline_error~rel_time_other_events_z+(0+rel_time_other_events_z|sub_id); SE: standard error, CI: confidence interval, SD: standard deviation, npar: number of parameters, LL: log likelihood, df: degrees of freedom, corr.: correlation |
To visualize the mixed model we create a dot plot of the fixed effect coefficient and the estimated marginal means.
# dot plot of Fixed Effect Coefficients with CIs
<- ggplot(data = lmm_full_bm[2,], aes(x = term, color = term)) +
sfigmm_q geom_hline(yintercept = 0, colour="black", linetype="dotted") +
geom_errorbar(aes(ymin = conf.low, ymax = conf.high, width = NA), size = 0.5) +
geom_point(aes(y = estimate), size = 1, shape = 16) +
scale_fill_manual(values = unname(aHPC_colors["across_main"])) +
scale_color_manual(values = unname(aHPC_colors["across_main"]), labels = c("across sequence bias")) +
labs(x = element_blank(), y="fixed\neffect estimate") +
theme_cowplot() +
theme(plot.title = element_text(hjust = 0.5), axis.text.x=element_blank()) +
guides(color = "none", fill = "none")+
annotate(geom = "text", x = 1, y = Inf, label = "***", hjust = 0.5, vjust = 1, size = 8/.pt, family=font2use)
# estimate marginal means for each model term by omitting the terms argument
<- ggeffects::ggpredict(lmm_full, ci.lvl = 0.95) %>% get_complete_df
lmm_full_emm
# plot marginal means
<- ggplot(data = lmm_full_emm, aes(color = group)) +
sfigmm_r geom_line(aes(x, predicted)) +
geom_ribbon(aes(x, ymin = conf.low, ymax = conf.high, fill = group), alpha = .3, linetype=0) +
scale_color_manual(values = unname(aHPC_colors["across_main"]), name=element_blank()) +
scale_fill_manual(values = unname(aHPC_colors["across_main"])) +
scale_x_continuous(breaks = c(-1.5, 0, 1.5), labels= c("-1.5", "0", "1.5")) +
ylab('estimated\nmarginal means') +
xlab('relative time\nof other events') +
guides(fill = "none", color = "none") +
theme_cowplot()
+sfigmm_r sfigmm_q
Lastly, we show that the effect is quite visible from the single-subject plots: The slopes of the least-squares lines are, on average, positive.
# plot the relationship for each subject
<- ggplot(beh_data_replication, aes(x=rel_time_other_events, y=timeline_error)) +
sfig_bias_replication_single_sub geom_smooth(method='lm', formula= y~x,
color=aHPC_colors["across_main"],
fill=aHPC_colors["across_main"])+
geom_point(size = 1, shape = 16) +
geom_text(data = fit_beh_bias_replication, aes(label = sprintf(" r=%.2f", round(r,digits = 2))),
x=-Inf, y=Inf, hjust="inward", vjust="inward", size=6/.pt,
family=font2use) +
facet_wrap(~sub_id, scales="free_y", ncol=7) +
scale_x_continuous(breaks = c(-2.5, 0, 2.5), labels= c("-2.5", "0", "2.5")) +
xlab("relative time of other events (virtual hours)") +
ylab("timeline error") +
theme_cowplot() +
theme(strip.background = element_blank(),
strip.text = element_blank(),
text = element_text(size=10, family=font2use),
axis.text = element_text(size=8))
= "
layout A
A
A
A
B
B
B
B
B
B"
<- sfig_bias_single_sub + sfig_bias_replication_single_sub +
sfig_bias_single_sub_both_samples plot_layout(design = layout, guides = "keep") &
theme(text = element_text(size=10, family=font2use),
axis.text = element_text(size=8),
legend.text=element_text(size=8),
legend.title=element_text(size=8),
legend.position = "none"
&
) plot_annotation(tag_levels="A",
theme = theme(plot.margin = margin(t=0, r=0, b=0, l=-5, unit="pt")))
# save as png and pdf and print to screen
<- here("figures", "sf10")
fn ggsave(paste0(fn, ".pdf"), plot=sfig_bias_single_sub_both_samples, units = "cm",
width = 17.4, height = 22.5, dpi = "retina", device = cairo_pdf)
ggsave(paste0(fn, ".png"), plot=sfig_bias_single_sub_both_samples, units = "cm",
width = 17.4, height = 22.5, dpi = "retina", device = "png")
5.5 Swap Errors
Given that participants’ behavior indicates that they generalize across sequences, we next also explore whether there might be systematic errors in the sorting task as well. Specifically, it could be that events belonging to the same sequence position are swapped.
In an exploratory analysis, we searched for systematic errors in the sorting task. Specifically, we looked for swap errors where participants interchanged events occurring at the same position between two or more sequences.
We begin by defining a function that identifies swap errors based on the sorting task data and logic. The function takes a dataframe as an input that corresponds to the sorting task data of one participant with one row per event image. The expected columns are:
- day: true day (1:4)
- event: sequence position of the image (1:5)
- sorted day: day the event was sorted to
- correct_sort: logical indicating whether event was sorted to correct day or not.
<- function(in_df = tibble(day = rep(1:4, each=5), event = rep(1:5, times=4), sorted_day = rep(1:4, each=5), correct_sort = day==sorted_day)){
find_swap_errors
<- in_df
out_df $swap_err <- FALSE
out_df
for (i_event in 1:5){
<- sum(in_df$day[in_df$event==i_event] == in_df$sorted_day[in_df$event==i_event])
n_correct <- n_distinct(in_df$sorted_day[in_df$event==i_event])
n_unique_days
# if the four events are sorted to only two days and no event is sorted correctly,
# there is a swap between the events that objectively belong to the two days that
# all events were sorted to --> 2 swap errors
if (n_correct==0 & n_unique_days==2){
<- unique(in_df$sorted_day[in_df$event==i_event & in_df$correct_sort==FALSE])
sorted_days <- in_df$event == i_event & in_df$day %in% sorted_days
swap_errs_idx $swap_err[swap_errs_idx] <- TRUE
out_dfstopifnot(2==sum(swap_errs_idx))
}
# if the four events are sorted to three days and no event is sorted correctly,
# there is a three-way swap error. The event belonging to the day that no event was sorted too is the error that is not a swap.
if (n_correct==0 & n_unique_days==3){
<- unique(in_df$sorted_day[in_df$event==i_event & in_df$correct_sort==FALSE])
sorted_days <- in_df$event == i_event & in_df$day %in% sorted_days # selects the 3 events that objectively belong to the days that events were incorrectly sorted to
swap_errs_idx $swap_err[swap_errs_idx] <- TRUE
out_dfstopifnot(3==sum(swap_errs_idx))
}
# if the four events are sorted to four days and no event is sorted correctly,
# all four events have been swapped
if (n_correct==0 & n_unique_days==4){
<- in_df$event == i_event
swap_errs_idx $swap_err[swap_errs_idx] <- TRUE
out_df
}
# if the four events are sorted to three days and one event is sorted correctly,
# there are multiple possibilities that need to be tested.
if (n_correct==1 & n_unique_days==3){
# There can't be a swap error for the event(s) sorted to the day where the correct sort was.
# So let's only look at the other days. This means we are looking at two or three events,
# depending on whether an incorrect sort occurred to the day whose event was sorted correctly or not
<- in_df$day[in_df$event==i_event & in_df$correct_sort == TRUE]
correctly_sorted_day <- in_df[event == i_event & in_df$sorted_day != correctly_sorted_day,]
curr_dat
# if there are two events left, we check for both events whether they were sorted to
# the correct day of the respective other event. If yes, the two were swapped!
if (nrow(curr_dat)==2){
if(curr_dat$day[1]==curr_dat$sorted_day[2] & curr_dat$day[2]==curr_dat$sorted_day[1]){
<- (in_df$event == i_event & in_df$day %in% curr_dat$day)
swap_errs_idx $swap_err[swap_errs_idx] <- TRUE
out_df
}
}
# if there are three events left, two of the three are swapped.
# we compare all pairwise comparisons of sorted and true days to find out for which pair
# the true days match the sorted days
if (nrow(curr_dat)==3){
<- combinat::combn(3,2)
comps <- which(curr_dat$day[comps[1,]]==curr_dat$sorted_day[comps[2,]])
swap_comp1 <- which(curr_dat$day[comps[2,]]==curr_dat$sorted_day[comps[1,]])
swap_comp2 <- comps[, intersect(swap_comp1, swap_comp2)]
swap_rows <- (in_df$event == i_event & in_df$day %in% curr_dat$day[swap_rows])
swap_errs_idx $swap_err[swap_errs_idx] <- TRUE
out_df
}
}
# if the four events are sorted to four days and only one event is sorted correctly,
# there is a three-way swap between the incorrectly sorted events
if (n_correct==1 & n_unique_days==4){
<- (in_df$event == i_event & in_df$correct_sort == FALSE)
swap_errs_idx $swap_err[swap_errs_idx] <- TRUE
out_df
}
# if the four events are sorted to four days and two events are sorted correctly,
# there is a swap between the two incorrectly sorted events
if (n_correct==2 & n_unique_days==4){
<- which(in_df$event==i_event & in_df$correct_sort == FALSE)
swap_errs_idx $swap_err[swap_errs_idx] <- TRUE
out_df
}
}return(out_df)
}
Let’s apply this function to the data of each participant.
# add variable denoting if event was sorted correctly
<- beh_data %>%
beh_data mutate(correct_sort = as.numeric(day==sorted_day),
swap_err = FALSE)
for (i_sub in as.numeric(subjects)){
# if a subject has sorting errors (i.e. not all 20 events were correctly sorted) run function to identify swap errors
if(sum(beh_data$correct_sort[beh_data$sub_id==i_sub])<20){
<- find_swap_errors(beh_data %>% filter(sub_id == i_sub))
sub_dat $swap_err[beh_data$sub_id==i_sub]<-sub_dat$swap_err
beh_data
} }
Descriptives of swap errors
Let’s see how many swap errors were made and plot a histogram of swap errors as a function of sorting errors.
# sum of swap errors for each participant
$n_swap_errs <- beh_data %>%
beh_groupgroup_by(sub_id) %>%
summarise(n_swap_errs=sum(swap_err), .groups="drop") %>%
pull(n_swap_errs)
# print descriptive stats
sprintf("%d of %d participants made at least one sorting error", sum(beh_group$n_correct_sorts<20), nrow(beh_group))
## [1] "14 of 28 participants made at least one sorting error"
sprintf("Of the %d participants with sorting errors, %d made swap errors (mean±S.D: %.1f±%.1f, %.1f%%±%.1f%% of sorting errors were swap errors)",
sum(beh_group$n_correct_sorts<20),
sum(beh_group$n_swap_errs[beh_group$n_correct_sorts<20]>0),
mean(beh_group$n_swap_errs[beh_group$n_correct_sorts<20]),
sd(beh_group$n_swap_errs[beh_group$n_correct_sorts<20]),
mean(beh_group$n_swap_errs[beh_group$n_correct_sorts<20]/beh_group$n_sorting_errors[beh_group$n_correct_sorts<20])*100,
sd(beh_group$n_swap_errs[beh_group$n_correct_sorts<20]/beh_group$n_sorting_errors[beh_group$n_correct_sorts<20])*100)
## [1] "Of the 14 participants with sorting errors, 12 made swap errors (mean±S.D: 3.1±2.1, 57.5%±34.3% of sorting errors were swap errors)"
# histogram of number swap errors
<- ggplot(beh_group, aes(x=n_swap_errs, group=perfect_sort, fill=perfect_sort)) +
sfig_beh_e geom_histogram(position=position_dodge(width = 0.2),binwidth = 1, alpha=1) +
scale_y_continuous(name = "count", breaks = seq(0,14,2)) +
xlab("swap errors") +
scale_fill_manual(values=c("#81A88D", "#972D15"),
name=element_blank(),
labels = c("no sorting\nerror", "≥1 sorting\nerror")) +
theme_cowplot() +
theme(legend.position = c(1,1), legend.key.size = unit(0.15, "cm"),
legend.justification = c(1,1),
text = element_text(size=10, family=font2use),
axis.text = element_text(size=8),
legend.text=element_text(size=8),
legend.title=element_text(size=8),
legend.spacing.x = unit(0.2, 'cm'))
sfig_beh_e
Swap errors as a function of sequence position
Next, we test whether the distribution of swap errors across sequence positions differs from uniformity.
We used a χ2-test to assess whether the number of swap errors deviated from uniformity across sequence positions.
# DISTRIBUTION OF SWAP ERRORS
# test whether frequencies of swap errors for the sequence positions differs from uniformity
<- beh_data %>% filter(swap_err==TRUE) %>%
stats group_by(event) %>%
summarise(n_swap_errs = sum(swap_err), .groups="drop") %>%
pull(n_swap_errs) %>%
chisq.test() %>%
tidy()
# histogram of swap errors as a function of sequence position
<- ggplot(beh_data %>% filter(swap_err==TRUE), aes(x=event, fill = event)) +
sfig_beh_f geom_bar(stat="count") +
scale_fill_manual(values = event_colors) +
scale_y_continuous(name = "swap error count", breaks = seq(0,12,2), labels = seq(0,12,2)) +
theme_cowplot() +
theme(legend.position = "none",
text = element_text(size=10, family=font2use),
axis.text = element_text(size=8),
legend.text=element_text(size=8),
legend.title=element_text(size=8))
sfig_beh_f
\(\chi^2\) test against uniformity for distribution of swap errors: \(\chi^2\)=1.07, p=0.899
Swap error: More frequent than expected from chance?
So we do observe swap errors, but are they more frequent than what would be expected from chance given the number of sorting errors in our data? To test that, we run a permutation test where we generate a fake sample of participants that has the same number of swap errors as the real participants, but swap errors are introduced at random positions.
To test whether participants made more swap errors than expected from chance we ran a permutation test where we introduced sorting errors for randomly selected events. For each of 10 000 iterations, we generated a surrogate sample of sorting results with the number of randomly introduced sorting errors matching the number of errors made by the different participants in our sample. We then quantified the proportion of swap errors across this surrogate sample. This resulted in a distribution of the proportion of swap errors that would be expected from random sorting errors. We assessed how many permutations yielded proportions of swap errors larger or equal to the proportion of swap errors observed in the fMRI sample to compute a p-value and further quantified a z-value as the difference between the observed swap error proportion and the mean of the chance distribution divided by the standard deviation of the chance distribution.
# tibble with a perfect sorting result
<- tibble(day = rep(1:n_days,each=5),
perfect_sort event = rep(1:n_events_day, 4),
sorted_day = rep(1:n_days, each=5),
sort_err = FALSE)
# from the data, find the participants with sorting errors and their number of swap errors
<- beh_group$n_sorting_errors[beh_group$n_sorting_errors>0]
n_sort_errs <- beh_group$n_swap_errs[beh_group$n_sorting_errors>0]
n_swap_errs
# matrix in which to store output
<- matrix(NA, nrow=length(n_sort_errs), ncol=n_perm)
n_swap_errs_perm <- matrix(NA, nrow=length(n_sort_errs), ncol=n_perm)
prop_swap_errs_perm
for (i_perm in 1:n_perm){ # takes ~30min with 10000 shuffles
for (i_sub in 1:length(n_sort_errs)){
<- FALSE
found_rand_sort_errs while(found_rand_sort_errs == FALSE){ # shuffle given number of sorts until we end up with the correct number of errors
# determine a random subset of trials as sorting errors (number of errors based on actual number of errors of current subject)
<- perfect_sort
error_sort $sort_err[sample(x=n_days*n_events_day,size=n_sort_errs[i_sub])] <- TRUE
error_sort
# shuffle the sorted days of the inserted sorting errors
$sorted_day[error_sort$sort_err] <- sample(error_sort$sorted_day[error_sort$sort_err])
error_sort$correct_sort <- error_sort$sorted_day == error_sort$day
error_sort
# break loop if we end up with the correct number of sorting errors
# (number of errors can be smaller if the above sample assigns to the correct day by chance)
if(sum(!error_sort$correct_sort)==n_sort_errs[i_sub]){
<- TRUE
found_rand_sort_errs
}
}
# find swap errors
<- find_swap_errors(in_df = error_sort)
error_sort
# count swap for this permutation subject and store
<- sum(error_sort$swap_err)
n_swap_errs_perm[i_sub, i_perm] <- sum(error_sort$swap_err)/n_sort_errs[i_sub]
prop_swap_errs_perm[i_sub, i_perm] # subject
} # permutation
}
# make tibble of average proportion of swap errors
<- tibble(prop_swap_errs = colMeans(prop_swap_errs_perm))
perm_dist
# get p and z-value
<- sum(c(perm_dist$prop_swap_errs, mean(n_swap_errs/n_sort_errs)) >= mean(n_swap_errs/n_sort_errs))/
p_sort_perm length(c(perm_dist$prop_swap_errs, mean(n_swap_errs/n_sort_errs)))
<- (mean(n_swap_errs/n_sort_errs)-mean(perm_dist$prop_swap_errs))/sd(perm_dist$prop_swap_errs) z_sort_perm
Permutation test for proportion of swap errors given the number of sorting errors: z=5.07, p=0.000
Plot histogram of the chance distribution.
# histogram of average proportions
<- as.character(expression(95^{th}*" percentile"))
prctile <- ggplot(perm_dist, aes(x=prop_swap_errs)) +
sfig_beh_g geom_histogram(aes(y=..density..), binwidth = 0.01, fill = ultimate_gray) +
annotate(geom = "line", x = quantile(perm_dist$prop_swap_errs, probs=0.95), y=c(0,1.5), color="BLACK") +
geom_label(aes(x=quantile(prop_swap_errs, probs=0.95), y=1.5, label = prctile),
hjust=0.5, vjust=0, size=6/.pt, fill = "white", alpha=0.5, color="BLACK", parse = TRUE,
label.padding = unit(0.1, "lines"), label.size = 1/.pt,) +
annotate(geom = "line", x = mean(n_swap_errs/n_sort_errs), y=c(0,3), color="#972D15") +
geom_label(aes(x=Inf, y=3, label="% swap errors\nin fMRI sample"),
label.padding = unit(0.1, "lines"), label.size = 1/.pt,
hjust=1, vjust=0.5, size=6/.pt, fill = "white", alpha=0.5, color="#972D15") +
scale_x_continuous(labels = scales::percent) +
xlab("% swaps of sorting errors") +
ylab("% of shuffles") +
theme_cowplot() +
theme(legend.position = "none", text = element_text(size=10, family=font2use),
axis.text = element_text(size=8),
legend.text=element_text(size=8),
legend.title=element_text(size=8))
sfig_beh_g
Swap errors and generalization bias
Both swap errors and the generalization bias point towards subjects generalizing across sequences in memory. Let’s test if the two are related across subjects.
We tested whether the number of swap errors was related to absolute errors in the timeline task (see below) using Spearman’s correlation and a t-test for independent samples.
<- inner_join(beh_group, fit_beh_bias) beh_group
## Joining, by = "sub_id"
# CORRELATION OF SWAP ERRORS AND GENERALIZATION BIAS
# Spearman correlation
<- beh_group %$%
stats cor.test(n_swap_errs, beta_rel_time_other_events,method = "spearman") %>%
tidy()
## Warning in cor.test.default(n_swap_errs, beta_rel_time_other_events, method = "spearman"): Cannot compute exact p-value with ties
# scatter plot of swap errors and generalization bias
<- ggplot(beh_group, aes(x=n_swap_errs, y=beta_rel_time_other_events)) +
sfig_beh_h geom_smooth(method='lm', formula= y~x, size=1, color=ultimate_gray, fill="lightgrey") +
geom_point(shape = 16, size = 1, color=ultimate_gray) +
scale_x_continuous(breaks=seq(0,7,2)) +
xlab("swap errors") +
ylab("generalization bias") +
annotate("text", x=Inf, y=Inf, hjust="inward", vjust="inward", size=6/.pt,
label=sprintf("r=%.2f\np=%.3f", round(stats$estimate, digits=2), round(stats$p.value, digits=3)),
family=font2use) +
theme_cowplot() +
theme(legend.position = "none",
text = element_text(size=10, family=font2use),
axis.text = element_text(size=8),
legend.text=element_text(size=8),
legend.title=element_text(size=8))
sfig_beh_h
Spearman correlation of number of swap errors and generalization bias: r=-0.12, p=0.528
Again, this correlation is limited by the fact that many subject don’t make sorting errors and thus cannot make any swap errors. So let’s contrast participants with and without swap errors.
# t-test based on swap error or not
<- beh_group %>% mutate(min_1swap = factor(if_else(n_swap_errs>0, "yes", "no"), levels = c("no", "yes")))
beh_group <- tidy(t.test(beh_group$beta_rel_time_other_events[beh_group$min_1swap=="yes"],
stats $beta_rel_time_other_events[beh_group$min_1swap=="no"],
beh_groupvar.equal = TRUE))
<- ggplot(beh_group, aes(x=min_1swap, y=beta_rel_time_other_events, fill=min_1swap, color=min_1swap)) +
sfig_beh_i geom_boxplot(width = .1, colour = "black", outlier.shape = NA, outlier.size = 2) +
::geom_half_violin(data=beh_group %>% filter(min_1swap=="yes"), aes(x=2, y=beta_rel_time_other_events),
gghalvesposition=position_nudge(0.1), side = "r", colour=NA) +
::geom_half_violin(data=beh_group %>% filter(min_1swap=="no"), aes(x=1, y=beta_rel_time_other_events),
gghalvesposition=position_nudge(-0.1), side = "l", colour=NA) +
geom_point(aes(x=ifelse(min_1swap=="yes", 1.8, 1.2)), alpha = 1, position = position_jitter(width = .1, height = 0), shape = 16, size = 1) +
stat_summary(fun = mean, geom = "point", size=1, shape = 16,
position = position_nudge(c(-0.1, 0.1)), colour = "BLACK") +
stat_summary(fun.data = mean_se, geom = "errorbar",
position = position_nudge(c(-0.1, 0.1)), colour = "BLACK", width = 0, size = 0.5) +
ylab('generalization bias')+xlab('swap errors') +
annotate(geom = "text", x = 1.5, y = Inf,
label = 'underline(" n.s. ")', hjust = 0.5, vjust=1, size = 8/.pt, family=font2use, parse=TRUE) +
scale_x_discrete(labels = c("0", "≥1")) +
scale_color_manual(values=c("#81A88D", "#972D15"), labels=c("no swap error", "≥1 swap error")) +
scale_fill_manual(values=c("#81A88D", "#972D15")) +
guides(color = guide_legend("swap errors", override.aes = list(shape=16, size=1, linetype=0, fill=NA, alpha=1)), fill = "none") +
theme_cowplot() +
theme(legend.position = "none", text = element_text(size=10, family=font2use),
axis.text = element_text(size=8),
legend.text=element_text(size=8),
legend.title=element_text(size=8))
sfig_beh_i
Summary Statistics: independent t-test comparing generalization bias for participants without and with swap errors t26=0.18, p=0.861, 95% CI [-0.18, 0.22]
We can now make the Supplemental Figure with all the behavioral supplemental plots.
# make new supplemental figure
<- "
layout AAAAAAAAAABBBBBBBBBBCCCCCCCCCCDDDDDDDDDD
EEEEEEFFFFGGGGGGGGGGHHHHHHHHHHIIIIIIIIII"
<- sfig_beh_a + sfig_beh_b + sfig_beh_c + sfig_beh_d + sfig_beh_e+ sfig_beh_f + sfig_beh_g + sfig_beh_h + sfig_beh_i +
sfig plot_layout(design = layout) +
plot_annotation(tag_levels = "A") &
theme(text = element_text(size=10, family=font2use),
axis.text = element_text(size=8),
legend.text=element_text(size=6),
legend.title=element_text(size=8))
5]] <- sfig[[5]] + theme(legend.spacing.x = unit(0.05, "cm"))
sfig[[
# save
<- here("figures", "sf03")
fn ggsave(paste0(fn, ".pdf"), plot=sfig, units = "cm",
width = 17.4, height = 11, dpi = "retina", device = cairo_pdf)
ggsave(paste0(fn, ".png"), plot=sfig, units = "cm",
width = 17.4, height = 11, dpi = "retina", device = "png")