This page explores potential associations between information available at diagnosis, or shortly thereafter, and cluster membership. A descriptive analysis of baseline variables in described in a previous page.
For univariate analyses, continuous data have been analysed via ANOVA, and categorical data have been analysed using chi-squared or Fisher’s exact test as appropriate. Time-to-event data have been analysed using log-rank tests of Kaplan-Meier curves.
Multivariate analyses were also performed to potentially adjust for confounding factors.
As faecal calprotectin (FC) and C-reactive protein (CRP) were analysed independently, this page is split into FC and CRP sections.
Faecal calprotectin analysis
Merge subject-level metadata with model-derived quantities
Here, we create a data frame that combines individual-level information (e.g. age at diagnosis, sex) with model-derived quantities, such as the posterior probabilities of class assignment. To facilitate visualisation, we also create discretised versions for some variables.
Uncertainty cluster assignment probabilities
First, we calculate the proportion of individuals assigned to each cluster with probability above 0.5.
p1<-myDF.fc%>%group_by(class_order_original)%>%summarise( prop50 =100*mean(probmax_original>0.5), prop75 =100*mean(probmax_original>0.75))%>%ggplot(aes(x =class_order_original, y =prop50))+ylim(c(0, 100))+xlab("Assigned cluster")+ylab("% assigned with prob > 0.5")+geom_bar(stat ="identity", fill ="#F9DC5C", color ="#C6AB00")+theme_minimal()p1
Next, we calculate average posterior probabilities of cluster assignment.
Figure 1 shows how cluster assignment probabilities change as follow-up for subjects increases. As one would expect, probabilities typically increase as as follow-up increases. This relationship appears to depend upon when the mean trajectory for the assigned cluster substantially differs from the other clusters. FC8 shows high posterior probabilities with even short follow-up as this is the only cluster with low FC at diagnosis. However, longer follow-ups are required to distinguish other clusters. For example, individuals assigned to FC6 that have a short follow-up (< 4 years from diagnosis) have, on average, a high probability of being assigned to FC3 instead ( versus ). This is not unexpected, as FC3 and FC6 share similar patterns within the first 2 years.
# Assign level order otherwise alphanumerical order used# and add sample sizes to labelsmyDF.fc_means<-myDF.fc_means%>%mutate( prob_order_original =factor(prob_order_original, levels =paste0("prob_order_original_",1:8), labels =paste0("FC", seq(1, 8))), class_order_original =factor(class_order_original, levels =paste0("FC", 1:8), labels =paste0("Assigned to FC", 1:8, "\n n = ",as.vector(table(myDF.fc$class_order_original)))))fc_fup<-myDF.fc_means%>%ggplot(aes(fill =prob_order_original, y =value, x =followup_cut))+geom_bar(position ="fill", stat ="identity")+facet_wrap(.~class_order_original, ncol =4)+theme_minimal()+theme( legend.title =element_text(hjust =0.5), strip.background =element_rect( color ="lightgray", linewidth =1.5, linetype ="solid"))+labs( x ="Follow-up cutoff (years)", y ="", fill ="Mean posterior\nprobability of\ncluster assignment")+scale_fill_viridis_d(option ="inferno")fc_fupggsave("plots/fc-prob-fup.png", fc_fup, width =11, height =8, units ="in")ggsave("plots/fc-prob-fup.pdf", fc_fup, width =11, height =8, units ="in")
Figure 1: Demonstration of how mean posterior probabilities of cluster assignment for subjects changes based upon follow-up and assigned cluster.
Associations with respect to cluster assignments
This section displays descriptive plots to summarize marginal associations between cluster assignments and individual-level covariates. We also explore univariate and multivariate associations with respect to cluster assignment using multinomial logistic regression. As a sensitivity analysis, we also consider restricting the analysis to only consider individuals whose class assignment was less uncertain (with posterior probability > 0.5).
For all individuals
p_diagnosis_all<-myDF.fc%>%plotCat("diagnosis", class ="class_order")p_sex_all<-myDF.fc%>%plotCat("sex", class ="class_order")p_age_all<-myDF.fc%>%ggplot(aes(x =class_order, y =age))+geom_violin(fill ="#5DB7DE", color ="#434371")+geom_boxplot(width =0.1, color ="black", alpha =0.2)+theme_minimal()+xlab("Cluster")+ylab("Age at diagnosis")p_mlr_all<-myDF.fc%>%mutate(class_order =relevel(class_order, ref ="FC1"))%>%mlrPlot(var =c("diagnosis", "age", "sex"), class ="class_order")temp.1<-myDF.fc%>%filter(class_order=="FC1")temp.2<-myDF.fc%>%filter(class_order!="FC1")perc.fc1<-round(sum(temp.1$diagnosis=="Crohn's Disease")/nrow(temp.1)*100,1)<-round(sum(temp.2$diagnosis=="Crohn's Disease")/nrow(temp.2)*100,1)
Here, we consider associations with respect to information available at diagnosis: age, sex and IBD type.
62.9% of subjects in FC1 have Crohn’s disease whilst 50.9% of subjects in the other clusters have Crohn’s disease.
p_mlr_all<-myDF.fc%>%mutate(class_order =relevel(class_order, ref ="FC1"))%>%mlrPlot( var =c("diagnosis", "age", "sex"), class ="class_order", extern =dk)p<-(wrap_elements(p_age_all)+p_mlr_all$plot_everything$age&theme(legend.position ="none"))/(wrap_elements(p_sex_all)+p_mlr_all$plot_everything$sexMale+plot_layout(guides ="collect")&theme(legend.position ="bottom"))+plot_annotation(tag_levels ="A")+plot_layout(widths =c(1, 1))&theme(plot.tag =element_text(face ="bold", size =22))ggsave("plots/associations/fc-sex-age.pdf",p, width =12, height =12, units ="in")print(p)
Crohn’s disease only
For CD patients, we also consider additional phenotyping information. This includes the following information:
This is recorded as a binary (Yes/No) variable and is primarily based on self-reporting. As such, it may not necessarily reflect true smoking status. Smoking was missing for approximately 5% of CD patients in the FC cohort.
Montreal location
Montreal location refers to where gastrointestinal inflammation is present and is categorised as:
L1: Ileal, limited to the ileum which is the final segment of the small intestine.
L2: Colonic, limited to the colon/large intestine.
L3: Ileocolonic, inflammation is present in both the ileum and colon.
Montreal location was missing for approximately 1% of CD patients in the FC cohort.
Montreal behaviour
Montreal behaviour describes another clinical phenotype and is defined as follows:
B1: Inflammatory, in other words non-stricturing and non-penetrating
B2: Stricturing, where the formation of fibrosis leads to the narrowing of the intestine.
B3: Penetrating, where the inflammation causes the formation of fistulas or abscesses.
Due to small numbers, B2 and B3 are merged into a single group (complicated CD) when analysing Montreal behaviour.
myDF.fc<-myDF.fc%>%mutate(Behaviour_merged =plyr::mapvalues(Behaviour, from =c("B1", "B2", "B3", NA), to =c("B1", "B2 or B3", "B2 or B3", NA)))
Montreal behaviour was missing for approximately 2% of CD patients in the FC cohort.
Upper GI inflammation
Upper GI inflammation refers to any gastrointestinal inflammation further up than the ileum. Usually, upper inflammation is considered a modifier for Montreal location and is denoted L4. Upper GI inflammation (L4) was missing for a high proportion of CD individuals in the FC cohort (approx 33%. This is because the required investigations are only carried out where upper GI inflammation is suspected. As such, we have manually mapped missing L4 values as “No” (i.e. no upper GI inflammation for the associated patients).
Perianal disease is considered a modifier for Montreal behaviour and is a severe complication of Crohn’s disease involving inflammation around the anus.
Perianal disease status was missing for approximately 1% of CD patients in the FC cohort.
NOTE: For the purposes of the multinomial logistic regression model, individuals with missing values in any of these variables will be excluded. For consistency, such individuals will also be excluded from the univariate summary plots.
For this purpose, we create a missingness indicator (missingN_cd) which will facilitate the application of such filter.
The additional phenotyping information available for UC patients consists of:
This is defined in the same way as for CD patients.
Smoking was missing for approximately 6% of UC patients in the FC cohort.
NOTE: As for CD cases, individuals with missing values in any of these variables will be excluded from the association analysis. For consistency, such individuals will also be excluded from the univariate summary plots.
For this purpose, we create a missingness indicator (missingN_uc) which will facilitate the application of such filter.
p_sex_uc<-myDF.fc%>%filter(diagnosis=="Ulcerative Colitis")%>%filter(missingN_uc==0)%>%plotCat("sex", class ="class_order")p_age_uc<-myDF.fc%>%filter(diagnosis=="Ulcerative Colitis")%>%filter(missingN_uc==0)%>%ggplot(aes(x =class_order, y =age))+geom_violin(fill ="#5DB7DE", color ="#434371")+geom_boxplot(width =0.1, color ="black", alpha =0.2)+theme_minimal()+xlab("Cluster")+ylab("Age at diagnosis")p_smoke_uc<-myDF.fc%>%filter(diagnosis=="Ulcerative Colitis")%>%filter(missingN_uc==0)%>%mutate(Smoke =ifelse(!, Smoke, "Missing"))%>%plotCat("Smoke", class ="class_order")p_extent_uc<-myDF.fc%>%filter(diagnosis=="Ulcerative Colitis")%>%filter(missingN_uc==0)%>%mutate(Extent =ifelse(!, Extent, "Missing"))%>%plotCat("Extent", class ="class_order")p_mlr_uc<-myDF.fc%>%filter(diagnosis=="Ulcerative Colitis")%>%filter(missingN_uc==0)%>%mutate(class_order =relevel(class_order, ref ="FC1"))%>%mlrPlot( var =c("age", "sex", "Smoke", "Extent"), class ="class_order", extern =dk.fc.uc)
Note that, due to small counts of extent E1 in cluster FC3, the model fitted after excluding those with low probability of cluster assignment was subject to numerical issues (complete separation). As such, the associated estimates are excluded from the plot.
Advanced therapy use
Summary statistics of AT use
Here, we focus on AT therapy within the observation period (i.e. seven years since diagnosis). Overall, we observe significant differences in AT across clusters. In particular, after adjusting for age and sex, AT was significantly lower in FC2.
myDF.fc<-myDF.fc%>%mutate(AT_7Y =ifelse(AT==1&AT_line_1<=7, 1, 0))p_AT<-myDF.fc%>%mutate(AT_7Y =factor(AT_7Y))%>%plotCat("AT_7Y", class ="class_order")p_AT_mlr<-myDF.fc%>%mutate(AT_7Y =factor(AT_7Y))%>%mutate(class_order_original =relevel(class_order_original, ref ="FC1"))%>%mlrPlot( var =c("age", "sex", "AT_7Y"), class ="class_order_original")wrap_elements(p_AT)+p_AT_mlr$plot_both$AT
p_AT_1Y<-myDF.fc%>%mutate(AT_1Y =factor(AT_1Y))%>%plotCat("AT_1Y", class ="class_order_original")p_AT_1Y_cd<-myDF.fc%>%subset(diagnosis=="Crohn's Disease")%>%plotCat("AT_1Y", class ="class_order_original")p_AT_1Y_cd<-myDF.fc%>%subset(diagnosis=="Ulcerative Colitis")%>%plotCat("AT_1Y", class ="class_order_original")
We also generate a censored version for AT_1Y where lack of AT is treated as a right censored observation at seven years.
At present, we cannot show cumulative advanced therapy usage in this document as there are fewer than five subjects within at least one cluster-IBD type stratum. In the meantime, it is possible to view these plots in our manuscript which has digitally removed any strata with fewer than five subjects.
km.df<-data.frame( time =numeric(), cumhaz =numeric(), class =character(), diag =character())for(gin1:8){# Calculate cumulative<-myDF.fc%>%filter(class_order_original==paste0("FC", g))%>%filter(diagnosis=="Crohn's Disease")temp.uc<-myDF.fc%>%filter(class_order_original==paste0("FC", g))%>%filter(diagnosis=="Ulcerative Colitis")km<-survfit(Surv(AT_line_1_cens, AT_7Y)~1, data<-rbind(km.df,data.frame( time =km$time, cumhaz =1-km$surv, class =paste0("FC",g,", CD=",nrow(,"; UC=",nrow(temp.uc)), diag ="Crohn's disease"))km<-survfit(Surv(AT_line_1_cens, AT_7Y)~1, data =temp.uc)km.df<-rbind(km.df,data.frame( time =km$time, cumhaz =1-km$surv, class =paste0("FC",g,", CD=",nrow(,"; UC=",nrow(temp.uc)), diag ="Ulcerative colitis"))temp.all<-myDF.fc%>%filter(class_order_original==paste0("FC", g))km<-survfit(Surv(AT_line_1_cens, AT_7Y)~1, data =temp.all)km.df<-rbind(km.df,data.frame( time =km$time, cumhaz =1-km$surv, class =paste0("FC",g,", CD=",nrow(,"; UC=",nrow(temp.uc)), diag ="All"))}p1<-km.df%>%subset(diag!="All")%>%ggplot(aes(x =time, y =cumhaz))+geom_line(aes(color =diag), lty =1, lwd =1.2)+facet_wrap(~class, ncol =2)+theme_minimal()+scale_y_continuous(labels =scales::percent, limits =c(0, 1))+labs( x ="Time (years)", y ="% of subjects receiving an advanced therapy", color ="IBD type")+theme(legend.position ="bottom")# p1ggsave("paper/Figure-3.pdf",p1, width =8*3/4, height =12*3/4, units ="in")ggsave("paper/Figure-3.png",p1, width =8*3/4, height =12*3/4, units ="in")
Total percentage
Crohn’s disease
Ulcerative colitis
Crohn’s disease
Ulcerative colitis
Crohn’s disease
Ulcerative colitis
Crohn’s disease
Ulcerative colitis
Crohn’s disease
Ulcerative colitis
Crohn’s disease
Ulcerative colitis
Crohn’s disease
Ulcerative colitis
Crohn’s disease
Ulcerative colitis
C-reactive protein analysis
Merge subject-level metadata with model-derived quantities
Here, we create a data.frame that combines individual-level information (e.g. age at diagnosis, sex) with model-derived quantities, such as the posterior probabilities of class assignment. To facilitate visualisation, we also create discretised versions for some variables.
Figure 1 shows how cluster assignment probabilities change as follow-up for subjects increases. As one would expect, probabilities typically increase as as follow-up increases. This relationship appears to depend upon when the mean trajectory for the assigned cluster substantially differs from the other clusters. FC8 shows high posterior probabilities with even short follow-up as this is the only cluster with low FC at diagnosis. However, longer follow-ups are required to distinguish other clusters. For example, individuals assigned to FC6 that have a short follow-up (< 4 years from diagnosis) have, on average, a high probability of being assigned to FC3 instead ( versus ). This is not unexpected, as FC3 and FC6 share similar patterns within the first 2 years.
# Assign level order otherwise alphanumerical order used# and add sample sizes to labelsmyDF.crp_means<-myDF.crp_means%>%mutate( prob_order =factor(prob_order, levels =c(paste0("prob_order", 1:8)), labels =c(paste0("CRP", 1:8))), class_order =factor(class_order, levels =paste0("CRP", 1:8), labels =paste0("Assigned to CRP", 1:8, "\n n = ",as.vector(table(myDF.crp$class_order)))))crp_fup<-myDF.crp_means%>%ggplot(aes(fill =prob_order, y =value, x =followup_cut))+geom_bar(position ="fill", stat ="identity")+facet_wrap(.~class_order, ncol =4)+theme_minimal()+theme( legend.title =element_text(hjust =0.5), strip.background =element_rect( color ="lightgray", linewidth =1.5, linetype ="solid"))+labs( x ="Follow-up cutoff (years)", y ="", fill ="Mean posterior\nprobability of\ncluster assignment")+scale_fill_viridis_d(option ="D")crp_fupggsave("plots/crp-prob-fup.png", crp_fup, width =11, height =8, units ="in")ggsave("plots/crp-prob-fup.pdf", crp_fup, width =11, height =8, units ="in")p<-fc_fup/crp_fup+plot_annotation(tag_levels ="A")&theme( plot.tag =element_text(size =20, face ="bold"), legend.title =element_text(size =14), legend.text =element_text(size =12))ggsave("plots/prob-fup.pdf",p, width =11*3/4, height =16*3/4, units ="in")ggsave("plots/prob-fup.png",p, width =11*3/4, height =16*3/4, units ="in")
Figure 2: Demonstration of how mean posterior probabilities of cluster assignment for subjects changes based upon follow-up and assigned cluster.
Associations with respect to cluster assignments
This section displays descriptive plots to summarize marginal associations between cluster assignments and individual-level covariates. We also explore univariate and multivariate associations with respect to cluster assignment using multinomial logistic regression. As a sensitivity analysis, we also consider restricting the analysis to only consider individuals whose class assignment was less uncertain (with posterior probability > 0.5).
For all individuals
Here, we consider associations with respect to information available at diagnosis: age, sex and IBD type.
p_diagnosis_all<-myDF.crp%>%plotCat("diagnosis", class ="class_order")p_sex_all<-myDF.crp%>%plotCat("sex", class ="class_order")p_age_all<-myDF.crp%>%ggplot(aes(x =class_order, y =age))+geom_violin(fill ="#5DB7DE", color ="#434371")+geom_boxplot(width =0.1, color ="black", alpha =0.2)+theme_minimal()+xlab("Cluster")+ylab("Age at diagnosis")p_mlr_all<-myDF.crp%>%mutate(class_order =relevel(class_order, ref ="CRP1"))%>%mlrPlot(var =c("diagnosis", "age", "sex"), class ="class_order")
For CD patients, we also consider additional phenotyping information. This includes the following information:
This is recorded as a binary (Yes/No) variable and is primarily based on self-reporting. As such, it may not necessarily reflect true smoking status. Smoking was missing for approximately 6% of CD patients in the FC cohort.
Montreal location
Montreal location refers to where gastrointestinal inflammation is present and is categorised as:
L1: Ileal, limited to the ileum which is the final segment of the small intestine.
L2: Colonic, limited to the colon/large intestine.
L3: Ileocolonic, inflammation is present in both the ileum and colon.
Montreal location was missing for approximately 3% of CD patients in the FC cohort.
Montreal behaviour
Montreal behaviour describes another clinical phenotype and is defined as follows:
B1: Inflammatory, in other words non-stricturing and non-penetrating
B2: Stricturing, where the formation of fibrosis leads to the narrowing of the intestine.
B3: Penetrating, where the inflammation causes the formation of fistulas or abscesses.
Due to small numbers, B2 and B3 are merged into a single group (complicated CD) when analysing Montreal behaviour.
myDF.crp<-myDF.crp%>%mutate(Behaviour_merged =plyr::mapvalues(Behaviour, from =c("B1", "B2", "B3", NA), to =c("B1", "B2 or B3", "B2 or B3", NA)))
Montreal behaviour was missing for approximately 3% of CD patients in the FC cohort.
Upper GI inflammation
Upper GI inflammation refers to any gastrointestinal inflammation further up than the ileum. Usually, upper inflammation is considered a modifier for Montreal location and is denoted L4. Upper GI inflammation (L4) was missing for a high proportion of CD individuals in the FC cohort (approx 3% This is because the required investigations are only carried out where upper GI inflammation is suspected. As such, we have manually mapped missing L4 values as “No” (i.e. no upper GI inflammation for the associated patients).
Perianal disease is considered a modifier for Montreal behaviour and is a severe complication of Crohn’s disease involving inflammation around the anus.
Perianal disease status was missing for approximately 2% of CD patients in the FC cohort.
NOTE: For the purposes of the multinomial logistic regression model, individuals with missing values in any of these variables will be excluded. For consistency, such individuals will also be excluded from the univariate summary plots.
For this purpose, we create a missingness indicator (missingN_cd) which will facilitate the application of such filter.
p_sex_cd<-myDF.crp%>%filter(diagnosis=="Crohn's Disease")%>%filter(missingN_cd==0)%>%plotCat("sex", class ="class_order")p_age_cd<-myDF.crp%>%filter(diagnosis=="Crohn's Disease")%>%filter(missingN_cd==0)%>%ggplot(aes(x =class_order, y =age))+geom_violin(fill ="#5DB7DE", color ="#434371")+geom_boxplot(width =0.1, color ="black", alpha =0.2)+theme_minimal()+xlab("Cluster")+ylab("Age at diagnosis")p_smoke_cd<-myDF.crp%>%filter(diagnosis=="Crohn's Disease")%>%filter(missingN_cd==0)%>%mutate(Smoke =ifelse(!, Smoke, "Missing"))%>%plotCat("Smoke", class ="class_order")p_location_cd<-myDF.crp%>%filter(diagnosis=="Crohn's Disease")%>%filter(missingN_cd==0)%>%mutate(Location =ifelse(!, Location, "Missing"))%>%plotCat("Location", class ="class_order")p_behaviour_cd<-myDF.crp%>%filter(diagnosis=="Crohn's Disease")%>%filter(missingN_cd==0)%>%mutate(Behaviour =ifelse(!,Behaviour_merged,"Missing"))%>%plotCat("Behaviour", class ="class_order")p_L4_cd<-myDF.crp%>%filter(diagnosis=="Crohn's Disease")%>%filter(missingN_cd==0)%>%mutate(L4 =ifelse(!, L4, "Missing"))%>%plotCat("L4", class ="class_order")p_perianal_cd<-myDF.crp%>%filter(diagnosis=="Crohn's Disease")%>%filter(missingN_cd==0)%>%mutate(Perianal =ifelse(!, Perianal, "Missing"))%>%plotCat("Perianal", class ="class_order")p_mlr_cd<-myDF.crp%>%filter(diagnosis=="Crohn's Disease")%>%filter(missingN_cd==0)%>%mutate(class_order =relevel(class_order, ref ="CRP1"))%>%mlrPlot( var =c("age", "sex", "Smoke", "Location", "L4", "Behaviour_merged"), class ="class_order")
Ulcerative Colitis only
The additional phenotyping information available for UC patients consists of:
This is defined in the same way as for CD patients.
Smoking was missing for approximately 8% of UC patients in the FC cohort.
NOTE: As for CD cases, individuals with missing values in any of these variables will be excluded from the association analysis. For consistency, such individuals will also be excluded from the univariate summary plots.
For this purpose, we create a missingness indicator (missingN_uc) which will facilitate the application of such filter.
p_sex_uc<-myDF.crp%>%filter(diagnosis=="Ulcerative Colitis")%>%filter(missingN_uc==0)%>%plotCat("sex", class ="class_order")p_age_uc<-myDF.crp%>%filter(diagnosis=="Ulcerative Colitis")%>%filter(missingN_uc==0)%>%ggplot(aes(x =class_order, y =age))+geom_violin(fill ="#5DB7DE", color ="#434371")+geom_boxplot(width =0.1, color ="black", alpha =0.2)+theme_minimal()+xlab("Cluster")+ylab("Age at diagnosis")p_smoke_uc<-myDF.crp%>%filter(diagnosis=="Ulcerative Colitis")%>%filter(missingN_uc==0)%>%mutate(Smoke =ifelse(!, Smoke, "Missing"))%>%plotCat("Smoke", class ="class_order")p_extent_uc<-myDF.crp%>%filter(diagnosis=="Ulcerative Colitis")%>%filter(missingN_uc==0)%>%mutate(Extent =ifelse(!, Extent, "Missing"))%>%plotCat("Extent", class ="class_order")p_mlr_uc<-myDF.crp%>%filter(diagnosis=="Ulcerative Colitis")%>%filter(missingN_uc==0)%>%mutate(class_order =relevel(class_order, ref ="CRP1"))%>%mlrPlot( var =c("age", "sex", "Smoke", "Extent"), class ="class_order")
Advanced therapy use
Overall cluster-specific trajectories
Here, we extract overall cluster-specific trajectories as these will be used for visualisation purposes in order to better understand patterns of AT use. Note that model outputs do not match the reordered clusters (based on cumulative inflammation) used throughout this report. As such, we use title.mapping to re-order the trajectories when these are plotted.
time.pred<-seq(0, 7, by =0.01)pred.crp.df<-data.frame( crp_time =c(time.pred, time.pred), diagnosis =c(rep("Crohn's Disease", length(time.pred)),rep("Ulcerative Colitis", length(time.pred))))pred.crp.df.update<-lcmm::predictY(model.crp,pred.crp.df, var.time ="crp_time", draws =TRUE)$predpred<-predictY(model.crp,pred.crp.df, var.time ="crp_time", draws =TRUE)$predpred<[seq_along(time.pred), ])pred$time<-time.predylimit<-log(2500)title.mapping<-c(2, 3, 1, 4, 5, 7, 6, 8)
Summary statistics of AT use
Overall, we observe significant differences in AT across clusters. In particular, after adjusting for age and sex, AT was significantly lower in FC2.
p_AT_1Y<-myDF.crp%>%mutate(AT_1Y =factor(AT_1Y))%>%plotCat("AT_1Y", class ="class_order")p_AT_1Y_cd<-myDF.crp%>%subset(diagnosis=="Crohn's Disease")%>%plotCat("AT_1Y", class ="class_order")p_AT_1Y_cd<-myDF.crp%>%subset(diagnosis=="Ulcerative Colitis")%>%plotCat("AT_1Y", class ="class_order")
We also generate a censored version for AT_1Y where lack of AT is treated as a right censored observation at seven years.
At present, we cannot show cumulative advanced therapy usage in this document as there are fewer than five subjects within at least one cluster-IBD type stratum. In the meantime, it is possible to view these plots in our manuscript which has digitally removed any strata with fewer than five subjects.
km.df<-data.frame( time =numeric(), cumhaz =numeric(), class =character(), diag =character())for(gin1:8){# Calculate cumulative<-myDF.crp%>%filter(class_order==paste0("CRP", g))%>%filter(diagnosis=="Crohn's Disease")temp.uc<-myDF.crp%>%filter(class_order==paste0("CRP", g))%>%filter(diagnosis=="Ulcerative Colitis")km<-survfit(Surv(AT_line_1_cens, AT_7Y)~1, data<-rbind(km.df,data.frame( time =km$time, cumhaz =1-km$surv, class =paste0("CRP",g,", CD=",nrow(,"; UC=",nrow(temp.uc)), diag ="Crohn's disease"))km<-survfit(Surv(AT_line_1_cens, AT_7Y)~1, data =temp.uc)km.df<-rbind(km.df,data.frame( time =km$time, cumhaz =1-km$surv, class =paste0("CRP",g,", CD=",nrow(,"; UC=",nrow(temp.uc)), diag ="Ulcerative colitis"))temp.all<-myDF.crp%>%filter(class_order==paste0("CRP", g))km<-survfit(Surv(AT_line_1_cens, AT_7Y)~1, data =temp.all)km.df<-rbind(km.df,data.frame( time =km$time, cumhaz =1-km$surv, class =paste0("CRP",g,", CD=",nrow(,"; UC=",nrow(temp.uc)), diag ="All"))}p1<-km.df%>%subset(diag!="All")%>%ggplot(aes(x =time, y =cumhaz))+geom_line(aes(color =diag), lty =1, lwd =1.2)+facet_wrap(~class, ncol =2)+theme_minimal()+scale_y_continuous(labels =scales::percent, limits =c(0, 1))+labs( x ="Time (years)", y ="% of subjects receiving an advanced therapy", color ="IBD type")+theme(legend.position ="bottom")# p1ggsave("paper/CRP-AT.pdf", p1, width =8*3/4, height =12*3/4, units ="in")ggsave("paper/CRP-AT.png", p1, width =8*3/4, height =12*3/4, units ="in")
