Duplicated/ Repeated Cases in SISTRAT C1 (part 4)

For the purpose of this page, we use the terms “rows” and “cases” interchangeably to refer to the entries of the dataset. In many of the processes made along the deduplication of entries in the C1 dataset, we used unstandardized columns or many other data that was in fact duplicated by HASHs that did not depend on events related to treatment. In order to find and delete duplicated data that does not add information relevant for the purposes of the study, we now may use these standardized variables as a criteria to achieve the goal of having a unique event per HASH, by reducing its complexity based on irrelevant differences.

As stated in the third part of the deduplication process, we identified and defined an amount of treatment days that would be suitable to link these entries and several additional criteria to distinguish between what would be a different treatment from what would reflect a continuation of a treatment. In this stage, we defined rules to keep the most relevant information of each variable, collapsing the intermediate events into a single entry that summarizes the whole treatment and would let us distinguish posterior treatments.

Structure of treatments and rules to collapse continuous entries

We concluded with a general impression of the database to understand which steps to follow to collapse entries into differentiated treatments. This is why we look at the relationship that the entries had with those that followed them.

#https://stackoverflow.com/questions/46750364/diagrammer-and-graphviz
#https://mikeyharper.uk/flowcharts-in-r-using-diagrammer/
#http://blog.nguyenvq.com/blog/2012/05/29/better-decision-tree-graphics-for-rpart-via-party-and-partykit/
#http://blog.nguyenvq.com/blog/2014/01/17/skeleton-to-create-fast-automatic-tree-diagrams-using-r-and-graphviz/
#https://cran.r-project.org/web/packages/DiagrammeR/vignettes/graphviz-mermaid.html
#https://stackoverflow.com/questions/39133058/how-to-use-graphviz-graphs-in-diagrammer-for-r
#https://subscription.packtpub.com/book/big_data_and_business_intelligence/9781789802566/1/ch01lvl1sec21/creating-diagrams-via-the-diagrammer-package
#https://justlegal.be/2019/05/using-flowcharts-to-display-legal-procedures/
#   #   [3]:  paste0('Only applications w/ only one\\n application in the same date \\n(n = ', formatC(nrow(CONS_C1_df_dup_JUN_2020), format='f', big.mark=',', digits=0), ';\\n users:',formatC(CONS_C1_df_dup_JUN_2020%>% dplyr::distinct(hash_key)%>% nrow(), format='f', big.mark=',', digits=0),')')
   #   [4]:  paste0('Dataset \\n(n = ',formatC(comb_datasets_a_n,format='f', big.mark=',', digits=0),'\\nusers =',formatC(comb_datasets_a_users,format='f', big.mark=',', digits=0),')')
   #   [5]:  paste0('Dataset \\n(n = ',formatC(comb_datasets_b_n,format='f', big.mark=',', digits=0),'\\nusers =',formatC(comb_datasets_b_users,format='f', big.mark=',', digits=0),')')
   #   [6]:  paste0('Dataset \\n(n = ',formatC(comb_datasets_c_n,format='f', big.mark=',', digits=0),'\\nusers =',formatC(comb_datasets_c_users,format='f', big.mark=',', digits=0),')')
   #   [7]:  paste0('Final Sample \\n(n = ', formatC(nrow(CONS_C1_df_dup_JUN_2020_match_top_sel), format='f', big.mark=',', digits=0), ';\\n users: ',formatC(CONS_C1_df_dup_JUN_2020_match_top_sel%>% dplyr::distinct(hash_key)%>% nrow(), format='f', big.mark=',', digits=0),')')
#
#    #  tab3 [label = '@@3']
    #  tab7 [label = '@@7']
   #  blank [label = '', width = 0.001, height = 0.001]
#
#    # blank -> tab3[ dir = none,  color = 'white',fontcolor = white,shape=none, width=0, height=0];
    # tab3 -> tab4 [label=paste0('Some users had events fullfilling both conditions (n=',tab6_lab_users+tab5_lab_users-tab4_lab_users')',fontsize = 9];
    #  tab6 -> tab7 [label='  Only rows with available data on TOP scores and Diagnostic of CIE-10',fontsize = 9];
tab1_lab<- paste0('C1 Dataset \n(n = ', formatC(nrow(CONS_C1_df_dup_JUN_2020), format='f', big.mark=',', digits=0), ';\nusers: ',formatC(CONS_C1_df_dup_JUN_2020%>% dplyr::distinct(hash_key)%>% nrow(), format='f', big.mark=',', digits=0),')')

tab2_lab<-paste0('Cases of users that had at least two entries \n(n = ', CONS_C1_df_dup_JUN_2020%>% dplyr::group_by(hash_key)%>%   dplyr::mutate(sum_validos=sum(!is.na(diff_bet_treat)))%>%
  ungroup()%>%  dplyr::filter(sum_validos>0)%>% nrow()%>% formatC(big.mark=","),';\nusers =',  CONS_C1_df_dup_JUN_2020%>% dplyr::group_by(hash_key)%>%   dplyr::mutate(sum_validos=sum(!is.na(diff_bet_treat)))%>%  ungroup()%>%  dplyr::filter(sum_validos>0)%>% distinct(hash_key)%>% nrow()%>% formatC(big.mark=","),')')

tab3_lab<-paste0('Only entries w/ an entry\n that followed another one \n(n = ', CONS_C1_df_dup_JUN_2020%>% dplyr::filter(!is.na(diff_bet_treat))%>%nrow()%>% formatC(big.mark=","),';\nusers =',CONS_C1_df_dup_JUN_2020%>% dplyr::filter(!is.na(diff_bet_treat))%>%distinct(hash_key)%>% nrow()%>% formatC(big.mark=","),')')

            tab4_lab_n<-CONS_C1_df_dup_JUN_2020%>% 
              #dplyr::filter(!is.na(diff_bet_treat))%>%
              dplyr::mutate(filter_complex= dplyr::case_when(diff_bet_treat<45& as.numeric(motivoegreso_derivacion)==2~1,TRUE~0))%>%
              dplyr::mutate(filter_complex2= dplyr::case_when(diff_bet_treat<60& as.numeric(motivoegreso_derivacion)==1~1,TRUE~0))%>%
              dplyr::filter(filter_complex==1|filter_complex2==1)%>%
              #dplyr::select(hash_key,motivoegreso_derivacion,diff_bet_treat)
              nrow()
            
            tab4_lab_users<-CONS_C1_df_dup_JUN_2020%>% 
              #dplyr::filter(!is.na(diff_bet_treat))%>%
              dplyr::mutate(filter_complex= dplyr::case_when(diff_bet_treat<45& as.numeric(motivoegreso_derivacion)==2~1,TRUE~0))%>%
              dplyr::mutate(filter_complex2= dplyr::case_when(diff_bet_treat<60& as.numeric(motivoegreso_derivacion)==1~1,TRUE~0))%>%
              dplyr::filter(filter_complex==1|filter_complex2==1)%>%
              #dplyr::select(hash_key,motivoegreso_derivacion,diff_bet_treat)
              distinct(hash_key)%>% nrow()
            
            tab5_lab_n<-CONS_C1_df_dup_JUN_2020%>% 
              #dplyr::filter(!is.na(diff_bet_treat))%>%
              dplyr::mutate(filter_complex= dplyr::case_when(diff_bet_treat>45& as.numeric(motivoegreso_derivacion)==2~1,TRUE~0))%>%
              dplyr::mutate(filter_complex2= dplyr::case_when(diff_bet_treat<60& as.numeric(motivoegreso_derivacion)==1~1,TRUE~0))%>%
              dplyr::filter(filter_complex==1)%>%
              #dplyr::select(hash_key,motivoegreso_derivacion,diff_bet_treat)
              nrow()
            
            tab5_lab_users<-CONS_C1_df_dup_JUN_2020%>% 
              #dplyr::filter(!is.na(diff_bet_treat))%>%
              dplyr::mutate(filter_complex= dplyr::case_when(diff_bet_treat<45& as.numeric(motivoegreso_derivacion)==2~1,TRUE~0))%>%
              dplyr::mutate(filter_complex2= dplyr::case_when(diff_bet_treat<60& as.numeric(motivoegreso_derivacion)==1~1,TRUE~0))%>%
              dplyr::filter(filter_complex==1)%>%
              #dplyr::select(hash_key,motivoegreso_derivacion,diff_bet_treat)
              distinct(hash_key)%>% nrow()
            
            tab6_lab_n<-CONS_C1_df_dup_JUN_2020%>% 
              #dplyr::filter(!is.na(diff_bet_treat))%>%
              dplyr::mutate(filter_complex= dplyr::case_when(diff_bet_treat<45& as.numeric(motivoegreso_derivacion)==2~1,TRUE~0))%>%
              dplyr::mutate(filter_complex2= dplyr::case_when(diff_bet_treat<60& as.numeric(motivoegreso_derivacion)==1~1,TRUE~0))%>%
              dplyr::filter(filter_complex2==1)%>%
              #dplyr::select(hash_key,motivoegreso_derivacion,diff_bet_treat)
              nrow()
            
            tab6_lab_users<-CONS_C1_df_dup_JUN_2020%>% 
              #dplyr::filter(!is.na(diff_bet_treat))%>%
              dplyr::mutate(filter_complex= dplyr::case_when(diff_bet_treat<45& as.numeric(motivoegreso_derivacion)==2~1,TRUE~0))%>%
              dplyr::mutate(filter_complex2= dplyr::case_when(diff_bet_treat<60& as.numeric(motivoegreso_derivacion)==1~1,TRUE~0))%>%
              dplyr::filter(filter_complex2==1)%>%
              #dplyr::select(hash_key,motivoegreso_derivacion,diff_bet_treat)
              distinct(hash_key)%>% nrow()
            tab7_lab<- paste0('* Some users had events fullfilling both conditions (n=',tab6_lab_users+tab5_lab_users-tab4_lab_users,')')

tab4_lab<-paste0('Only entries w/ an entry\n that followed another one\n(both conditions)\n(n = ', tab4_lab_n%>% formatC(big.mark=","),';\nusers =',tab4_lab_users%>% formatC(big.mark=","),')*')

tab5_lab<-paste0('Only entries w/ an entry\n that followed another one \n(< 45 days of difference w/ a posterior entry &\nReferral as a cause of discharge)\n(n = ', tab5_lab_n%>% formatC(big.mark=","),';\nusers =',tab5_lab_users%>% formatC(big.mark=","),')')

tab6_lab<-paste0('Only entries w/ an entry\n that followed another one \n(< 60 days of difference w/ a posterior entry &\nNot a Referral as a cause of discharge)\n(n = ', tab6_lab_n%>% formatC(big.mark=","),';\nusers =',tab6_lab_users%>% formatC(big.mark=","),')')
          
DiagrammeR::grViz("
digraph graph2 {

graph [layout = dot]

# node definitions with substituted label text
node [shape = rectangle, width = 5, fillcolor = Biege]
a [label = '@@1']
b [label = '@@2']
c [label = '@@3']
d [label = '@@4']
e [label = '@@5', fontcolor = MidnightBlue, color = MidnightBlue]
f [label = '@@6']
g [label = '@@7', width = 0.001, height = 0.001, color=White]

a -> b 
b -> c 
c -> d #[label= paste0('** Some users had events fullfilling both conditions (n=',tab6_lab_users+tab5_lab_users-tab4_lab_users,')'),fontsize = 9];
d -> {e f} 
{e f} -> g [ dir = none,  color = 'white',fontcolor = white,shape=none, width=0, height=0];

}

[1]:  tab1_lab
[2]:  tab2_lab
[3]:  tab3_lab
[4]:  tab4_lab
[5]:  tab5_lab
[6]:  tab6_lab
[7]:  tab7_lab
")

Figure 1. Decision Tree for the Users with more than one entry

#[label=paste0('Some users had events fullfilling both conditions (n=',tab6_lab_users+tab5_lab_users-tab4_lab_users,')',fontsize = 9];

As seen in Figure 1, we could define that this pair of events with the same users could correspond to a continuous treatment, rather than different ones. We focused on these patterns to collapse them into treatments, particularly those related to cases with referrals in the first entry and less than 45 days of difference with a posterior entry.

invisible(c("1. Que las derivaciones hayan terminado siendo dervidados"))
invisible(c("2. Referral falsos, cuando el primer tratamiento es un traspaso perfecto y es considerado como un senda no"))
invisible(c("3. Ver uno a uno los casos que tienen 1 día de tratamiento y que uno es SENDA No y el otro SENDA Sí"))
invisible(c("4. colapsarse en un registro único aquellos registros de usuarios en común que presenten una diferencia menor a 45 para derivaciones y 60 días para el resto de motivos de egreso, entre la fecha de egreso y la fecha de ingreso al siguiente tratamiento (dependiendo de lo que acordemos), y en los que el único cambio registrado entre un tratamiento y otro sea el cambio del ID del centro."))
invisible(c("5. Generar variable con tratamientos concatenados"))
invisible(c("6. Qué hago con los tratamientos con NAs en fecha de egreso. Debiese borrarlos"))

invisible(c("derivaciones que cuenten con un tratamiento posterior, agrupar las entradas que tengan una diferencia menor o igual a 45 días"))

 #CONS_C1_df_dup_JUN_2020%>% dplyr::mutate(motivodeegreso_mod_imp_tidy= case_when(!is.na(diff_bet_treat) & as.character(motivodeegreso_mod_imp)=="Derivación" & grepl("Clínica",tipo_centro_derivacion==<90~"Abandono Temprano" , TRUE~as.character(motivodeegreso_mod_imp))%>% nrow()

#- si menor
#se puede pensar que un abandono tardío en verdad puede abarcar al menos 1 mes de tratamiento. Following the criteria stated in the annex and the terminological glossary, 
#las derivaciones deberían abarcar hasta 45 días.
#
#si hay más de 1095 días 

#menor_60_dias_diff
#motivoegreso_derivacion
#obs_cambios_ninguno

#tiene casos inválidos
CONS_C1_df_dup_JUN_2020%>% 
  dplyr::filter(!is.na(diff_bet_treat))%>%
#  dplyr::group_by(hash_key)%>%
#  dplyr::mutate(sum_validos=sum(!is.na(diff_bet_treat)))%>%
#  ungroup()%>%
#  dplyr::filter(sum_validos>0)%>%
  dplyr::mutate(menor_45_dias_diff=ifelse(diff_bet_treat<45,1,0))%>%
  janitor::tabyl(menor_45_dias_diff,motivoegreso_derivacion)%>%
  adorn_totals("col") %>%
  adorn_percentages("col") %>%
  adorn_pct_formatting(digits = 1) %>%
  adorn_ns()%>%
  knitr::kable(format= "html", format.args= list(decimal.mark= ".", big.mark= ","),
               caption="Table 1. Diff. in Treatments <45 days, by Referral (only cases that had an entry after another one)",
               align= c("l",rep('c', 5)), col.names = c("Diff. in Treatments <45 days","Not a Referral","Referral", "Total"))%>%
  
  kableExtra::kable_styling(bootstrap_options = c("striped", "hover"),font_size= 8)%>%
        kableExtra::add_footnote(paste0("Note= Percentages by Column; Cases with an entry that follows them (n= ",CONS_C1_df_dup_JUN_2020%>% dplyr::filter(!is.na(diff_bet_treat))%>%nrow()%>% formatC(big.mark=","),"; users=",CONS_C1_df_dup_JUN_2020%>% dplyr::filter(!is.na(diff_bet_treat))%>%distinct(hash_key)%>% nrow()%>% formatC(big.mark=","),")"), notation = "none")%>%
  kableExtra::scroll_box(width= "100%", height = "250x")

distinct: removed 11,085 rows (35%), 20,524 rows remaining

Table 1. Diff. in Treatments <45 days, by Referral (only cases that had an entry after another one)
Diff. in Treatments <45 days	Not a Referral	Referral	Total
0	86.4% (18772)	31.2% (3084)	69.1% (21856)
1	13.6% (2944)	68.8% (6809)	30.9% (9753)
Note= Percentages by Column; Cases with an entry that follows them (n= 31,609; users=20,524)

As seen in the Table above, most of the referrals that had a posterior treatment had a difference of 45 days or less, compared to other causes of admission. Considering this, we decided to get an impression over the amount of time that took to report another entry within users that had different causes of discharge in a previous treatment.

#http://rstudio-pubs-static.s3.amazonaws.com/316989_83cbe556125645b698c9ff6cf88c4c1a.html
#https://thriv.github.io/biodatasci2018/r-survival.html
#http://si.biostat.washington.edu/sites/default/files/modules/SISCR_2018_11_all-2pp_0.pdf
#https://www.researchgate.net/profile/Claudia_Castro-Kuriss/publication/325390160_Analisis_de_Supervivencia_mediante_el_empleo_de_R/links/5b0aba27a6fdcc8c25333860/Analisis-de-Supervivencia-mediante-el-empleo-de-R.pdf?origin=publication_detail
#http://www.sthda.com/english/wiki/survival-analysis-basics

#SURVIVAL= Explores factors that are thought to influence the chance that the event occurs
#Datos censurados= pueden ser por distintas causas:
    #- El paciente no refirió un evento (la readmissión) durante el estudio, y no sabemos si el evento ocurrió después. ESTOS SON LOS QUE TENGO QUE DARLES UN DIFF TREAT HASTA EL DIA DE HOY. SIEMPRE Y CUANDO TENGAN MENOS DE 1095 DIAS PARA PERDIDOS EN FECHA DE EGRESO, Y NO ESTÉN TRUNCADOS A LA DERECHA PORQUE NO SE LES TERMINÓ EL PRIMER TRAT.
    #Esta censura puede ocurrir cuando un usuario abandona un estudio, se pierde el seguimiento o no experimenta el evento una vez finaliza el estudio
    #- Truncado a la derecha: quien se perdió por una razón. Truncado a la derecha
    #Las muestras con censura aleatoria se consideran generalmente censuradas por derecha debido a que se van incorporando progresivamente los tiempos de fallas de distintas unidades
    #Los eventos que no experimentaron el evento en el tiempo de estudio se les censurará hasta el último tiempo de registro
    # Una suposición menos restrictiva que la suposición de independencia entre Ci y Ti, pero que alcanza para que los métodos sean válidos, es “la censura independiente” o “censura no informativa”:la probabilidad de que un individuo sea censurado en el instante t0 no depende de que ese individuo tenga inusualmente alto (o bajo) riesgo de evento.
    #Censoring may arise in the following ways:
    ###a patient has not (yet) experienced the event of interest, such as relapse or death, within the study time period;
    ###a patient is lost to follow-up during the study period;
    ###a patient experiences a different event that makes further follow-up impossible.
    #This type of censoring, named right censoring, is handled in survival analysis.

#– Recurrence rate
survfit_days_new_treat<-survfit(Surv(diff_bet_treat, status) ~ motivodeegreso_mod_imp, 
                                data=CONS_C1_df_dup_JUN_2020%>% 
                                  dplyr::mutate(diff_nas_fech_egres= as.numeric(difftime(lubridate::ymd("2019-11-13"),fech_ing, units = "days")))%>%
                                  #dplyr::filter(is.na(fech_egres_imp))%>% dplyr::select(fech_ing,fech_egres_imp,diff_nas_fech_egres)
                                  dplyr::mutate(perdi_seguimiento=dplyr::case_when(is.na(fech_egres_imp)&diff_nas_fech_egres>=1095~1,TRUE~0))%>% 
                                  dplyr::filter(perdi_seguimiento==0)%>% #NI SIQUIERA TERMINARON EL PIMER EVNTO, Y LO MAS PROBABLE ES QUE NUNCA REGISTRARON FECHA DE TÉRMINO. ESTPS SI QUE SI DEBO SACARLOS.
                                  dplyr::mutate(no_tienen_ni_el_primer_evento=dplyr::case_when(is.na(fech_egres_imp)&diff_nas_fech_egres<1095~1,
                                                                                     TRUE~0))%>% 
                                 # dplyr::filter(no_tienen_ni_el_primer_evento==0)%>%#NO HAN TERMINADO EL TRATAMIENTO. CENSURA SIMPLE TIPO 1, PERO SE DIFERENCIA DE LOS QUE NUNCA LLEGARON SIQUIERA A TENER EL PRIMER EVENTO. POR ESO A ESOS CASOS DEBO SACARLOS. AUNQUE NO ESTOY SEGURO, PORQUE PUEDE QUE ESTOS CASOSO TAMBIÉN FORMEN PARTE ED LA CENSURA AUTOMATICA QUE HACE R.
                                  #LOS QUE TIENEN 
                                  dplyr::mutate(status=dplyr::case_when(!is.na(diff_bet_treat)~1,TRUE~0)), #censurar si no tienen fechas entre trat porque no tienen un siguiente
                                    #mutate(status=dplyr::case_when(!is.na(fech_egres_imp)~1,TRUE~0)), #censurar fechas de egreso ##se supone q este es más puro, no sé
                                type = "kaplan-meier", #The Kaplan-Meier curve is a nonparametric estimator of the survival distribution (i.e. the “estimation” component of the “test/estimation” approach to analysis of time-to-event data)
                                error = "tsiatis", conf.type = "log-log", conf.int = 0.95)
#So we only know that the patient survived AT LEAST 13 months, but we have no other information available about the patient's status.  This type of censoring (also known as "right censoring") makes linear regression an inappropriate way to analyze the data due to censoring bias.

#simple
#survfit_days_new_treat_simple<-survfit(Surv(diff_bet_treat, status) ~ motivodeegreso_mod_imp, 
#                                data=CONS_C1_df_dup_JUN_2020%>% mutate(status=dplyr::case_when(!is.na(diff_bet_treat)~1,TRUE~0))%>% data.frame())

#Utilizando esta información se compara si existe alguna diferencia de las curvas de supervivencia entre los estados 
#In order to determine if there is a statistically significant difference between the survival curves, we perform what is known as a log-rank test, which tests the following hypothesis:
##H0: There is no difference in the survival function between those who were on maintenance chemotherapy and those who weren't on maintenance chemotherapy.
##Ha: There is a difference in the survival function between those who were on maintenance chemotherapy and those who weren't on maintenance chemotherapy.
#También con la orden “survdiff”, podemos realizar un test de hipótesis no paramétrico que nos diga si la diferencia de la probabilidad de supervivencia entre subgrupos es significativa o no. En este caso lo sería, al obtener un p-value < 0,05, experimentando esas diferencias en las zonas Centro y Sur, que sería en donde deberíamos de realizar un estudio más en profundidad.
  invisible(  
  survdiff(Surv(diff_bet_treat, status) ~ motivodeegreso_mod_imp, data=CONS_C1_df_dup_JUN_2020%>% mutate(status=dplyr::case_when(!is.na(diff_bet_treat)~1,TRUE~0)), rho = 0)  
  )

mutate: new variable 'status' with 2 unique values and 0% NA
mutate: new variable 'status' with 2 unique values and 0% NA

  #Prueba log-rank
  invisible(  
  survdiff(Surv(diff_bet_treat, status) ~ motivodeegreso_mod_imp, data=CONS_C1_df_dup_JUN_2020%>% mutate(status=dplyr::case_when(!is.na(diff_bet_treat)~1,TRUE~0)), rho = 1) 
  )

mutate: new variable 'status' with 2 unique values and 0% NA
mutate: new variable 'status' with 2 unique values and 0% NA

#survfit_days_new_treat_simple
survfit_days_new_treat_dataframe<-summary(survfit_days_new_treat, times=seq(0, 3500, 100), print.rmean=T,digits=2)
#
data.table(survfit_days_new_treat_dataframe$table,keep.rownames = T)%>%
  knitr::kable(format= "html", format.args= list(decimal.mark= ".", big.mark= ","),
               caption="Table 2. Estimates related to the probability that an entry kept free of a posterior one",
               align= c("l",rep('c', 5)), col.names = c("Cause of Discharge","Records","n.max", "n.start","events","rmean","se(rmean)","median", "95%CI Lower","95%CI Upper"))%>%
  kableExtra::kable_styling(bootstrap_options = c("striped", "hover"),font_size= 8)%>%
        kableExtra::add_footnote(paste0("Note= Treatments that did not finished their first treatment were discarded (n=",CONS_C1_df_dup_JUN_2020%>% 
    dplyr::mutate(diff_nas_fech_egres= as.numeric(difftime(lubridate::ymd("2019-11-13"),fech_ing, units = "days")))%>%dplyr::mutate(perdi_seguimiento=dplyr::case_when(is.na(fech_egres_imp)&diff_nas_fech_egres>=1095~1,TRUE~0))%>%dplyr::filter(perdi_seguimiento==1)%>% nrow()%>% formatC(big.mark=","),"); Excluded cases with no cause of discharge (n=",CONS_C1_df_dup_JUN_2020%>% dplyr::mutate(diff_nas_fech_egres= as.numeric(difftime(lubridate::ymd("2019-11-13"),fech_ing, units = "days")))%>% dplyr::mutate(perdi_seguimiento=dplyr::case_when(is.na(fech_egres_imp)&diff_nas_fech_egres>=1095~1,TRUE~0))%>%  dplyr::filter(perdi_seguimiento==0)%>% dplyr::mutate(status=dplyr::case_when(!is.na(diff_bet_treat)~1,TRUE~0))%>% dplyr::filter(!is.na(diff_bet_treat),is.na(motivodeegreso_mod_imp))%>% nrow() %>% formatC(big.mark=","),")"), notation = "none")%>%
  kableExtra::scroll_box(width= "100%", height = "250x")

Table 2. Estimates related to the probability that an entry kept free of a posterior one
Cause of Discharge	Records	n.max	n.start	events	rmean	se(rmean)	median	95%CI Lower	95%CI Upper
motivodeegreso_mod_imp=Abandono Tardio	9,080	9,080	9,080	9,080	578.4448	6.264544	374	357	386
motivodeegreso_mod_imp=Abandono Temprano	4,803	4,803	4,803	4,803	507.1576	8.337565	295	283	309
motivodeegreso_mod_imp=Alta Administrativa	2,998	2,998	2,998	2,998	496.9633	11.010819	262	241	286
motivodeegreso_mod_imp=Alta Terapéutica	4,826	4,826	4,826	4,826	601.5468	9.021110	383	363	401
motivodeegreso_mod_imp=Derivación	9,893	9,893	9,893	9,893	165.2633	3.933942	5	4	5
Note= Treatments that did not finished their first treatment were discarded (n=264); Excluded cases with no cause of discharge (n=9)

#median time to event (the time when half the records have an event).
#Even if median survival has been reached in a group, it might not be possible to calculate complete confidence intervals for those median values,
# just knowing the difference in median survival values doesn't necessarily tell you which is better for prognosis--then you have to specify which prognosis time you care about.
#The restricted mean (rmean) and its standard error se(rmean) are based on a truncated estimator. When the last censoring time is not random this quantity is occasionally of interest.

#El estimador de S es lo que se llama curva de supervivencia (“survival curve”). 
event="no"
if(event=="si"){
plot(mfit2, col=c(1,2,1,2), lty=c(2,2,1,1),
     mark.time=FALSE, lwd=2, xscale=12,
     xlab="Years post diagnosis", ylab="Probability in State")
legend(3000, .6, c("death:female", "death:male", "pcm:female", "pcm:male"),
         col=c(1,2,1,2), lty=c(1,1,2,2), lwd=2, bty='n')
}
plot(survfit_days_new_treat,
         xlab = "Days of difference with a posterior treatment",  conf.int = T,mark.time = F,
     ylab = "Ssurvival probability",
     col=c("yellow4","thistle","cornflowerblue","violetred3","gray20"), lwd=2) # 
legend("topright", c("Late Withdrawal", "Early Withdrawal", "Administrative Discharge", "Therapeutic Discharge","Referral"),
         col=c("yellow4","thistle","cornflowerblue","violetred3","gray20"), lty=c(1,1,1,1), lwd=2, bty='n')
mtext("Note. Users who did not finish their first treatment or did not show recurrence have been censored", side=1,size=.5,cex=.7,outer=F,at=1500,4)

Figure 2. Recurrence-free interval of a treatment according to cause of discharge of the first treatment

From the Figure above, we can interpret that referrals had most entries with 0’s or a minimum time with a posterior one. But, how many entries could users have had? We generated a histogram distinguishing the cases that summed no more than 0 days of difference between entries (possibly part of the same treatment with only minor changes), from those with more days of treatment.

c26 <- c(
  "dodgerblue2", "#E31A1C", # red
  "green4",
  "#6A3D9A", # purple
  "#FF7F00", # orange
  "gray16", "gold1",
  "skyblue2", "#FB9A99", # lt pink
  "palegreen2",
  "#CAB2D6", # lt purple
  "#FDBF6F", # lt orange
  "gray70", "khaki2",
  "maroon", "orchid1", "deeppink1", "blue1", "steelblue4",
  "darkturquoise", "green1", "yellow4", "yellow3",
  "darkorange4", "brown", "gray40")
c28 <- c(
  "dodgerblue2", "#E31A1C",  "green4",  "#6A3D9A",  "#FF7F00", "gray16", "gold1", "skyblue2", "#FB9A99",  "palegreen2","orchid1", "#CAB2D6", # lt purple
  "#FDBF6F",  "gray70","deeppink1", "khaki2","steelblue4",  "maroon",  "blue1", "brown",  "darkturquoise", "green1", "yellow4", "yellow3","pink",
  "darkorange4",  "gray40", "blue","black","red","green", "orange", "white", "blue4", "violet")

get_distinct_hues <- function(ncolor,s=0.5,v=0.95,seed=350) {
  golden_ratio_conjugate <- 0.618033988749895
  set.seed(seed)
  h <- runif(1)
  H <- vector("numeric",ncolor)
  for(i in seq_len(ncolor)) {
    h <- (h + golden_ratio_conjugate) %% 1
    H[i] <- h
  }
  hsv(H,s=s,v=v)
}
p3_1<-CONS_C1_df_dup_JUN_2020%>%
  dplyr::filter(!is.na(diff_bet_treat))%>%
ggplot(aes()) + 
  geom_segment(aes(x = as.POSIXct(as.Date(fech_ing)), xend = as.POSIXct(as.Date(fech_egres_imp)),
                   y = hash_key, yend = hash_key,colour=as.factor(row),size=1/100)) + 
    scale_x_datetime(breaks=scales::date_breaks("1 year"), 
                  limits = as.POSIXct(c('2010-01-01 09:00:00','2020-01-01 09:00:00')),
                  labels = scales::date_format("%m/%y")) +
 # scale_color_manual(values=get_distinct_hues(31609)) +
  theme(axis.line=element_blank(),
          axis.ticks=element_blank(),axis.title.y=element_text("HASHs"),axis.text.y=element_blank(),
          axis.title.x=element_text(""),legend.position="none",
          panel.background=element_blank(),panel.border=element_blank(),panel.grid.major=element_blank(),
          panel.grid.minor=element_blank(),plot.background=element_blank(), plot.title = element_text(hjust = 0))+
  scale_size_identity()+ ##para cambiar el ancho de cada segmento
  #scale_x_date(breaks = scales::date_breaks("1 year"), date_labels = "%b %d") +
    theme(plot.caption = element_text(face= "italic",hjust = 0)) +
    labs(x = "Dates of admission and discharge", y="HASHs", 
         caption="Example of 4 clean trajectories. Colored lines represent different rows in the dataset, but same HASH")
ggplotly(p3_1)

p3<-CONS_C1_df_dup_JUN_2020%>%
  #dplyr::filter(!is.na(diff_bet_treat))%>%
  dplyr::group_by(hash_key)%>%
  dplyr::mutate(sum_validador=sum(diff_bet_treat,na.rm=T), 
                n=n(),
                con_diff_dias=if_else(sum_validador>0,1,0,NA_real_))%>%
  distinct(hash_key,.keep_all=T)%>%
  dplyr::select(n,con_diff_dias)
  
  groupA <- p3 %>% filter(con_diff_dias == 1)
  groupB <- p3 %>% filter(con_diff_dias == 0)
  
#p3<-ggplot(p3,aes(x=n))+
#  geom_histogram_interactive()+
#  facet_wrap(~sin_diff_dias, labeller = as_labeller(c(`0` = "No differences between Treatments in Days", `1` = "Differences Between Treatments in Days")))+
#  sjPlot::theme_sjplot2()+
#    labs(x="",y="Frequencies ", x="No. of cases by user")+
# # xlim(c(2,10))+
#  ylim(c(0,15000))+
#scale_x_continuous(breaks = seq(from = 0, to = 13, by = 1))+
# theme(panel.grid.minor=element_blank(),
#       plot.background=element_blank(),
#        panel.background=element_blank(),
#       panel.border=element_blank(),
#       panel.grid.major=element_blank())+
#  labs(caption="Note. Only selected users with more than 1 case")

tooltip_css <- "background-color:gray;color:white;font-style:italic;padding:10px;border-radius:10px 20px 10px 20px;"

#ggiraph(code = {print(p3)}, tooltip_extra_css = tooltip_css, tooltip_opacity = .75 )

p3 <- plot_ly(alpha = 0.5) %>% 
  add_histogram(x = ~groupA$n,
                name = "Differences Between Treatments in Days") %>% 
    add_histogram(x = ~groupB$n,
                name = "No differences between Treatments in Days",
                marker = list(color = "rgba(150, 150, 150, 0.7)")) %>% 
  layout(barmode = "overlay",
         xaxis = list(title = "No. of cases by user",
                      
                      zeroline = T),
         yaxis = list(title = "Frequency of different users",
                      zeroline = T))%>%
layout(legend = list(orientation = "h",   # show entries horizontally
                     xanchor = "center",  # use center of legend as anchor
                     x = 0.5, y= -.04))  %>%
  config(displayModeBar = FALSE) %>%
layout(hovermode = 'compare')%>%
  layout(
    xaxis = list(
      dtick = 1, 
      tick0 = 1, 
      tickmode = "linear"
    )
  )
p3

Figure 3. Histogram of No. of Treatments depending on Sum of Diff Between Entries (if any)

As seen in the Figure above, the users that had only one entry represent the greater amount of people that had no cumulative days of difference between another entry, because there were no other treatment. This is why we focused on users that had more than one treatment, and we could highlight a small amount of users that had 2 entries with no days of difference between them, and a much smaller amount had three entries with no differences in the days between them. Ir is possible that we could consider that these users only had one treatment with minor changes between a different entry. Conversely, most of the users that had days of difference in entries within them had 2 cases and, some exceptional users had 3 cases (Notice that many of these users could have a treatment with differences of 0 days between another, but a third treatment with more days, leading to have a total difference of more than 0). It must be noted that some user had 13 entries.

Also, we considered necessary to get an overview of the distribution of changes by days (divided in 20 equal parts), and depending on the cause of discharge of the first entry, to get an impression of what the major changes are that are occurring between different entries of each user.

invisible(c("¿cuántos casos sin diferencias de tratamiento tienen por motivo de egreso"))

p11<-
  CONS_C1_df_dup_JUN_2020%>% 
    dplyr::filter(!is.na(diff_bet_treat), !is.na(motivodeegreso_mod_imp))%>%
    #dplyr::mutate(motivoegreso_derivacion=factor(motivoegreso_derivacion, levels = c("Motivo: Otro", "Motivo: Derivación","NA")))%>%
    
    dplyr::mutate(diff_bet_treat_bar=round(diff_bet_treat,0)) %>%
    dplyr::mutate(diff_bet_treat_bar=cut2(diff_bet_treat_bar, g =20))%>%
    
    dplyr::mutate(grupo_var=factor(obs_cambios_num))%>%
    dplyr::group_by(diff_bet_treat_bar, grupo_var,motivodeegreso_mod_imp)%>%
    summarise(n_2_grupos=n())%>%
    dplyr::ungroup()%>%
    dplyr::group_by(motivodeegreso_mod_imp,diff_bet_treat_bar)%>%
    dplyr::mutate(total_n=sum(n_2_grupos))%>%
    dplyr::mutate(freq = (n_2_grupos / total_n))%>%
    dplyr::ungroup()%>%
    dplyr::group_by(motivodeegreso_mod_imp)%>%
    dplyr::mutate(total_n_solo_mot_egreso=sum(n_2_grupos))%>%
    dplyr::mutate(text=paste('% Cause Discharge by No. Days: ', scales::percent(freq,accuracy =0.01), '<br>', #formatC(positivos_acumulados, format="f", big.mark=",", digits=0)
                            'Cause of Discharge: ', motivodeegreso_mod_imp , '<br>',
                            'Total Frequency by Days: ', total_n , '<br>',
                            'Frequency of Cause of Discharge by No. Days: ',n_2_grupos, '<br>',
                            'No. Days:',diff_bet_treat_bar))%>%

    ggplot2::ggplot(aes(x = diff_bet_treat_bar, y = n_2_grupos,fill=grupo_var,text=text))+
    geom_bar(stat='identity', alpha=.8) + 
    scale_x_discrete()+
    scale_fill_manual(name= "No. of changes",values=c("cornsilk3", "lightskyblue2", "#56B4E9", "steelblue","slategray4")) +
    theme_bw()+
    labs(x="",y="", fill="No. of changes")+
    #ylim(0,101)+
    #scale_y_continuous(limits=c(0,1),labels = scales::percent) +
    theme(legend.position="bottom")+
    guides(fill=guide_legend(title="No. of changes",ncol=5))+
    theme(legend.text = element_text(size=9))+
    theme(panel.grid.minor = element_blank(), 
          panel.grid.major = element_blank(), 
          panel.grid.major.x = element_blank(),
          panel.background = element_blank(),
          axis.title.x = element_blank())+
    theme(axis.text.x = element_text(vjust = 0.5,hjust = 0.5,angle = 90, size= 6.5), plot.caption= element_text(hjust=0))+
    facet_wrap(~motivodeegreso_mod_imp, ncol=3, labeller = as_labeller(c(`Abandono Tardio` = "Late Withdrawal", `Abandono Temprano` = "Early Withdrawal",`Alta Administrativa` = "Administrative Discharge",`Alta Terapéutica` = "Therapeutic Discharge",`Derivación` = "Referral")), strip.position = "right")+ 
    geom_vline(aes(xintercept = 45), 
               linetype = "dashed", colour = "red",size = 1)+
    labs(fill="No. of changes",caption=paste0("Note. ",CONS_C1_df_dup_JUN_2020%>% dplyr::filter(is.na(diff_bet_treat))%>% nrow() %>% formatC(big.mark = ",")," obs. had missing data that corresponded to unique treatments by users;\nDays of treatment were divided en 35 equal parts"))+
    theme(strip.background =element_rect(colour=NA,fill=NA, size=3.5))+
    theme(strip.text = element_text(colour = 'gray60', size=8),
          plot.caption= element_text(size=7))+
    theme(legend.title = element_text(colour = 'gray30', size=8))+
  theme(
    strip.text.x = element_text(margin = margin(10, 0, 10, 0))
  ) 
ggplotly(p11, tooltip = "text")%>%
    layout(legend = list(title= list(text = "Changes in SENDA,Center,Program or Plan"),
                         orientation = "h",   # show entries horizontally
                         xanchor = "center",  # use center of legend as anchor
                         x = 0.5, y=-0.09)) %>%
  config(displayModeBar = FALSE) %>%
layout(hovermode = 'compare')

Figure 4. Distribution of No. of Changes by Sum of Diff Between Entries Depending on Cause of Discharge of the first entry

From the Table above, we can add support to the fact that referrals had a great amount of posterior entries, compared to the other causes of discharge. In the entries with referrals as a cause of discharge, the first five bars represented most of the entries with posterior ones. Notably, these entries experienced a lot of changes in each following treatment (around 2 or 3).

Collapse continuous or almost continuous entries into treatments

We decided to collapse the different entries into a single treatment. This required us to adopt different strategies to collapse variable values of different types and characteristics.

##f     #Tratamiento más largo //#g- Replacedfavored dgs.-a // #h     #Sum values x_se_trata_mujer_emb_n  usuario_tribunal_trat_droga_n discapacidad_n ha_estado_embarazada_egreso_n tiene_menores_de_edad_a_cargo_n
disc_lab<- paste0('* Some variables were transformed in different formats)')
#via_adm_sus_prin sus_ini origen_ingreso
DiagrammeR::grViz(
  "digraph structs {
    node [shape=record];
    struct [label='<f1> Wide format(a)|<f2> Maximum/Last value(b)|<f3> Minimum/First value(c)|<f4> Kept more vulnerable category(d)|<f5> Same value(e)|<f6>Largest treatment(f)|<f8> Favored dgs.-a(g)|<f12> Sum values(h)'];
    struct_f1 [label='{row|nombre_centro|tipo_centro|servicio_de_salud|senda|id_centro|obs*}'];
    struct_f2 [label='{numero_de_hijos_mod**|num_hijos_trat_res_mod**|tipo_centro_derivacion|fech_egres_imp|motivodeegreso_mod_imp|macrozona|nombre_region|comuna_residencia_cod|identidad_de_genero**|tipo_de_plan_2*|id_centro*|ano_bd*}'];
    struct_f3 [label='{fech_ing|fecha_ingreso_a_convenio_senda|edad_al_ing|origen_ingreso_mod|embarazo|edad_al_ing_grupos|ano_bd*}'];
    struct_f4 [label='{escolaridad|compromiso_biopsicosocial|dg_global_nec_int_soc_or|dg_nec_int_soc_cap_hum_or|dg_nec_int_soc_cap_fis_or|dg_nec_int_soc_cap_soc_or|evaluacindelprocesoteraputico|eva_consumo|eva_fam|eva_relinterp|eva_ocupacion|eva_sm|eva_fisica|eva_transgnorma|dg_trs_psiq_cie_10_egres_or|dg_global_nec_int_soc_or_1|dg_nec_int_soc_cap_hum_or_1|dg_nec_int_soc_cap_fis_or_1|dg_nec_int_soc_cap_soc_or_1|tiene_menores_de_edad_a_cargo|x_se_trata_mujer_emb|usuario_tribunal_trat_droga|ha_estado_embarazada_egreso|dg_trs_cons_sus_or|opcion_discapacidad}']; 
    struct_f5 [label='{hash_key|id|hash_rut_completo|nacionalidad|sexo_2|id_mod|obs*|fech_nac|edad_ini_cons|edad_ini_sus_prin|estado_conyugal_2|edad_grupos|etnia_cor|nacionalidad_2|etnia_cor_2|sus_ini_mod_2|sus_ini_mod_3|sus_ini_mod|at_least_one_cont_entry}'];
    struct_f6 [label='{con_quien_vive|tipo_de_plan_2*|estatus_ocupacional|cat_ocupacional|tipo_de_vivienda_mod|tenencia_de_la_vivienda_mod|rubro_trabaja_mod|sus_principal_mod|freq_cons_sus_prin|via_adm_sus_prin_act|otras_sus1_mod|otras_sus2_mod|otras_sus3_mod|tipo_de_programa_2}'];
    struct_f8 [label='{dg_trs_psiq_dsm_iv_or|dg_trs_psiq_sub_dsm_iv_or|x2_dg_trs_psiq_dsm_iv_or|x2_dg_trs_psiq_sub_dsm_iv_or|x3_dg_trs_psiq_dsm_iv_or|x3_dg_trs_psiq_sub_dsm_iv_or|dg_trs_psiq_cie_10_or|dg_trs_psiq_sub_cie_10_or|x2_dg_trs_psiq_cie_10_or|x2_dg_trs_psiq_sub_cie_10_or|x3_dg_trs_psiq_cie_10_or|x3_dg_trs_psiq_sub_cie_10_or|diagnostico_trs_fisico|otros_probl_at_sm_or}'];
    g [label = '* Some variables were transformed in different formats);** If not available, replaced with the last available', width = 0.001, height = 0.001, color=White];
    struct_f12 [label='{dias_trat_imp|dias_trat_inv}']; 
    struct:f1-> struct_f1;
    struct:f2-> struct_f2;
    struct:f3-> struct_f3;
    struct:f4-> struct_f4;
    struct:f5-> struct_f5;
    struct:f6-> struct_f6;
    struct:f8-> struct_f8;
    struct:f12-> struct_f12;
    struct_f12 -> g [ dir = none,  color = 'white',fontcolor = white,shape=none, width=0, height=0];
  }")

Figure 5. Criteria to Transform Variables

  #width=14, height=7)

We generated a subset of entries that had less than 45 days of difference with a posterior entry and a referral as a cause of discharge, and we also included this posterior entry that would mostly replace the values of these initial and intermediate entries.

Once we subsetted the entries, we found that many users had more than one entry that can be considered as part of a treatment.

invisible(c("2.el problema que tengo es que los filtros solo seleccionan como candidatos a la transformación a los casos intermedios, pero tengo casos al final que no van a cumplir con las condiciones. Eso es problemático con casos que tienen más de una entrada intermedia"))
invisible(c("3.creo que los puedo identificar ocupando un siguiente lag"))

invisible(c("1. si es menor a 45 días y es referral, seleccionar las filas q podrían ser absorvidas, más el sig tratamiento"))
CONS_C1_JUN_2020_row_sig_row<-
  CONS_C1_df_dup_JUN_2020%>%
      #dplyr::filter(!is.na(diff_bet_treat))%>% #31,609, no hay NAs en derivación, los que son NA son 0.
      dplyr::mutate(filter_complex= dplyr::case_when(!is.na(diff_bet_treat) & diff_bet_treat<45 & as.character(motivoegreso_derivacion)=="Referral"~1,TRUE~0))%>%
      dplyr::arrange(hash_key)%>%
      dplyr::group_by(hash_key)%>%
      dplyr::mutate(sig_row=lag(row), sig_fech_egres_imp=lag(fech_egres_imp),sig_motivoegres_ref=lag(motivoegreso_derivacion),n_por_hash=n())%>%
      ungroup()%>%
      dplyr::filter(filter_complex==1)%>%
      # dplyr::select(row,hash_key,fech_ing,fech_egres_imp,motivoegreso_derivacion,obs_cambios,diff_bet_treat,sig_row,n_por_hash) %>% View() #0007678b8b35fa0961d1e8110fbf9620 
      dplyr::select(row,sig_row)

ungroup: no grouping variables

#6,809*2 =   13,618

  CONS_C1_df_dup_JUN_2020%>%
    dplyr::filter(row %in% unlist(c(CONS_C1_JUN_2020_row_sig_row$row,CONS_C1_JUN_2020_row_sig_row$sig_row)))%>%
    dplyr::arrange(hash_key,desc(fech_ing))%>%
    dplyr::mutate(filter_complex_anterior= dplyr::case_when(!is.na(lag(diff_bet_treat)) & lag(diff_bet_treat)<45 & as.character(lag(motivoegreso_derivacion))=="Referral"~1,TRUE~0))%>%
    dplyr::mutate(filter_complex= dplyr::case_when(!is.na(diff_bet_treat) & diff_bet_treat<45 & as.character(motivoegreso_derivacion)=="Referral"~1,TRUE~0))%>%
    dplyr::group_by(hash_key)%>%
    dplyr::mutate(sum_validadores=sum(filter_complex))%>%
    dplyr::mutate(n_complex = str_count(filter_complex, '0'))%>%
    ungroup()%>%
    dplyr::mutate(cumsum_n_complex = cumsum(n_complex))%>%
    dplyr::group_by(hash_key)%>%
    dplyr::mutate(n_dist=n_distinct(cumsum_n_complex))%>%
    dplyr::filter(n_dist>3)%>%
    dplyr::select(hash_key,fech_ing,fech_egres_imp,motivodeegreso_mod_imp,diff_bet_treat,filter_complex,filter_complex_anterior,n_dist,cumsum_n_complex)%>%
    
    knitr::kable(format= "html", format.args= list(decimal.mark= ".", big.mark= ","),
                 caption="Table 3. Groups w/ more than 3 distinct groups of cases that fullfilled the conditions to categorize as continuous treatments by each user",
                 align= c("l",rep('c', 5)), col.names = c("User","Date of Admission","Date of Discharge", "Cause of Discharge", "Diff Between Treatments","<45 & Referral","<45 & Referral of Previous Treatment","No. groups within users","ID of groups"))%>%
    kableExtra::kable_styling(bootstrap_options = c("striped", "hover"),font_size= 8)%>%
    kableExtra::kable_styling(full_width = F)%>%
          kableExtra::add_footnote(paste0("Note= Cases with an entry that had a referral as a cause of discharge and >45 days (n= ", CONS_C1_JUN_2020_row_sig_row%>% nrow()%>% formatC(big.mark=","),")"), notation = "none")

ungroup: no grouping variables

Table 3. Groups w/ more than 3 distinct groups of cases that fullfilled the conditions to categorize as continuous treatments by each user
User	Date of Admission	Date of Discharge	Cause of Discharge	Diff Between Treatments	<45 & Referral	<45 & Referral of Previous Treatment	No. groups within users	ID of groups
0f4aa2f78fa5da961404e6e5389ad76c	2017-04-03	2017-04-08	Abandono Temprano	2	0	1	4	352
0f4aa2f78fa5da961404e6e5389ad76c	2017-01-31	2017-03-29	Derivación	5	1	0	4	352
0f4aa2f78fa5da961404e6e5389ad76c	2016-05-27	2016-08-12	Abandono Temprano	172	0	1	4	353
0f4aa2f78fa5da961404e6e5389ad76c	2015-10-19	2016-05-23	Derivación	4	1	0	4	353
0f4aa2f78fa5da961404e6e5389ad76c	2015-07-14	2015-10-16	Derivación	3	1	1	4	353
0f4aa2f78fa5da961404e6e5389ad76c	2014-07-23	2014-08-11	Abandono Temprano	337	0	1	4	354
0f4aa2f78fa5da961404e6e5389ad76c	2014-07-08	2014-07-21	Derivación	2	1	0	4	354
0f4aa2f78fa5da961404e6e5389ad76c	2014-06-05	2014-07-01	Alta Terapéutica	7	0	1	4	355
0f4aa2f78fa5da961404e6e5389ad76c	2013-11-04	2014-06-05	Derivación	0	1	0	4	355
1173f19959cadd5542a584ab94ca87b7	2017-11-22	2018-05-08	Alta Administrativa	155	0	1	4	418
1173f19959cadd5542a584ab94ca87b7	2017-07-10	2017-11-21	Derivación	1	1	0	4	418
1173f19959cadd5542a584ab94ca87b7	2014-07-07	2016-04-13	Alta Terapéutica	453	0	1	4	419
1173f19959cadd5542a584ab94ca87b7	2014-04-04	2014-07-07	Derivación	0	1	0	4	419
1173f19959cadd5542a584ab94ca87b7	2013-04-26	2013-08-01	Derivación	246	0	1	4	420
1173f19959cadd5542a584ab94ca87b7	2013-01-23	2013-04-01	Derivación	25	1	0	4	420
1173f19959cadd5542a584ab94ca87b7	2012-04-02	2013-01-21	Derivación	2	1	1	4	420
1173f19959cadd5542a584ab94ca87b7	2012-02-15	2012-04-02	Abandono Temprano	0	0	1	4	421
1173f19959cadd5542a584ab94ca87b7	2011-09-22	2012-02-13	Derivación	2	1	0	4	421
25c36b6820ac514094c458ba22918452	2017-08-02	2018-02-19	Alta Terapéutica	194	0	1	4	940
25c36b6820ac514094c458ba22918452	2017-05-03	2017-08-01	Derivación	1	1	0	4	940
25c36b6820ac514094c458ba22918452	2016-08-18	2017-04-28	Alta Terapéutica	5	0	1	4	941
25c36b6820ac514094c458ba22918452	2016-05-27	2016-07-19	Derivación	30	1	0	4	941
25c36b6820ac514094c458ba22918452	2016-03-04	2016-05-22	Abandono Temprano	5	0	1	4	942
25c36b6820ac514094c458ba22918452	2016-01-04	2016-03-03	Derivación	1	1	0	4	942
25c36b6820ac514094c458ba22918452	2015-09-24	2015-12-31	Derivación	4	1	1	4	942
25c36b6820ac514094c458ba22918452	2011-01-31	2011-12-30	Alta Administrativa	279	0	1	4	943
25c36b6820ac514094c458ba22918452	2010-08-20	2010-12-22	Derivación	40	1	0	4	943
c81df65dbf73521d91ff7c65a3c7ceba	2015-04-13	2015-06-01	Abandono Temprano	36	0	1	4	4,783
c81df65dbf73521d91ff7c65a3c7ceba	2014-06-30	2015-04-07	Derivación	6	1	0	4	4,783
c81df65dbf73521d91ff7c65a3c7ceba	2013-12-17	2014-03-01	Abandono Temprano	80	0	1	4	4,784
c81df65dbf73521d91ff7c65a3c7ceba	2013-08-26	2013-12-17	Derivación	0	1	0	4	4,784
c81df65dbf73521d91ff7c65a3c7ceba	2013-03-22	2013-08-01	Abandono Tardio	25	0	1	4	4,785
c81df65dbf73521d91ff7c65a3c7ceba	2013-01-02	2013-03-11	Derivación	11	1	0	4	4,785
c81df65dbf73521d91ff7c65a3c7ceba	2012-08-20	2012-09-10	Abandono Temprano	114	0	1	4	4,786
c81df65dbf73521d91ff7c65a3c7ceba	2012-07-02	2012-08-18	Derivación	2	1	0	4	4,786
d375edb930e2d3f4517f2307200b1cf2	2019-09-03	NA	NA	NA	0	1	4	5,044
d375edb930e2d3f4517f2307200b1cf2	2019-07-15	2019-08-09	Derivación	25	1	0	4	5,044
d375edb930e2d3f4517f2307200b1cf2	2019-03-05	2019-03-29	Derivación	108	0	1	4	5,045
d375edb930e2d3f4517f2307200b1cf2	2018-12-27	2019-03-01	Derivación	4	1	0	4	5,045
d375edb930e2d3f4517f2307200b1cf2	2017-11-28	2018-01-08	Derivación	353	0	1	4	5,046
d375edb930e2d3f4517f2307200b1cf2	2017-04-17	2017-11-27	Derivación	1	1	0	4	5,046
d375edb930e2d3f4517f2307200b1cf2	2013-09-24	2013-10-18	Abandono Temprano	648	0	1	4	5,047
d375edb930e2d3f4517f2307200b1cf2	2013-06-13	2013-08-30	Derivación	25	1	0	4	5,047
e81d886539caa4e9b2527984cacd7ec0	2018-02-08	2018-04-09	Derivación	NA	0	1	4	5,549
e81d886539caa4e9b2527984cacd7ec0	2017-10-10	2018-02-01	Derivación	7	1	0	4	5,549
e81d886539caa4e9b2527984cacd7ec0	2016-08-16	2017-01-01	Abandono Tardio	282	0	1	4	5,550
e81d886539caa4e9b2527984cacd7ec0	2016-03-14	2016-08-10	Derivación	6	1	0	4	5,550
e81d886539caa4e9b2527984cacd7ec0	2016-01-19	2016-03-10	Derivación	4	1	1	4	5,550
e81d886539caa4e9b2527984cacd7ec0	2015-02-24	2015-06-02	Alta Terapéutica	231	0	1	4	5,551
e81d886539caa4e9b2527984cacd7ec0	2015-01-12	2015-02-23	Derivación	1	1	0	4	5,551
e81d886539caa4e9b2527984cacd7ec0	2013-11-28	2014-03-13	Derivación	305	0	1	4	5,552
e81d886539caa4e9b2527984cacd7ec0	2013-10-02	2013-11-22	Derivación	6	1	0	4	5,552
fba24f5affb5795f58a61bed2019722a	2015-11-02	2016-05-02	Alta Administrativa	NA	0	1	4	6,025
fba24f5affb5795f58a61bed2019722a	2015-07-01	2015-11-01	Derivación	1	1	0	4	6,025
fba24f5affb5795f58a61bed2019722a	2014-07-23	2015-07-01	Derivación	0	1	1	4	6,025
fba24f5affb5795f58a61bed2019722a	2013-10-21	2014-01-30	Derivación	67	0	1	4	6,026
fba24f5affb5795f58a61bed2019722a	2013-07-30	2013-10-18	Derivación	3	1	0	4	6,026
fba24f5affb5795f58a61bed2019722a	2013-04-09	2013-07-29	Alta Administrativa	1	0	1	4	6,027
fba24f5affb5795f58a61bed2019722a	2012-07-04	2013-04-09	Derivación	0	1	0	4	6,027
fba24f5affb5795f58a61bed2019722a	2012-05-01	2012-07-02	Derivación	2	1	1	4	6,027
fba24f5affb5795f58a61bed2019722a	2012-02-08	2012-02-24	Abandono Temprano	67	0	1	4	6,028
fba24f5affb5795f58a61bed2019722a	2011-07-14	2012-01-31	Derivación	8	1	0	4	6,028
Note= Cases with an entry that had a referral as a cause of discharge and >45 days (n= 6,809)

  #%>%
  #  kableExtra::scroll_box(width= "100%", height = "30%")

We applied these criteria to all of the entries that shared common records that could be considered as a part of a continuous treatment.

In case of variables, such as the primary substance and other substances and educational attainment, we ordered these variables in terms of vulnerability and replaced variables, giving priority to more vulnerable categories. For variables in which there was not a clear hierarchy to identify the more vulnerable category, we selected the values present in the entries in which the treatments lasted longer than the rest. Instead of keeping values of days with more than one maximum amount of days of treatment up to the date of retrieval, the rank chose one of the corresponding rows randomly.

In case of the variable related to the type of plan, we left two variables, specifying the last plan and another specifying the plan of the larger entry.

In case of other substances at admission, we generated five variables (otras_sus1_mod,otras_sus2_mod, otras_sus3_mod, sus_ini_2_mod and sus_ini_3_mod) that selected only the variables related to the main substances.

invisible(c("OJO PUSE EVAL=F"))
invisible(c("4.ejemplos de casos con trat continuos distintos"))
   # dplyr::filter(sum_validadores>1) #ffd3f4ed5841cfac947ce546757b8e3f, es un caso que tiene un par de días que se podrían colapsar: fech egres 2014-11-27 y después ingresa en 2015-03-27 // lo mismo con 015ea90c1b1655155f30a3e276436ed5 en 2009-09-25 a 2012-02-29 y después 2018-07-03 al 2019-08-30
#12,945, posiblemente hay otros pares que se superponen 
invisible(c("5. Hay casos que tienen hasta 7 tratamientos continuos ¿?, raro- lo vi y corresponde"))
#dd0d42261d00273d4e19ff2a46bda4b9_5276 dd0d42261d00273d4e19ff2a46bda4b9, pueden existir hasta 7 trat continuos ¿?
toString2<-
function (x, width = NULL, ...) 
        {
            string <- paste(x, collapse = "; ")
            if (missing(width) || is.null(width) || width == 0) 
                return(string)
            if (width < 0) 
                stop("'width' must be positive")
            if (nchar(string, type = "w") > width) {
                width <- max(6, width)
                string <- paste0(strtrim(string, width - 4), "....")
            }
            string
}

CONS_C1_df_dup_JUN_2020%>%
#FILTRAR VARIABLES
#_#_#_#_#_#_#_
    dplyr::filter(row %in% unlist(c(CONS_C1_JUN_2020_row_sig_row$row,CONS_C1_JUN_2020_row_sig_row$sig_row)))%>%
    dplyr::mutate(ano_bd2=ano_bd)%>%
    dplyr::arrange(hash_key,desc(fech_ing))%>%
    dplyr::mutate(filter_complex_anterior= dplyr::case_when(!is.na(lag(diff_bet_treat)) & lag(diff_bet_treat)<45 & as.character(lag(motivoegreso_derivacion))=="Referral"~1,TRUE~0))%>%
    dplyr::mutate(filter_complex= dplyr::case_when(!is.na(diff_bet_treat) & diff_bet_treat<45 & as.character(motivoegreso_derivacion)=="Referral"~1,TRUE~0))%>%
    dplyr::group_by(hash_key)%>%
    dplyr::mutate(sum_validadores=sum(filter_complex))%>%
    dplyr::mutate(n_complex = str_count(filter_complex, '0'))%>%
    ungroup()%>%
    dplyr::mutate(cumsum_n_complex = cumsum(n_complex))%>%
    dplyr::mutate(concat_hash_id_treatments=paste0(hash_key,"_",cumsum_n_complex))%>%
    dplyr::group_by(concat_hash_id_treatments)%>%
    dplyr::mutate(rn_common_treats=row_number())%>% 
    dplyr::mutate(rn_common_treats2=row_number())%>% 
    ungroup()%>%
    dplyr::mutate(tipo_de_plan_2_for_f=tipo_de_plan_2)%>%
    dplyr::mutate(mod_0_row=row)%>%
    dplyr::mutate(obs_for_e=obs)%>%
#_#_#_#_#_#_#_#_#_#_
#a
#_#_#_#_#_#_#_#_#_#_
    #dplyr::filter(hash_key=="dd0d42261d00273d4e19ff2a46bda4b9")%>% dplyr::select(row,hash_key, concat_hash_id_treatments, fech_ing, fech_egres_imp, diff_bet_treat)%>% View()
##IMPORTANTE: PARA VER TIPOS DISTINTOS
    #dplyr::mutate(n_dist=n_distinct(cumsum_n_complex))%>%
    #dplyr::filter(n_dist>3)%>%
    #dplyr::select(hash_key,fech_ing,fech_egres_imp,motivodeegreso_mod_imp,diff_bet_treat,filter_complex,filter_complex_anterior,n_dist,cumsum_n_complex)%>%
                tidyr::pivot_wider(names_from =  rn_common_treats, 
                                   names_sep="_",
                                   values_from = c(row, tipo_centro, servicio_de_salud, senda,id_centro, tipo_de_plan_2,obs))%>%
  dplyr::group_by(concat_hash_id_treatments)%>%
  dplyr::mutate_at(vars(row_1:obs_7),~max(as.character(.),na.rm=T))%>%
  dplyr::ungroup()%>%
  unite(., col = "mod_a_row",  row_1:row_7, na.rm=TRUE, sep = "; ")%>%
  unite(., col = "mod_a_tipo_centro",  tipo_centro_1:tipo_centro_7, na.rm=TRUE, sep = "; ")%>%
  unite(., col = "mod_a_servicio_de_salud",  servicio_de_salud_1:servicio_de_salud_7, na.rm=TRUE, sep = "; ")%>%
  unite(., col = "mod_a_senda",  senda_1:senda_7, na.rm=TRUE, sep = "; ")%>%
  unite(., col = "mod_a_id_centro",  id_centro_1:id_centro_7, na.rm=TRUE, sep = "; ")%>%
  unite(., col = "mod_a_tipo_de_plan_2",  tipo_de_plan_2_1:tipo_de_plan_2_7, na.rm=TRUE, sep = "; ")%>%
  unite(., col = "mod_a_obs",  obs_1:obs_7, na.rm=TRUE, sep = "; ")%>%
  dplyr::mutate(mod_a_obs=sub("^;", "", mod_a_obs))%>%
  tidyr::separate(mod_a_obs,into=paste0("obs",1:30), sep=";")%>%
  dplyr::mutate(across(c(obs1:obs30),~stringr::str_trim(.)))%>%
  dplyr::mutate(mod_a_obs = pmap_chr(select(.,obs1:obs30), ~toString2(unique(na.omit(c(...))))))%>%
    
  dplyr::mutate(mod_a_obs,sub("^; ; ", "", mod_a_obs))%>%
  dplyr::mutate(mod_a_obs,sub("^; ", "", mod_a_obs))%>%
  dplyr::mutate(mod_a_obs=sub("^;", "", mod_a_obs))%>% 
  dplyr::mutate(mod_a_obs,sub("; ; $", "", mod_a_obs))%>%
  dplyr::mutate(mod_a_obs,sub("; $", "", mod_a_obs))%>%
  dplyr::mutate(mod_a_obs,sub(";$", "", mod_a_obs))%>%
  dplyr::mutate(mod_a_obs,sub("; ; ", "; ", mod_a_obs))%>%
  dplyr::mutate(mod_a_obs,sub("; ;", "; ", mod_a_obs))%>%
  
  dplyr::mutate(mod_b_tipo_de_plan_2=sub(".*\\;","",mod_a_tipo_de_plan_2))%>% #ultimos tratamiento
  dplyr::mutate(mod_b_id_centro=sub(".*\\;","",mod_a_id_centro))%>% #ultimos tratamiento
    #dplyr::mutate(across(c(mod_a_senda),~str_count(., pattern = ";"),.names="{col}_cnt"))%>% dplyr::select(mod_a_senda_cnt)%>% summary()
  #qué hizo el senda con los concatenados, no están
#_#_#_#_#_#_#_#_#_#_
#b Last value
#_#_#_#_#_#_#_#_#_#_
  dplyr::group_by(concat_hash_id_treatments)%>%
  dplyr::mutate(n_concat_hash_id_treatments=n())%>%
  dplyr::mutate(fech_egres_imp=as.character(fech_egres_imp))%>%
  dplyr::mutate(fech_egres_imp=ifelse(n_concat_hash_id_treatments==1 & is.na(fech_egres_imp),"2019-11-13",fech_egres_imp))%>%
  dplyr::mutate(motivodeegreso_mod_imp=ifelse(n_concat_hash_id_treatments==1 & is.na(motivodeegreso_mod_imp),"En curso",as.character(motivodeegreso_mod_imp)))%>%
  dplyr::mutate(across(c(numero_de_hijos_mod, num_hijos_trat_res_mod,identidad_de_genero),~dplyr::first(na.omit(.)),.names = "mod_b_{col}"))%>%
  dplyr::mutate(across(c(tipo_centro_derivacion, motivodeegreso_mod_imp, fech_egres_imp, macrozona, nombre_region, nombre_centro, comuna_residencia_cod,ano_bd),~dplyr::first(.),.names = "mod_b_{col}"))%>%
  
  #dplyr::select(hash_key,concat_hash_id_treatments,  n_concat_hash_id_treatments,numero_de_hijos, num_hijos_ing_trat_res, fech_egres_imp, motivodeegreso_mod_imp, macrozona, nombre_region, comuna_residencia_cod,identidad_de_genero,starts_with("mod_"))%>%dplyr::filter(concat_hash_id_treatments=="dd0d42261d00273d4e19ff2a46bda4b9_5276"|hash_key=="015ea90c1b1655155f30a3e276436ed5"|hash_key=="ffd3f4ed5841cfac947ce546757b8e3f")%>%
   #dplyr::filter(hash_key=="ffd3f4ed5841cfac947ce546757b8e3f")%>%View() #CASOS CON 2 TRATAMIENTOS, EFECTIVAMENTE SON RELLENADOS.
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
#c First value
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
  dplyr::ungroup()%>%
  dplyr::mutate(fech_ing=as.character(fech_ing))%>%
  dplyr::group_by(concat_hash_id_treatments)%>%
  dplyr::mutate(across(c(fech_ing, fecha_ingreso_a_convenio_senda, embarazo, edad_al_ing, origen_ingreso_mod, edad_al_ing_grupos,ano_bd2),~dplyr::last(na.omit(.)),.names = "mod_c_{col}"))%>%
  dplyr::ungroup()%>%
  assign("CONS_C1_df_dup_JUN_2020_a_c",., envir = .GlobalEnv)
    #dplyr::select(hash_key,concat_hash_id_treatments,  n_concat_hash_id_treatments,fech_ing, fecha_ingreso_a_convenio_senda, edad_al_ing, origen_ingreso_mod, edad_al_ing_grupos, starts_with("mod_"))%>%dplyr::filter(concat_hash_id_treatments=="dd0d42261d00273d4e19ff2a46bda4b9_5276"|hash_key=="015ea90c1b1655155f30a3e276436ed5"|hash_key=="ffd3f4ed5841cfac947ce546757b8e3f")%>%View()

#CONS_C1_df_dup_JUN_2020_a_c%>% janitor::tabyl(mod_a_obs)%>% dplyr::select(mod_a_obs)%>% data.frame()%>% View()

#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
#d     #Primero ordenar por vulnerabilidad, luego hacer la selección
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
  
CONS_C1_df_dup_JUN_2020_a_c%>%
  dplyr::mutate(compromiso_biopsicosocial=dplyr::case_when(compromiso_biopsicosocial=="Leve"~1,compromiso_biopsicosocial=="Moderado"~2,compromiso_biopsicosocial=="Severo"~3,TRUE~NA_real_))%>%
  dplyr::mutate(escolaridad=dplyr::case_when(escolaridad=="Mayor a Ed Secundaria"~1,escolaridad=="Ed Secundaria Completa o Menor"~2,escolaridad=="Ed Primaria Completa o Menor"~3,TRUE~NA_real_))%>%
  dplyr::mutate(across(c(dg_global_nec_int_soc_or, dg_nec_int_soc_cap_hum_or, dg_nec_int_soc_cap_fis_or, dg_nec_int_soc_cap_soc_or,dg_global_nec_int_soc_or_1, dg_nec_int_soc_cap_hum_or_1, dg_nec_int_soc_cap_fis_or_1,dg_nec_int_soc_cap_soc_or_1),~dplyr::case_when(as.character(.)=="Bajas"~3,as.character(.)=="Medias"~2,as.character(.)=="Altas"~1,TRUE~NA_real_)))%>%
  dplyr::mutate(across(c(evaluacindelprocesoteraputico, eva_consumo, eva_fam, eva_relinterp, eva_ocupacion, eva_sm, eva_fisica, eva_transgnorma),~dplyr::case_when(as.character(.)=="Logro Mínimo"~3,as.character(.)=="Logro M?mo"~3,as.character(.)=="Logro Intermedio"~2,as.character(.)=="Logro Alto"~1,TRUE~NA_real_)))%>% 
 #dplyr::select(hash_key,concat_hash_id_treatments, n_concat_hash_id_treatments,compromiso_biopsicosocial,escolaridad,dg_global_nec_int_soc_or, dg_nec_int_soc_cap_hum_or, dg_nec_int_soc_cap_fis_or, dg_nec_int_soc_cap_soc_or,dg_global_nec_int_soc_or_1, dg_nec_int_soc_cap_hum_or_1, dg_nec_int_soc_cap_fis_or_1, dg_nec_int_soc_cap_soc_or_1,evaluacindelprocesoteraputico, eva_consumo, eva_fam, eva_relinterp, eva_ocupacion, eva_sm, eva_fisica, eva_transgnorma,starts_with("mod_"))%>%dplyr::filter(concat_hash_id_treatments=="dd0d42261d00273d4e19ff2a46bda4b9_5276"|hash_key=="015ea90c1b1655155f30a3e276436ed5"|hash_key=="ffd3f4ed5841cfac947ce546757b8e3f")%>% 

  dplyr::group_by(concat_hash_id_treatments)%>%
  
      dplyr::mutate(across(c(compromiso_biopsicosocial,escolaridad,dg_global_nec_int_soc_or, dg_nec_int_soc_cap_hum_or, dg_nec_int_soc_cap_fis_or, dg_nec_int_soc_cap_soc_or,dg_global_nec_int_soc_or_1, dg_nec_int_soc_cap_hum_or_1, dg_nec_int_soc_cap_fis_or_1, dg_nec_int_soc_cap_soc_or_1,evaluacindelprocesoteraputico, eva_consumo, eva_fam, eva_relinterp, eva_ocupacion, eva_sm, eva_fisica, eva_transgnorma),~max(.,na.rm=T),.names = "mod_d_{col}"))%>%
    
      dplyr::mutate(tiene_menores_de_edad_a_cargo_n=ifelse(as.character(tiene_menores_de_edad_a_cargo)=="si",1,0),tiene_menores_de_edad_a_cargo_n=sum(tiene_menores_de_edad_a_cargo_n,na.rm=T),mod_d_tiene_menores_de_edad_a_cargo=ifelse(mod_b_numero_de_hijos_mod>0 & tiene_menores_de_edad_a_cargo_n>0,"si","no"))%>%
      
      dplyr::mutate(x_se_trata_mujer_emb_n=ifelse(as.character(x_se_trata_mujer_emb)=="Si",1,0),x_se_trata_mujer_emb_n=sum(x_se_trata_mujer_emb_n,na.rm=T),mod_d_x_se_trata_mujer_emb=ifelse(x_se_trata_mujer_emb_n>0,"Si","No"))%>%
      
      dplyr::mutate(usuario_tribunal_trat_droga_n=ifelse(as.character(usuario_tribunal_trat_droga)=="Si",1,0),usuario_tribunal_trat_droga_n=sum(usuario_tribunal_trat_droga_n,na.rm=T),mod_d_usuario_tribunal_trat_droga=ifelse(usuario_tribunal_trat_droga_n>0,"Si","No"))%>%
      
      dplyr::mutate(discapacidad_n=ifelse(as.character(discapacidad)=="si",1,0),discapacidad_n=sum(discapacidad_n,na.rm=T),mod_d_discapacidad=ifelse(discapacidad_n>0,"si","no"))%>%
      dplyr::mutate(ha_estado_embarazada_egreso_n=ifelse(as.character(ha_estado_embarazada_egreso)=="si",1,0),ha_estado_embarazada_egreso_n=sum(ha_estado_embarazada_egreso_n,na.rm=T),mod_d_ha_estado_embarazada_egreso=ifelse(ha_estado_embarazada_egreso_n>0,"si","no"))%>%
      dplyr::mutate(dg_trs_cons_sus_or_n=ifelse(as.character(dg_trs_cons_sus_or)=="Dependencia",1,0),dg_trs_cons_sus_or_n=sum(dg_trs_cons_sus_or_n,na.rm=T),mod_d_dg_trs_cons_sus_or=ifelse(dg_trs_cons_sus_or_n>0,"Dependencia","Consumo Perjudicial"))%>%
      dplyr::mutate(mod_d_opcion_discapacidad=max(as.character(opcion_discapacidad),na.rm=T))%>%
  dplyr::ungroup()%>%
 # dplyr::select(hash_key,concat_hash_id_treatments, n_concat_hash_id_treatments,compromiso_biopsicosocial,escolaridad,dg_global_nec_int_soc_or, dg_nec_int_soc_cap_hum_or, dg_nec_int_soc_cap_fis_or, dg_nec_int_soc_cap_soc_or,dg_global_nec_int_soc_or_1, dg_nec_int_soc_cap_hum_or_1, dg_nec_int_soc_cap_fis_or_1, dg_nec_int_soc_cap_soc_or_1,evaluacindelprocesoteraputico, eva_consumo, eva_fam, eva_relinterp, eva_ocupacion, eva_sm, eva_fisica, eva_transgnorma,tiene_menores_de_edad_a_cargo,x_se_trata_mujer_emb,usuario_tribunal_trat_droga,discapacidad,ha_estado_embarazada_egreso,starts_with("mod_d_"))%>%dplyr::filter(concat_hash_id_treatments=="dd0d42261d00273d4e19ff2a46bda4b9_5276"|hash_key=="015ea90c1b1655155f30a3e276436ed5"|hash_key=="ffd3f4ed5841cfac947ce546757b8e3f")%>% 

assign("CONS_C1_df_dup_JUN_2020_a_d",., envir = .GlobalEnv)

#¿qué pasa con opción discapacidad?
#CONS_C1_df_dup_JUN_2020_a_d%>% janitor::tabyl(mod_d_discapacidad, opcion_discapacidad)
#CONS_C1_df_dup_JUN_2020_a_d%>% dplyr::select(mod_0_row,concat_hash_id_treatments,mod_d_discapacidad, opcion_discapacidad)%>% dplyr::filter(mod_d_discapacidad=="si")%>% View()

#dg_trs_cons_sus_or

#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
#e     #mantener- agregar hash_key
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_      

invisible(c("hash_key","id","hash_rut_completo","nacionalidad","sexo_2","id_mod","obs","fech_nac","edad_ini_cons","edad_ini_sus_prin","sus_ini","estado_conyugal_2","edad_grupos","etnia_cor","nacionalidad_2","etnia_cor_2","sus_ini_2","sus_ini_3","sus_ini_mod","obs_cambios","obs_cambios_ninguno","obs_cambios_num","obs_cambios_fac","at_least_one_cont_entry"))

CONS_C1_df_dup_JUN_2020_a_d%>%
      dplyr::mutate(across(c(sus_ini_2, sus_ini_3),~dplyr::case_when(as.character(.)!="Alcohol"&as.character(.)!="Cocaína"&as.character(.)!="Marihuana"&as.character(.)!="Pasta Base"~"Otros",TRUE~as.character(.)),.names = "{col}_mod"))%>%
  dplyr::mutate(across(c(hash_key,id,hash_rut_completo,nacionalidad,sexo_2,id_mod,obs_for_e,fech_nac,edad_ini_cons,edad_ini_sus_prin,sus_ini,estado_conyugal_2,edad_grupos,etnia_cor,nacionalidad_2,etnia_cor_2,sus_ini_2_mod,sus_ini_3_mod,sus_ini_mod,obs_cambios,obs_cambios_ninguno,obs_cambios_num,obs_cambios_fac,at_least_one_cont_entry), ~ .,.names = "mod_e_{col}"))%>%
  dplyr::rename("mod_e_obs"="mod_e_obs_for_e")%>%
assign("CONS_C1_df_dup_JUN_2020_a_e",., envir = .GlobalEnv)

#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
#f     #Tratamiento más largo
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_      

invisible(c("acá no está clara la jerarquía en términos de vulnerabildiad. Un inactivo puede estar mejor que un desempleado, así como no; un asalariado"))

set.seed(1234) #para la resolución de empates
CONS_C1_df_dup_JUN_2020_a_e%>%
  dplyr::mutate(dias_trat_imp_op= ifelse(is.na(fech_egres_imp),as.integer(lubridate::time_length(difftime(as.Date("2019-11-13"),as.Date(as.character(fech_ing))),"days")),dias_trat_imp))%>%
  dplyr::mutate(across(c(otras_sus1, otras_sus2,otras_sus3),~dplyr::case_when(as.character(.)!="Alcohol"&as.character(.)!="Cocaína"&as.character(.)!="Marihuana"&as.character(.)!="Pasta Base"~"Otros",TRUE~as.character(.)),.names = "{col}_mod"))%>%
  dplyr::mutate(rn_total=row_number())%>%
  
  dplyr::group_by(concat_hash_id_treatments)%>%
#dplyr::mutate(n_days_op_max= max(dias_trat_imp_op,na.rm=T))%>%
  mutate(rank_dias_trat_op  = rank(-dias_trat_imp_op, ties.method = "random"))%>% #negative, descendant
  dplyr::arrange(concat_hash_id_treatments,rank_dias_trat_op)%>%
  dplyr::mutate(across(c(con_quien_vive,tipo_de_plan_2_for_f,tipo_de_programa_2,estatus_ocupacional,cat_ocupacional,origen_ingreso,tipo_de_vivienda_mod,tenencia_de_la_vivienda_mod,rubro_trabaja_mod,sus_principal_mod,freq_cons_sus_prin,via_adm_sus_prin_act,otras_sus1_mod,otras_sus2_mod,otras_sus3_mod,via_adm_sus_prin), ~dplyr::first(na.omit(.)),.names = "mod_f_{col}"))%>%
  dplyr::ungroup()%>%
  #dplyr::rename("mod_f_tipo_de_plan_2_for_f"="mod_f_tipo_de_plan_2")%>%
  #dplyr::select(hash_key,concat_hash_id_treatments, n_concat_hash_id_treatments,con_quien_vive,tipo_de_plan_2_for_f,estatus_ocupacional,cat_ocupacional,origen_ingreso,tipo_de_vivienda_mod,tenencia_de_la_vivienda_mod,rubro_trabaja_mod,otras_sus1_mod,otras_sus2_mod,otras_sus3_mod,rank_dias_trat_op,dias_trat_imp_op,starts_with("mod_f_"))%>%dplyr::filter(concat_hash_id_treatments=="dd0d42261d00273d4e19ff2a46bda4b9_5276"|hash_key=="015ea90c1b1655155f30a3e276436ed5"|hash_key=="ffd3f4ed5841cfac947ce546757b8e3f")%>% 
  dplyr::arrange(rn_total)%>%
assign("CONS_C1_df_dup_JUN_2020_a_f",., envir = .GlobalEnv)

#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
#g- Replacedfavored dgs.-a     #Reemplazar si está presente
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_

invisible(c("acá no quiero captar progresion, sólo valores distintos"))
library(tidyverse)
toString2<-
function (x, width = NULL, ...) 
        {
            string <- paste(x, collapse = "; ")
            if (missing(width) || is.null(width) || width == 0) 
                return(string)
            if (width < 0) 
                stop("'width' must be positive")
            if (nchar(string, type = "w") > width) {
                width <- max(6, width)
                string <- paste0(strtrim(string, width - 4), "....")
            }
            string
        }

#para definir lo que privilegiaré
  dg_trs_psiq_dsm_iv_or_cat<-CONS_C1_df_dup_JUN_2020 %>% dplyr::mutate(dg_trs_psiq_dsm_iv_or=stringr::str_trim(as.character(dg_trs_psiq_dsm_iv_or)))%>% janitor::tabyl(dg_trs_psiq_dsm_iv_or)%>% data.frame()%>% select(dg_trs_psiq_dsm_iv_or)%>% dplyr::filter(!dg_trs_psiq_dsm_iv_or %in% c("En estudio","Sin trastorno", NA))%>% unlist()%>% as.character()            
  dg_trs_psiq_cie_10_or_cat<-CONS_C1_df_dup_JUN_2020 %>% dplyr::mutate(dg_trs_psiq_cie_10_or=stringr::str_trim(as.character(dg_trs_psiq_cie_10_or)))%>% janitor::tabyl(dg_trs_psiq_cie_10_or)%>% data.frame()%>% select(dg_trs_psiq_cie_10_or)%>% dplyr::filter(!dg_trs_psiq_cie_10_or %in% c("En estudio(NA)","Sin trastorno(NA)", NA))%>% unlist()%>% as.character() 
  dg_trs_psiq_sub_dsm_iv_or_cat<-CONS_C1_df_dup_JUN_2020 %>% dplyr::mutate(dg_trs_psiq_sub_dsm_iv_or=stringr::str_trim(as.character(dg_trs_psiq_sub_dsm_iv_or)))%>% janitor::tabyl(dg_trs_psiq_sub_dsm_iv_or)%>% data.frame()%>% select(dg_trs_psiq_sub_dsm_iv_or)%>% unlist()%>% as.character() 
  dg_trs_psiq_sub_cie_10_or_cat<-CONS_C1_df_dup_JUN_2020 %>% dplyr::mutate(dg_trs_psiq_sub_cie_10_or=stringr::str_trim(as.character(dg_trs_psiq_sub_cie_10_or)))%>% janitor::tabyl(dg_trs_psiq_sub_cie_10_or)%>% data.frame()%>% select(dg_trs_psiq_sub_cie_10_or)%>% dplyr::filter(!dg_trs_psiq_sub_cie_10_or %in% c(NA))%>% unlist()%>% as.character() 
  diagnostico_trs_fisico_cat<-CONS_C1_df_dup_JUN_2020 %>% dplyr::mutate(diagnostico_trs_fisico=stringr::str_trim(as.character(diagnostico_trs_fisico)))%>% janitor::tabyl(diagnostico_trs_fisico)%>% data.frame()%>% select(diagnostico_trs_fisico)%>% dplyr::filter(!diagnostico_trs_fisico %in% c("En estudio","Sin trastorno", NA))%>% unlist()%>% as.character() 
  otros_probl_at_sm_or_cat<-CONS_C1_df_dup_JUN_2020 %>% dplyr::mutate(otros_probl_at_sm_or=stringr::str_trim(as.character(otros_probl_at_sm_or)))%>% janitor::tabyl(otros_probl_at_sm_or)%>% data.frame()%>% select(otros_probl_at_sm_or)%>% dplyr::filter(!otros_probl_at_sm_or %in% c("Sin otros problemas de salud mental", NA))%>% unlist()%>% as.character() 

CONS_C1_df_dup_JUN_2020_a_f%>%
  dplyr::mutate(rn_common_treats=rn_common_treats2)%>%
  #dplyr::mutate(across(c(dg_trs_psiq_dsm_iv_or,x2_dg_trs_psiq_dsm_iv_or,x3_dg_trs_psiq_dsm_iv_or,dg_trs_psiq_cie_10_or,x2_dg_trs_psiq_cie_10_or,x3_dg_trs_psiq_cie_10_or,diagnostico_trs_fisico), ~replace_na(as.character(.), 0),.names = "{col}_mod"))%>%
    #dplyr::mutate(across(c(dg_trs_psiq_dsm_iv_or_mod,x2_dg_trs_psiq_dsm_iv_or_mod,x3_dg_trs_psiq_dsm_iv_or_mod,dg_trs_psiq_cie_10_or_mod,x2_dg_trs_psiq_cie_10_or_mod,x3_dg_trs_psiq_cie_10_or_mod,diagnostico_trs_fisico_mod), ~dplyr::case_when(grepl("En estudio",as.character(.))~1,grepl("Sin trastorno",as.character(.))~0,TRUE~2)))%>%
  dplyr::mutate(across(c(dg_trs_psiq_dsm_iv_or,dg_trs_psiq_sub_dsm_iv_or,x2_dg_trs_psiq_dsm_iv_or,x2_dg_trs_psiq_sub_dsm_iv_or,x3_dg_trs_psiq_dsm_iv_or,x3_dg_trs_psiq_sub_dsm_iv_or,dg_trs_psiq_cie_10_or,dg_trs_psiq_sub_cie_10_or,x2_dg_trs_psiq_cie_10_or,x2_dg_trs_psiq_sub_cie_10_or,x3_dg_trs_psiq_cie_10_or,x3_dg_trs_psiq_sub_cie_10_or,diagnostico_trs_fisico,otros_probl_at_sm_or),~stringr::str_trim(.)))%>%
                tidyr::pivot_wider(names_from =  rn_common_treats, 
                                   names_sep="_",
                                   values_from =
c(dg_trs_psiq_dsm_iv_or,dg_trs_psiq_sub_dsm_iv_or,x2_dg_trs_psiq_dsm_iv_or,x2_dg_trs_psiq_sub_dsm_iv_or,x3_dg_trs_psiq_dsm_iv_or,x3_dg_trs_psiq_sub_dsm_iv_or,dg_trs_psiq_cie_10_or,dg_trs_psiq_sub_cie_10_or, x2_dg_trs_psiq_cie_10_or,x2_dg_trs_psiq_sub_cie_10_or,x3_dg_trs_psiq_cie_10_or,x3_dg_trs_psiq_sub_cie_10_or,diagnostico_trs_fisico,otros_probl_at_sm_or))%>%
  #c(dg_trs_psiq_dsm_iv_or_mod,x2_dg_trs_psiq_dsm_iv_or_mod,x3_dg_trs_psiq_dsm_iv_or_mod,dg_trs_psiq_cie_10_or_mod,x2_dg_trs_psiq_cie_10_or_mod,x3_dg_trs_psiq_cie_10_or_mod,diagnostico_trs_fisico_mod))%>%
    dplyr::group_by(concat_hash_id_treatments)%>%
    dplyr::mutate_at(vars(dg_trs_psiq_dsm_iv_or_1:otros_probl_at_sm_or_7),~max(as.character(.),na.rm=T))%>%
    #dplyr::mutate_at(vars(dg_trs_psiq_dsm_iv_or_mod_1:diagnostico_trs_fisico_mod_7),~max(as.numeric(.),na.rm=T))%>%
  dplyr::ungroup()%>%
    dplyr::mutate(mod_g_dg_trs_psiq_dsm_iv_or = pmap_chr(select(.,dg_trs_psiq_dsm_iv_or_1:dg_trs_psiq_dsm_iv_or_7,x2_dg_trs_psiq_dsm_iv_or_1:x2_dg_trs_psiq_dsm_iv_or_7,x3_dg_trs_psiq_dsm_iv_or_1:x3_dg_trs_psiq_dsm_iv_or_7), ~toString2(unique(na.omit(c(...))))))%>%
    dplyr::mutate(mod_g_dg_trs_psiq_sub_dsm_iv_or = pmap_chr(select(.,dg_trs_psiq_sub_dsm_iv_or_1:dg_trs_psiq_sub_dsm_iv_or_7,x2_dg_trs_psiq_sub_dsm_iv_or_1:x2_dg_trs_psiq_sub_dsm_iv_or_7,x3_dg_trs_psiq_sub_dsm_iv_or_1:x3_dg_trs_psiq_sub_dsm_iv_or_7), ~toString2(unique(na.omit(c(...))))))%>%
    #dplyr::mutate(mod_g_x2_dg_trs_psiq_dsm_iv_or = pmap_chr(select(.,x2_dg_trs_psiq_dsm_iv_or_1:x2_dg_trs_psiq_dsm_iv_or_7), ~toString2(unique(na.omit(c(...))))))%>%
   # dplyr::mutate(mod_g_x2_dg_trs_psiq_sub_dsm_iv_or = pmap_chr(select(.,x2_dg_trs_psiq_sub_dsm_iv_or_1:x2_dg_trs_psiq_sub_dsm_iv_or_7), ~toString2(unique(na.omit(c(...))))))%>%
    #dplyr::mutate(mod_g_x3_dg_trs_psiq_dsm_iv_or = pmap_chr(select(.,x3_dg_trs_psiq_dsm_iv_or_1:x3_dg_trs_psiq_dsm_iv_or_7), ~toString2(unique(na.omit(c(...))))))%>%
    #dplyr::mutate(mod_g_x3_dg_trs_psiq_sub_dsm_iv_or = pmap_chr(select(.,x3_dg_trs_psiq_sub_dsm_iv_or_1:x3_dg_trs_psiq_sub_dsm_iv_or_7), ~toString2(unique(na.omit(c(...))))))%>%
    dplyr::mutate(mod_g_dg_trs_psiq_cie_10_or = pmap_chr(select(.,dg_trs_psiq_cie_10_or_1:dg_trs_psiq_cie_10_or_7,x2_dg_trs_psiq_cie_10_or_1:x2_dg_trs_psiq_cie_10_or_7,x3_dg_trs_psiq_cie_10_or_1:x3_dg_trs_psiq_cie_10_or_7), ~toString2(unique(na.omit(c(...))))))%>%
    dplyr::mutate(mod_g_dg_trs_psiq_sub_cie_10_or = pmap_chr(select(.,dg_trs_psiq_sub_cie_10_or_1:dg_trs_psiq_sub_cie_10_or_7,x2_dg_trs_psiq_sub_cie_10_or_1:x2_dg_trs_psiq_sub_cie_10_or_7,x3_dg_trs_psiq_sub_cie_10_or_1:x3_dg_trs_psiq_sub_cie_10_or_7), ~toString2(unique(na.omit(c(...))))))%>%
    #dplyr::mutate(mod_g_x2_dg_trs_psiq_cie_10_or = pmap_chr(select(.,x2_dg_trs_psiq_cie_10_or_1:x2_dg_trs_psiq_cie_10_or_7), ~toString2(unique(na.omit(c(...))))))%>%
  #  dplyr::mutate(mod_g_x2_dg_trs_psiq_sub_cie_10_or = pmap_chr(select(.,x2_dg_trs_psiq_sub_cie_10_or_1:x2_dg_trs_psiq_sub_cie_10_or_7), ~toString2(unique(na.omit(c(...))))))%>%
   # dplyr::mutate(mod_g_x3_dg_trs_psiq_cie_10_or = pmap_chr(select(.,x3_dg_trs_psiq_cie_10_or_1:x3_dg_trs_psiq_cie_10_or_7), ~toString2(unique(na.omit(c(...))))))%>%
   # dplyr::mutate(mod_g_x3_dg_trs_psiq_sub_cie_10_or = pmap_chr(select(.,x3_dg_trs_psiq_sub_cie_10_or_1:x3_dg_trs_psiq_sub_cie_10_or_7), ~toString2(unique(na.omit(c(...))))))%>%
    dplyr::mutate(mod_g_diagnostico_trs_fisico = pmap_chr(select(.,diagnostico_trs_fisico_1:diagnostico_trs_fisico_7), ~toString2(unique(na.omit(c(...))))))%>%
    dplyr::mutate(mod_g_otros_probl_at_sm_or = pmap_chr(select(.,otros_probl_at_sm_or_1:otros_probl_at_sm_or_7), ~toString2(unique(na.omit(c(...))))))%>%
    #plyr::ungroup()%>%
    #dplyr::mutate_all(vars(dg_trs_psiq_dsm_iv_or_mod_1:diagnostico_trs_fisico_mod_7)) %>    unite(., col = "mod_g_x2_dg_trs_psiq_dsm_iv_or_mod", x2_dg_trs_psiq_dsm_iv_or_mod_1:x2_dg_trs_psiq_dsm_iv_or_mod_7, na.rm=TRUE, sep = "; ")%>%
    #unite(., col = "mod_g_dg_trs_psiq_dsm_iv_or_mod", dg_trs_psiq_dsm_iv_or_mod_1:dg_trs_psiq_dsm_iv_or_mod_7, na.rm=TRUE, sep = "; ")%>%
    #unite(., col = "mod_g_x2_dg_trs_psiq_dsm_iv_or_mod", x2_dg_trs_psiq_dsm_iv_or_mod_1:x2_dg_trs_psiq_dsm_iv_or_mod_7, na.rm=TRUE, sep = "; ")%>%
    #unite(., col = "mod_g_x3_dg_trs_psiq_dsm_iv_or_mod", x3_dg_trs_psiq_dsm_iv_or_mod_1:x3_dg_trs_psiq_dsm_iv_or_mod_7, na.rm=TRUE, sep = "; ")%>%
    #unite(., col = "mod_g_dg_trs_psiq_cie_10_or_mod", dg_trs_psiq_cie_10_or_mod_1:dg_trs_psiq_cie_10_or_mod_7, na.rm=TRUE, sep = "; ")%>%
    #unite(., col = "mod_g_x2_dg_trs_psiq_cie_10_or_mod", x2_dg_trs_psiq_cie_10_or_mod_1:x2_dg_trs_psiq_cie_10_or_mod_7, na.rm=TRUE, sep = "; ")%>%
    #unite(., col = "mod_g_x3_dg_trs_psiq_cie_10_or_mod", x3_dg_trs_psiq_cie_10_or_mod_1:x3_dg_trs_psiq_cie_10_or_mod_7, na.rm=TRUE, sep = "; ")%>%
    #unite(., col = "mod_g_diagnostico_trs_fisico_mod", diagnostico_trs_fisico_mod_1:diagnostico_trs_fisico_mod_7, na.rm=TRUE, sep = "; ")%>%

  # dplyr::select(hash_key,concat_hash_id_treatments,n_concat_hash_id_treatments,dg_trs_psiq_dsm_iv_or,x2_dg_trs_psiq_dsm_iv_or,x3_dg_trs_psiq_dsm_iv_or,dg_trs_psiq_cie_10_or,x2_dg_trs_psiq_cie_10_or,x3_dg_trs_psiq_cie_10_or,diagnostico_trs_fisico,starts_with("mod_g_"))%>%dplyr::filter(concat_hash_id_treatments=="dd0d42261d00273d4e19ff2a46bda4b9_5276"|hash_key=="015ea90c1b1655155f30a3e276436ed5"|hash_key=="ffd3f4ed5841cfac947ce546757b8e3f")%>% 
dplyr::mutate(across(c(mod_g_dg_trs_psiq_dsm_iv_or),~dplyr::case_when(grepl(paste(dg_trs_psiq_dsm_iv_or_cat, collapse = "|"),.)~sub("En estudio|Sin trastorno",replacement= "",.),grepl("En estudio",.,ignore.case = T)~str_replace_all(., "Sin trastorno", ""),TRUE~.)))%>%
  
  #nchar>18 nchar("Trastorno Adaptativo")==20  nchar("En estudio")
dplyr::mutate(across(c(mod_g_dg_trs_psiq_cie_10_or),~dplyr::case_when(grepl(paste(dg_trs_psiq_cie_10_or_cat, collapse = "|"),.)~sub("En estudio\\(NA\\)|Sin trastorno\\(NA\\)",replacement= "",.),grepl("En estudio(NA)",.,ignore.case = T)~sub("Sin trastorno\\(NA\\)",replacement= "",.),TRUE~.)))%>%

dplyr::mutate(across(c(mod_g_diagnostico_trs_fisico),~dplyr::case_when(grepl(paste(diagnostico_trs_fisico_cat, collapse = "|"),.)~str_replace_all(., "En estudio|Sin trastorno", ""),grepl("En estudio",.,ignore.case = T)~str_replace_all(., "Sin trastorno", ""),TRUE~.)))%>%  
  dplyr::mutate(across(c(mod_g_otros_probl_at_sm_or),~dplyr::case_when(grepl(paste(otros_probl_at_sm_or_cat, collapse = "|"),.)~str_replace_all(., "Sin otros problemas de salud mental", ""),TRUE~.)))%>%  

    dplyr::mutate(across(c(mod_g_dg_trs_psiq_cie_10_or),~sub("Sin trastorno\\(NA\\); En estudio\\(NA\\)",replacement= "En estudio(NA)", ., perl=TRUE)))%>%
  dplyr::mutate(across(c(mod_g_dg_trs_psiq_cie_10_or),~sub("En estudio\\(NA\\); Sin trastorno\\(NA\\)",replacement= "En estudio(NA)", ., perl=TRUE)))%>%
  dplyr::mutate(across(c(mod_g_dg_trs_psiq_cie_10_or),~sub("^Sin trastorno\\(NA\\); ",replacement= "", ., perl=TRUE)))%>%
  dplyr::mutate(across(c(mod_g_dg_trs_psiq_cie_10_or),~sub("; Sin trastorno\\(NA\\)$",replacement="", ., perl=TRUE)))%>%
  dplyr::mutate(across(c(mod_g_dg_trs_psiq_cie_10_or),~sub("; Sin trastorno\\(NA\\);",replacement=";", ., perl=TRUE)))%>%
  dplyr::mutate(across(c(mod_g_dg_trs_psiq_cie_10_or),~sub("Trastornos de los hábitos y del control de los impulsos;",replacement="Trastornos de los hábitos y del control de los impulsos(F63);", ., perl=TRUE)))%>%
  dplyr::mutate(across(c(mod_g_dg_trs_psiq_cie_10_or),~sub("Trastornos de los hábitos y del control de los impulsos$",replacement="Trastornos de los hábitos y del control de los impulsos(F63)", .)))%>%

  dplyr::mutate(across(c(mod_g_dg_trs_psiq_dsm_iv_or),~sub("En estudio; Sin trastorno",replacement="En estudio", ., perl=TRUE)))%>%
  dplyr::mutate(across(c(mod_g_dg_trs_psiq_dsm_iv_or),~sub("Sin trastorno; En estudio",replacement="En estudio", ., perl=TRUE)))%>%
  
  dplyr::mutate(across(c(mod_g_dg_trs_psiq_dsm_iv_or),~sub("En estudio; Sin trastorno",replacement="En estudio", ., perl=TRUE)))%>%
  dplyr::mutate(across(c(mod_g_dg_trs_psiq_dsm_iv_or),~sub("Sin trastorno; En estudio",replacement="En estudio", ., perl=TRUE)))%>%
  
  dplyr::mutate(across(c(mod_g_dg_trs_psiq_dsm_iv_or),~sub("; Sin trastorno;",replacement=";", ., perl=TRUE)))%>%
  dplyr::mutate(across(c(mod_g_dg_trs_psiq_dsm_iv_or,mod_g_dg_trs_psiq_sub_dsm_iv_or,mod_g_dg_trs_psiq_cie_10_or,mod_g_dg_trs_psiq_sub_cie_10_or,mod_g_diagnostico_trs_fisico,mod_g_otros_probl_at_sm_or),~sub("^; ; ", "", .)))%>%
  dplyr::mutate(across(c(mod_g_dg_trs_psiq_dsm_iv_or,mod_g_dg_trs_psiq_sub_dsm_iv_or,mod_g_dg_trs_psiq_cie_10_or,mod_g_dg_trs_psiq_sub_cie_10_or,mod_g_diagnostico_trs_fisico,mod_g_otros_probl_at_sm_or),~sub("^; ", "", .)))%>%
  dplyr::mutate(across(c(mod_g_dg_trs_psiq_dsm_iv_or,mod_g_dg_trs_psiq_sub_dsm_iv_or,mod_g_dg_trs_psiq_cie_10_or,mod_g_dg_trs_psiq_sub_cie_10_or,mod_g_diagnostico_trs_fisico,mod_g_otros_probl_at_sm_or),~sub("; ; $", "", .)))%>% 
  dplyr::mutate(across(c(mod_g_dg_trs_psiq_dsm_iv_or,mod_g_dg_trs_psiq_sub_dsm_iv_or,mod_g_dg_trs_psiq_cie_10_or,mod_g_dg_trs_psiq_sub_cie_10_or,mod_g_diagnostico_trs_fisico,mod_g_otros_probl_at_sm_or),~sub("; $", "", .)))%>% 
  dplyr::mutate(across(c(mod_g_dg_trs_psiq_dsm_iv_or,mod_g_dg_trs_psiq_sub_dsm_iv_or,mod_g_dg_trs_psiq_cie_10_or,mod_g_dg_trs_psiq_sub_cie_10_or,mod_g_diagnostico_trs_fisico,mod_g_otros_probl_at_sm_or),~sub("; ; ", "; ", .)))%>% 
   
   #dplyr::select(hash_key,concat_hash_id_treatments,n_concat_hash_id_treatments,starts_with("mod_g_"))%>%dplyr::filter(concat_hash_id_treatments=="dd0d42261d00273d4e19ff2a46bda4b9_5276"|hash_key=="015ea90c1b1655155f30a3e276436ed5"|hash_key=="ffd3f4ed5841cfac947ce546757b8e3f")%>%
  dplyr::ungroup()%>%
  #dplyr::mutate(across(c(rn_common_treats2,mod_g_dg_trs_psiq_dsm_iv_or,mod_g_dg_trs_psiq_sub_dsm_iv_or,mod_g_dg_trs_psiq_cie_10_or,mod_g_dg_trs_psiq_sub_cie_10_or,mod_g_diagnostico_trs_fisico,mod_g_otros_probl_at_sm_or),~str_count(., pattern = ";"),.names="{col}_cnt"))%>%     dplyr::select(rn_common_treats2,mod_g_dg_trs_psiq_dsm_iv_or_cnt,mod_g_dg_trs_psiq_sub_dsm_iv_or_cnt,mod_g_dg_trs_psiq_cie_10_or_cnt,mod_g_dg_trs_psiq_sub_cie_10_or_cnt,mod_g_diagnostico_trs_fisico_cnt,mod_g_otros_probl_at_sm_or_cnt)%>%summary()
  #dplyr::select(rn_common_treats2,mod_g_dg_trs_psiq_dsm_iv_or,mod_g_dg_trs_psiq_sub_dsm_iv_or,mod_g_dg_trs_psiq_cie_10_or,mod_g_dg_trs_psiq_sub_cie_10_or,mod_g_diagnostico_trs_fisico,mod_g_otros_probl_at_sm_or)%>%
assign("CONS_C1_df_dup_JUN_2020_a_g",., envir = .GlobalEnv)

#CONS_C1_df_dup_JUN_2020_a_g%>% janitor::tabyl(mod_g_dg_trs_psiq_cie_10_or)%>% copiar_nombres()

invisible(c("aquí se ven los resumenes de cuantos diagnosticos distintos puede tener un usuario",
            "rn           dsm4         dsm4_sub            cie10                cie10_sub           tr_fis                  otros_sm",
" Min.   :1.000     Min.   :0.0000     Min.   :0.00000     Min.   :0.0000      Min.   :0.0000      Min.   :0.00000       Min.   :0.0000
  1st Qu.:1.000     1st Qu.:0.0000     1st Qu.:0.00000     1st Qu.:0.0000      1st Qu.:0.0000      1st Qu.:0.00000       1st Qu.:0.0000                 
  Median :2.000     Median :0.0000     Median :0.00000     Median :1.0000      Median :0.0000      Median :0.00000       Median :0.0000                 
  Mean   :1.585     Mean   :0.0917     Mean   :0.04751     Mean   :0.8168      Mean   :0.0703      Mean   :0.02704       Mean   :0.1393                 
  3rd Qu.:2.000     3rd Qu.:0.0000     3rd Qu.:0.00000     3rd Qu.:1.0000      3rd Qu.:0.0000      3rd Qu.:0.00000       3rd Qu.:0.0000                 
  Max.   :7.000     Max.   :3.0000     Max.   :3.00000     Max.   :5.0000      Max.   :3.0000      Max.   :2.00000       Max.   :2.0000"))

CONS_C1_df_dup_JUN_2020_a_g%>%
  tidyr::separate(mod_g_dg_trs_psiq_dsm_iv_or,c("mod_g_dg_trs_psiq_dsm_iv_or","mod_g_x2_dg_trs_psiq_dsm_iv_or","mod_g_x3_dg_trs_psiq_dsm_iv_or","mod_g_x4_dg_trs_psiq_dsm_iv_or"), sep="; ")%>%
  tidyr::separate(mod_g_dg_trs_psiq_sub_dsm_iv_or,c("mod_g_dg_trs_psiq_sub_dsm_iv_or","mod_g_x2_dg_trs_psiq_sub_dsm_iv_or","mod_g_x3_dg_trs_psiq_sub_dsm_iv_or","mod_g_x4_dg_trs_psiq_sub_dsm_iv_or"), sep="; ")%>%
  tidyr::separate(mod_g_dg_trs_psiq_cie_10_or,c("mod_g_dg_trs_psiq_cie_10_or","mod_g_x2_dg_trs_psiq_cie_10_or","mod_g_x3_dg_trs_psiq_cie_10_or","mod_g_x4_dg_trs_psiq_cie_10_or","mod_g_x5_dg_trs_psiq_cie_10_or","mod_g_x6_dg_trs_psiq_cie_10_or"), sep="; ")%>%
  tidyr::separate(mod_g_dg_trs_psiq_sub_cie_10_or,c("mod_g_dg_trs_psiq_sub_cie_10_or","mod_g_x2_dg_trs_psiq_sub_cie_10_or","mod_g_x3_dg_trs_psiq_sub_cie_10_or","mod_g_x4_dg_trs_psiq_sub_cie_10_or"), sep="; ")%>%
#dplyr::select(c("mod_g_dg_trs_psiq_cie_10_or","mod_g_x2_dg_trs_psiq_cie_10_or","mod_g_x3_trs_psiq_cie_10_or","mod_g_x4_trs_psiq_cie_10_or","mod_g_x5_trs_psiq_cie_10_or","mod_g_x6_trs_psiq_cie_10_or"))%>% dplyr::filter(!is.na(mod_g_x6_trs_psiq_cie_10_or))%>% 
  assign("CONS_C1_df_dup_JUN_2020_a_g",., envir = .GlobalEnv)

#CONS_C1_df_dup_JUN_2020_a_g%>% janitor::tabyl(mod_g_diagnostico_trs_fisico)
#CONS_C1_df_dup_JUN_2020_a_g%>% janitor::tabyl(otros_probl_at_sm_or)  

  #dg_trs_psiq_sub_dsm_iv_or_cat dg_trs_psiq_sub_cie_10_or_cat

#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
#h     #Sum values
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_

CONS_C1_df_dup_JUN_2020_a_g%>%
  dplyr::group_by(concat_hash_id_treatments)%>%
  dplyr::mutate(mod_h_dias_trat_imp_op=sum(dias_trat_imp_op,na.rm=T))%>%
  dplyr::mutate(mod_h_dias_trat_imp=sum(dias_trat_imp,na.rm=T))%>%
  dplyr::mutate(mod_h_dias_trat_inv=sum(dias_trat_inv,na.rm=T))%>%
  #dplyr::select(dias_trat_imp_op,dias_trat_imp,dias_trat_inv,mod_h_dias_trat_imp_op,mod_h_dias_trat_imp,mod_h_dias_trat_inv)%>% summary()
  slice(1)%>%
  dplyr::ungroup()%>%
  #dplyr::select(hash_key,concat_hash_id_treatments,n_concat_hash_id_treatments,dias_trat_imp_op,dias_trat_imp,dias_trat_inv,starts_with("mod_h_"))%>%dplyr::filter(concat_hash_id_treatments=="dd0d42261d00273d4e19ff2a46bda4b9_5276"|hash_key=="015ea90c1b1655155f30a3e276436ed5"|hash_key=="ffd3f4ed5841cfac947ce546757b8e3f")%>%
  dplyr::select(concat_hash_id_treatments,rn_common_treats2,matches("^mod_[0abcdefgh]_"))%>%
  dplyr::rename("tipo_de_plan_2_concat_a"="mod_a_tipo_de_plan_2")%>%
  dplyr::rename("id_centro_concat_a"="mod_a_id_centro")%>%
  dplyr::rename("obs_concat_a"="mod_a_obs")%>%

  assign("CONS_C1_df_dup_JUN_2020_a_h",., envir = .GlobalEnv)

invisible(c("ANTES QUE TODO. Ver qué variables son susceptiblesde ser combinadas (ej., las a o b que dejan sólo un valor)"))

CONS_C1_df_dup_JUN_2020_a_h%>%
  dplyr::rename_at(.vars = vars(matches("^mod_[abcdefgh]_")),
            .funs = funs(sub("^mod_[abcdefgh]_", "", .)))%>%
  dplyr::mutate(tipo_de_plan_2=stringr::str_trim(tipo_de_plan_2))%>%
  dplyr::mutate(id_centro=stringr::str_trim(id_centro))%>%
  dplyr::select(mod_0_row,concat_hash_id_treatments,id_centro_concat_a, obs_concat_a,tipo_de_plan_2_concat_a,everything())%>%
  assign("CONS_C1_df_dup_JUN_2020_cont_treats",., envir = .GlobalEnv)
 #id_centro tipo_de_plan_2
#  dplyr::mutate(motivodeegreso_mod_imp_tidy= case_when(!is.na(diff_bet_treat) & as.character(motivodeegreso_mod_imp)=="Derivación" & grepl("Clínica",tipo_centro_derivacion==<90~"Abandono Temprano" , TRUE~as.character(motivodeegreso_mod_imp))%>% nrow()
invisible(c("ojo con tipo_de_plan_2_for_f y mod_e_obs_for_e"))
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
#DESCARTAR CASOS REDUNDANTES
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
row_id_redundantes_trat_completos<-
      CONS_C1_df_dup_JUN_2020_a_g%>%
          dplyr::filter(!mod_0_row %in% unlist(CONS_C1_df_dup_JUN_2020_a_h$mod_0_row))%>%
          dplyr::select(mod_0_row)%>%unlist()%>%as.numeric()
#CONS_C1_df_dup_JUN_2020_a_h%>%select(matches("^mod_[b]_"))%>% names()%>% data.frame()%>% copiar_nombres()
#CONS_C1_df_dup_JUN_2020_a_h%>%select(matches("^mod_[0abcdefgh]_"))%>% names()%>% data.frame()%>% copiar_nombres()
#sus_principal_mod

We started with 12,945 entries of 5,767 users that it would be appropriate to distinguish 6,136 groups of entries comprising single treatments.

Join with the main dataset

We had to join the new entries of treatments into the original database (n= 117,212; users= 85,603) into a new one that collapsed continuous entries into single treatments (n= 110,403;users= 85,603).

#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#
#CONS_C1_df_dup_JUN_2020_a_h
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#
invisible(c("https://stackoverflow.com/questions/28298688/how-do-i-sweep-specific-columns-with-dplyr"))
invisible(c("https://stackoverflow.com/questions/54818931/difference-between-and-eval-tidy-in-mutate-at"))
invisible(c("https://stackoverflow.com/questions/63290366/mutate-across-multiple-variables-usinga-list-of-third-variables-in-r/63292975#63292975"))
invisible(c("https://stackoverflow.com/questions/51051810/how-do-i-use-mutate-at-with-multiple-functions-where-each-function-has-parameter"))

#CONS_C1_df_dup_MAY_2020_prev_6c%>% dplyr::filter(grepl('3.03.', obs)|grepl('3.03.', obs)) %>% nrow()

#llego a 110,403 entradas

CONS_C1_df_dup_JUN_2020%>%
    dplyr::mutate(across(c(otras_sus1, otras_sus2,otras_sus3),~dplyr::case_when(as.character(.)!="Alcohol"&as.character(.)!="Cocaína"&as.character(.)!="Marihuana"&as.character(.)!="Pasta Base"~"Otros",TRUE~as.character(.)),.names = "{col}_mod"))%>%
   dplyr::mutate(across(c(dg_trs_psiq_dsm_iv_or,dg_trs_psiq_sub_dsm_iv_or,x2_dg_trs_psiq_dsm_iv_or,x2_dg_trs_psiq_sub_dsm_iv_or,x3_dg_trs_psiq_dsm_iv_or,x3_dg_trs_psiq_sub_dsm_iv_or,dg_trs_psiq_cie_10_or,dg_trs_psiq_sub_cie_10_or,x2_dg_trs_psiq_cie_10_or,x2_dg_trs_psiq_sub_cie_10_or,x3_dg_trs_psiq_cie_10_or,x3_dg_trs_psiq_sub_cie_10_or,diagnostico_trs_fisico,otros_probl_at_sm_or),~stringr::str_trim(.)))%>%
    dplyr::mutate(fech_ing=as.character(fech_ing))%>%
  dplyr::mutate(fech_egres_imp=as.character(fech_egres_imp))%>%
  dplyr::mutate(ano_bd2=ano_bd)%>%
  dplyr::mutate(across(c(sus_ini_2, sus_ini_3),~dplyr::case_when(as.character(.)!="Alcohol"&as.character(.)!="Cocaína"&as.character(.)!="Marihuana"&as.character(.)!="Pasta Base"~"Otros",TRUE~as.character(.)),.names = "{col}_mod"))%>%

#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_          
  #ELIMINAR COLUMNAS INNECESARIAS
#_#_#_#_#_#_#_#_#_#_#_#_#_  
  dplyr::select(-table, -region_del_centro, -tipo_de_programa, -tipo_de_plan, -dias_trat, -nmesesentratamiento, -dias_en_senda, -n_meses_en_senda, -sexo, -edad, -nombre_usuario, -comuna_residencia, -origen_de_ingreso, -pais_nacimiento, -etnia, -estado_conyugal, -parentesco_con_el_jefe_de_hogar, -num_trat_ant, -fecha_ultimo_tratamiento, -sustancia_de_inicio, -edad_inicio_consumo, -escolaridad_ultimo_ano_cursado, -condicion_ocupacional, -categoria_ocupacional, -rubro_trabaja, -tipo_de_vivienda, -tenencia_de_la_vivienda, -sustancia_principal, -`otras_sustancias_nº1`, -`otras_sustancias_nº2`, -`otras_sustancias_nº3`, -freq_cons_sus_prin_original, -edad_inicio_sustancia_principal, -via_adm_sus_prin_original, -sus_principal, -consentimiento_informado, -fech_egres, -motivodeegreso, -mot_egres_alt_adm_or, -consorcio, -fech_egres_sin_fmt, -ano_nac, -fech_ing_ano, -fech_ing_mes, -fech_ing_dia, -concat, -dias_trat_alta_temprana, -motivodeegreso_mod, -dias_trat_knn_imp, -fech_egres_knn_imp, -dias_trat_alta_temprana_knn_imp, -motivodeegreso_imp, -dias_trat_alta_temprana_imp, -concat_hash_sus_prin, -dg_trs_psiq_cie_10_egres_or, -menor_45_dias_diff, -menor_60_dias_diff)%>%
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_      
#HACER EL JOIN
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_      
    dplyr::filter(!row %in%row_id_redundantes_trat_completos)%>%
    dplyr::left_join(CONS_C1_df_dup_JUN_2020_cont_treats, by=c("row"="mod_0_row"), suffix=c("","_cont_entries"))%>%
    assign("CONS_C1_df_dup_JUL_2020_prev0",., envir = .GlobalEnv)

  #names() %>% data.frame()%>% copiar_nombres()
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_      
#VER VARIABLES CODIFICADAS EN MÁS DE UNA FORMA
#_#_#_#_#_#_#_ 
#####id_centro_concat_a joined  1. concatenado
#####obs_concat_a   joined  1. concatenado
#####tipo_de_plan_2_concat_a    joined  1. concatenado
#####tipo_de_plan_2_cont_entries    joined  2. ultimo tratamiento (b)
#####id_centro_cont_entries joined  2. ultimo tratamiento (b)
#####obs_cont_entries   joined  2. mismo valor (e )
#####tipo_de_plan_2_for_f   joined  3. aparentemente el correspondiente al tratmaiento más largo (f)
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_      
out_cols <- c("id_centro","tipo_de_plan_2","obs")
out_cols_final <-paste0(gsub("_cont_entries", "", out_cols),"_final")

for(col in out_cols){
  old = gsub("_cont_entries", "", col)
  new = paste0(gsub("_cont_entries", "", col),"_final")
  
  CONS_C1_df_dup_JUL_2020_prev0%>% dplyr::mutate(!!sym(new):= ifelse(is.na(concat_hash_id_treatments), as.character(!!sym(old)), as.character(!!sym(col))))%>%
  assign("CONS_C1_df_dup_JUL_2020_prev0",., envir = .GlobalEnv)
}
no_mostrar="si"
if(no_mostrar=="no"){
    CONS_C1_df_dup_JUL_2020_prev0%>%
      dplyr::filter(!is.na(concat_hash_id_treatments))%>%
      dplyr::select(concat_hash_id_treatments,!!(old),!!(out_cols),!!(out_cols_final))%>%
      View()
}
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_      
#_#_#_#_#_#_#_
#a. Wide format
#_#_#_#_#_#_#_
#dplyr::mutate(mod_b_id_centro=sub(".*\\;","",mod_a_id_centro))%>% #ultimos tratamiento

out_cols <- c("tipo_centro_cont_entries", "servicio_de_salud_cont_entries", "senda_cont_entries")
#  c(paste0(gsub("_cont_entries", "", out_cols),"_final"),gsub("_cont_entries", "", out_cols))
out_cols_final <-paste0(gsub("_cont_entries", "", out_cols),"_final")

CONS_C1_df_dup_JUL_2020_prev0%>%
  assign("CONS_C1_df_dup_JUL_2020_prev_a",., envir = .GlobalEnv)

for(col in out_cols){
  old = gsub("_cont_entries", "", col)
  new = paste0(gsub("_cont_entries", "", col),"_final")
  
  CONS_C1_df_dup_JUL_2020_prev_a%>% dplyr::mutate(!!sym(new):= ifelse(is.na(concat_hash_id_treatments), as.character(!!sym(old)), as.character(!!sym(col))))%>%
  assign("CONS_C1_df_dup_JUL_2020_prev_a",., envir = .GlobalEnv)
}
no_mostrar="si"
if(no_mostrar=="no"){
    CONS_C1_df_dup_JUL_2020_prev_a%>%
      dplyr::filter(!is.na(concat_hash_id_treatments))%>%
      dplyr::select(concat_hash_id_treatments,!!(old),!!(out_cols),!!(out_cols_final))%>% 
      View()
}
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_      
#_#_#_#_#_#_#_
#b. Maximum/ Last Value
#_#_#_#_#_#_#_
out_cols <- c("nombre_centro_cont_entries", "numero_de_hijos_mod_cont_entries","ano_bd_cont_entries", "num_hijos_trat_res_mod","tipo_centro_derivacion_cont_entries","fech_egres_imp_cont_entries","motivodeegreso_mod_imp_cont_entries","macrozona_cont_entries","nombre_region_cont_entries","comuna_residencia_cod_cont_entries")
#  c(paste0(gsub("_cont_entries", "", out_cols),"_final"),gsub("_cont_entries", "", out_cols))
out_cols_final <-paste0(gsub("_cont_entries", "", out_cols),"_final")
  
CONS_C1_df_dup_JUL_2020_prev_a%>%
  assign("CONS_C1_df_dup_JUL_2020_prev_b",., envir = .GlobalEnv)

for(col in out_cols){
  old = gsub("_cont_entries", "", col)
  new = paste0(gsub("_cont_entries", "", col),"_final")
  
  CONS_C1_df_dup_JUL_2020_prev_b%>% dplyr::mutate(!!sym(new):= ifelse(is.na(concat_hash_id_treatments), as.character(!!sym(old)), as.character(!!sym(col))))%>%
  assign("CONS_C1_df_dup_JUL_2020_prev_b",., envir = .GlobalEnv)
}
no_mostrar="si"
if(no_mostrar=="no"){
    CONS_C1_df_dup_JUL_2020_prev_b%>%
      dplyr::filter(!is.na(concat_hash_id_treatments))%>%
      dplyr::select(concat_hash_id_treatments,!!(old),!!(out_cols),!!(out_cols_final))%>% 
      View()
}
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_      
#_#_#_#_#_#_#_
#c. Minimum/ First value
#_#_#_#_#_#_#_
out_cols <- c("fech_ing_cont_entries","fecha_ingreso_a_convenio_senda_cont_entries","identidad_de_genero_cont_entries","edad_al_ing_cont_entries","origen_ingreso_mod_cont_entries","embarazo_cont_entries", "ano_bd2_cont_entries")
#  c(paste0(gsub("_cont_entries", "", out_cols),"_final"),gsub("_cont_entries", "", out_cols))
out_cols_final <-paste0(gsub("_cont_entries", "", out_cols),"_final")
  
CONS_C1_df_dup_JUL_2020_prev_b%>%
  assign("CONS_C1_df_dup_JUL_2020_prev_c",., envir = .GlobalEnv)

for(col in out_cols){
  old = gsub("_cont_entries", "", col)
  new = paste0(gsub("_cont_entries", "", col),"_final")
  
  CONS_C1_df_dup_JUL_2020_prev_c%>% dplyr::mutate(!!sym(new):= ifelse(is.na(concat_hash_id_treatments), as.character(!!sym(old)), as.character(!!sym(col))))%>%
  assign("CONS_C1_df_dup_JUL_2020_prev_c",., envir = .GlobalEnv)
}

no_mostrar="si"
if(no_mostrar=="no"){
    CONS_C1_df_dup_JUL_2020_prev_c%>%
      dplyr::filter(!is.na(concat_hash_id_treatments))%>%
     dplyr::select(concat_hash_id_treatments,!!(old),!!(out_cols),!!(out_cols_final))%>% 
      View()
}
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_      
#_#_#_#_#_#_#_
#d. Kept more vulnerable category
#_#_#_#_#_#_#_
out_cols <- c("x_se_trata_mujer_emb_cont_entries","compromiso_biopsicosocial_cont_entries","dg_global_nec_int_soc_or_cont_entries","dg_nec_int_soc_cap_hum_or_cont_entries","dg_nec_int_soc_cap_fis_or_cont_entries","dg_nec_int_soc_cap_soc_or_cont_entries","usuario_tribunal_trat_droga_cont_entries","evaluacindelprocesoteraputico_cont_entries","eva_consumo_cont_entries","eva_fam_cont_entries","eva_relinterp_cont_entries","eva_ocupacion_cont_entries","eva_sm_cont_entries","eva_fisica_cont_entries","eva_transgnorma_cont_entries","dg_global_nec_int_soc_or_1_cont_entries","dg_nec_int_soc_cap_hum_or_1_cont_entries","dg_nec_int_soc_cap_fis_or_1_cont_entries","dg_nec_int_soc_cap_soc_or_1_cont_entries","tiene_menores_de_edad_a_cargo_cont_entries","ha_estado_embarazada_egreso_cont_entries","discapacidad_cont_entries","opcion_discapacidad_cont_entries","escolaridad_cont_entries","edad_al_ing_grupos_cont_entries")
#  c(paste0(gsub("_cont_entries", "", out_cols),"_final"),gsub("_cont_entries", "", out_cols))
out_cols_final <-paste0(gsub("_cont_entries", "", out_cols),"_final")
  
CONS_C1_df_dup_JUL_2020_prev_c%>%
        dplyr::mutate(compromiso_biopsicosocial=dplyr::case_when(compromiso_biopsicosocial=="Leve"~1,compromiso_biopsicosocial=="Moderado"~2,compromiso_biopsicosocial=="Severo"~3,TRUE~NA_real_))%>%
    dplyr::mutate(escolaridad=dplyr::case_when(escolaridad=="Mayor a Ed Secundaria"~1,escolaridad=="Ed Secundaria Completa o Menor"~2,escolaridad=="Ed Primaria Completa o Menor"~3,TRUE~NA_real_))%>%
    dplyr::mutate(across(c(dg_global_nec_int_soc_or, dg_nec_int_soc_cap_hum_or, dg_nec_int_soc_cap_fis_or, dg_nec_int_soc_cap_soc_or,dg_global_nec_int_soc_or_1, dg_nec_int_soc_cap_hum_or_1, dg_nec_int_soc_cap_fis_or_1,dg_nec_int_soc_cap_soc_or_1),~dplyr::case_when(as.character(.)=="Bajas"~3,as.character(.)=="Medias"~2,as.character(.)=="Altas"~1,TRUE~NA_real_)))%>%
    dplyr::mutate(across(c(evaluacindelprocesoteraputico, eva_consumo, eva_fam, eva_relinterp, eva_ocupacion, eva_sm, eva_fisica, eva_transgnorma),~dplyr::case_when(as.character(.)=="Logro Mínimo"~3,as.character(.)=="Logro M?mo"~3,as.character(.)=="Logro Intermedio"~2,as.character(.)=="Logro Alto"~1,TRUE~NA_real_)))%>%
  assign("CONS_C1_df_dup_JUL_2020_prev_d",., envir = .GlobalEnv)

for(col in out_cols){
  old = gsub("_cont_entries", "", col)
  new = paste0(gsub("_cont_entries", "", col),"_final")
  
  CONS_C1_df_dup_JUL_2020_prev_d%>% dplyr::mutate(!!sym(new):= ifelse(is.na(concat_hash_id_treatments), as.character(!!sym(old)), as.character(!!sym(col))))%>%
  assign("CONS_C1_df_dup_JUL_2020_prev_d",., envir = .GlobalEnv)
}
no_mostrar="si"
if(no_mostrar=="no"){
    CONS_C1_df_dup_JUL_2020_prev_d%>%
      dplyr::filter(!is.na(concat_hash_id_treatments))%>%
     dplyr::select(concat_hash_id_treatments,!!(old),!!(out_cols),!!(out_cols_final))%>%
      View()
}
# Variables -Inf, 
CONS_C1_df_dup_JUL_2020_prev_d%>%
    dplyr::mutate(across(c(out_cols_final),~na_if(., "-Inf")))%>%
    dplyr::rename("ano_bd_last_final"="ano_bd_final")%>%
    dplyr::rename("ano_bd_first_final"="ano_bd2_final")%>%
   #                 dplyr::select(concat_hash_id_treatments,!!(old),!!(out_cols), ends_with("_final"))%>% 
  assign("CONS_C1_df_dup_JUL_2020_prev_d",., envir = .GlobalEnv)
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_      
#_#_#_#_#_#_#_
#e. Same value
#_#_#_#_#_#_#_
out_cols <- c("hash_key_cont_entries","id_cont_entries","nacionalidad_cont_entries","hash_rut_completo_cont_entries","sexo_2_cont_entries","embarazo_cont_entries","id_mod_cont_entries","fech_nac_cont_entries","edad_ini_cons_cont_entries","edad_ini_sus_prin_cont_entries","sus_ini_cont_entries","estado_conyugal_2_cont_entries","edad_grupos_cont_entries","etnia_cor_cont_entries","nacionalidad_2_cont_entries","etnia_cor_2_cont_entries","sus_ini_2_mod_cont_entries","sus_ini_3_mod_cont_entries","sus_ini_mod_cont_entries", "at_least_one_cont_entry_cont_entries")
#  c(paste0(gsub("_cont_entries", "", out_cols),"_final"),gsub("_cont_entries", "", out_cols))
out_cols_final <-paste0(gsub("_cont_entries", "", out_cols),"_final")
  
CONS_C1_df_dup_JUL_2020_prev_d%>%
  assign("CONS_C1_df_dup_JUL_2020_prev_e",., envir = .GlobalEnv)

for(col in out_cols){
  old = gsub("_cont_entries", "", col)
  new = paste0(gsub("_cont_entries", "", col),"_final")
  
  CONS_C1_df_dup_JUL_2020_prev_e%>% 
    dplyr::mutate(!!sym(new):= ifelse(is.na(concat_hash_id_treatments), as.character(!!sym(old)), as.character(!!sym(col))))%>%
  assign("CONS_C1_df_dup_JUL_2020_prev_e",., envir = .GlobalEnv)
}
no_mostrar="si"
if(no_mostrar=="no"){
    CONS_C1_df_dup_JUL_2020_prev_e%>%
      dplyr::filter(!is.na(concat_hash_id_treatments))%>%
      dplyr::select(concat_hash_id_treatments,!!(old),!!(out_cols),!!(out_cols_final))%>% 
      View()
}
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_      
#_#_#_#_#_#_#_
#f.Largest treatment
#_#_#_#_#_#_#_
out_cols <-c('con_quien_vive_cont_entries','estatus_ocupacional_cont_entries','cat_ocupacional_cont_entries','origen_ingreso_cont_entries','via_adm_sus_prin_act_cont_entries','sus_principal_mod_cont_entries',"freq_cons_sus_prin_cont_entries","tipo_de_vivienda_mod_cont_entries",'tenencia_de_la_vivienda_mod_cont_entries','rubro_trabaja_mod_cont_entries','otras_sus1_mod_cont_entries','otras_sus2_mod_cont_entries','otras_sus3_mod_cont_entries',
             'dg_trs_cons_sus_or_cont_entries','tipo_de_programa_2_cont_entries','tipo_de_plan_2_for_f')

invisible(c("dg_trs_cons_sus_or","tipo_de_programa_2","dg_trs_cons_sus_or_cont_entries","tipo_de_programa_2_cont_entries"))

out_cols_final <-paste0(gsub("_cont_entries", "", out_cols),"_final")
  
CONS_C1_df_dup_JUL_2020_prev_e%>%
  dplyr::mutate(tipo_de_plan_2_for_f_cont_entries=tipo_de_plan_2_for_f)%>%
  dplyr::mutate(tipo_de_plan_2_for_f=tipo_de_plan_2)%>%
  assign("CONS_C1_df_dup_JUL_2020_prev_f",., envir = .GlobalEnv)

for(col in out_cols){
  old = gsub("_cont_entries", "", col)
  new = paste0(gsub("_cont_entries", "", col),"_final")
  
  CONS_C1_df_dup_JUL_2020_prev_f%>% dplyr::mutate(!!sym(new):= ifelse(is.na(concat_hash_id_treatments), as.character(!!sym(old)), as.character(!!sym(col))))%>%
  assign("CONS_C1_df_dup_JUL_2020_prev_f",., envir = .GlobalEnv)
}
no_mostrar="si"
if(no_mostrar=="no"){
    CONS_C1_df_dup_JUL_2020_prev_f%>%
      dplyr::filter(!is.na(concat_hash_id_treatments))%>%
      dplyr::select(concat_hash_id_treatments,!!(old),!!(out_cols),!!(out_cols_final))%>% 
      View()
}
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_      
#_#_#_#_#_#_#_
#g. Favor dgs. - wide format
#_#_#_#_#_#_#_
out_cols <-c('dg_trs_psiq_dsm_iv_or_cont_entries','dg_trs_psiq_sub_dsm_iv_or_cont_entries','x2_dg_trs_psiq_dsm_iv_or_cont_entries','x2_dg_trs_psiq_sub_dsm_iv_or_cont_entries','x3_dg_trs_psiq_dsm_iv_or_cont_entries','x3_dg_trs_psiq_sub_dsm_iv_or_cont_entries','dg_trs_psiq_cie_10_or_cont_entries','dg_trs_psiq_sub_cie_10_or_cont_entries','x2_dg_trs_psiq_cie_10_or_cont_entries','x2_dg_trs_psiq_sub_cie_10_or_cont_entries','x3_dg_trs_psiq_cie_10_or_cont_entries','x3_dg_trs_psiq_sub_cie_10_or_cont_entries','diagnostico_trs_fisico_cont_entries','otros_probl_at_sm_or_cont_entries')

out_cols_final <-paste0(gsub("_cont_entries", "", out_cols),"_final")
  
CONS_C1_df_dup_JUL_2020_prev_f%>%
  dplyr::mutate(tipo_de_plan_2_for_f_cont_entries=tipo_de_plan_2_for_f)%>%
  dplyr::mutate(tipo_de_plan_2_for_f=tipo_de_plan_2)%>%
  assign("CONS_C1_df_dup_JUL_2020_prev_g",., envir = .GlobalEnv)

for(col in out_cols){
  old = gsub("_cont_entries", "", col)
  new = paste0(gsub("_cont_entries", "", col),"_final")
  
  CONS_C1_df_dup_JUL_2020_prev_g%>% dplyr::mutate(!!sym(new):= ifelse(is.na(concat_hash_id_treatments), as.character(!!sym(old)), as.character(!!sym(col))))%>%
  assign("CONS_C1_df_dup_JUL_2020_prev_g",., envir = .GlobalEnv)
}
no_mostrar="si"
if(no_mostrar=="no"){
    CONS_C1_df_dup_JUL_2020_prev_g%>%
      dplyr::filter(!is.na(concat_hash_id_treatments))%>%
      dplyr::select(concat_hash_id_treatments,!!(old),!!(out_cols),!!(out_cols_final))%>% 
      View()
}
CONS_C1_df_dup_JUL_2020_prev_g%>%
  dplyr::mutate(across(c(x4_dg_trs_psiq_dsm_iv_or, x4_dg_trs_psiq_sub_dsm_iv_or, x4_dg_trs_psiq_cie_10_or, x5_dg_trs_psiq_cie_10_or, x6_dg_trs_psiq_cie_10_or, x4_dg_trs_psiq_sub_cie_10_or),~.,.names = "{col}_final"))%>%
  assign("CONS_C1_df_dup_JUL_2020_prev_g",., envir = .GlobalEnv)
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_      
#_#_#_#_#_#_#_
#h. Sum values
#_#_#_#_#_#_#_
out_cols <-c("dias_trat_imp_cont_entries","dias_trat_inv_cont_entries")

out_cols_final <-paste0(gsub("_cont_entries", "", out_cols),"_final")
  
CONS_C1_df_dup_JUL_2020_prev_g%>%
  assign("CONS_C1_df_dup_JUL_2020_prev_h",., envir = .GlobalEnv)

for(col in out_cols){
  old = gsub("_cont_entries", "", col)
  new = paste0(gsub("_cont_entries", "", col),"_final")
  
  CONS_C1_df_dup_JUL_2020_prev_h%>% dplyr::mutate(!!sym(new):= ifelse(is.na(concat_hash_id_treatments), as.character(!!sym(old)), as.character(!!sym(col))))%>%
  assign("CONS_C1_df_dup_JUL_2020_prev_h",., envir = .GlobalEnv)
}
no_mostrar="si"
if(no_mostrar=="no"){
    CONS_C1_df_dup_JUL_2020_prev_h%>%
      dplyr::filter(!is.na(concat_hash_id_treatments))%>%
      dplyr::select(concat_hash_id_treatments,!!(old),!!(out_cols),!!(out_cols_final))%>% 
      View()
}
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_      
#_#_#_#_#_#_#_
#Consolidación

CONS_C1_df_dup_JUL_2020_prev_h%>%
dplyr::select(c('row', 'row_cont_entries','concat_hash_id_treatments','id_centro_concat_a','obs_concat_a','tipo_de_plan_2_concat_a','rn_common_treats2','id_centro_final','tipo_de_plan_2_final','obs_final','tipo_centro_final','servicio_de_salud_final','senda_final','nombre_centro_final','numero_de_hijos_mod_final','ano_bd_last_final','num_hijos_trat_res_mod_final','tipo_centro_derivacion_final','fech_egres_imp_final','motivodeegreso_mod_imp_final','macrozona_final','nombre_region_final','comuna_residencia_cod_final','fech_ing_final','fecha_ingreso_a_convenio_senda_final','identidad_de_genero_final','edad_al_ing_final','origen_ingreso_mod_final','ano_bd_first_final','x_se_trata_mujer_emb_final','compromiso_biopsicosocial_final','dg_global_nec_int_soc_or_final','dg_nec_int_soc_cap_hum_or_final','dg_nec_int_soc_cap_fis_or_final','dg_nec_int_soc_cap_soc_or_final','usuario_tribunal_trat_droga_final','evaluacindelprocesoteraputico_final','eva_consumo_final','eva_fam_final','eva_relinterp_final','eva_ocupacion_final','eva_sm_final','eva_fisica_final','eva_transgnorma_final','dg_global_nec_int_soc_or_1_final','dg_nec_int_soc_cap_hum_or_1_final','dg_nec_int_soc_cap_fis_or_1_final','dg_nec_int_soc_cap_soc_or_1_final','tiene_menores_de_edad_a_cargo_final','ha_estado_embarazada_egreso_final','discapacidad_final','opcion_discapacidad_final','escolaridad_final','edad_al_ing_grupos_final','hash_key_final','id_final','nacionalidad_final','hash_rut_completo_final','sexo_2_final','embarazo_final','id_mod_final','fech_nac_final','edad_ini_cons_final','edad_ini_sus_prin_final','estado_conyugal_2_final','edad_grupos_final','freq_cons_sus_prin_final','via_adm_sus_prin_act_final','etnia_cor_final','nacionalidad_2_final','etnia_cor_2_final','sus_ini_2_mod_final','sus_ini_3_mod_final','sus_ini_mod_final','at_least_one_cont_entry_final','con_quien_vive_final','estatus_ocupacional_final','cat_ocupacional_final','sus_principal_mod_final','tipo_de_vivienda_mod_final','tenencia_de_la_vivienda_mod_final','rubro_trabaja_mod_final','otras_sus1_mod_final','otras_sus2_mod_final','otras_sus3_mod_final','dg_trs_cons_sus_or_final','tipo_de_programa_2_final','tipo_de_plan_2_for_f_final','dg_trs_psiq_dsm_iv_or_final','dg_trs_psiq_sub_dsm_iv_or_final','x2_dg_trs_psiq_dsm_iv_or_final','x2_dg_trs_psiq_sub_dsm_iv_or_final','x3_dg_trs_psiq_dsm_iv_or_final','x3_dg_trs_psiq_sub_dsm_iv_or_final','dg_trs_psiq_cie_10_or_final','dg_trs_psiq_sub_cie_10_or_final','x2_dg_trs_psiq_cie_10_or_final','x2_dg_trs_psiq_sub_cie_10_or_final','x3_dg_trs_psiq_cie_10_or_final','x3_dg_trs_psiq_sub_cie_10_or_final','diagnostico_trs_fisico_final','otros_probl_at_sm_or_final','x4_dg_trs_psiq_dsm_iv_or_final','x4_dg_trs_psiq_sub_dsm_iv_or_final','x4_dg_trs_psiq_cie_10_or_final','x5_dg_trs_psiq_cie_10_or_final','x6_dg_trs_psiq_cie_10_or_final','x4_dg_trs_psiq_sub_cie_10_or_final','dias_trat_imp_final','dias_trat_inv_final'))%>%
  assign("CONS_C1_df_dup_JUL_2020_cons",., envir = .GlobalEnv)
  
#row_cont_entries - Está concatenada
#tipo_de_plan_2_cont_entries -  table(CONS_C1_df_dup_JUL_2020_prev_h$tipo_de_plan_2_cont_entries) - 2. ultimo tratamiento (b)
#obs_cont_entries - obs del mismo valor - 2. mismo valor (e)

invisible(c("6. Estandarizar las fechas de ingreso y egreso al final"))
invisible(c("7. Dejar estos y sacar el prefijo mod"))
invisible(c("9. Este bind deberá tener en cuenta que hay variables q están numerizadas, otras se resumieron como otras_sus_1, o etc"))
invisible(c("11. Extender validación y ordenación de variables como dg_nec_ y eva_ evaluacindelprocesoteraputico"))
invisible(c("12. Hacer variable tipo de programa 2, en base a los planes"))
invisible(c("13. Validar otras sustancias para que no repitan la misma información del otras sus anterior"))
invisible(c("15. Ver factores"))

Standardization of variables

We generated variables that compare each treatment with their following treatment (in case the user had more than one treatment) (obs_cambios). Also we calculated the days of treatment of each treatment (dias_treat_imp_sin_na), and in case the treatment did not have a date of discharge, we decided to get the difference of days between the date of admission and the date of retrieval of the dataset (2019-11-13) (dates are in the format “years-month-day” in this document). For analytic terms, we added two situations to the cause of discharge: the treatment is in course and does not have another observation, or is in course but the treatment exceeds 1095 days of treatment. We also included two variables to indicate whether a treatment had a difference of less than 45 (menor_45_dias_diff) or 60 (menor_60_dias_diff) days with the following treatment (if any). Finally, we generated the variable abandono_temprano to differentiate between treatment that lasted at least three months and those that did not.

#       main= "Efecto conjunto de Cambios en Características de Tratamiento y Motivo de \n egreso (Derivación), en la probabilidad de que el Tratamiento Dure Menos o Igual de 60 días",

CONS_C1_df_dup_JUL_2020_cons%>%
    dplyr::rename_at(.vars = vars(matches("_final$")),
            .funs = funs(sub("_final$", "", .)))%>%
    dplyr::select(row, row_cont_entries,hash_key,hash_rut_completo,id,id_mod,fech_ing,fech_egres_imp,tipo_de_plan_2,tipo_de_plan_2_for_f,tipo_de_plan_2_concat_a,tipo_de_programa_2,id_centro,nombre_centro,id_centro_concat_a,everything())%>%
  dplyr::relocate(ano_bd_first,ano_bd_last,obs,obs_concat_a,rn_common_treats2,concat_hash_id_treatments,at_least_one_cont_entry, .after = last_col())%>%
  dplyr::select(-contains("dias_trat"))%>%
  dplyr::mutate(senda_concat_a=senda)%>%
  dplyr::mutate(senda=sub(".*\\; ","",senda))%>%
  dplyr::mutate(tipo_centro_concat_a=tipo_centro)%>%
  dplyr::mutate(tipo_centro=sub(".*\\; ","",tipo_centro))%>%
  dplyr::rename("tipo_de_plan_2_largest_treat"="tipo_de_plan_2_for_f")%>%
  #CONS_C1_df_dup_JUL_2020_cons2%>%dplyr::mutate(esto=ifelse(tipo_de_plan_2_for_f!=tipo_de_plan_2,1,0))%>% filter(esto==1)%>% View()
  
  dplyr::group_by(hash_key)%>%
  dplyr::mutate(fech_ing_num=as.numeric(as.Date(fech_ing)))%>%
  dplyr::mutate(fech_egres_num=as.numeric(as.Date(fech_egres_imp)))%>%
  dplyr::mutate(fech_egres_num=ifelse(is.na(fech_egres_imp),18213,fech_egres_num))%>% #equivalente a 2019-11-13
  dplyr::mutate(fech_ing_next_treat=dplyr::lag(fech_egres_num))%>%
  dplyr::mutate(diff_bet_treat=fech_ing_next_treat-fech_egres_num)%>%
  dplyr::mutate(id_centro_sig_trat=dplyr::lag(id_centro)) %>%
  dplyr::mutate(tipo_plan_sig_trat=dplyr::lag(tipo_de_plan_2)) %>%
  dplyr::mutate(tipo_programa_sig_trat=dplyr::lag(tipo_de_programa_2)) %>%
  dplyr::mutate(senda_sig_trat=dplyr::lag(senda)) %>%
  dplyr::ungroup()%>%
  #para tener sólo los casos que corresponde, que tienen comparaciones con un siguiente. Los otros no me interesan
  dplyr::mutate(menor_60_dias_diff=case_when(diff_bet_treat<60~1,TRUE~0))%>%
  dplyr::mutate(menor_45_dias_diff= ifelse(diff_bet_treat<45,1,0))%>%
  dplyr::mutate(motivoegreso_derivacion=case_when(motivodeegreso_mod_imp=="Derivación"~1,TRUE~0))%>%
  dplyr::mutate(dias_treat_imp_sin_na=fech_egres_num-fech_ing_num)%>%
  dplyr::mutate(motivodeegreso_mod_imp=ifelse(dias_treat_imp_sin_na<1095 & is.na(fech_egres_imp),"En curso",as.character(motivodeegreso_mod_imp)))%>%
  dplyr::mutate(motivodeegreso_mod_imp=ifelse(dias_treat_imp_sin_na>=1095 & is.na(fech_egres_imp),"En curso (>=1095 d)",as.character(motivodeegreso_mod_imp)))%>%
  dplyr::mutate(motivodeegreso_mod_imp=ifelse(grepl("Abandono Temprano",motivodeegreso_mod_imp) & dias_treat_imp_sin_na>=90,"Abandono Tardio",as.character(motivodeegreso_mod_imp)))%>%
  dplyr::mutate(abandono_temprano=ifelse(dias_treat_imp_sin_na>=90,0,1)) %>%
  dplyr::mutate(abandono_temprano=as.factor(abandono_temprano)) %>%
  dplyr::mutate(abandono_temprano= dplyr::recode(abandono_temprano, "1"="Menos de 90 días", "0"="Mayor o igual a 90 días"))%>%
  
  dplyr::mutate(obs_cambios=case_when(id_centro_sig_trat!=id_centro~"1.1.cambio centro",TRUE~""))%>%
  dplyr::mutate(obs_cambios=case_when(tipo_plan_sig_trat!=tipo_de_plan_2~glue::glue("{obs_cambios};1.2.cambio tipo plan"),TRUE~obs_cambios))%>%
  dplyr::mutate(obs_cambios=case_when(tipo_programa_sig_trat!=tipo_de_programa_2~glue::glue("{obs_cambios};1.3.cambio tipo programa"),TRUE~obs_cambios))%>%
  dplyr::mutate(obs_cambios=case_when(senda_sig_trat!=senda~glue::glue("{obs_cambios};1.4.cambio senda"),TRUE~obs_cambios))%>%
  dplyr::mutate(obs_cambios_ninguno=case_when(obs_cambios==""~1,TRUE~0))%>%
  dplyr::mutate(obs_cambios_num=case_when(id_centro_sig_trat!=id_centro~1,TRUE~0))%>%
  dplyr::mutate(obs_cambios_num=case_when(tipo_plan_sig_trat!=tipo_de_plan_2~obs_cambios_num+1,TRUE~obs_cambios_num))%>%
  dplyr::mutate(obs_cambios_num=case_when(tipo_programa_sig_trat!=tipo_de_programa_2~obs_cambios_num+1,TRUE~obs_cambios_num))%>%
  dplyr::mutate(obs_cambios_num=case_when(senda_sig_trat!=senda~obs_cambios_num+1,TRUE~obs_cambios_num))%>%
  dplyr::mutate(obs_cambios_num=as.numeric(obs_cambios_num))%>%
  dplyr::mutate(obs_cambios_fac=obs_cambios_num)%>%
  dplyr::mutate(menor_45_dias_diff= recode(as.character(menor_45_dias_diff),"0"=">= 45 Days of Difference Between Entries","1"="<45 Days of Difference Between Entries"))%>%
  dplyr::mutate(menor_60_dias_diff= recode(as.character(menor_60_dias_diff),"0"=">= 60 Days of Difference Between Entries","1"="<60 Days of Difference Between Entries"))%>%
  dplyr::mutate(obs_cambios_ninguno= recode(as.character(obs_cambios_ninguno),"0"="At least 1 Change w/ the Next Entry","1"="No Changes w/ the Next Entry"))%>%
  dplyr::mutate(motivoegreso_derivacion= recode(as.character(motivoegreso_derivacion),"0"="Other causes of discharge","1"="Referral"))%>%
  dplyr::mutate_at(c('menor_45_dias_diff','menor_60_dias_diff','motivoegreso_derivacion','obs_cambios_ninguno','obs_cambios_fac'),~as.factor(.))%>%
  dplyr::mutate(via_adm_sus_prin_act=ifelse(grepl("Fumada o Pulmonar",via_adm_sus_prin_act),"Fumada o Pulmonar (aspiración de gases o vapores)",as.character(via_adm_sus_prin_act)))%>%
  assign("CONS_C1_df_dup_JUL_2020_cons2",., envir = .GlobalEnv)

We replaced the number of children/dependents (numero_de_hijos_mod) if the number of children/dependents is 0 and the number of children that are admitted to a residential treatment (num_hijos_trat_res_mod) is greater than 0. Also, we declared values invalid if the user had no children but declared living with them. Additionally, we excluded the number of kids in a residential treatment if the user declared having no children/dependents but reported being admitted with at least one dependent. Finally, we collapsed the Age of Onset of Drug Use of Primary Substance in the variable edad_ini_sus_prin_grupos.

invisible(c("1.Validar el número de hijos con lo de número de hijos ing trat res"))
invisible(c("2.tipo_centro_derivacion ASEGURARSE QUE TENGA MOTIVO DE EGRESO DERIVACIÓN"))
invisible(c("num_hijos_trat_res numero_de_hijos_mod, num_hijos_trat_res_mod rubro_trabaja_mod tenencia_de_la_vivienda_mod tipo_de_vivienda_mod"))
#numero_de_hijos
#num_hijos_ing_trat_res
#edad_al_ing
#edad_ini_cons
#edad_ini_sus_prin

#Convivencia en los 30 días previos a la admisión a tratamiento

CONS_C1_df_dup_JUL_2020_cons2%>%
  dplyr::mutate(across(c(numero_de_hijos_mod, num_hijos_trat_res_mod, edad_ini_cons,edad_ini_sus_prin),~as.integer(.)))%>%
dplyr::mutate(across(c(edad_al_ing),~as.numeric(.)))%>%
  #más de 10 hijos, distinto numero de hijos y (el nro. de hijos para el tratamiento residencial es distinto al número de hijos | el nro. de hijos en tratamiento residencial es vacío)
    dplyr::mutate(numero_de_hijos_mod= dplyr::case_when(numero_de_hijos_mod>10 & (num_hijos_trat_res_mod!=numero_de_hijos_mod|is.na(num_hijos_trat_res_mod))~NA_integer_,TRUE~numero_de_hijos_mod))%>%
  #dplyr::filter(numero_de_hijos>10,num_hijos_ing_trat_res!=numero_de_hijos)%>% dplyr::select(numero_de_hijos_mod,num_hijos_ing_trat_res,numero_de_hijos)
  dplyr::mutate(numero_de_hijos_mod= dplyr::case_when(grepl("hij",con_quien_vive,ignore.case=T) & numero_de_hijos_mod==0 & num_hijos_trat_res_mod>0~num_hijos_trat_res_mod, TRUE~numero_de_hijos_mod))%>%
  #dplyr::filter(numero_de_hijos_mod==num_hijos_ing_trat_res,num_hijos_ing_trat_res>0,grepl("hij",con_quien_vive))%>%dplyr::select(numero_de_hijos_mod,num_hijos_ing_trat_res,numero_de_hijos,con_quien_vive) 
  dplyr::mutate(num_hijos_trat_res_mod= dplyr::case_when(numero_de_hijos_mod==0 & num_hijos_trat_res_mod>0 ~NA_integer_,TRUE~num_hijos_trat_res_mod))%>%
  dplyr::mutate(numero_de_hijos_mod= dplyr::case_when(numero_de_hijos_mod==0 & grepl("hij",con_quien_vive,ignore.case=T)~NA_integer_,TRUE~numero_de_hijos_mod))%>%
  dplyr::mutate(edad_ini_sus_prin_grupos=ifelse(edad_ini_sus_prin>=25,">=25",
                                                ifelse(edad_ini_sus_prin>18,"19-24",
                                                ifelse(edad_ini_sus_prin>15,"16-18",
                                                ifelse(edad_ini_sus_prin>0,"<=15",
                                                NA_character_)))))%>% 
assign("CONS_C1_df_dup_JUL_2020_cons3",., envir = .GlobalEnv)

invisible(c("2.tipo_centro_derivacion ASEGURARSE QUE TENGA MOTIVO DE EGRESO DERIVACIÓN"))
invisible(
  tabyl(CONS_C1_df_dup_JUN_2020,tipo_centro_derivacion,motivodeegreso_mod_imp)
)
invisible(
tabyl(CONS_C1_df_dup_JUL_2020_cons3,tipo_centro_derivacion,motivodeegreso_mod_imp)
)
invisible(c("Lo tenía malo porque estaba reemplazando si no estaba"))
invisible(23+3+11+91+379+76+39+7+28+23945)

We replaced cases that reported having dependent children if they responded at least one in the question related to dependent children.

#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
# Tiene menores de edad a cargo
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
invisible(c("10. Extender validación de tiene_menores_de_edad_a_cargo si tiene hijos"))
invisible(
  CONS_C1_df_dup_JUL_2020_cons3%>% janitor::tabyl(tiene_menores_de_edad_a_cargo,numero_de_hijos_mod)
)
CONS_C1_df_dup_JUL_2020_cons3%>%
    dplyr::mutate(tiene_menores_de_edad_a_cargo=ifelse(numero_de_hijos_mod>0 & tiene_menores_de_edad_a_cargo=="si","si","no"))%>%
  assign("CONS_C1_df_dup_JUL_2020_cons4",., envir = .GlobalEnv)

We focused on cases that had treatments with different gender identity, type of plan & type of program. As noted by SENDA professionals, the type of program is subordinated to the type of plan that each user might have. However, we kept the cases that had a gender identity in intermediate treatments; it is possible that when the entries collapsed into treatments, some information might have overlapped generating some inconsistencies. This is why we checked for them, particularly related to gender-related programs.

We assumed that women and users that reported a gender identity as a woman could be in a women or a general-population program. However, men with a masculine or not reported identity could only be in a general-population program. We are investigating to determine whether there was an incorrect assignment of the sex, or if it corresponded to a problem of classification.

#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
# Tipo de plan y Tipo de Programa
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
invisible(
CONS_C1_df_dup_JUL_2020_cons4%>%
    dplyr::select(hash_key, fech_ing, fech_egres_imp)
)
#CONS_C1_df_dup_JUL_2020_cons4%>%janitor::tabyl(tipo_de_programa_2,tipo_de_plan_2,sexo_2)

invisible(c("a- no son mujeres, con identidad de género distinta a femenino y tienen un tipo de programa mujeres."))
  #CONS_C1_df_dup_JUL_2020_cons4%>% dplyr::filter(sexo_2!="Mujer",grepl("Muje",tipo_de_programa_2))%>% dplyr::filter(identidad_de_genero!="Femenino"|is.na(identidad_de_genero))
#tipo_de_plan_2_largest_treat
#CONS_C1_df_dup_JUL_2020_cons4%>% dplyr::filter(sexo_2!="Mujer",grepl("Muje",tipo_de_programa_2))%>% dplyr::filter(identidad_de_genero!="Femenino"|is.na(identidad_de_genero))%>% dplyr::select(obs)%>% dplyr::filter(grepl("2.6.01",obs)) #40 de 96 tienen otrA IDENTIDAD DE GÉNERO. INVESTIGAR.
invisible(c("ver los centros que tienen como 'mujeres' en el nombre, ahi puedo estar escondiendo una incosistencia. Ver si el problema está en la asignación del sexo o del programa"))

#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_

muestra=1
if(muestra==0) {
###### 1.2. Más de un valor de sexo por usuario (Duplicates 4) ##############
      casos_interes_sexo_tipo_programa_dup4<-
      CONS_C1_df_dup_JUL_2020_cons4%>% 
        dplyr::filter(sexo_2!="Mujer",grepl("Muje",tipo_de_programa_2))%>% 
        dplyr::filter(identidad_de_genero!="Femenino"|is.na(identidad_de_genero))  
        
      casos_interes_sexo_tipo_plan_dup4<-
        CONS_C1_df_dup_JUL_2020_cons4%>% 
        dplyr::filter(sexo_2!="Mujer",grepl("M-",tipo_de_plan_2))%>% 
        dplyr::filter(identidad_de_genero!="Femenino"|is.na(identidad_de_genero))
      
      casos_interes_sexo_tipo_programa_dup4%>%
        dplyr::select(row, row_cont_entries, hash_key, sexo_2, fech_ing, fech_egres_imp, tipo_de_plan_2,tipo_de_plan_2_largest_treat,tipo_de_programa_2,identidad_de_genero,obs)%>%
        left_join(CONS_C1_df_dup_ENE_2020_prev%>%janitor::clean_names()%>%dplyr::select(row,hash_key, sexo_2,tipo_de_plan_2, tipo_de_programa_2, identidad_de_genero),
                   by="hash_key", suffix=c("","_original"))%>% guardar_tablas("problema_programa_sexo")
      
      casos_interes_sexo_tipo_plan_dup4%>%
        dplyr::select(row, row_cont_entries, hash_key, sexo_2, fech_ing, fech_egres_imp, tipo_de_plan_2,tipo_de_plan_2_largest_treat,tipo_de_programa_2,identidad_de_genero,obs)%>%
        left_join(CONS_C1_df_dup_ENE_2020_prev%>%janitor::clean_names()%>%dplyr::select(row,hash_key, sexo_2,tipo_de_plan_2, tipo_de_programa_2, identidad_de_genero),
                  by="hash_key", suffix=c("","_original"))%>% guardar_tablas("problema_plan_sexo3")
}

invisible(c("a- no son mujeres, con identidad de género distinta a femenino y tienen un tipo de programa mujeres."))

library(readxl)
problema_plan_sexo_analisis <- read_excel(paste0(path,"/problema_plan_sexo analisis.xlsx"),skip = 1)
#sexo_CAMBIOS SEXO (ID EMBARAZO NOMBE CENTRO Y OTRAS VARIABLES USER-INVARIANT DEBIESE SER POR HASH)
#tipo_de_plan_CAMBIOS PLAN (ID EMBARAZO NOMBE CENTRO Y OTRAS VARIABLES USER-INVARIANT DEBIESE SER POR HASH)
#tipo_de_programa_CAMBIOS PROGRAMA (ID EMBARAZO NOMBE CENTRO Y OTRAS VARIABLES USER-INVARIANT DEBIESE SER POR HASH)
#identidad_de_genero_CAMBIOS PROGRAMA (ID EMBARAZO NOMBE CENTRO Y OTRAS VARIABLES USER-INVARIANT DEBIESE SER POR HASH)
problema_plan_sexo_analisis_filter<-
problema_plan_sexo_analisis%>%
  dplyr::group_by(row)%>%
  dplyr::select(hash_key,`sexo_CAMBIOS SEXO (ID EMBARAZO NOMBE CENTRO Y OTRAS VARIABLES USER-INVARIANT DEBIESE SER POR HASH)`,`tipo_de_plan_CAMBIOS PLAN (ID EMBARAZO NOMBE CENTRO Y OTRAS VARIABLES USER-INVARIANT DEBIESE SER POR HASH)`,`tipo_de_programa_CAMBIOS PROGRAMA (ID EMBARAZO NOMBE CENTRO Y OTRAS VARIABLES USER-INVARIANT DEBIESE SER POR HASH)`,`identidad_de_genero_CAMBIOS PROGRAMA (ID EMBARAZO NOMBE CENTRO Y OTRAS VARIABLES USER-INVARIANT DEBIESE SER POR HASH)`)%>%
  slice(1)

Adding missing grouping variables: `row`slice (grouped): removed 132 rows (58%), 95 rows remaining

#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
#INCORPORAR LOS CAMBIOS A LA BASE DE DATOS PRINCIPAL

#CONS_C1_df_dup_JUL_2020_cons4%>% dplyr::filter(sexo_2!="Mujer", identidad_de_genero!="Femenino",grepl("M-",tipo_de_plan_2))
#CONS_C1_df_dup_JUL_2020_cons4%>% dplyr::filter(sexo_2!="Mujer", is.na(identidad_de_genero),grepl("Muje",tipo_de_programa_2), grepl("M-",tipo_de_plan_2))

CONS_C1_df_dup_JUL_2020_cons4%>%
  #dplyr::mutate(tipo_de_programa_2= tipo_de_programa_2, tipo_de_plan_2_for_f)%>%
  #*6 tipo_de_programa tipo_de_plan: MANDA EL PLAN- M-PAI debe estar en mujeres. Cambiar a mujeres si está en población general// no hay casos
  dplyr::left_join(problema_plan_sexo_analisis_filter, by="row", suffix=c("","_sex_program"))%>%
  dplyr::mutate(changes= "")%>%
  dplyr::mutate(changes= dplyr::case_when(as.character(sexo_2)!=`sexo_CAMBIOS SEXO (ID EMBARAZO NOMBE CENTRO Y OTRAS VARIABLES USER-INVARIANT DEBIESE SER POR HASH)`~glue::glue("sex"),TRUE~""))%>%
  dplyr::mutate(changes= dplyr::case_when(as.character(tipo_de_plan_2)!=`tipo_de_plan_CAMBIOS PLAN (ID EMBARAZO NOMBE CENTRO Y OTRAS VARIABLES USER-INVARIANT DEBIESE SER POR HASH)`~glue::glue("{changes};plan"),TRUE~changes))%>%
  dplyr::mutate(changes= dplyr::case_when(as.character(tipo_de_programa_2)!=`tipo_de_programa_CAMBIOS PROGRAMA (ID EMBARAZO NOMBE CENTRO Y OTRAS VARIABLES USER-INVARIANT DEBIESE SER POR HASH)`~glue::glue("{changes};prog"),TRUE~changes))%>%
  dplyr::mutate(changes= dplyr::case_when(as.character(identidad_de_genero)!=`identidad_de_genero_CAMBIOS PROGRAMA (ID EMBARAZO NOMBE CENTRO Y OTRAS VARIABLES USER-INVARIANT DEBIESE SER POR HASH)`~glue::glue("{changes};gen"),TRUE~changes))%>%
  dplyr::mutate(changes=sub("^;","",changes))%>%
#Reemplazar valores en caso que hayan sido seleccionados
  dplyr::mutate(sexo_2= ifelse(!is.na(hash_key_sex_program),`sexo_CAMBIOS SEXO (ID EMBARAZO NOMBE CENTRO Y OTRAS VARIABLES USER-INVARIANT DEBIESE SER POR HASH)`,sexo_2))%>%
  dplyr::mutate(tipo_de_plan_2= ifelse(!is.na(hash_key_sex_program),`tipo_de_plan_CAMBIOS PLAN (ID EMBARAZO NOMBE CENTRO Y OTRAS VARIABLES USER-INVARIANT DEBIESE SER POR HASH)`,tipo_de_plan_2))%>%
  dplyr::mutate(tipo_de_programa_2= ifelse(!is.na(hash_key_sex_program),`tipo_de_programa_CAMBIOS PROGRAMA (ID EMBARAZO NOMBE CENTRO Y OTRAS VARIABLES USER-INVARIANT DEBIESE SER POR HASH)`,tipo_de_programa_2))%>%
  dplyr::mutate(identidad_de_genero= ifelse(!is.na(hash_key_sex_program),`identidad_de_genero_CAMBIOS PROGRAMA (ID EMBARAZO NOMBE CENTRO Y OTRAS VARIABLES USER-INVARIANT DEBIESE SER POR HASH)`,identidad_de_genero))%>%
#glimpse(CONS_C1_df_dup_JUL_2020_cons4%>% select(sexo_2,tipo_de_plan_2,tipo_de_programa_2,identidad_de_genero))
  dplyr::mutate(obs=case_when(!is.na(hash_key_sex_program)~glue::glue("{obs};4.05.Inconsistent Sex, Gender or Treatment ({changes})"),TRUE~obs))%>%
  dplyr::select(-`tipo_de_plan_CAMBIOS PLAN (ID EMBARAZO NOMBE CENTRO Y OTRAS VARIABLES USER-INVARIANT DEBIESE SER POR HASH)`,-`tipo_de_programa_CAMBIOS PROGRAMA (ID EMBARAZO NOMBE CENTRO Y OTRAS VARIABLES USER-INVARIANT DEBIESE SER POR HASH)`,-`identidad_de_genero_CAMBIOS PROGRAMA (ID EMBARAZO NOMBE CENTRO Y OTRAS VARIABLES USER-INVARIANT DEBIESE SER POR HASH)`,-changes)%>%
  dplyr::ungroup()%>%
  #dplyr::filter(grepl("4.05.",obs))%>%  janitor::tabyl(obs) #para ver si el obs.
      assign("CONS_C1_df_dup_JUL_2020_cons5a",., envir = .GlobalEnv)

CONS_C1_df_dup_JUL_2020_cons5a%>%
  dplyr::group_by(hash_key)%>%
  dplyr::mutate(dis_sex=n_distinct(sexo_2))%>%
  #dplyr::filter(dis_sex>1)%>%
  dplyr::rename("sexo_rec"=`sexo_CAMBIOS SEXO (ID EMBARAZO NOMBE CENTRO Y OTRAS VARIABLES USER-INVARIANT DEBIESE SER POR HASH)`)%>%
  dplyr::mutate(sexo_rec= max(sexo_rec,na.rm=T))%>%
  dplyr::mutate(sexo_rec=ifelse(is.na(sexo_rec),sexo_2,sexo_rec))%>%
  dplyr::mutate(sexo_2=ifelse(sexo_2!=sexo_rec,sexo_rec,sexo_2))%>%
  #dplyr::select(row,hash_key,fech_ing,fech_egres_imp,ano_bd_first,ano_bd_last,sexo_2,embarazo, identidad_de_genero,tipo_de_plan_2,tipo_de_programa_2,senda, obs, sexo_rec)#x_se_trata_de_una_mujer_embarazada
  dplyr::mutate(id_mod=ifelse(sexo_2=="Mujer",`substr<-`(id_mod,5,5,"2"),`substr<-`(id_mod,5,5,"1")))%>%
  dplyr::mutate(id=ifelse(sexo_2=="Mujer",`substr<-`(id,5,5,"2"),`substr<-`(id,5,5,"1")))%>%
  dplyr::ungroup()%>%
  dplyr::mutate(centro_muj=ifelse(grepl("(Mujeres)",nombre_centro,ignore.case=T),1,0))%>%
  dplyr::select(-sexo_rec,-dis_sex)%>%
  assign("CONS_C1_df_dup_JUL_2020_cons6",., envir = .GlobalEnv)
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
#INCORPORAR LOS CAMBIOS A LA BASE DE DATOS PRINCIPAL
invisible(c("a- no son mujeres, con identidad de género distinta a femenino y tienen un tipo de programa mujeres."))

CONS_C1_df_dup_JUL_2020_cons6%>% 
  dplyr::filter(sexo_2!="Mujer",grepl("Muje",tipo_de_programa_2))%>%
  dplyr::filter(identidad_de_genero!="Femenino"|is.na(identidad_de_genero))%>%
  nrow()

[1] 0

  #janitor::tabyl(progr_incoherente)%>%
  #dplyr::mutate(progr_incoherente= dplyr::case_when(sexo_2!="Mujer" & grepl("Muje",tipo_de_programa_2) & (identidad_de_genero!="Femenino"|is.na(identidad_de_genero))~1,TRUE~0))%>%janitor::tabyl(progr_incoherente)
  #dplyr::filter(progr_incoherente==1)

muestra=1
if(muestra==0) {
    CONS_C1_df_dup_ENE_2020_prev6%>%
      janitor::clean_names()%>%
      dplyr::filter(hash_key %in% unlist(problema_plan_sexo_analisis_filter[,"hash_key"]))%>%
      dplyr::select(row,hash_key,fech_ing,fech_egres,ano_bd,sexo_2,embarazo, identidad_de_genero,tipo_de_plan_2,tipo_de_programa_2,senda, x_se_trata_de_una_mujer_embarazada, obs)
}
  #dplyr::mutate(sexo_2.1=as.factor(sexo_2.1))%>%
invisible(c("dg_trs_cons_sus_or","tipo_de_programa_2","dg_trs_cons_sus_or_cont_entries","tipo_de_programa_2_cont_entries"))

We had an amount of users that had changed their sex, but that change affected other entries of these users (users= 13; n=40).

#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
# Opcion_discapacidad
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
invisible(c("14. Validar opcion_discapacidad en función de discapacidad _---> está bien"))
invisible(c("16. Podría traerme por ej, opcion_discapacidad, en el caso que lo haya mencionado la persona, de una base anterior"))
CONS_C1_df_dup_JUL_2020_cons6%>% janitor::tabyl(discapacidad,opcion_discapacidad)

We decided to order specific variables based on its severity or level of vulnerability by adding an arbitrary greater number to those categories with a greater vulnerability or severity.

CONS_C1_df_dup_JUL_2020_cons6%>% 
    dplyr::mutate(compromiso_biopsicosocial=dplyr::case_when(compromiso_biopsicosocial=="1"~"1-Leve",
                                                         compromiso_biopsicosocial=="2"~"2-Moderado",
                                                         compromiso_biopsicosocial=="3"~"3-Severo",TRUE~NA_character_))%>%
    dplyr::mutate(escolaridad=dplyr::case_when(escolaridad=="1"~"1-Mayor a Ed Secundaria",
                                               escolaridad=="2"~"2-Ed Secundaria Completa o Menor",
                                               escolaridad=="3"~"3-Ed Primaria Completa o Menor",TRUE~NA_character_))%>%
    dplyr::mutate(across(c(dg_global_nec_int_soc_or, dg_nec_int_soc_cap_hum_or, dg_nec_int_soc_cap_fis_or, dg_nec_int_soc_cap_soc_or,dg_global_nec_int_soc_or_1, dg_nec_int_soc_cap_hum_or_1, dg_nec_int_soc_cap_fis_or_1,dg_nec_int_soc_cap_soc_or_1),~dplyr::case_when(as.character(.)=="3"~"3-Bajas",as.character(.)=="2"~"2-Medias",as.character(.)=="1"~"1-Altas",TRUE~NA_character_)))%>%
    dplyr::mutate(across(c(evaluacindelprocesoteraputico, eva_consumo, eva_fam, eva_relinterp, eva_ocupacion, eva_sm, eva_fisica, eva_transgnorma),~dplyr::case_when(as.character(.)=="3"~"3-Logro Minimo",as.character(.)=="2"~"2-Logro Intermedio",as.character(.)=="1"~"1-Logro Alto",TRUE~NA_character_)))%>%
  dplyr::mutate(across(c(compromiso_biopsicosocial, escolaridad,dg_global_nec_int_soc_or, dg_nec_int_soc_cap_hum_or, dg_nec_int_soc_cap_fis_or, dg_nec_int_soc_cap_soc_or,dg_global_nec_int_soc_or_1, dg_nec_int_soc_cap_hum_or_1, dg_nec_int_soc_cap_fis_or_1,dg_nec_int_soc_cap_soc_or_1,evaluacindelprocesoteraputico, eva_consumo, eva_fam, eva_relinterp, eva_ocupacion, eva_sm, eva_fisica, eva_transgnorma),~as.factor(.)))%>%
  assign("CONS_C1_df_dup_JUL_2020_cons7",., envir = .GlobalEnv)

invisible(c("17. Ver categoría y estatus ocupacional ahora para ver inconsistencias"))
CONS_C1_df_dup_JUL_2020_cons7%>% janitor::tabyl(estatus_ocupacional,cat_ocupacional,rubro_trabaja_mod)

invisible(c("18. Debería recuperar embarazos de casos que volví a considerar mujeres"))
CONS_C1_df_dup_JUL_2020_cons7%>% janitor::tabyl(sexo_2,embarazo)

invisible(c("19. Centros que son de mujeres en hombres"))
CONS_C1_df_dup_JUL_2020_cons7%>% 
  janitor::tabyl(centro_muj,sexo_2)
#hay un grupo de hombres que están en centros mujeres

Additionally, we corrected the categories of other problems related to mental health (otros_probl_at_sm_or) and excluded the categories if a user did not declare other elements related to mental health.

invisible(janitor::tabyl(CONS_C1_df_dup_JUL_2020_cons7,otros_probl_at_sm_or))

CONS_C1_df_dup_JUL_2020_cons7%>%
  dplyr::mutate(otros_probl_at_sm_or=sub("; Sin otros problemas de salud mental","",otros_probl_at_sm_or))%>%
  dplyr::mutate(otros_probl_at_sm_or=sub("Explotaci\\?omercial Sexual","Explotación Comercial Sexual",otros_probl_at_sm_or))%>%
  dplyr::mutate(otros_probl_at_sm_or=sub("Prisionizaci\\?","Prisionalización",otros_probl_at_sm_or))%>%
  dplyr::mutate(tenencia_de_la_vivienda_mod=sub("Ocupaci\\?","Ocupación Irregular",tenencia_de_la_vivienda_mod))%>%

  dplyr::mutate_at(vars(c("otros_probl_at_sm_or", "dg_trs_psiq_dsm_iv_or", "dg_trs_psiq_sub_dsm_iv_or", "dg_trs_psiq_sub_cie_10_or")),~dplyr::case_when(.==""~NA_character_,TRUE~.))%>%
  
  assign("CONS_C1_df_dup_JUL_2020_cons11",., envir = .GlobalEnv)

CONS_C1_df_dup_JUL_2020_cons11%>%
dplyr::mutate(across(c(dg_trs_psiq_cie_10_or,x2_dg_trs_psiq_cie_10_or,x3_dg_trs_psiq_cie_10_or,x4_dg_trs_psiq_cie_10_or,x5_dg_trs_psiq_cie_10_or,x6_dg_trs_psiq_cie_10_or),~dplyr::case_when(grepl("En estudio",as.character(.),ignore.case = T)~1,grepl("Sin trastorno",as.character(.),ignore.case = T)~0,is.na(.)~0,TRUE~0),.names = "{col}_mod1a"))%>%
  dplyr::mutate(total_cie_10_en_est = base::rowSums(dplyr::select(.,ends_with("_mod1a"))))%>%  
  dplyr::mutate(across(c(dg_trs_psiq_cie_10_or,x2_dg_trs_psiq_cie_10_or,x3_dg_trs_psiq_cie_10_or,x4_dg_trs_psiq_cie_10_or,x5_dg_trs_psiq_cie_10_or,x6_dg_trs_psiq_cie_10_or),~dplyr::case_when(grepl("En estudio",as.character(.),ignore.case = T)~0,grepl("Sin trastorno",as.character(.),ignore.case = T)~0,is.na(.)~0,TRUE~1),.names = "{col}_mod2a"))%>%
  dplyr::mutate(total_cie_10_dg = base::rowSums(dplyr::select(.,ends_with("_mod2a"))))%>%  
    
  dplyr::mutate(cie_10=dplyr::case_when(total_cie_10_dg>0 & total_cie_10_en_est>0~"Diagnosticado/a (1 o más)",
                       total_cie_10_dg>0 & total_cie_10_en_est==0~"Diagnosticado/a (1 o más)",
                       total_cie_10_dg==0 & total_cie_10_en_est>0~"En estudio",
                       TRUE~"Sin información diagnóstica"))%>%
  
dplyr::mutate(across(c(dg_trs_psiq_cie_10_or,x2_dg_trs_psiq_dsm_iv_or,x3_dg_trs_psiq_dsm_iv_or,x4_dg_trs_psiq_dsm_iv_or),~dplyr::case_when(grepl("En estudio",as.character(.))~1,grepl("Sin trastorno",as.character(.))~0,is.na(.)~0,TRUE~0),.names = "{col}_mod1b"))%>%
  dplyr::mutate(total_dsm_iv_en_est = base::rowSums(dplyr::select(.,ends_with("_mod1b"))))%>%  
dplyr::mutate(across(c(dg_trs_psiq_cie_10_or,x2_dg_trs_psiq_dsm_iv_or,x3_dg_trs_psiq_dsm_iv_or,x4_dg_trs_psiq_dsm_iv_or),~dplyr::case_when(grepl("En estudio",as.character(.),ignore.case = T)~0,grepl("Sin trastorno",as.character(.),ignore.case = T)~0,is.na(.)~0,TRUE~1),.names = "{col}_mod2b"))%>%
  dplyr::mutate(total_dsm_iv_dg = base::rowSums(dplyr::select(.,ends_with("_mod2b"))))%>%  
  
  dplyr::mutate(dsm_iv=dplyr::case_when(total_dsm_iv_dg>0 & total_dsm_iv_en_est>0~"Diagnosticado/a (1 o más)",
                       total_dsm_iv_dg>0 & total_dsm_iv_en_est==0~"Diagnosticado/a (1 o más)",
                       total_dsm_iv_dg==0 & total_dsm_iv_en_est>0~"En estudio",
                       TRUE~"Sin información diagnóstica"))%>%
  dplyr::select(-ends_with("_mod2b"),-ends_with("_mod2a"),-ends_with("_mod1a"),-ends_with("_mod1b"),
                -total_cie_10_dg,-total_cie_10_en_est,-total_dsm_iv_dg,-total_dsm_iv_en_est)%>%
  
assign("CONS_C1_df_dup_JUL_2020_cons12",., envir = .GlobalEnv)

We also restricted the number of dependants at the admission of a residential treatment to residential treatments only.

#table(CONS_C1_df_dup_JUL_2020_cons12$tipo_de_plan_2,CONS_C1_df_dup_JUL_2020_cons12$num_hijos_trat_res_mod)%>% View()
#
CONS_C1_df_dup_JUL_2020_cons12%>%
  dplyr::mutate(num_hijos_trat_res_mod=dplyr::case_when(
    !grepl("pr",tipo_de_plan_2,ignore.case=T) & num_hijos_trat_res_mod>0~NA_integer_,
    TRUE~num_hijos_trat_res_mod))%>%
assign("CONS_C1_df_dup_JUL_2020_cons13",., envir = .GlobalEnv)

In the first steps of the normalization of the database, we collapsed the different plan types into the following: PG-PAB, PG-PAI, PG-PR, M-PAB, M-PAI, and M-PR. Other plans and programs were grouped into the General Population Programs due to their low prevalence (1.2%). Also, there are notations of the procedures of data normalization made by SENDA professionals that indicates that the type of plan is followed by the type of program in terms of importance. This is why we changed the type of program, depending on the type of plan.

#table(CONS_C1_df_dup_JUL_2020_cons14$tipo_de_plan_2,CONS_C1_df_dup_JUL_2020_cons14$tipo_de_programa_2)
#table(CONS_C1$Tipo.de.Plan,CONS_C1$Tipo.de.Programa)
CONS_C1_df_dup_JUL_2020_cons13%>%
  dplyr::mutate(tipo_de_programa_2=dplyr::case_when(
    grepl("M-",tipo_de_plan_2,ignore.case=T)~"Programa Específico Mujeres",
    grepl("PG-",tipo_de_plan_2,ignore.case=T) & grepl("Muj",tipo_de_programa_2,ignore.case=T)~"Programa Población General",
    grepl("M-",tipo_de_plan_2,ignore.case=T) & !grepl("Muj",tipo_de_programa_2,ignore.case=T)~"Programa Específico Mujeres",
    grepl("PG-",tipo_de_plan_2,ignore.case=T) & grepl("Otro|Alcohol|Calles|Vigilada",tipo_de_programa_2,ignore.case=T)~"Programa Población General",
    TRUE~tipo_de_programa_2))%>%
  assign("CONS_C1_df_dup_JUL_2020_cons14",., envir = .GlobalEnv)

We recoded the variable con_quien_vive in order to group different situations into those living with relatives, only with children, only with a couple, with a couple and children, and other different situations (with friends, with other that is not a relative, and many other situations) (8,7%).

#janitor::tabyl(CONS_C1_df_dup_JUL_2020_cons14$con_quien_vive)

CONS_C1_df_dup_JUL_2020_cons14b<-
CONS_C1_df_dup_JUL_2020_cons14 %>% 
  dplyr::mutate(con_quien_vive_rec=dplyr::case_when(
    grepl("Solo$",con_quien_vive, ignore.case=T)~"Solo",
    
    grepl("Con abuelos",con_quien_vive, ignore.case=T)~"Con familiares",
    grepl("Con hermanos",con_quien_vive, ignore.case=T)~"Con familiares",
    grepl("Con la madre \\(sola\\)",con_quien_vive, ignore.case=T)~"Con familiares",
    grepl("Con otro pariente",con_quien_vive, ignore.case=T)~"Con familiares",
    grepl("con hijos y padres o familia",con_quien_vive, ignore.case=T)~"Con familiares",
    grepl("con la pareja y padres o familia de origen",con_quien_vive, ignore.case=T)~"Con familiares",
    grepl("con padres o familia de origen",con_quien_vive, ignore.case=T)~"Con familiares",
    
    grepl("Únicamente con hijos",con_quien_vive, ignore.case=T)~"Únicamente con hijos",
    
    grepl("Únicamente con pareja",con_quien_vive, ignore.case=T)~"Únicamente con pareja",
    
    grepl("Hijos y Padres o Familia de Origen",con_quien_vive, ignore.case=T)~"Con pareja e hijos",
    grepl("Únicamente con la pareja e hijos",con_quien_vive, ignore.case=T)~"Con pareja e hijos",
    grepl("Únicamente con hijos",con_quien_vive, ignore.case=T)~"Únicamente con hijos",
    
    grepl("Con amigos",con_quien_vive, ignore.case=T)~"Otros",
    grepl("Con otro NO pariente",con_quien_vive, ignore.case=T)~"Otros",
    grepl("*Otros$",con_quien_vive, ignore.case=T)~"Otros")) #%>% 
    #janitor::tabyl(con_quien_vive, con_quien_vive_rec)

Cases with more than 1095 days of treatment

Finally, we focused on treatment days in some of the cases. Due to the collapse of treatments, it may sum additional days to a treatment, resulting in extensive treatments (>1095 days) that would not be considered as valid treatments to SENDA professionals. We distinguished between cases that corresponded to users that had more than one treatment, and cases that did not have an available date of discharge.

#table(CONS_C1_df_dup_JUL_2020_cons14$tipo_de_plan_2,CONS_C1_df_dup_JUL_2020_cons14$tipo_de_programa_2)
#table(CONS_C1$Tipo.de.Plan,CONS_C1$Tipo.de.Programa)

mas_1095_usuarios<-
      CONS_C1_df_dup_JUL_2020_cons14b%>%
        dplyr::filter(dias_treat_imp_sin_na>1095)%>%
        dplyr::distinct(hash_key)%>%
        unlist()%>%
        as.character()

df_tab5<-
CONS_C1_df_dup_JUL_2020_cons14b%>%
    dplyr::filter(hash_key %in% mas_1095_usuarios)%>%
    dplyr::select(hash_key, ano_bd_first, ano_bd_last, fech_ing, dias_treat_imp_sin_na, fech_egres_imp,tipo_de_plan_2,senda)%>%
  dplyr::group_by(hash_key)%>%
  dplyr::mutate(count=n(),rn=row_number())%>%
  dplyr::filter(ano_bd_first<=2015, rn==row_number(),dias_treat_imp_sin_na>1095)

#Individuos con un caso, sin fecha de egreso y ordenada por dias de tratamiento.
in_un_caso_dias_trat_mas_1095<-
CONS_C1_df_dup_JUL_2020_cons14b%>%
    dplyr::filter(hash_key %in% mas_1095_usuarios)%>%
    dplyr::select(hash_key, ano_bd_first, ano_bd_last, fech_ing, dias_treat_imp_sin_na, fech_egres_imp,tipo_de_plan_2,senda)%>%
  dplyr::group_by(hash_key)%>%
  dplyr::mutate(count=n(),rn=row_number())%>%
  dplyr::filter(count==1)%>%
  dplyr::arrange(desc(dias_treat_imp_sin_na))

  #tidyr::pivot_wider(names_from=rn,values_from=fech_ing)%>%
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
#TABLA DE MOTIVOS DE EGRESO POR TRIMESTRE
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
invisible=5
if(invisible==4){
CONS_C1_df_dup_JUL_2020_cons14b%>%
  dplyr::mutate(date_by_quarter = lubridate::round_date(as.Date(fech_ing), unit="quarter"))%>%
  janitor::tabyl(date_by_quarter,motivodeegreso_mod_imp)
}
to_string <- as_labeller(c(`0` = "More than one entry by user (n=163)", `1` = "Only one entry by user (n=555)"))

desc_dias_mas_1095<-
CONS_C1_df_dup_JUL_2020_cons14b%>%
    dplyr::mutate(in_un_caso_dias_trat_mas_1095=ifelse(hash_key %in% unlist(in_un_caso_dias_trat_mas_1095["hash_key"]),1,0))%>%
    dplyr::mutate(in_un_caso_dias_trat_mas_1095=factor(in_un_caso_dias_trat_mas_1095))%>%
    dplyr::filter(dias_treat_imp_sin_na>1095)%>%
    ungroup()%>%
    dplyr::mutate(vacio_fech_egres=ifelse(!is.na(fech_egres_imp),1,0))%>%
    dplyr::select(dias_treat_imp_sin_na,vacio_fech_egres,in_un_caso_dias_trat_mas_1095)%>%
    dplyr::mutate(group=dplyr::case_when(vacio_fech_egres==1 & in_un_caso_dias_trat_mas_1095==1~"With missing date of discharge & only one case",vacio_fech_egres==0 & in_un_caso_dias_trat_mas_1095==1~"With date of discharge & only one case",vacio_fech_egres==1 & in_un_caso_dias_trat_mas_1095==0~"With missing date of discharge & more than one case",vacio_fech_egres==0 & in_un_caso_dias_trat_mas_1095==0~"With date of discharge & more than one case",TRUE~NA_character_))%>%
    dplyr::group_by(group)%>%
    summarise(
        `n` = n(),
        `Mdn` = median(dias_treat_imp_sin_na, na.rm = TRUE),
        `IQR` = IQR(dias_treat_imp_sin_na, na.rm = TRUE),
        `Perc. 25`= quantile(dias_treat_imp_sin_na,.25,na.rm=T),
        `Perc. 75`= quantile(dias_treat_imp_sin_na,.75,na.rm=T))

  library(gridExtra)
fig<-
CONS_C1_df_dup_JUL_2020_cons14b%>%
    dplyr::mutate(in_un_caso_dias_trat_mas_1095=ifelse(hash_key %in% unlist(in_un_caso_dias_trat_mas_1095["hash_key"]),1,0))%>%
  dplyr::mutate(in_un_caso_dias_trat_mas_1095=factor(in_un_caso_dias_trat_mas_1095))%>%
  dplyr::filter(dias_treat_imp_sin_na>1095)%>%
      ungroup()%>%
    dplyr::mutate(vacio_fech_egres=ifelse(!is.na(fech_egres_imp),1,0))%>%
    dplyr::select(dias_treat_imp_sin_na,vacio_fech_egres,in_un_caso_dias_trat_mas_1095)%>%
ggplot(aes(x = factor(vacio_fech_egres), y = dias_treat_imp_sin_na, group=factor(vacio_fech_egres))) +
    geom_boxplot() +
      geom_jitter(shape = 15,
        color = "steelblue",
        position = position_jitter(width = 0.21)) +
    theme_classic()+
  labs(x="Date of Discharge Not Available (=1)",
       y="Days in treatment until the date of retrieval of the dataset")+
  facet_wrap(~in_un_caso_dias_trat_mas_1095,ncol = 2, labeller = to_string)

tt <- ttheme_default(colhead=list(fg_params = list(parse=TRUE)),
                     base_size = 7.5, padding = unit(c(3, 4), "mm"))
tbl <- tableGrob(desc_dias_mas_1095, rows=NULL,theme=tt)

grid.arrange(fig, tbl, 
             nrow = 2, heights = c(4, 1),
             as.table = TRUE)

Figure 6. Criteria to Transform Variables

#:#:#:#:#:#:#:#:#:#:#
#REGRESION
if(invisible==4){
model<-
    CONS_C1_df_dup_JUL_2020_cons14b%>%
        dplyr::mutate(in_un_caso_dias_trat_mas_1095=ifelse(hash_key %in% unlist(in_un_caso_dias_trat_mas_1095["hash_key"]),1,0))%>%
      dplyr::mutate(in_un_caso_dias_trat_mas_1095=factor(in_un_caso_dias_trat_mas_1095))%>%
      dplyr::filter(dias_treat_imp_sin_na>1095)%>%
          ungroup()%>%
        dplyr::mutate(vacio_fech_egres=factor(ifelse(!is.na(fech_egres_imp),1,0)))%>%
        dplyr::select(dias_treat_imp_sin_na,vacio_fech_egres,in_un_caso_dias_trat_mas_1095)
  library(lsmeans)
  refR <- lsmeans(lm(dias_treat_imp_sin_na~ vacio_fech_egres*in_un_caso_dias_trat_mas_1095, data=model),
                  specs = c("vacio_fech_egres","in_un_caso_dias_trat_mas_1095"))
  g4R <- ggplot(data.frame(refR), aes(x= vacio_fech_egres, y=lsmean,group=in_un_caso_dias_trat_mas_1095, colour=in_un_caso_dias_trat_mas_1095))+
  geom_errorbar(aes(ymin=lower.CL, ymax=upper.CL), width=.1,position=position_dodge(0.1), size=1) +
  geom_point(position=position_dodge(0.1), size=2)+
    xlab("Date of discharge is empty (=1)")+
    ylab("Days of treatment")+
    theme_bw() +
    geom_rect_interactive(alpha = 0.1, xmin=.1, xmax=.1, ymin=.1,ymax=.1) +
      # Remove plot elements added by geom_rect_interactive
    theme(legend.position="bottom")+
    guides(color=guide_legend(ncol=4,name = "Cause of Discharge"))+
    labs(color="More than one entry by user")+
    scale_colour_brewer(palette = "Set1",labels=c("More than one entry by user", "Only one entry by user"))+
    theme(legend.title = element_blank())
  g4R
}

There were 718 entries with >1095 days, which affects 950 users. As seen in the figure presented above, cases with more than 2000 days of treatment could be considered as anomalies, except for the group of the cases that showed only one entry by user and date of discharge, in which more than 50% of the treatments show more than 1800 days of treatment.

cases_w_intermediate_entries_w_more_1095d<-
CONS_C1_df_dup_JUL_2020_cons14b%>%
    dplyr::filter(hash_key %in% mas_1095_usuarios)%>%
    dplyr::arrange(hash_key,desc(fech_ing))%>%
    dplyr::group_by(hash_key)%>%
    dplyr::mutate(count=n(),rn=row_number())%>%
    dplyr::ungroup()%>%
    dplyr::mutate(mas_1095d_trat_intermedio=dplyr::case_when(rn>1 & dias_treat_imp_sin_na>1095~1,TRUE~0))%>%
    dplyr::group_by(hash_key)%>%
    dplyr::mutate(users_w_prob_cases=sum(mas_1095d_trat_intermedio))%>%
    dplyr::ungroup()%>%
  #Para ver los casos anomalos e intermedios
  #dplyr::filter(users_w_prob_cases==1)%>%
  dplyr::select(row,hash_key, fech_ing, dias_treat_imp_sin_na, fech_egres_imp,tipo_de_plan_2,senda,mas_1095d_trat_intermedio,rn,count,row_cont_entries,users_w_prob_cases)%>%
  #View()
    dplyr::filter(mas_1095d_trat_intermedio>0)

no_mostrar=1
if(no_mostrar==0){
  invisible(c("importante: los casos que se unieron pueden llegar hasta 3000 días"))
CONS_C1_df_dup_JUL_2020_cons14b%>%
    dplyr::filter(dias_treat_imp_sin_na>1095)%>%
    dplyr::filter(!is.na(row_cont_entries))%>%
    summarise(
        `n` = n(),
        `Mdn` = median(dias_treat_imp_sin_na, na.rm = TRUE),
        `IQR` = IQR(dias_treat_imp_sin_na, na.rm = TRUE),
        `Perc. 25`= quantile(dias_treat_imp_sin_na,.25,na.rm=T),
        `Perc. 75`= quantile(dias_treat_imp_sin_na,.75,na.rm=T),
        `max`=max(dias_treat_imp_sin_na,na.rm=T))
}

We decided to discard 718 entries with more than 1095 days of treatment, excepting intermediate entries of users with more than one treatment that had more than 1095 days (that is to say, users that had a treatment before and after the treatment with more than 1095 days of treatment) (n=71; users=71).

rows_more_1095d_no_int_treat_discarded<-
CONS_C1_df_dup_JUL_2020_cons14b%>%
  dplyr::filter(dias_treat_imp_sin_na>1095)%>%
  dplyr::filter(!row %in% unlist(cases_w_intermediate_entries_w_more_1095d[,"row"]))%>%
  dplyr::select(row)

CONS_C1_df_dup_JUL_2020_cons14b%>%
  dplyr::mutate(discarded_more_1095d_treat=ifelse(row %in% unlist(rows_more_1095d_no_int_treat_discarded[,"row"]),1,0))%>%
  dplyr::group_by(hash_key)%>%
  dplyr::mutate(discarded_more_1095d_treat_by_hash=sum(discarded_more_1095d_treat,na.rm = T))%>%
  dplyr::ungroup()%>%
  dplyr::mutate(obs=case_when(discarded_more_1095d_treat_by_hash>0~glue::glue("{obs};4.98.HASH w/ cases w/ >1095 d of treat"),TRUE~obs))%>%  
  dplyr::filter(!row %in% unlist(rows_more_1095d_no_int_treat_discarded[,"row"]))%>%
  dplyr::select(-discarded_more_1095d_treat,-discarded_more_1095d_treat_by_hash)%>%
  assign("CONS_C1_df_dup_JUL_2020_cons15",., envir = .GlobalEnv)
  ##CONS_C1_df_dup_JUL_2020_cons15 %>%  dplyr::filter(grepl('4.99.', obs)) 
#%>% View() %>% dplyr::group_by(obs) %>% summarise(n=n()) %>% dplyr::filter(n,grepl('4.99.', obs)) %>% View()

Consistencies in age of onset of drug use, age of onset of drug Use in the primary substance at admission, and age at admission

As a result of collapsing treatments and the imputation of ages of onset (edad_ini_cons or edad_ini_sus_prin), some ages shown inconsistencies due to a greater Age of Onset of Drug Use posterior to the first date of admission by user (n=104).

Also, there are many cases with an Age of Onset of Drug Use in the Primary Substance lower to the Age of Onset of Drug Use, or greater to the first age at admission for each user in each specific primary substance at admission (n= 2,532).

We decided that both values should be declared as missing values. Must take note that there may still be inconsistencies within each user in terms of the age of onset of drug use in the primary substance. For this particular variable, we were faced with a great mount of missing values, in part due to inconsistencies with the other ages reported. This is why we treated them as missing values, despite the efforts to generate a more reliable value.

paste0("Number of users with more than age of onset of drug use: ",CONS_C1_df_dup_JUL_2020_cons15 %>% dplyr::group_by(hash_key) %>% dplyr::mutate(n_edad_ini_cons=n_distinct(edad_ini_cons)) %>% dplyr::ungroup() %>%  dplyr::filter(n_edad_ini_cons>1) %>% nrow())

[1] "Number of users with more than age of onset of drug use: 0"

paste0("Number of users with more than age of onset of drug use of the primary substance at admission: ",CONS_C1_df_dup_JUL_2020_cons15 %>% dplyr::group_by(hash_key,sus_principal_mod) %>% dplyr::mutate(n_edad_ini_cons=n_distinct(edad_ini_sus_prin)) %>% dplyr::ungroup() %>%  dplyr::filter(n_edad_ini_cons>1) %>% nrow()) #704

[1] "Number of users with more than age of onset of drug use of the primary substance at admission: 704"

#Ver los casos con más de un valor en la edad de inicio de la sustancia principal.

#CONS_C1_df_dup_JUL_2020_cons15 %>% dplyr::group_by(hash_key,sus_principal_mod) %>% dplyr::mutate(n_edad_ini_cons=n_distinct(edad_ini_sus_prin)) %>% dplyr::ungroup() %>%  dplyr::filter(n_edad_ini_cons>1) %>% dplyr::select(hash_key, sus_principal_mod, edad_ini_sus_prin, edad_ini_cons) %>% View()

#Ver los casos que por validación correspondería conservar.

#CONS_C1_df_dup_JUL_2020_cons15 %>% 
#    dplyr::group_by(hash_key,sus_principal_mod) %>% 
#    dplyr::mutate(n_edad_ini_cons=n_distinct(edad_ini_sus_prin)) %>% 
#    dplyr::mutate(min_edad_al_ing=min(edad_al_ing,na.rm=T)) %>% 
#    dplyr::ungroup() %>%  
#    dplyr::filter(n_edad_ini_cons>1) %>% 
#    dplyr::select(hash_key, sus_principal_mod, edad_ini_sus_prin, edad_ini_cons,min_edad_al_ing) %>% #
#    dplyr::filter(edad_ini_sus_prin>=edad_ini_cons,edad_ini_sus_prin<min_edad_al_ing) %>% 
#    nrow()

#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:
#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:
CONS_C1_df_dup_JUL_2020_cons15 %>% 
  dplyr::group_by(hash_key) %>% 
  dplyr::mutate(min_edad_al_ing=min(edad_al_ing,na.rm=T),min_edad_al_ing=ifelse(is.infinite(min_edad_al_ing), NA, min_edad_al_ing)) %>% 
  dplyr::ungroup() %>%  
  dplyr::mutate(edad_ini_cons=dplyr::case_when(edad_ini_cons>min_edad_al_ing~NA_integer_,TRUE~edad_ini_cons)) %>%
  dplyr::group_by(hash_key,sus_principal_mod) %>%
  dplyr::mutate(min_edad_al_ing=min(edad_al_ing,na.rm=T),min_edad_al_ing=ifelse(is.infinite(min_edad_al_ing), NA, min_edad_al_ing)) %>% 
  dplyr::ungroup() %>%
  dplyr::mutate(edad_ini_sus_prin=dplyr::case_when(is.na(edad_ini_cons)~NA_integer_,TRUE~edad_ini_sus_prin)) %>% 
  dplyr::mutate(edad_ini_sus_prin=dplyr::case_when((edad_ini_sus_prin>min_edad_al_ing)|(edad_ini_cons>edad_ini_sus_prin)~NA_integer_,TRUE~edad_ini_sus_prin)) %>% 
  dplyr::group_by(hash_key,sus_principal_mod) %>% dplyr::mutate(n_edad_ini_sus_prin=n_distinct(edad_ini_sus_prin)) %>% 
  dplyr::ungroup() %>%  
  dplyr::mutate(edad_ini_sus_prin=ifelse(n_edad_ini_sus_prin>1,NA_integer_,edad_ini_sus_prin)) %>% 
  #edad_ini_sus_prin>min_edad_al_ing|edad_ini_cons>edad_ini_sus_prin
  #dplyr::filter(edad_ini_sus_prin>min_edad_al_ing) %>% 
  #dplyr::filter(edad_ini_cons>edad_ini_sus_prin) %>% 
  dplyr::select(-n_edad_ini_sus_prin,-min_edad_al_ing) %>% 
  assign("CONS_C1_df_dup_JUL_2020_cons15a",., envir = .GlobalEnv)

Warning in min(edad_al_ing, na.rm = T): ningún argumento finito para min;
retornando Inf

Warning in min(edad_al_ing, na.rm = T): ningún argumento finito para min;
retornando Inf

Warning in min(edad_al_ing, na.rm = T): ningún argumento finito para min;
retornando Inf

Warning in min(edad_al_ing, na.rm = T): ningún argumento finito para min;
retornando Inf

Warning in min(edad_al_ing, na.rm = T): ningún argumento finito para min;
retornando Inf

Warning in min(edad_al_ing, na.rm = T): ningún argumento finito para min;
retornando Inf

Warning in min(edad_al_ing, na.rm = T): ningún argumento finito para min;
retornando Inf

Warning in min(edad_al_ing, na.rm = T): ningún argumento finito para min;
retornando Inf

Warning in min(edad_al_ing, na.rm = T): ningún argumento finito para min;
retornando Inf

Warning in min(edad_al_ing, na.rm = T): ningún argumento finito para min;
retornando Inf

Warning in min(edad_al_ing, na.rm = T): ningún argumento finito para min;
retornando Inf

Warning in min(edad_al_ing, na.rm = T): ningún argumento finito para min;
retornando Inf

Warning in min(edad_al_ing, na.rm = T): ningún argumento finito para min;
retornando Inf

Warning in min(edad_al_ing, na.rm = T): ningún argumento finito para min;
retornando Inf

Warning in min(edad_al_ing, na.rm = T): ningún argumento finito para min;
retornando Inf

Warning in min(edad_al_ing, na.rm = T): ningún argumento finito para min;
retornando Inf

Warning in min(edad_al_ing, na.rm = T): ningún argumento finito para min;
retornando Inf

Warning in min(edad_al_ing, na.rm = T): ningún argumento finito para min;
retornando Inf

Warning in min(edad_al_ing, na.rm = T): ningún argumento finito para min;
retornando Inf

Warning in min(edad_al_ing, na.rm = T): ningún argumento finito para min;
retornando Inf

Warning in min(edad_al_ing, na.rm = T): ningún argumento finito para min;
retornando Inf

Warning in min(edad_al_ing, na.rm = T): ningún argumento finito para min;
retornando Inf

Warning in min(edad_al_ing, na.rm = T): ningún argumento finito para min;
retornando Inf

Warning in min(edad_al_ing, na.rm = T): ningún argumento finito para min;
retornando Inf

Warning in min(edad_al_ing, na.rm = T): ningún argumento finito para min;
retornando Inf

Warning in min(edad_al_ing, na.rm = T): ningún argumento finito para min;
retornando Inf

Warning in min(edad_al_ing, na.rm = T): ningún argumento finito para min;
retornando Inf

Warning in min(edad_al_ing, na.rm = T): ningún argumento finito para min;
retornando Inf

Warning in min(edad_al_ing, na.rm = T): ningún argumento finito para min;
retornando Inf

Warning in min(edad_al_ing, na.rm = T): ningún argumento finito para min;
retornando Inf

Warning in min(edad_al_ing, na.rm = T): ningún argumento finito para min;
retornando Inf

Warning in min(edad_al_ing, na.rm = T): ningún argumento finito para min;
retornando Inf

Information on diagnoses

We ordered the ICD-10 and DSM-IV diagnostics of any given case, depending on how many different diagnoses each case may had. First we started recoding the sub categories. Once we had every distinct sub-cateogry, we looked for cases with no available sub-categories. If they had only a diagnosis “In study” (“En estudio”), we conserved this general diagnosis (category). It was not possible to work with subcategories, because an important part of the more general diagnostics did not have any diagnosis on the subcategory. For example, in the first column of DSM-IV, a 43% had a general diagnose but did not have a sub-category related. Same with ICD-10 (63%). This led us to conclude that diagnostic categories were not related to sub-categories, once these processes are completed, in order to avoid loosing information. DSM-IV and ICD-10 Diagnoses and sub-categories are presented as a separated list of unique values by treatment, independently from one another.

Additionally, we generated a category to detect whether a user had at least one CIE-10 diagnosis (cie_10). The same was done to DSM-IV diagnoses (dsm_iv). Also, we organized some secondary variables, such as physical diagnosis and other diagnoses. Lastly, we generated several variables to count how many different diagnoses each entry had.

dg_trs_psiq_sub_dsm_iv_or_cat<-CONS_C1_df_dup_JUL_2020_cons15a%>% 
  dplyr::mutate(dg_trs_psiq_sub_dsm_iv_or=stringr::str_trim(as.character(dg_trs_psiq_sub_dsm_iv_or)))%>% 
  janitor::tabyl(dg_trs_psiq_sub_dsm_iv_or)%>% data.frame()%>% 
  dplyr::select(dg_trs_psiq_sub_dsm_iv_or)%>% 
  dplyr::filter(!dg_trs_psiq_sub_dsm_iv_or %in% c(NA))%>% unlist()%>% as.character() 
library(readxl)
sub_dsm_iv_to_cie_10_comp_table <- read_excel(paste0(path,"/sub_dsm_iv_to_cie_10_comp_table.xlsx"), col_names = c("original","mod"))
#PARA DEJAR LAS CATEGORIAS GENERALES ACTUALIZADAS:  
cat_dsm_iv_desde_sub_dsm_iv<-
  CONS_C1_df_dup_JUN_2020%>%
  janitor::tabyl(dg_trs_psiq_sub_dsm_iv_or,dg_trs_psiq_dsm_iv_or)%>%
  melt()%>%#glimpse()
  data.frame()%>%
  dplyr::arrange(dg_trs_psiq_sub_dsm_iv_or,desc(value))%>%
  dplyr::group_by(dg_trs_psiq_sub_dsm_iv_or)%>%
  slice(1)%>%
  dplyr::mutate(variable=str_trim(variable))%>%
  dplyr::mutate(dg_trs_psiq_sub_dsm_iv_or=str_trim(dg_trs_psiq_sub_dsm_iv_or))

Warning in melt(.): The melt generic in data.table has been passed a tabyl
and will attempt to redirect to the relevant reshape2 method; please note that
reshape2 is deprecated, and this redirection is now deprecated as well. To
continue using melt methods from reshape2 while both libraries are attached,
e.g. melt.list, you can prepend the namespace like reshape2::melt(.). In the
next version, this warning will become an error.

Using dg_trs_psiq_sub_dsm_iv_or as id variables

slice (grouped): removed 1,173 rows (94%), 69 rows remaining

sub_cie_10_to_cie_10_comp_table <-
  CONS_C1_df_dup_JUN_2020%>%
  janitor::tabyl(dg_trs_psiq_sub_cie_10_or,dg_trs_psiq_cie_10_or)%>%
  melt()%>%#glimpse()
  data.frame()%>%
  dplyr::arrange(dg_trs_psiq_sub_cie_10_or,desc(value))%>%
  group_by(dg_trs_psiq_sub_cie_10_or)%>%
  slice(1)%>%
  dplyr::mutate(variable=str_trim(variable))

Warning in melt(.): The melt generic in data.table has been passed a tabyl
and will attempt to redirect to the relevant reshape2 method; please note that
reshape2 is deprecated, and this redirection is now deprecated as well. To
continue using melt methods from reshape2 while both libraries are attached,
e.g. melt.list, you can prepend the namespace like reshape2::melt(.). In the
next version, this warning will become an error.

Using dg_trs_psiq_sub_cie_10_or as id variables

group_by: one grouping variable (dg_trs_psiq_sub_cie_10_or)

slice (grouped): removed 702 rows (93%), 54 rows remaining

sub_dsm_iv_to_cie_10_comp_table <-
  cat_dsm_iv_desde_sub_dsm_iv%>%
  dplyr::left_join(sub_dsm_iv_to_cie_10_comp_table,by=c("dg_trs_psiq_sub_dsm_iv_or"="original"))
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_

dg_trs_psiq_sub_cie_10_or_cat<-CONS_C1_df_dup_JUL_2020_cons15a%>% 
  dplyr::mutate(dg_trs_psiq_sub_cie_10_or=stringr::str_trim(as.character(dg_trs_psiq_sub_cie_10_or)))%>% 
  janitor::tabyl(dg_trs_psiq_sub_cie_10_or)%>% data.frame()%>% 
  select(dg_trs_psiq_sub_cie_10_or)%>% 
  dplyr::filter(!dg_trs_psiq_sub_cie_10_or %in% c(NA))%>% unlist()%>% as.character()

select: dropped 3 variables (n, percent, valid_percent)

#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
CONS_C1_df_dup_JUL_2020_cons15a%>%
  dplyr::mutate(across(c(contains("psiq_sub_dsm_iv"),contains("psiq_sub_cie_10"),
                         diagnostico_trs_fisico,otros_probl_at_sm_or),~stringr::str_trim(as.character(.))))%>%
  #PARA CONVERTIR LOS DSM EN HOMOLOGACIONES A CIE 10 IN BRACKETS
  dplyr::left_join(dplyr::select(sub_dsm_iv_to_cie_10_comp_table,dg_trs_psiq_sub_dsm_iv_or,mod),by=c("dg_trs_psiq_sub_dsm_iv_or"="dg_trs_psiq_sub_dsm_iv_or"))%>%
  dplyr::left_join(dplyr::select(sub_dsm_iv_to_cie_10_comp_table,dg_trs_psiq_sub_dsm_iv_or,mod),by=c("x2_dg_trs_psiq_sub_dsm_iv_or"="dg_trs_psiq_sub_dsm_iv_or"))%>%
  dplyr::left_join(dplyr::select(sub_dsm_iv_to_cie_10_comp_table,dg_trs_psiq_sub_dsm_iv_or,mod),by=c("x3_dg_trs_psiq_sub_dsm_iv_or"="dg_trs_psiq_sub_dsm_iv_or"))%>%
  dplyr::left_join(dplyr::select(sub_dsm_iv_to_cie_10_comp_table,dg_trs_psiq_sub_dsm_iv_or,mod),by=c("x4_dg_trs_psiq_sub_dsm_iv_or"="dg_trs_psiq_sub_dsm_iv_or"))%>%
  dplyr::select(-contains("psiq_sub_dsm_iv"))%>%
  dplyr::rename("dg_trs_psiq_sub_dsm_iv_or"="mod.x", "x2_dg_trs_psiq_sub_dsm_iv_or"="mod.y", 
                "x3_dg_trs_psiq_sub_dsm_iv_or"="mod.x.x", "x4_dg_trs_psiq_sub_dsm_iv_or"="mod.y.y")%>%
  #PARA EXPLORACÓMO ESTÁ RECODIFICANDOLOS
    #dplyr::filter(row=="117796")%>%
    #dplyr::select(contains("dsm_iv"))
  #glimpse()
  dplyr::mutate(mod_cie_10_or =pmap_chr(select(.,contains("psiq_cie_10")), ~toString2(unique(na.omit(c(...))))))%>%
  dplyr::mutate(mod_dsm_iv_or =pmap_chr(select(.,contains("psiq_dsm_iv")), ~toString2(unique(na.omit(c(...))))))%>%
  dplyr::mutate(mod_sub_dsm_iv_or =pmap_chr(select(.,contains("psiq_sub_dsm_iv")), ~toString2(unique(na.omit(c(...))))))%>%
  dplyr::mutate(mod_sub_cie_10_or =pmap_chr(select(.,contains("psiq_sub_cie_10")), ~toString2(unique(na.omit(c(...))))))%>%
  
  dplyr::mutate(mod_cie_10_or= sub("Trastornos de los hábitos y del control de los impulsos;", "Trastornos de los hábitos y del control de los impulsos(F63);",mod_cie_10_or))%>%
  dplyr::mutate(mod_cie_10_or= sub("Trastornos de los hábitos y del control de los impulsos$", "Trastornos de los hábitos y del control de los impulsos(F63)",mod_cie_10_or))%>%
  dplyr::mutate(mod_cie_10_or= sub("; Sin trastorno\\(NA\\)$",replacement= "",mod_cie_10_or,ignore.case=T,perl=T))%>%
  dplyr::mutate(mod_cie_10_or= sub("^Sin trastorno\\(NA\\); ",replacement= "",mod_cie_10_or))%>%
  dplyr::mutate(mod_cie_10_or= sub(";Sin trastorno\\(NA\\);",replacement= ";",mod_cie_10_or))%>%
  dplyr::mutate(mod_cie_10_or= stringr::str_replace_all(mod_cie_10_or, "; Sin trastorno\\(NA\\);", ";"))%>%
  
  dplyr::mutate(mod_dsm_iv_or= sub("; Sin trastorno$",replacement= "",mod_dsm_iv_or,ignore.case=T,perl=T))%>%
  dplyr::mutate(mod_dsm_iv_or= sub("^Sin trastorno; ",replacement= "",mod_dsm_iv_or,ignore.case=T,perl=T))%>%
  dplyr::mutate(mod_dsm_iv_or= sub(";Sin trastorno;",replacement= ";",mod_dsm_iv_or,ignore.case=T,perl=T))%>%
  dplyr::mutate(mod_dsm_iv_or= stringr::str_replace_all(mod_dsm_iv_or, "; Sin trastorno;", ";"))%>%
  
  dplyr::mutate(mod_cie_10_or=dplyr::case_when(grepl("Sin trastorno\\(NA\\); En estudio\\(NA\\)",mod_cie_10_or)~"En estudio(NA)",grepl("En estudio\\(NA\\); Sin trastorno\\(NA\\)",mod_cie_10_or)~"En estudio(NA)",TRUE~mod_cie_10_or))%>%
  dplyr::mutate(mod_dsm_iv_or=dplyr::case_when(grepl("Sin trastorno; En estudio",mod_dsm_iv_or)~"En estudio",
                                               grepl("En estudio; Sin trastorno",mod_dsm_iv_or)~"En estudio",
                                               TRUE~mod_dsm_iv_or))%>%
  
       dplyr::mutate(across(c("mod_dsm_iv_or","mod_cie_10_or","diagnostico_trs_fisico","otros_probl_at_sm_or"),~str_count(., pattern = ";")+1,.names="cnt_{col}"))%>% 

   assign("CONS_C1_df_dup_JUL_2020_cons15b",., envir = .GlobalEnv)

select: dropped 127 variables (row, row_cont_entries, hash_key, hash_rut_completo, id, …)

select: dropped 130 variables (row, row_cont_entries, hash_key, hash_rut_completo, id, …)

select: dropped 131 variables (row, row_cont_entries, hash_key, hash_rut_completo, id, …)

select: dropped 132 variables (row, row_cont_entries, hash_key, hash_rut_completo, id, …)

#PARA REVISAR LAS DISTRIBUCIONES 
#CONS_C1_df_dup_JUL_2020_cons15b%>% dplyr::select(row,contains("cnt"))%>% names()
#CONS_C1_df_dup_JUL_2020_cons15b%>% janitor::tabyl(mod_cie_10_or)%>% guardar_tablas("revision_cie_10_sub")
#CONS_C1_df_dup_JUL_2020_cons15b%>% janitor::tabyl(mod_dsm_iv_or)%>% guardar_tablas("revision_dsm_iv_sub")

#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:
#SEPARAR EN COLUMNAS
CONS_C1_df_dup_JUL_2020_cons15b%>%
  #LOS EN ESTUDIO NO VAN CON SUBCATEGORIAS
  #janitor::tabyl(mod_sub_dsm_iv_or)
  #----> siempre debería ser EN estudio primero
  #si no hay sub categorias diagnosticas y hay un En estudio
  tidyr::separate(mod_sub_dsm_iv_or, c("dg_trs_psiq_sub_dsm_iv_or", "x2_dg_trs_psiq_sub_dsm_iv_or","x3_dg_trs_psiq_sub_dsm_iv_or","x4_dg_trs_psiq_sub_dsm_iv_or"), 
                  extra = "merge", fill = "warn", sep="; ")%>%
  
  dplyr::mutate(dg_trs_psiq_dsm_iv_or=ifelse(dg_trs_psiq_dsm_iv_or==""|dg_trs_psiq_dsm_iv_or=="Sin trastorno(NA)",NA_character_,dg_trs_psiq_dsm_iv_or))%>%
   #_#_#_#_#
  #PARA VER QUÉ TIPO DE CASOS ESTÁN DISPONIBLES Y SI TIENEN CARACTERES REPETIDOS.
  #_#_#_#_#
  #dplyr::select(contains("sub_dsm_iv"))%>%View()
  #dplyr::filter(!is.na(dg_trs_psiq_sub_dsm_iv_or),dg_trs_psiq_sub_dsm_iv_or!="",!is.na(x4_dg_trs_psiq_sub_dsm_iv_or))%>% View()
  tidyr::separate(mod_sub_cie_10_or, c("dg_trs_psiq_sub_cie_10_or", "x2_dg_trs_psiq_sub_cie_10_or","x3_dg_trs_psiq_sub_cie_10_or","x4_dg_trs_psiq_sub_cie_10_or"), 
                  extra = "merge", fill = "warn", sep="; ")%>%
  
  tidyr::separate(mod_dsm_iv_or, c("dg_trs_psiq_dsm_iv_or", "x2_dg_trs_psiq_dsm_iv_or","x3_dg_trs_psiq_dsm_iv_or","x4_dg_trs_psiq_dsm_iv_or"), 
                  extra = "merge", fill = "warn", sep="; ")%>%
  tidyr::separate(mod_cie_10_or, c("dg_trs_psiq_cie_10_or", "x2_dg_trs_psiq_cie_10_or","x3_dg_trs_psiq_cie_10_or","x4_dg_trs_psiq_cie_10_or","x5_dg_trs_psiq_cie_10_or"), extra = "merge", fill = "warn", sep="; ")%>%
  
  dplyr::mutate(dg_trs_psiq_cie_10_or=ifelse(dg_trs_psiq_cie_10_or==""|dg_trs_psiq_cie_10_or=="Sin trastorno(NA)",NA_character_,dg_trs_psiq_cie_10_or))%>%
  #_#_#_#_#
  #PARA VER QUÉ TIPO DE CASOS ESTÁN DISPONIBLES Y SI TIENEN CARACTERES REPETIDOS.
  #_#_#_#_#
  #dplyr::filter(!is.na(dg_trs_psiq_sub_cie_10_or),dg_trs_psiq_sub_cie_10_or!="",!is.na(x4_dg_trs_psiq_sub_cie_10_or))%>% 
  #dplyr::select(contains("sub_cie_10"))%>%View()
  assign("CONS_C1_df_dup_JUL_2020_cons15c",., envir = .GlobalEnv)

Warning: Expected 4 pieces. Missing pieces filled with `NA` in 109752 rows [1,
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, ...].

Warning: Expected 4 pieces. Missing pieces filled with `NA` in 109748 rows [1,
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, ...].

Warning: Expected 4 pieces. Missing pieces filled with `NA` in 109753 rows [1,
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, ...].

Warning: Expected 5 pieces. Missing pieces filled with `NA` in 109754 rows [1,
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, ...].

#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#::#:#:#:#:#:#:#
#REGULARIZAR OTRAS CATEGORIAS
CONS_C1_df_dup_JUL_2020_cons15c%>% 
  #janitor::tabyl(diagnostico_trs_fisico)
  dplyr::mutate(diagnostico_trs_fisico= sub("; Sin trastorno$",replacement= "",diagnostico_trs_fisico,ignore.case =T,perl=T))%>%
  dplyr::mutate(diagnostico_trs_fisico= sub("^Sin trastorno; ",replacement= "",diagnostico_trs_fisico,ignore.case =T,perl=T))%>%
  dplyr::mutate(diagnostico_trs_fisico= sub(";Sin trastorno;",replacement= ";",diagnostico_trs_fisico,ignore.case =T,perl=T))%>%
  dplyr::mutate(diagnostico_trs_fisico= stringr::str_replace_all(diagnostico_trs_fisico, "; Sin trastorno;", ";"))%>%
  #janitor::tabyl(diagnostico_trs_fisico)
  dplyr::mutate(otros_probl_at_sm_or= sub("; Sin otros problemas de salud mental$",replacement= "",otros_probl_at_sm_or,ignore.case = T, perl = T))%>%
  dplyr::mutate(otros_probl_at_sm_or= sub("^Sin otros problemas de salud mental; ",replacement= "",otros_probl_at_sm_or,ignore.case = T, perl = T))%>%
  dplyr::mutate(otros_probl_at_sm_or= sub(";Sin otros problemas de salud mental;",replacement= ";",otros_probl_at_sm_or,ignore.case = T, perl = T))%>%
  dplyr::mutate(otros_probl_at_sm_or= stringr::str_replace_all(otros_probl_at_sm_or, "; Sin otros problemas de salud mental;", ";"))%>%
  #janitor::tabyl(otros_probl_at_sm_or)
  dplyr::select(-contains("x6_dg"))%>%
  dplyr::mutate(dg_trs_psiq_sub_dsm_iv_or=dplyr::case_when(dg_trs_psiq_sub_dsm_iv_or==""~NA_character_,TRUE~dg_trs_psiq_sub_dsm_iv_or))%>%
  dplyr::mutate(dg_trs_psiq_sub_cie_10_or= ifelse(grepl("^$|^ $", dg_trs_psiq_sub_cie_10_or)==TRUE, NA,dg_trs_psiq_sub_cie_10_or))%>%
  dplyr::mutate(dg_trs_psiq_dsm_iv_or= ifelse(grepl("^$|^ $", dg_trs_psiq_dsm_iv_or)==TRUE, NA,dg_trs_psiq_dsm_iv_or))%>%
  assign("CONS_C1_df_dup_JUL_2020_cons16",., envir = .GlobalEnv)

sin_mostrar=1
if (sin_mostrar=="00"){
  invisible(c("ES PARA VER CÓMO SE COMPORTA"))
CONS_C1_df_dup_JUL_2020_cons15d%>%
  dplyr::select(row,dg_trs_psiq_cie_10_or,dg_trs_psiq_sub_cie_10_or,x2_dg_trs_psiq_cie_10_or,x2_dg_trs_psiq_sub_cie_10_or,x3_dg_trs_psiq_cie_10_or,x3_dg_trs_psiq_sub_cie_10_or,x4_dg_trs_psiq_cie_10_or,x4_dg_trs_psiq_sub_cie_10_or)%>% 
  #dplyr::filter(as.character(row) %in% c("29875", "25234"))%>% #29875
  #dplyr::filter(as.character(row) %in% c("89191", "78137", "74707", "67292"))%>%
  #dplyr::filter(as.character(row) %in% c("35696", "26802", "25956", "21873", "12865"))%>%
  View()
}

#dg_trs_psiq_sub_cie_10_or, dg_trs_psiq_dsm_iv_or


#REGULARIZAR OTRAS CATEGORIAS
#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#::#:#:#:#:#:#:##:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#::#:#:#:#:#:#:#
CONS_C1_df_dup_JUL_2020_cons16%>%
dplyr::mutate(across(c(dg_trs_psiq_cie_10_or,x2_dg_trs_psiq_cie_10_or,x3_dg_trs_psiq_cie_10_or,x4_dg_trs_psiq_cie_10_or,x5_dg_trs_psiq_cie_10_or),~dplyr::case_when(grepl("En estudio",as.character(.),ignore.case = T)~1,grepl("Sin trastorno",as.character(.),ignore.case = T)~0,is.na(.)~0,TRUE~0),.names = "{col}_mod1a"))%>%
  dplyr::mutate(total_cie_10_en_est = base::rowSums(dplyr::select(.,ends_with("_mod1a"))))%>%  
  dplyr::mutate(across(c(dg_trs_psiq_cie_10_or,x2_dg_trs_psiq_cie_10_or,x3_dg_trs_psiq_cie_10_or,x4_dg_trs_psiq_cie_10_or,x5_dg_trs_psiq_cie_10_or),~dplyr::case_when(grepl("En estudio",as.character(.),ignore.case = T)~0,grepl("Sin trastorno",as.character(.),ignore.case = T)~0,is.na(.)~0,TRUE~1),.names = "{col}_mod2a"))%>%
  dplyr::mutate(total_cie_10_dg = base::rowSums(dplyr::select(.,ends_with("_mod2a"))))%>%  
    
  dplyr::mutate(cie_10=dplyr::case_when(total_cie_10_dg>0 & total_cie_10_en_est>0~"Diagnosticado/a (uno en estudio)",
                       total_cie_10_dg>0 & total_cie_10_en_est==0~"Diagnosticado/a (sin otros registros)",
                       total_cie_10_dg==0 & total_cie_10_en_est>0~"En estudio (sin diagnosticados)",
                       TRUE~"Sin información diagnóstica"))%>%
  
dplyr::mutate(across(c(dg_trs_psiq_dsm_iv_or,x2_dg_trs_psiq_dsm_iv_or,x3_dg_trs_psiq_dsm_iv_or,x4_dg_trs_psiq_dsm_iv_or),~dplyr::case_when(grepl("En estudio",as.character(.))~1,grepl("Sin trastorno",as.character(.))~0,is.na(.)~0,TRUE~0),.names = "{col}_mod1b"))%>%
  dplyr::mutate(total_dsm_iv_en_est = base::rowSums(dplyr::select(.,ends_with("_mod1b"))))%>%  
dplyr::mutate(across(c(dg_trs_psiq_dsm_iv_or,x2_dg_trs_psiq_dsm_iv_or,x3_dg_trs_psiq_dsm_iv_or,x4_dg_trs_psiq_dsm_iv_or),~dplyr::case_when(grepl("En estudio",as.character(.),ignore.case = T)~0,grepl("Sin trastorno",as.character(.),ignore.case = T)~0,is.na(.)~0,TRUE~1),.names = "{col}_mod2b"))%>%
  dplyr::mutate(total_dsm_iv_dg = base::rowSums(dplyr::select(.,ends_with("_mod2b"))))%>%  
  
  dplyr::mutate(dsm_iv=dplyr::case_when(total_dsm_iv_dg>0 & total_dsm_iv_en_est>0~"Diagnosticado/a (uno en estudio)",
                       total_dsm_iv_dg>0 & total_dsm_iv_en_est==0~"Diagnosticado/a (sin otros registros)",
                       total_dsm_iv_dg==0 & total_dsm_iv_en_est>0~"En estudio (sin diagnosticados)",
                       TRUE~"Sin información diagnóstica"))%>%
  dplyr::select(-ends_with("_mod2b"),-ends_with("_mod2a"),-ends_with("_mod1a"),-ends_with("_mod1b"),
                -total_cie_10_dg,-total_cie_10_en_est,-total_dsm_iv_dg,-total_dsm_iv_en_est)%>%
  
assign("CONS_C1_df_dup_JUL_2020_cons17",., envir = .GlobalEnv)
#CONS_C1_df_dup_JUL_2020_cons17 %>%  janitor::tabyl(dsm_iv)
#CONS_C1_df_dup_JUL_2020_cons17 %>%  janitor::tabyl(cie_10)
#CONS_C1_df_dup_JUL_2020_cons17 %>%  janitor::tabyl(cnt_mod_cie_10_or)

Dates of treatments and continuity of their characteristics

We corrected the dates of discharge of those treatment dates that were imputed into values larger than the study period (2019-11-13). Subsequently, we generated the numeric value of these dates for compatibility with other statistical software, and created variables of the treatment that follows, if any, and values relative to the difference between them.

CONS_C1_df_dup_JUL_2020_cons17%>%
  dplyr::mutate(fech_egres_imp=dplyr::case_when(as.Date(fech_egres_imp)>as.Date("2019-11-13")~as.Date("2019-11-13"),TRUE~as.Date(fech_egres_imp))) %>%
  dplyr::mutate(fech_ing_num=as.numeric(as.Date(fech_ing)))%>%
  dplyr::mutate(fech_egres_num=as.numeric(as.Date(fech_egres_imp)))%>%
  dplyr::mutate(fech_egres_num=ifelse(is.na(fech_egres_imp),18213,fech_egres_num))%>%
  dplyr::mutate(dias_treat_imp_sin_na=fech_egres_num-fech_ing_num)%>%
  
  dplyr::group_by(hash_key)%>%
  dplyr::mutate(fech_ing_next_treat=dplyr::lag(fech_ing_num))%>%
  dplyr::mutate(diff_bet_treat=fech_ing_next_treat-fech_egres_num)%>%
  dplyr::mutate(id_centro_sig_trat=dplyr::lag(id_centro)) %>%
  dplyr::mutate(tipo_plan_sig_trat=dplyr::lag(tipo_de_plan_2)) %>%
  dplyr::mutate(tipo_programa_sig_trat=dplyr::lag(tipo_de_programa_2)) %>%
  dplyr::mutate(senda_sig_trat=dplyr::lag(senda)) %>%
  dplyr::ungroup()%>%
  #para tener sólo los casos que corresponde, que tienen comparaciones con un siguiente. Los otros no me interesan
  dplyr::mutate(menor_60_dias_diff=case_when(diff_bet_treat<60~1,TRUE~0))%>%
  dplyr::mutate(menor_45_dias_diff= ifelse(diff_bet_treat<45,1,0))%>%
  dplyr::mutate(obs_cambios=case_when(id_centro_sig_trat!=id_centro~"1.1.cambio centro",TRUE~""))%>%
  dplyr::mutate(obs_cambios=case_when(tipo_plan_sig_trat!=tipo_de_plan_2~glue::glue("{obs_cambios};1.2.cambio tipo plan"),TRUE~obs_cambios))%>%
  dplyr::mutate(obs_cambios=case_when(tipo_programa_sig_trat!=tipo_de_programa_2~glue::glue("{obs_cambios};1.3.cambio tipo programa"),TRUE~obs_cambios))%>%
  dplyr::mutate(obs_cambios=case_when(senda_sig_trat!=senda~glue::glue("{obs_cambios};1.4.cambio senda"),TRUE~obs_cambios))%>%
  dplyr::mutate(obs_cambios_ninguno=case_when(obs_cambios==""~1,TRUE~0))%>%
  dplyr::mutate(obs_cambios_num=case_when(id_centro_sig_trat!=id_centro~1,TRUE~0))%>%
  dplyr::mutate(obs_cambios_num=case_when(tipo_plan_sig_trat!=tipo_de_plan_2~obs_cambios_num+1,TRUE~obs_cambios_num))%>%
  dplyr::mutate(obs_cambios_num=case_when(tipo_programa_sig_trat!=tipo_de_programa_2~obs_cambios_num+1,TRUE~obs_cambios_num))%>%
  dplyr::mutate(obs_cambios_num=case_when(senda_sig_trat!=senda~obs_cambios_num+1,TRUE~obs_cambios_num))%>%
  dplyr::mutate(obs_cambios_num=as.numeric(obs_cambios_num))%>%
  dplyr::mutate(obs_cambios_fac=obs_cambios_num)%>%
  dplyr::mutate(menor_45_dias_diff= recode(as.character(menor_45_dias_diff),"0"=">= 45 Days of Difference Between Entries","1"="<45 Days of Difference Between Entries"))%>%
  dplyr::mutate(menor_60_dias_diff= recode(as.character(menor_60_dias_diff),"0"=">= 60 Days of Difference Between Entries","1"="<60 Days of Difference Between Entries"))%>%
  dplyr::mutate(obs_cambios_ninguno= recode(as.character(obs_cambios_ninguno),"0"="At least 1 Change w/ the Next Entry","1"="No Changes w/ the Next Entry"))%>%
  dplyr::mutate_at(c('menor_45_dias_diff','menor_60_dias_diff','obs_cambios_ninguno','obs_cambios_fac'),~as.factor(.))%>%  
  assign("CONS_C1_df_dup_JUL_2020_cons18",., envir = .GlobalEnv)

if(CONS_C1_df_dup_JUL_2020_cons18%>%
    dplyr::filter(fech_egres_imp>"2019-11-13") %>% nrow()>0){warning("there are still values over the date of retrieval of the data")}

Normalize progression of educational attainment by users

Once treatments were from one another, we noticed that many users reported a determined educational attainment; however, in a following treatment, registries shown inconsistent levels of educational attainment. This is why we decided to focus on the 2,337 users with inconsistencies throughout their different treatments.

hash_key_escolaridad<-
CONS_C1_df_dup_JUL_2020_cons18%>%
    dplyr::group_by(hash_key)%>%
    dplyr::mutate(esc_num=as.numeric(substring(as.character(escolaridad), 1, 1)))%>%
    dplyr::mutate(esc_num_lag=lag(esc_num))%>%
    dplyr::mutate(fech_ing_lag=lag(fech_ing))%>%
    dplyr::mutate(escolaridad_lag=lag(escolaridad))%>%
    dplyr::filter(esc_num_lag>esc_num)%>% #El tratamiento posterior tiene menor escolaridad que el actual
    dplyr::select(row,hash_key,fech_ing, esc_num_lag,esc_num,escolaridad,escolaridad_lag,fech_ing_lag)%>%
          dplyr::distinct(hash_key)
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
  #_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
  #_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_

hash_key_escolaridad_rules<-
CONS_C1_df_dup_JUL_2020_cons18%>%
dplyr::filter(hash_key %in% unlist(hash_key_escolaridad))%>%
  dplyr::mutate(esc_num=as.numeric(substring(as.character(escolaridad), 1, 1)))%>%
  dplyr::mutate(nas_ed=ifelse(is.na(escolaridad),1,0))%>%
  dplyr::group_by(hash_key)%>%
  dplyr::mutate(n_dis_esc=n_distinct(escolaridad),n=n(), rn_esc=row_number(),
                nas_ed=sum(nas_ed),nas_ed=ifelse(nas_ed>0,1,0), min_ed=max(esc_num, na.rm=T))%>%
  dplyr::ungroup()%>%
  dplyr::group_by(hash_key,escolaridad)%>%
  dplyr::mutate(n_hash_esc=n())%>%
  dplyr::ungroup()%>%
  dplyr::select(row,hash_key,fech_ing,esc_num,escolaridad,n_dis_esc,n,n_hash_esc,rn_esc,nas_ed,min_ed)%>%
  #_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
  #Generar variables de comparación
  #_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
  dplyr::group_by(hash_key)%>%
  dplyr::mutate(esc_num_lag_post=lag(esc_num),esc_num_lead_ant=lead(esc_num))%>% 
  dplyr::mutate(ed_problematico_post=dplyr::case_when(esc_num_lag_post>esc_num~1,TRUE~0))%>%
  dplyr::mutate(ed_problematico_ant=dplyr::case_when(esc_num_lead_ant>esc_num~1,TRUE~0))%>%
  dplyr::mutate(ant_ed_problematico_post=lead(ed_problematico_post))%>%
  dplyr::mutate(post_ed_problematico_ant=lag(ed_problematico_ant))%>%
  dplyr::mutate(the_rank= rank(-n_hash_esc, ties.method = "min"))%>% #"max"
  dplyr::mutate(mfv=ifelse(the_rank==1,esc_num,NA_real_))%>%
  dplyr::mutate(the_rank_post=lag(the_rank))%>% 
  dplyr::mutate(the_rank_ant=lead(the_rank))%>% 
  dplyr::mutate(mfv=max(mfv, na.rm=T))%>%
  dplyr::ungroup()%>%
  #_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
  #A.Si no hay perdidos, 3 o más escolaridades, más de 3 casos y existe un caso en medio que es problematico y este no es un caso más frecuente, reemplazar con el valor posterior. A menos que el error esté en el final (a.2). EN ese caso, ver si está en la primera fila (caso más reciente) y no tiene valor posterior, reemplazar con <
  #_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
  dplyr::mutate(ed_a= dplyr::case_when(nas_ed==0 & n_dis_esc>2 & n>3 & ant_ed_problematico_post==1 & post_ed_problematico_ant==1 & the_rank>1 & the_rank_post==1~esc_num_lag_post,TRUE~NA_real_))%>%
  #Error al final
  dplyr::mutate(ed_a2= dplyr::case_when(nas_ed==0 & n_dis_esc>2 & n>3 & ant_ed_problematico_post==1 & is.na(post_ed_problematico_ant) & the_rank>1 & rn_esc==1 & the_rank_ant==1~esc_num_lead_ant,TRUE~NA_real_))%>%
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
  #B. Hay un usuario que tiene más de un caso con una educación particular. Es decir, tiene solo un valor más frecuente.
   ##############NO ME DA LO MISMO SI HAY UN VACIO, NO LO VOY A CAMBIAR IGUAL POR UN VALOR DETERMINADO. EJ: ES UN CASO DEL 2011 Y DE AHI VUELVE EL   2016, e1b2708112875d77f7d3d1bd87c10164 
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
  dplyr::mutate(ed_b= dplyr::case_when(n_dis_esc==n-1 & n_hash_esc==1 & the_rank >1 & nas_ed==0~mfv,TRUE~NA_real_))%>%
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
  #C. En el caso de que todos los casos sean distintos y aunque haya valores perdidos, elegir el valor máximo (en este caso, equivalente al mínimo) , para todos los HASH
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
  dplyr::mutate(ed_c= dplyr::case_when(n_dis_esc==n~min_ed,TRUE~NA_real_))%>% # & nas_ed==0
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
  #Si hay empates y el posterior es the_rank==1, podría reemplazarse con el posterior.
  #D. (Ej. 154608 04b09b0ad8f6d8cbf9871594cf10f7e5) Tiene 2 casos distintos, aunque no está en el medio. 3 casos iguales (univers), aunque el final es una anomalía (secundaria)
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
  dplyr::mutate(ed_d= dplyr::case_when(ant_ed_problematico_post==1 & post_ed_problematico_ant==1 & esc_num_lag_post==esc_num_lead_ant ~esc_num_lag_post,TRUE~NA_real_))%>%
  #CUIDADO: Si hay empates y el posterior es rank 1 puede ser un caso problemático
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
  #D. (Ej: 112484 e1b2708112875d77f7d3d1bd87c10164) El caso que está en medio es menor, hay NAs y no hay restricción de n (ant_ed_problematico_post==1 & post_ed_problematico_ant==1 & the_rank>1), . El caso menos frecuente (esc_num_lag_post==esc_num_lead_ant (el mismo valor en el tratamiento anterior y en el tratamiento posterior). 
  #:#:#:#:#FIJARSE QUE NO HAYA EMPATE EN EL VALOR MÁS FRECUENTE.#_#_#_#__#_#_#_ 
    dplyr::mutate(ed_d2= dplyr::case_when(ed_problematico_post==1 & ed_problematico_ant==1 & esc_num_lag_post==esc_num_lead_ant ~esc_num_lag_post,TRUE~NA_real_))%>%
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
  #E. (ej. 157971  fe456c2da940fa8ece275e509a634242), hay un caso igual al anterior, misma frecuencia, por lo que hay empate en mfv y rn_esc>1 (no es el último caso). Hay empate, pero para estos fines no me interesa mucho.
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
  dplyr::mutate(ed_e= dplyr::case_when(rn_esc==n & ed_problematico_post==1 ~esc_num_lag_post,TRUE~NA_real_))%>%
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
  #F. e(ej. 140181 485a5ff2e2c7aa0943c292e337ea1411) Casos distintos, más de 3 casos, no hay NAs debiese ser el caso más reciente qcon el problema, es el probemático y con el rank más alto (no es el caso más frecuente).
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_  
  dplyr::mutate(ed_f= dplyr::case_when(n_dis_esc==2 & n>3 & rn_esc==1 & ant_ed_problematico_post==1 & the_rank>1~esc_num_lead_ant,TRUE~NA_real_))%>%
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_  
   #G. (Ej. 114450 e1b2708112875d77f7d3d1bd87c10164), tiene NAs en el primer registro, lo ignoro. Tiene 3 distintos, por lo que si hay una educación en medio, debiese reemplazar. Esto puede usarse si coinciden uno y otro
  
  #F. (ej. 39890, faa8263a28f47dbaefa77c326ea96b2a) no tiene valores perdidos, hay eventos intermedios inconsistentes, hay empates
  ## Ver cómo generar más de un mfv. EN una de esas sacar el mfv posterior para ver si es uno también.
  ## Puede ser sin el rank==1 mientras que el posterior al problemático tenga ==1 en el rank.
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
  #H. Hay 3 escolaridades distintas (ej: 72013 bf70334c0a891bee1d016bc530eece8d)
  
  ##NO SE QUÉ HAY QUE CAMBIAR AQUÍ, PORQUE HAY QUE VER EL CONJUNTO DE LAS VARIABLES PARA CAMBIARLO. LA CONDICION SERIA QUE POR CADA USUARIO, VER QUE TENGA CASOS PROBLEMATICOS TRAT POSTERIOR, NO TENGA PERDIDOS, ESE CASO PROBLEMATICO POSTERIOR ES TAMBIÉN UN RANK==1
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
  #i. (EJ. 4e6b041e4a0a40e7c6bc8a8a65f842d6  132495)
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_  
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_    
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
####  Ver si hay casos que no tienen reglas###
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
  dplyr::group_by(hash_key)%>%
dplyr::mutate(across(c("ed_a", "ed_a2", "ed_b", "ed_c","ed_d","ed_d2","ed_e","ed_f"),~ifelse(abs(max(.,na.rm=T)) == Inf,NA,.), .names= "{col}_n"))%>% 
  dplyr::select(-ends_with("_n"))%>%
  dplyr::ungroup() %>% 

#hash_key_escolaridad_rules%>%
    dplyr::mutate(total_mean = base::rowSums(dplyr::select(., ed_a, ed_a2, ed_b, ed_c, ed_d, ed_d2, ed_e, ed_f), na.rm=T))%>%
  dplyr::group_by(hash_key) %>% 
    dplyr::mutate(total_mean=max(total_mean,na.rm=T)) %>% 
  ungroup() %>% 
  #dplyr::filter(total_mean==0 ) %>%  #VER SI LO SACO O NO. UNA VEZ QUE TENGA TODO LISTO

#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
#$G. Hay empate en cantidad de valores distintos. El primer caso tiene educación valor menor (secundaria), después 2 mayores a secundaria, y finalmente en la última entrada (más reciente), pasa de universitaria (1) a completa o menor (2). Cambiar el último porque no hay progresión en ese.
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
#7776e5c9335808de3e8389ac2c685056 93804 & 133259  
  #4 casos, 2 distintas escolaridades, partió con secundaria (2), de ahí siguió con universitaria (1)2 veces, hasta volver con ed. secundaria    
#7776e5c9335808de3e8389ac2c685056 78986 42125    
  dplyr::mutate(ed_g= dplyr::case_when(total_mean==0 & n_dis_esc==2 & rn_esc==1 & ant_ed_problematico_post==1~esc_num_lead_ant,TRUE~NA_real_))%>%
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
  #:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:
  #|||°°°°°°°°°°|||°°°°°°°°°°|||°°°°°°°°°°|||°°°°°°°°°°||||||°°°°°°°°°°|||°°°°°°°°°°|||°°°°°°°°°°|||°°°°°°°°°°|||
  #:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:#:
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
#Hay unos que se deben ver en su conjunto (ej: 156063), ver si el primer caso es igual al último al interior de un usuario. Si es así, reemplazar todos los vaalores --> que no sean NAs
  #Ver casos anómalos en que hay 2 seguidos.
  #_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
#$H. Sólo 2 valores distintos, (n= n_dis_esc >2), Hay tratamientos en medio. EN el ejemplo, hay 5 casos, tiene 2 entradas con universitaria (1), luego 2 entradas que se ven contradictorias (ej, 2 casos completa o menor)
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
#144b81fab8cd7e2796a133085c6d8d16 131587 & 44293
  dplyr::group_by(hash_key) %>% 
    dplyr::mutate(mix_max_esc=ifelse(dplyr::first(esc_num)==dplyr::last(esc_num) & n_dis_esc==2,1,0),esc_num_last=dplyr::last(esc_num))%>% 
    dplyr::mutate(ties=ifelse(n_distinct(n_hash_esc)>1,0,1), ties=max(ties,na.rm=T))%>%
    dplyr::ungroup()%>%
  dplyr::mutate(ed_h= dplyr::case_when(total_mean==0 & mix_max_esc==1 ~esc_num_last,TRUE~NA_real_))%>% #para dejar el valor min y max en reemplazo al resto de los casos.
  dplyr::mutate(ed_i= dplyr::case_when(total_mean==0 & ties==1 & mix_max_esc==0 & n_dis_esc==2~min_ed,TRUE~NA_real_))%>% #p
#
  dplyr::mutate(ed_j= dplyr::case_when(total_mean==0 & is.na(ed_i) & is.na(ed_h)~min_ed,TRUE~NA_real_))%>%
#49868 0f1218127d5370806310f2ccc6784302
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
  dplyr::group_by(hash_key)%>%
  dplyr::mutate(across(c("ed_h", "ed_i", "ed_j"),~ifelse(abs(max(.,na.rm=T)) == Inf,NA,.), .names= "{col}_n"))%>% 
  dplyr::mutate(across(c("ed_a", "ed_a2", "ed_b", "ed_c","ed_d","ed_d2","ed_e","ed_f","ed_h","ed_i","ed_j"),~ifelse(.>0,1,0), .names= "{col}_N-dis"))%>% 
  dplyr::select(-ends_with("_n"))%>%
  dplyr::ungroup() %>%
  dplyr::mutate(no_suggestions = base::rowSums(dplyr::select(., "ed_a_N-dis", "ed_a2_N-dis","ed_b_N-dis","ed_c_N-dis","ed_d_N-dis","ed_d2_N-dis","ed_e_N-dis","ed_f_N-dis","ed_h_N-dis","ed_i_N-dis","ed_j_N-dis"), na.rm=T))%>%
  dplyr::mutate(total_mean2 = base::rowSums(dplyr::select(., ed_h, ed_i, ed_j), na.rm=T))%>%
  dplyr::select(-ends_with("_N-dis")) %>% 
  dplyr::group_by(hash_key) %>% 
    dplyr::mutate(total_mean2=max(total_mean2,na.rm=T)) %>% 
    dplyr::mutate(no_suggestions= max(no_suggestions,na.rm = T)) %>% 
  dplyr::ungroup()
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
#4 casos, 2 distintas escolaridades, las 2 primeras universitaria, las 2 que le siguen están en secundaria, hay empate. Debiese dejar la más vulnerable.
#bde6cd6a3f35291441df7ae58f1ba4bb 118109 103929     ant_ed_problematico_post esc_num_lead_ant  

hash_key_escolaridad_rules_final<-
hash_key_escolaridad_rules%>%
  dplyr::select(row,no_suggestions, starts_with("ed_"))
  #dplyr::filter(no_suggestions>1) %>%

#2,285 #6,320
  CONS_C1_df_dup_JUL_2020_cons18%>%
  dplyr::mutate(obs=case_when(row %in% as.numeric(unlist(hash_key_escolaridad_rules_final$row))~glue::glue("{obs};4.99. Education Changed"),TRUE~obs))%>%
  dplyr::left_join(hash_key_escolaridad_rules_final, by="row")%>%
  dplyr::select(-no_suggestions,ed_problematico_post, ed_problematico_ant) %>% 
  dplyr::mutate(esc_num=as.numeric(substring(as.character(escolaridad), 1, 1)))%>% #janitor::tabyl(esc_num)
  dplyr::mutate(esc_num=dplyr::case_when(!is.na(ed_a)~ed_a,
                                         !is.na(ed_a2)~ed_a2,
                                         !is.na(ed_b)~ed_b,
                                         !is.na(ed_c)~ed_c,
                                         !is.na(ed_d)~ed_d,
                                         !is.na(ed_d2)~ed_d2,
                                         !is.na(ed_e)~ed_e,
                                         !is.na(ed_f)~ed_f,
                                         !is.na(ed_g)~ed_g,
                                         !is.na(ed_h)~ed_h,
                                         !is.na(ed_i)~ed_i,
                                         !is.na(ed_j)~ed_j,
                                         TRUE~esc_num)) %>% #janitor::tabyl(esc_num)
  dplyr::mutate(escolaridad_rec=dplyr::case_when(esc_num==1~"1-Mayor a Ed Secundaria",
                                                esc_num==2~"2-Ed Secundaria Completa o Menor",
                                                esc_num==3~"3-Ed Primaria Completa o Menor",
                                                TRUE~NA_character_)) %>% 
  dplyr::select(-starts_with("ed_"),-esc_num) %>% 
  assign("CONS_C1_df_dup_JUL_2020_cons19",., envir = .GlobalEnv)
# esc_num     n     percent valid_percent
#       1 19473 0.177420824     0.1781382
#       2 60782 0.553792048     0.5560312
#       3 29059 0.264760013     0.2658305
#      NA   442 0.004027115            NA
  #UNA VEZ QUE HAGO EL FILTRO
#   esc_num     n     percent valid_percent
#       1 18675 0.170150151     0.1708303
#       2 60843 0.554347826     0.5565638
#       3 29801 0.271520464     0.2726059
#      NA   437 0.003981559            NA

hash_key_escolaridad2<-
CONS_C1_df_dup_JUL_2020_cons19%>%
    dplyr::group_by(hash_key)%>%
    dplyr::mutate(esc_num=as.numeric(substring(as.character(escolaridad_rec), 1, 1)))%>%
    dplyr::mutate(esc_num_lag=lag(esc_num))%>%
    dplyr::mutate(fech_ing_lag=lag(fech_ing))%>%
    dplyr::mutate(escolaridad_lag=lag(escolaridad_rec))%>%
    dplyr::filter(esc_num_lag>esc_num)%>% #El tratamiento posterior tiene menor escolaridad que el actual
    dplyr::select(row,hash_key,fech_ing, esc_num_lag,esc_num,escolaridad,escolaridad_lag,fech_ing_lag)%>%
          dplyr::distinct(hash_key)
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
  #_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
  #_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_

hash_key_escolaridad_rules2<-
CONS_C1_df_dup_JUL_2020_cons19%>%
dplyr::filter(hash_key %in% unlist(hash_key_escolaridad2))%>%
  dplyr::mutate(esc_num=as.numeric(substring(as.character(escolaridad_rec), 1, 1)))%>%
  dplyr::mutate(nas_ed=ifelse(is.na(escolaridad_rec),1,0))%>%
  dplyr::group_by(hash_key)%>%
  dplyr::mutate(n_dis_esc=n_distinct(escolaridad_rec),n=n(), rn_esc=row_number(),
                nas_ed=sum(nas_ed),nas_ed=ifelse(nas_ed>0,1,0), min_ed=max(esc_num, na.rm=T))%>%
  dplyr::ungroup()%>%
  dplyr::group_by(hash_key,escolaridad_rec)%>%
  dplyr::mutate(n_hash_esc=n())%>%
  dplyr::ungroup()%>%
  dplyr::select(row,hash_key,fech_ing,esc_num,escolaridad,escolaridad_rec,n_dis_esc,n,n_hash_esc,rn_esc,nas_ed,min_ed)%>%
  #_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
  #Generar variables de comparación
  #_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
  dplyr::group_by(hash_key)%>%
  dplyr::mutate(esc_num_lag_post=lag(esc_num),esc_num_lead_ant=lead(esc_num))%>% 
  dplyr::mutate(ed_problematico_post=dplyr::case_when(esc_num_lag_post>esc_num~1,TRUE~0))%>%
  dplyr::mutate(ed_problematico_ant=dplyr::case_when(esc_num_lead_ant>esc_num~1,TRUE~0))%>%
  dplyr::mutate(ant_ed_problematico_post=lead(ed_problematico_post))%>%
  dplyr::mutate(post_ed_problematico_ant=lag(ed_problematico_ant))%>%
  dplyr::mutate(the_rank= rank(-n_hash_esc, ties.method = "min"))%>% #"max"
  dplyr::mutate(mfv=ifelse(the_rank==1,esc_num,NA_real_))%>%
  dplyr::mutate(the_rank_post=lag(the_rank))%>% 
  dplyr::mutate(the_rank_ant=lead(the_rank))%>% 
  dplyr::mutate(mfv=max(mfv, na.rm=T))%>%
  dplyr::ungroup()%>%
  
 dplyr::group_by(hash_key) %>% 
    dplyr::mutate(esc_num_first=dplyr::first(esc_num),esc_num_last=dplyr::last(esc_num))%>% 
    dplyr::ungroup()%>%
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
  dplyr::mutate(ed2_a= suppressWarnings(dplyr::case_when(esc_num_first==min_ed & nas_ed==0 & n_dis_esc==2~min_ed,TRUE~NA_real_)))%>% #
  dplyr::group_by(hash_key) %>% 
  dplyr::mutate(ed2_a_n=ifelse(abs(suppressWarnings(max(ed2_a,na.rm=T))) == Inf,NA,ed2_a)) %>% 
  ungroup() %>% 
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
  dplyr::mutate(ed2_b= suppressWarnings(dplyr::case_when(is.na(ed2_a_n) & nas_ed==0 & ant_ed_problematico_post==1 & post_ed_problematico_ant==1 & n_dis_esc==2~esc_num_lead_ant,TRUE~NA_real_)))

ungroup: no grouping variables

  CONS_C1_df_dup_JUL_2020_cons19%>%
  dplyr::left_join(hash_key_escolaridad_rules2[,c("row","ed2_a","ed2_b")], by="row")%>%
  dplyr::mutate(esc_num=as.numeric(substring(as.character(escolaridad_rec), 1, 1)))%>% #janitor::tabyl(esc_num)
  dplyr::mutate(esc_num=dplyr::case_when(!is.na(ed2_a)~ed2_a,
                                         !is.na(ed2_b)~ed2_b,
                                         TRUE~esc_num)) %>% #janitor::tabyl(esc_num)
  dplyr::mutate(escolaridad_rec=dplyr::case_when(esc_num==1~"1-Mayor a Ed Secundaria",
                                                esc_num==2~"2-Ed Secundaria Completa o Menor",
                                                esc_num==3~"3-Ed Primaria Completa o Menor",
                                                TRUE~NA_character_)) %>% 
  dplyr::select(-starts_with("ed2_"),-esc_num) %>% 
  assign("CONS_C1_df_dup_JUL_2020_cons19b",., envir = .GlobalEnv)
  
if(
  CONS_C1_df_dup_JUL_2020_cons19b%>%
    dplyr::group_by(hash_key)%>%
    dplyr::mutate(esc_num=as.numeric(substring(as.character(escolaridad_rec), 1, 1)))%>%
    dplyr::mutate(esc_num_lag=lag(esc_num))%>%
    dplyr::mutate(fech_ing_lag=lag(fech_ing))%>%
    dplyr::mutate(escolaridad_lag=lag(escolaridad_rec))%>%
    dplyr::filter(esc_num_lag>esc_num)%>% #El tratamiento posterior tiene menor escolaridad que el actual
    dplyr::select(row,hash_key,fech_ing, esc_num_lag,esc_num,escolaridad,escolaridad_lag,fech_ing_lag)%>%
    dplyr::distinct(hash_key)%>% nrow()
>0){warning("there are still levels of educational attainment left to normalize")}

Generate values of the trajectories of users

We created variables to obtain a summary of each trajectory in terms of the days treated. One variable (cum_dias_trat_sin_na) and their mean (mean_cum_dias_trat_sin_na) aimed to get the cumulative days treated by each patient, depending on the number of treatments up to a determined treatment. Also, we added variables related to the cumulative difference between treatments (cum_diff_bet_treat and mean_cum_diff_bet_treat). These variables let us identify changes in treatment lengths and time to readmission throughout the trajectory of each user in SENDA between 2010-2019.

library(magrittr)


Attaching package: 'magrittr'

The following object is masked from 'package:purrr':

    set_names

The following object is masked from 'package:tidyr':

    extract

CONS_C1_df_dup_JUL_2020_cons19b %>% 
  dplyr::arrange(hash_key,fech_ing) %>%
  dplyr::mutate(keep_tipo_de_plan_2=tipo_de_plan_2)%>%
  dplyr::mutate(keep_motivodeegreso_mod_imp=motivodeegreso_mod_imp)%>% 
  dplyr::group_by(hash_key) %>% 
  dplyr::mutate(rn_hash_discard=row_number())%>% 
  dplyr::mutate(rn_hash=row_number())%>% 
  dplyr::mutate(n_hash=n())%>% 
  dplyr::mutate(cum_dias_trat_sin_na=cumsum(tidyr::replace_na(dias_treat_imp_sin_na, 0)))%>%
  dplyr::mutate(keep_cum_dias_trat_sin_na=cumsum(tidyr::replace_na(dias_treat_imp_sin_na, 0)))%>%
  dplyr::mutate(mean_cum_dias_trat_sin_na=cum_dias_trat_sin_na/rn_hash)%>%
  dplyr::mutate(keep_mean_cum_dias_trat_sin_na=cum_dias_trat_sin_na/rn_hash)%>% 
  dplyr::mutate(cum_diff_bet_treat=cumsum(tidyr::replace_na(diff_bet_treat, 0)))%>%
  dplyr::mutate(keep_cum_diff_bet_treat=cumsum(tidyr::replace_na(diff_bet_treat, 0)))%>% 
  dplyr::mutate(mean_cum_diff_bet_treat=cum_diff_bet_treat/rn_hash)%>%
  dplyr::mutate(mean_cum_diff_bet_treat=dplyr::case_when(n_hash== rn_hash~NA_real_,TRUE~mean_cum_diff_bet_treat)) %>% 
  dplyr::mutate(keep_mean_cum_diff_bet_treat=mean_cum_diff_bet_treat)%>%
  dplyr::ungroup()%>% 

  #":":":":":":"":":":":": Prueba
  #dplyr::select(hash_key,rn_hash_discard,rn_hash,n_hash,cum_dias_trat_sin_na_rev,mean_cum_dias_trat_sin_na_rev,cum_diff_bet_treat_rev,n_hash,diff_bet_treat)%>% dplyr::filter(hash_key=="0093cc44fee21895b9e55f3d84e51928") %>% View()
  tidyr::pivot_wider(
    names_from =  rn_hash_discard, 
    names_sep="_",
    values_from = c(tipo_de_plan_2,
                    motivodeegreso_mod_imp,
                    dias_treat_imp_sin_na, 
                    diff_bet_treat,
                    cum_dias_trat_sin_na,
                    mean_cum_dias_trat_sin_na, 
                    cum_diff_bet_treat,
                    mean_cum_diff_bet_treat)
  )%>% #glimpse()

  #dplyr::filter(hash_key=="0093cc44fee21895b9e55f3d84e51928") %>% View()
   dplyr::group_by(hash_key)%>%
  dplyr::mutate_at(vars(tipo_de_plan_2_1:mean_cum_diff_bet_treat_10),~suppressWarnings(max(as.character(.),na.rm=T)))%>%
  dplyr::ungroup() %>%
  
  dplyr::mutate_at(vars(dias_treat_imp_sin_na_1:mean_cum_diff_bet_treat_10),~as.numeric(.))%>%
  dplyr::mutate(diff_bet_treat=fech_ing_next_treat-fech_egres_num)%>%
  dplyr::mutate(dias_treat_imp_sin_na=fech_egres_num-fech_ing_num)%>%
  #dplyr::select(hash_key,n_hash,fech_ing, starts_with("tipo_de_plan_2"),starts_with("dias_treat_imp_sin_na"),starts_with("diff_bet_treat_"),starts_with("mean_cum_dias_trat_sin_na_"),starts_with("mean_cum_dias_trat_sin_na_"))%>% dplyr::filter(hash_key=="0093cc44fee21895b9e55f3d84e51928")%>% View()
  dplyr::rename_at(.vars = vars(matches("^keep_")),
            .funs = funs(sub("^keep_", "", .)))%>%
      assign("CONS_C1_df_dup_JUL_2020_cons20",., envir = .GlobalEnv)

no_mostrar=0
if(no_mostrar==1){
CONS_C1_df_dup_JUL_2020_cons20 %>% 
    dplyr::group_by(hash_key) %>% 
    dplyr::mutate(n_hash=n())%>% 
   dplyr::ungroup()%>% 
    dplyr::filter(n_hash>3) %>% 
    dplyr::mutate_at(vars(dias_treat_imp_sin_na:cum_diff_bet_treat_rev_10),~suppressWarnings(max(as.character(.),na.rm=T)))%>%
  dplyr::select(hash_key,n_hash,fech_ing, starts_with("tipo_de_plan_2"),starts_with("dias_treat_imp_sin_na"),starts_with("diff_bet_treat_"),starts_with("mean_cum_dias_trat_sin_na_"),starts_with("mean_cum_dias_trat_sin_na_"),starts_with("mean_cum_diff_bet_treat_"))%>% View()
}
 # CONS_C1_df_dup_JUL_2020_cons19 %>% dplyr::mutate(n_hash=n())%>% dplyr::filter(n_hash>1)%>% select(diff_bet_treat_1) %>%  summary()

CONS_C1_df_dup_JUL_2020_cons20%>%
  dplyr::mutate_at(vars(contains("dg_trs_psiq_cie_10"),contains("dg_trs_psiq_dsm_iv"),contains("dg_trs_psiq_sub_cie_10"),contains("dg_trs_psiq_sub_dsm_iv"),contains("tipo_de_plan"),c('tipo_de_programa_2','nombre_centro','tipo_centro','servicio_de_salud','senda','tipo_centro_derivacion','usuario_tribunal_trat_droga','motivodeegreso_mod_imp','macrozona','nombre_region','comuna_residencia_cod','identidad_de_genero','origen_ingreso_mod','x_se_trata_mujer_emb','usuario_tribunal_trat_droga','tiene_menores_de_edad_a_cargo','ha_estado_embarazada_egreso','discapacidad','opcion_discapacidad','escolaridad','edad_al_ing_grupos','nacionalidad','sexo_2','embarazo','estado_conyugal_2','edad_grupos','freq_cons_sus_prin','via_adm_sus_prin_act','etnia_cor','nacionalidad_2','etnia_cor_2','sus_ini_2_mod','sus_ini_3_mod','sus_ini_mod','con_quien_vive','estatus_ocupacional','cat_ocupacional','sus_principal_mod', 'tipo_de_vivienda_mod','tenencia_de_la_vivienda_mod','rubro_trabaja_mod','otras_sus1_mod','otras_sus2_mod','otras_sus3_mod','dg_trs_cons_sus_or','diagnostico_trs_fisico','otros_probl_at_sm_or','ano_bd_first','ano_bd_last','centro_muj','cie_10','dsm_iv','escolaridad_rec','con_quien_vive_rec','edad_ini_sus_prin_grupos')),~as.factor(.))%>%
  dplyr::mutate_at(c('id_centro_sig_trat','tipo_plan_sig_trat','tipo_programa_sig_trat','senda_sig_trat','at_least_one_cont_entry','id_centro'),~as.factor(.))%>%
  dplyr::mutate_at(c('tipo_de_plan_2_concat_a'),~as.character(.))%>%
  dplyr::mutate_at(c('fech_ing','fech_egres_imp'),~as.Date(.))%>%
  
    assign("CONS_C1_df_dup_JUL_2020",., envir = .GlobalEnv)

Consolidation of the dataset and its variables

  metadata(CONS_C1_df_dup_JUL_2020)$name <- "Agreement 1 SENDA"
  metadata(CONS_C1_df_dup_JUL_2020)$description <- "Information About Agreement 1 of SENDA and MINSAL. Contians information about treatments.(*) Intermediate events are collapsed and concatenated in some variables; Criteria to select values from entries that were collapsed into single treatments: Wide format(a),Maximum/Last value(b), Minimum/First value(c), Kept more vulnerable category(d), Same value(e), Largest treatment(f), Favored dgs.-a(g), Sum values(h). In case of 'tipo_de_plan_2','dias_treat_imp_sin_na', 'diff_bet_treat', 'cum_dias_trat_sin_na', 'mean_cum_dias_trat_sin_na' & 'cum_diff_bet_treat', the first variable, 10 variables were generated for each variable and represents each treatment of user, since the first(1) to the last(10)"

#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_  
  
codebook::var_label(CONS_C1_df_dup_JUL_2020) <- list(
row= 'Numerador de los eventos presentes en la Base de Datos (Último registro)/Events in the Dataset (Last Entry)',
row_cont_entries= 'Numerador de los eventos presentes en la Base de Datos(*)/Events in the Dataset(*)',
hash_key= 'Codificación del RUN/Masked Identifier (RUN)',
hash_rut_completo= 'HASH alternativo, en el escenario en que se asuma que el individuo al que se le codificó el RUN presente mayor edad/Alternative HASH-Key',
id= 'Codigo Identificación de SENDA/SENDA ID',
id_mod= 'ID de SENDA para Presentación en Página Web (enmascara caracteres 5 y 6)/SENDA ID (mask characters 5 & 6)',
fech_ing= 'Fecha de Ingreso a Tratamiento (Primera Entrada)/Date of Admission to Treatment (First Entry)',
fech_egres_imp= 'Fecha de Egreso (Imputados KNN & Lógico) del Último Registro(b)/Date of Discharge (Imputed KNN & Logic) of the Last Entry(b)',
tipo_de_plan_2= 'Tipo de Plan del Último Registro/Type of Plan of the Last Entry',
tipo_de_plan_2_largest_treat= 'Tipo de Plan del Registro Más Largo entre entradas intermedias(f)/Type of Plan of the Largest Entry Among Intermediate Entries(f)',
tipo_de_plan_2_concat_a= 'Tipo de Plan(*)/Type of Plan(*)', 
tipo_de_programa_2= 'Tipo de Programa del Registro Más Largo entre Entradas Intermedias/Type of Program of the Largest Entry Among Intermediate Entries',
id_centro= 'ID de Centro(b)/Treatment center ID(b)',
nombre_centro= 'Nombre del Centro de Tratamiento(*)/Treatment Center(*)',
id_centro_concat_a= 'ID de Centro(*)/Treatment center ID(*)',
tipo_centro= 'Tipo de Centro del Último Registro/Type of Center of the Last Entry',
servicio_de_salud= 'Servicio de Salud(*)/Health Service(*)',
senda= 'SENDA del Último Registro/SENDA of the Last Entry',
numero_de_hijos_mod= 'Número de Hijos (Valor Max.)/Number of Children (Max. Value)',
num_hijos_trat_res_mod= 'Número de Hijos para Ingreso a Tratamiento Residencial del Último Registro/Number of Children to Residential Treatment of the Last Entry',
tipo_centro_derivacion= 'Tipo de Centro al que el Usuario es Derivado del Último Registro(b)/Type of Center of Derivation of the Last Entry(b)',
motivodeegreso_mod_imp= 'Motivo de Egreso (con abandono temprano y tardío)(Imputados KNN & Lógico) del Último Registro(b)/Cause of Discharge (with late and early withdrawal)(Imputed KNN & Logic) of the Last Entry(b)',
macrozona= "Macrozona del Centro del Último Registro(b)/Macrozones of the Center of the Last Entry(b)",
nombre_region= "Región del Centro del Último Registro(b)/Chilean Region of the Center of the Last Entry(b)",
comuna_residencia_cod= "Comuna de Residencia del Último Registro(b)/Municipality or District of Residence of the Last Entry(b)",
fecha_ingreso_a_convenio_senda= 'Fecha de Ingreso a Convenio SENDA (aún no formateada como fecha) (Primera Entrada)/Date of Admission to SENDA Agreement (First Entry)',
identidad_de_genero= 'Identidad de Género (Último Registro)(b)/Gender Identity (Last Entry)(b)',
edad_al_ing= 'Edad a la Fecha de Ingreso a Tratamiento (numérico continuo) (Primera Entrada)/Age at Admission to Treatment (First Entry)',
origen_ingreso_mod= 'Origen de Ingreso (Primera Entrada)/Motive of Admission to Treatment (First Entry)',
x_se_trata_mujer_emb= 'Mujer Embarazada al Ingreso (d)/Pregnant at Admission (d)',
compromiso_biopsicosocial= 'Compromiso Biopsicosocial(d)/Biopsychosocial Involvement(d)',
dg_global_nec_int_soc_or= 'Diagnóstico Global de Necesidades de Integración Social (Al Ingreso)(d)/Global Diagnosis of Social Integration (At Admission)(d)',
dg_nec_int_soc_cap_hum_or= 'Diagnóstico de Necesidades de Integración Social en Capital Humano (Al Ingreso)(d)/Global Diagnosis of Social Integration in Human Capital (At Admission)(d)',
dg_nec_int_soc_cap_fis_or= 'Diagnóstico de Necesidades de Integración Social en Capital Físico (Al Ingreso)(d)/Global Diagnosis of Social Integration in Physical Capital (At Admission)(d)',
dg_nec_int_soc_cap_soc_or= 'Diagnóstico de Necesidades de Integración Social en Capital Social (Al Ingreso)(d)/Global Diagnosis of Social Integration in Social Capital (At Admission)(d)',
usuario_tribunal_trat_droga= 'Usuario de modalidad Tribunales de Tratamiento de Drogas(d)/User of Drug Treatment Courts Modality(d)',
evaluacindelprocesoteraputico= 'Evaluación del Proceso Terapéutico(d)/Evaluation of the Therapeutic Process(d)',
eva_consumo= 'Evaluación al Egreso Respecto al Patrón de consumo(d)/Evaluation at Discharge regarding to Consumption Pattern(d)',
eva_fam= 'Evaluación al Egreso Respecto a Situación Familiar(d)/Evaluation at Discharge regarding to Family Situation(d)',
eva_relinterp= 'Evaluación al Egreso Respecto a Relaciones Interpersonales(d)/Evaluation at Discharge regarding to Interpersonal Relations(d)',
eva_ocupacion= 'Evaluación al Egreso Respecto a Situación Ocupacional(d)/Evaluation at Discharge regarding to Occupational Status(d)',
eva_sm= 'Evaluación al Egreso Respecto a Salud Mental(d)/Evaluation at Discharge regarding to Mental Health(d)',
eva_fisica= 'Evaluación al Egreso Respecto a Salud Física(d)/Evaluation at Discharge regarding to Physical Health(d)',
eva_transgnorma= 'Evaluación al Egreso Respecto a Trasgresión a la Norma Social(d)/Evaluation at Discharge regarding to Transgression to the Norm(d)',
dg_global_nec_int_soc_or_1= 'Diagnóstico Global de Necesidades de Integración Social (Al Egreso)(d)/Global Diagnosis of Social Integration (At Discharge)(d)',
dg_nec_int_soc_cap_hum_or_1= 'Diagnóstico de Necesidades de Integración Social en Capital Humano (Al Egreso)(d)/Global Diagnosis of Social Integration in Human Capital (At Discharge)(d)',
dg_nec_int_soc_cap_fis_or_1= 'Diagnóstico de Necesidades de Integración Social en Capital Físico (Al Egreso)(d)/Global Diagnosis of Social Integration in Physical Capital (At Discharge)(d)',
dg_nec_int_soc_cap_soc_or_1= 'Diagnóstico de Necesidades de Integración Social en Capital Social (Al Egreso)(d)/Global Diagnosis of Social Integration in Social Capital (At Discharge)(d)',
tiene_menores_de_edad_a_cargo= 'Menores de Edad A Cargo(d)/Minor Dependants(d)',
ha_estado_embarazada_egreso= '¿Ha estado embarazada? (al Egreso)(d)/Have you been Pregnant (at Discharge)(d)',
discapacidad= 'Presenta Discapacidad(d)/Disability(d)',
opcion_discapacidad= 'Origen de Discapacidad(d)/Cause of Disability(d)',
escolaridad= 'Escolaridad: Nivel Eduacional(d)/Educational Attainment(d)',
escolaridad_rec= 'Escolaridad: Nivel Eduacional(d) Normalizado a Progresión de Tratamientos/Educational Attainment(d) & Normalized Following Progression of Treatments',
edad_al_ing_grupos= 'Edad a la Fecha de Ingreso a Tratamiento en Grupos(c)/Age at Admission to Treatment In Groups(c)',
nacionalidad= 'Nacionalidad/Nationality',
sexo_2= 'Sexo Usuario/Sex of User',
embarazo= 'Embarazo al Ingreso(c)/Pregnant at Admission(c)',
fech_nac= 'Fecha de Nacimiento/Date of Birth',
edad_ini_cons= 'Edad de Inicio de Consumo/Age of Onset of Drug Use',
edad_ini_sus_prin=  'Edad de Inicio de Consumo Sustancia Principal/Age of Onset of Drug Use of Primary Substance',
edad_ini_sus_prin_grupos=  'Edad de Inicio de Consumo Sustancia Principal (en Grupos)/Age of Onset of Drug Use of Primary Substance (in Groups)',
estado_conyugal_2= 'Estado Conyugal/Marital Status',
edad_grupos= 'Edad agrupada/Age in groups',
freq_cons_sus_prin= 'Frecuencia de Consumo de la Sustancia Principal (30 días previos a la admisión)(f)/Frequency of Consumption of the Primary or Main Substance (30 days previous to admission)(f)',
via_adm_sus_prin_act= 'Vía de Administración de la Sustancia Principal (Se aplicaron criterios de limpieza)(f)/Route of Administration of the Primary or Main Substance (Tidy)(f)',
etnia_cor= 'Etnia/Ethnic Group',
nacionalidad_2= 'Segunda Nacionalidad/Second Nationality',
etnia_cor_2= 'Etnia (2)/Second Ethnic Group',
sus_ini_2_mod= 'Segunda Sustancia de Inicio(Sólo más frecuentes)/Second Starting Substance',
sus_ini_3_mod= 'Tercera Sustancia de Inicio(Sólo más frecuentes)/Third Starting Substance',
sus_ini_mod= "Sustancia de Inicio (Sólo más frecuentes)/Starting Substance (Only more frequent)",
con_quien_vive= 'Persona con la que vive el Usuario(f)/People that Share Household with the User (Cohabitation Status)(f)',
con_quien_vive_rec= 'Persona con la que vive el Usuario (Recodificada)(f)/People that Share Household with the User (Cohabitation Status)(Recoded)(f)',
estatus_ocupacional= 'Condición Ocupacional(f)/Occupational Status(f)',
cat_ocupacional= 'Categoría Ocupacional(f)/Occupational Category(f)',
sus_principal_mod= 'Sustancia Principal de Consumo (Sólo más frecuentes)(f)/Primary or Main Substance of Consumption at Admission (Only more frequent)(f)',
tipo_de_vivienda_mod= 'Tipo de Vivienda(f)/Type of Housing(f)', 
tenencia_de_la_vivienda_mod= 'Tenencia de la Vivienda(f)/Tenure status of Households(f)',
rubro_trabaja_mod= 'Rubro de Trabajo(f)/Area of Work(f)',
otras_sus1_mod= 'Otras Sustancias (1)(Sólo más frecuentes)(f)/Other Substances (1)(Only more frequent)(f)',
otras_sus2_mod= 'Otras Sustancias (2)(Sólo más frecuentes)(f)/Other Substances (2)(Only more frequent)(f)',
otras_sus3_mod= 'Otras Sustancias (3)(Sólo más frecuentes)(f)/Other Substances (3)(Only more frequent)(f)',
dg_trs_cons_sus_or= 'Diagnósico de Trastorno por Consumo de Sustancias(d)/Diagnosed of Substance Use Disorder(d)',
dg_trs_psiq_dsm_iv_or= 'Diagnóstico de Trastorno Psiquiátrico, Criterios DSM IV(g)/Diagnosis of Psychiatric Disorders, DSM-IV criteria(g)',
dg_trs_psiq_sub_dsm_iv_or= 'Diagnóstico de Trastorno Psiquiátrico, Criterios DSM IV (Subclasificacion)(g)/Diagnosis of Psychiatric Disorders, DSM-IV criteria (sub-classification)(g)',
x2_dg_trs_psiq_dsm_iv_or= 'Diagnóstico de Trastorno Psiquiátrico, Criterios DSM IV (2)(g)/Diagnosis of Psychiatric Disorders, DSM-IV criteria (2)(g)',
x2_dg_trs_psiq_sub_dsm_iv_or= 'Diagnóstico de Trastorno Psiquiátrico, Criterios DSM IV (Subclasificacion) (2)(g)/Diagnosis of Psychiatric Disorders, DSM-IV criteria (sub-classification) (2)(g)',
x3_dg_trs_psiq_dsm_iv_or= 'Diagnóstico de Trastorno Psiquiátrico, Criterios DSM IV (3)(g)/Diagnosis of Psychiatric Disorders, DSM-IV criteria (3)(g)',
x3_dg_trs_psiq_sub_dsm_iv_or= 'Diagnóstico de Trastorno Psiquiátrico, Criterios DSM IV (Subclasificacion) (3)(g)/Diagnosis of Psychiatric Disorders, DSM-IV criteria (sub-classification) (3)(g)',
dg_trs_psiq_cie_10_or= 'Diagnóstico de Trastorno Psiquiátrico, Criterios CIE-10(g)/Diagnosis of Psychiatric Disorders, CIE-10 criteria(g)',
dg_trs_psiq_sub_cie_10_or= 'Diagnóstico de Trastorno Psiquiátrico, Criterios CIE-10 (Subclasificacion)(g)/Diagnosis of Psychiatric Disorders, CIE-10 criteria (subclassification)(g)',
x2_dg_trs_psiq_cie_10_or= 'Diagnóstico de Trastorno Psiquiátrico, Criterios CIE-10 (2)(g)/Diagnosis of Psychiatric Disorders, CIE-10 criteria (2)(g)',
x2_dg_trs_psiq_sub_cie_10_or= 'Diagnóstico de Trastorno Psiquiátrico, Criterios CIE-10 (Subclasificacion) (2)(g)/Diagnosis of Psychiatric Disorders, CIE-10 criteria (subclassification) (2)(g)',
x3_dg_trs_psiq_cie_10_or= 'Diagnóstico de Trastorno Psiquiátrico, Criterios CIE-10 (3)(g)/Diagnosis of Psychiatric Disorders, CIE-10 criteria (3)(g)',
x3_dg_trs_psiq_sub_cie_10_or= 'Diagnóstico de Trastorno Psiquiátrico, Criterios CIE-10 (Subclasificacion) (3)(g)/Diagnosis of Psychiatric Disorders, CIE-10 criteria (subclassification) (3)(g)',
diagnostico_trs_fisico= 'Diagnóstico de Trastorno Físico(g)/Diagnosis of Physical Disorder(g)',
otros_probl_at_sm_or= 'Otros Problemas de Atención Vinculados a Salud Mental(g)/Other problems linked to Mental Health(g)',
x4_dg_trs_psiq_dsm_iv_or= 'Diagnóstico de Trastorno Psiquiátrico, Criterios DSM IV (4)(g)/Diagnosis of Psychiatric Disorders, DSM-IV criteria (4)(g)',
x4_dg_trs_psiq_sub_dsm_iv_or= 'Diagnóstico de Trastorno Psiquiátrico, Criterios DSM IV (Subclasificacion)(4)(g)/Diagnosis of Psychiatric Disorders, DSM-IV criteria (sub-classification)(4)(g)',
x4_dg_trs_psiq_cie_10_or= 'Diagnóstico de Trastorno Psiquiátrico, Criterios CIE-10 (4)(g)/Diagnosis of Psychiatric Disorders, CIE-10 criteria (4)(g)',
x5_dg_trs_psiq_cie_10_or= 'Diagnóstico de Trastorno Psiquiátrico, Criterios CIE-10 (5)(g)/Diagnosis of Psychiatric Disorders, CIE-10 criteria (5)(g)',
x4_dg_trs_psiq_sub_cie_10_or= 'Diagnóstico de Trastorno Psiquiátrico, Criterios CIE-10 (Subclasificacion)(4)(g)/Diagnosis of Psychiatric Disorders, CIE-10 criteria (subclassification)(4)(g)',
ano_bd_first= 'Año de la Base de Datos(c)/Year of the Dataset (Source)(c)',
ano_bd_last= 'Año de la Base de Datos(b)/Year of the Dataset (Source)(b)',
obs= 'Observaciones al Proceso de Limpieza y Estandarización de Casos(e)/Observations to the Process of Data Tidying & Standardization(e)',
obs_concat_a= 'Observaciones al Proceso de Limpieza y Estandarización de Casos(*)/Observations to the Process of Data Tidying & Standardization(*)',
rn_common_treats2= 'Cuenta de Entradas Comunes(b)/Count of Common Entries(b)',
concat_hash_id_treatments='Combination of User & Distint Entries',
at_least_one_cont_entry= "Casos de Usuarios con más de una entrada después de otra/Cases of users with more than one entry after another one",
senda_concat_a= 'SENDA(*)/SENDA(*)',
tipo_centro_concat_a= 'Tipo de Centro(*)/Type of Center(*)',
fech_ing_num= 'Fecha de Ingreso a Tratamiento (Numérico)(c)/Date of Admission to Treatment (Numeric)(c)',
fech_egres_num= 'Fecha de Egreso (Imputados KNN & Lógico)(Numérico)(b)/Date of Discharge (Imputed KNN & Logic)(Numeric)(b) of the Next Treatment',
fech_ing_next_treat= 'Fecha de Ingreso a Tratamiento (Numérico)(c) del Tratamiento Posterior/Date of Admission to Treatment (Numeric)(c)',
diff_bet_treat= 'Días de diferencia con el Tratamiento Posterior/Days of difference between the Next Treatment',
id_centro_sig_trat= "ID del Centro del Tratamiento Posterior/Center ID of the Next Treatment",
tipo_plan_sig_trat= "Tipo de Plan del Tratamiento Posterior/Type of Plan of the Next Treatment",
tipo_programa_sig_trat= "Tipo de Programa del Tratamiento Posterior/Type of Program of the Next Treatment", 
senda_sig_trat= "SENDA del Tratamiento Posterior/SENDA of the Next Treatment",
menor_60_dias_diff= 'Menor a 60 días de diferencia con el Tratamiento Posterior/Menor a 60 days of difference between the Next Treatment',
menor_45_dias_diff= 'Menor a 45 días de diferencia con el Tratamiento Posterior/Less than 45 days of difference between the Next Treatment',
motivoegreso_derivacion= "Motivo de Egreso= Derivación(b)/Cause of Discharge= Derivación(b)",
dias_treat_imp_sin_na= 'Días de Tratamiento (valores perdidos en la fecha de egreso se reemplazaron por la diferencia con 2019-11-13)/Days of Treatment (missing dates of discharge were replaced with difference from 2019-11-13)',
obs_cambios= "Cambios del tratamiento en comparación al Tratamiento Posterior/Changes in treatment compared to the Next Treatment",
obs_cambios_ninguno= "Sin cambios del tratamiento en comparación al Tratamiento Posterior/No changes in treatment compared to the Next Treatment",
obs_cambios_num= "Recuento de cambios del tratamiento en comparación al Tratamiento Posterior/Count of changes in treatment compared to the Next Treatment",
obs_cambios_fac= "Recuento de cambios del tratamiento en comparación al Tratamiento Posterior(factor)/Count of changes in treatment compared to the Next Treatment(factor)",
hash_key_sex_program= 'Usuarios a los que se le ha cambiado el sexo de acuerdo al tipo de plan/Users that changed of sex considering the types of plan',
centro_muj= 'ID de centro que alude a un centro específico para mujeres/Center ID aludes to a women-specific center',
dsm_iv= 'Diagnóstico DSM-IV (1 o más)/Psychiatric Diagnoses (DSM-IV)(one or more)',
cie_10= 'Diagnóstico CIE-10 (1 o más)/Psychiatric Diagnoses (ICD-10)(one or more)',
abandono_temprano= 'Abandono temprano(<3 meses)/ Early Drop-out(<3 months)',
cnt_mod_dsm_iv_or= 'Recuento de Diagnóstico DSM-IV/Count of Psychiatric Diagnoses (DSM-IV)',
cnt_mod_cie_10_or= 'Recuento de Diagnóstico CIE-10/Count of Psychiatric Diagnoses (ICD-10)',
cnt_diagnostico_trs_fisico= 'Recuento de Diagnóstico de Trastorno Físico/Count of Physical Disorder',
cnt_otros_probl_at_sm_or= 'Recuento de Otros Problemas de Atención Vinculados a Salud Mental/Count of Other problems linked to Mental Health',
cum_dias_trat_sin_na= 'Suma acumulada de Días de Tratamiento por Usuario/Cumulative Days of Treatment by User',
mean_cum_dias_trat_sin_na= 'Promedio acumulado de Días de Tratamiento por Usuario/Cumulative Average Days of Treatment by User',
cum_diff_bet_treat= 'Suma acumulada de Diferencia en Días con Tratamiento Siguiente por Usuario/Cumulative sum of Days of difference between the Next Treatment by User',
mean_cum_diff_bet_treat= 'Promedio acumulado de Diferencia en Días entre Tratamientos por Usuario/Cumulative Average Days of Differences Between Treatments By User',
rn_hash= 'Número de Tratamientos por Usuario (menor, tratamiento más antiguo)/Number of Treatments by User (less, older treatment)',
n_hash= 'Número total de Tratamientos por Usuario/Total Number of Treatments by User',
#2022
tipo_de_plan_2_1= 'Tipo de Plan del Último Registro/Type of Plan of the Last Entry (1st Treatment)',
tipo_de_plan_2_2= 'Tipo de Plan del Último Registro/Type of Plan of the Last Entry (2nd Treatment)',
tipo_de_plan_2_3= 'Tipo de Plan del Último Registro/Type of Plan of the Last Entry (3rd Treatment)',
tipo_de_plan_2_4= 'Tipo de Plan del Último Registro/Type of Plan of the Last Entry (4th Treatment)',
tipo_de_plan_2_5= 'Tipo de Plan del Último Registro/Type of Plan of the Last Entry (5th Treatment)',
tipo_de_plan_2_6= 'Tipo de Plan del Último Registro/Type of Plan of the Last Entry (6th Treatment)',
tipo_de_plan_2_7= 'Tipo de Plan del Último Registro/Type of Plan of the Last Entry (7th Treatment)',
tipo_de_plan_2_8= 'Tipo de Plan del Último Registro/Type of Plan of the Last Entry (8th Treatment)',
tipo_de_plan_2_9= 'Tipo de Plan del Último Registro/Type of Plan of the Last Entry (9th Treatment)',
tipo_de_plan_2_10= 'Tipo de Plan del Último Registro/Type of Plan of the Last Entry (10th Treatment)',

motivodeegreso_mod_imp_1= 'Cause of Discharge (with late and early withdrawal)(Imputed KNN & Logic) of the Last Entry(b) (1st Treatment)',
motivodeegreso_mod_imp_2= 'Cause of Discharge (with late and early withdrawal)(Imputed KNN & Logic) of the Last Entry(b) (2nd Treatment)', 
motivodeegreso_mod_imp_3= 'Cause of Discharge (with late and early withdrawal)(Imputed KNN & Logic) of the Last Entry(b) (3th Treatment)',
motivodeegreso_mod_imp_4= 'Cause of Discharge (with late and early withdrawal)(Imputed KNN & Logic) of the Last Entry(b) (4th Treatment)',
motivodeegreso_mod_imp_5= 'Cause of Discharge (with late and early withdrawal)(Imputed KNN & Logic) of the Last Entry(b) (5th Treatment)',
motivodeegreso_mod_imp_6= 'Cause of Discharge (with late and early withdrawal)(Imputed KNN & Logic) of the Last Entry(b) (6th Treatment)',
motivodeegreso_mod_imp_7= 'Cause of Discharge (with late and early withdrawal)(Imputed KNN & Logic) of the Last Entry(b) (7th Treatment)',
motivodeegreso_mod_imp_8= 'Cause of Discharge (with late and early withdrawal)(Imputed KNN & Logic) of the Last Entry(b) (8th Treatment)',
motivodeegreso_mod_imp_9= 'Cause of Discharge (with late and early withdrawal)(Imputed KNN & Logic) of the Last Entry(b) (9th Treatment)',
motivodeegreso_mod_imp_10= 'Cause of Discharge (with late and early withdrawal)(Imputed KNN & Logic) of the Last Entry(b) (10th Treatment)',

dias_treat_imp_sin_na_1= "Days of Treatment (1st Treatment)",
dias_treat_imp_sin_na_2= "Days of Treatment (2nd Treatment)",
dias_treat_imp_sin_na_3= "Days of Treatment (3rd Treatment)",
dias_treat_imp_sin_na_4= "Days of Treatment (4th Treatment)",
dias_treat_imp_sin_na_5= "Days of Treatment (5th Treatment)",
dias_treat_imp_sin_na_6= "Days of Treatment (6th Treatment)",
dias_treat_imp_sin_na_7= "Days of Treatment (7th Treatment)",
dias_treat_imp_sin_na_8= "Days of Treatment (8th Treatment)",
dias_treat_imp_sin_na_9= "Days of Treatment (9th Treatment)",
dias_treat_imp_sin_na_10="Days of Treatment (10th Treatment)",

diff_bet_treat_1= "Diff Between Treatments (1st Treatment)",
diff_bet_treat_2= "Diff Between Treatments (2nd Treatment)",
diff_bet_treat_3= "Diff Between Treatments (3rd Treatment)",
diff_bet_treat_4= "Diff Between Treatments (4th Treatment)",
diff_bet_treat_5= "Diff Between Treatments (5th Treatment)",
diff_bet_treat_6= "Diff Between Treatments (6th Treatment)",
diff_bet_treat_7= "Diff Between Treatments (7th Treatment)",
diff_bet_treat_8= "Diff Between Treatments (8th Treatment)",
diff_bet_treat_9= "Diff Between Treatments (9th Treatment)",
diff_bet_treat_10="Diff Between Treatments (10th Treatment)",

cum_dias_trat_sin_na_1= "Cum. Days of Treatment (1st Treatment)",
cum_dias_trat_sin_na_2= "Cum. Days of Treatment (2nd Treatment)",
cum_dias_trat_sin_na_3= "Cum. Days of Treatment (3rd Treatment)",
cum_dias_trat_sin_na_4= "Cum. Days of Treatment (4th Treatment)",
cum_dias_trat_sin_na_5= "Cum. Days of Treatment (5th Treatment)",
cum_dias_trat_sin_na_6= "Cum. Days of Treatment (6th Treatment)",
cum_dias_trat_sin_na_7= "Cum. Days of Treatment (7th Treatment)",
cum_dias_trat_sin_na_8= "Cum. Days of Treatment (8th Treatment)",
cum_dias_trat_sin_na_9= "Cum. Days of Treatment (9th Treatment)",
cum_dias_trat_sin_na_10="Cum. Days of Treatment (10th Treatment)",

mean_cum_dias_trat_sin_na_1= "Avg. Cum. Days of Treatment (1st Treatment)",
mean_cum_dias_trat_sin_na_2= "Avg. Cum. Days of Treatment (2nd Treatment)",
mean_cum_dias_trat_sin_na_3= "Avg. Cum. Days of Treatment (3rd Treatment)",
mean_cum_dias_trat_sin_na_4= "Avg. Cum. Days of Treatment (4th Treatment)",
mean_cum_dias_trat_sin_na_5= "Avg. Cum. Days of Treatment (5th Treatment)",
mean_cum_dias_trat_sin_na_6= "Avg. Cum. Days of Treatment (6th Treatment)",
mean_cum_dias_trat_sin_na_7= "Avg. Cum. Days of Treatment (7th Treatment)",
mean_cum_dias_trat_sin_na_8= "Avg. Cum. Days of Treatment (8th Treatment)",
mean_cum_dias_trat_sin_na_9= "Avg. Cum. Days of Treatment (9th Treatment)",
mean_cum_dias_trat_sin_na_10="Avg. Cum. Days of Treatment (10th Treatment)",

cum_diff_bet_treat_1= "Cum. Diff Between Treatments (1st Treatment)",
cum_diff_bet_treat_2= "Cum. Diff Between Treatments (2nd Treatment)",
cum_diff_bet_treat_3= "Cum. Diff Between Treatments (3rd Treatment)",
cum_diff_bet_treat_4= "Cum. Diff Between Treatments (4th Treatment)",
cum_diff_bet_treat_5= "Cum. Diff Between Treatments (5th Treatment)",
cum_diff_bet_treat_6= "Cum. Diff Between Treatments (6th Treatment)",
cum_diff_bet_treat_7= "Cum. Diff Between Treatments (7th Treatment)",
cum_diff_bet_treat_8= "Cum. Diff Between Treatments (8th Treatment)",
cum_diff_bet_treat_9= "Cum. Diff Between Treatments (9th Treatment)",
cum_diff_bet_treat_10="Cum. Diff Between Treatments (10th Treatment)",

mean_cum_diff_bet_treat_1= "Avg. Cum. Diff Between Treatments (1st Treatment)",
mean_cum_diff_bet_treat_2= "Avg. Cum. Diff Between Treatments (2nd Treatment)",
mean_cum_diff_bet_treat_3= "Avg. Cum. Diff Between Treatments (3rd Treatment)",
mean_cum_diff_bet_treat_4= "Avg. Cum. Diff Between Treatments (4th Treatment)",
mean_cum_diff_bet_treat_5= "Avg. Cum. Diff Between Treatments (5th Treatment)",
mean_cum_diff_bet_treat_6= "Avg. Cum. Diff Between Treatments (6th Treatment)",
mean_cum_diff_bet_treat_7= "Avg. Cum. Diff Between Treatments (7th Treatment)",
mean_cum_diff_bet_treat_8= "Avg. Cum. Diff Between Treatments (8th Treatment)",
mean_cum_diff_bet_treat_9= "Avg. Cum. Diff Between Treatments (9th Treatment)",
mean_cum_diff_bet_treat_10="Avg. Cum. Diff Between Treatments (10th Treatment)"
)
#  bind_rows(data.frame("code"=c("fech_egres_imp", 
##                                "dias_trat_imp",
#                                "dias_trat_alta_temprana_imp",
#                                "motivodeegreso_mod_imp"), "label"=c("Date of Discharge (Imputed)", 
##                                                                     "Days of Treatment (Imputed)",
#                                                                     "Days of Treatment for Early Withdrawal (Imputed)",
#                                                                     "Cause of Discharge w/ Early or Late Withdrawal (Imputed)"))) %>% 

time_after_dedup4<-Sys.time()

paste0("Time in markdown: ");time_after_dedup4-time_before_dedup4

[1] "Time in markdown: "

Time difference of 28.28899 mins

CONS_C1_df_dup_JUL_2020%>%
   dplyr::arrange(hash_key, desc(fech_ing))%>% 
   rio::export(file = paste0(path,"/CONS_C1_df_dup_JUL_2020.dta"))

#save.image("G:/Mi unidad/Alvacast/SISTRAT 2019 (github)/8.RData")
save.image(paste0(path,"/7.RData"))

# no_mostrar_nunca=0
# if(no_mostrar_nunca==1){
# df_def<-
# data.frame(cbind(var_name= names(codebook_data),var_def=data.table(codebook::var_label(codebook_data), keep.rownames = T), type=data.table(sapply(codebook_data, class)),can_be_na=data.table(rep(FALSE,length(names(codebook_data))))))%>%
#   dplyr::rename("var_def"="var_def.V1","type"="type.V1","can_be_na"="can_be_na.V1")
# }    
# 
# #####to_export_labels
# table_labels<-
#   tibble::rownames_to_column(data.frame(Hmisc::label(CONS_C1_df_dup_JUL_2020)))%>% data.frame() %>%
#   dplyr::rename("code" = !!names(.[1]), "label" = !!names(.[2]))%>% data.frame()%>%
#   dplyr::mutate(first= "cap label variable")%>%
#   dplyr::mutate(final= paste0(first, " ",code,' "',label,'"'))%>%
#   dplyr::select(-code,-label,-first)%>%
#   rbind('cap save "G:/Mi unidad/Alvacast/SISTRAT 2019 (github)/CONS_C1_df_dup_JUL_2020.dta", replace')%>%
#   rbind('cap drop id id_mod nombre_centro consentimiento_informado')%>%
#   rbind('cap save "G:/Mi unidad/Alvacast/SISTRAT 2019 (github)/CONS_C1_df_dup_JUL_2020_exp.dta", replace')
# 
# table_labels<-
#   data.frame(final='use "G:/Mi unidad/Alvacast/SISTRAT 2019 (github)/CONS_C1_df_dup_JUL_2020.dta", clear')%>%
#   rbind(table_labels)%>%
#   rename("*final"="final") 
#   #write.csv2(table_labels,"__labels_to_stata_C1_jun_2020.do",row.names =F)
#   write.table(table_labels, file = "G:/Mi unidad/Alvacast/SISTRAT 2019 (github)/SUD_CL/_label_var_to_stata.do", sep = "",row.names = FALSE, quote = FALSE,fileEncoding="UTF-8")

 # dplyr::filter(label!="") %>%
#  bind_rows(data.frame("code"=c("fech_egres_imp", 
##                                "dias_trat_imp",
#                                "dias_trat_alta_temprana_imp",
#                                "motivodeegreso_mod_imp"), "label"=c("Date of Discharge (Imputed)", 
##                                                                     "Days of Treatment (Imputed)",
#                                                                     "Days of Treatment for Early Withdrawal (Imputed)",
#                                                                     "Cause of Discharge w/ Early or Late Withdrawal (Imputed)"))) %>% 


#Agregar nuevas variables al codebook
#hacer codebook
#Agregar procesos a strobe
#Poner olr y arriba pondría los otros modelos.

export_lab_stata<-
  tibble::rownames_to_column(data.frame(Hmisc::label(CONS_C1_df_dup_JUL_2020)))%>% data.frame() %>%
  dplyr::rename("code" = !!names(.[1]), "label" = !!names(.[2]))%>% data.frame()%>%
  dplyr::mutate(first= "cap noi label variable")%>%
  dplyr::mutate(final= paste0(first, " ",code,' "',label,'"'))%>%
  dplyr::select(-code,-label,-first)%>%
  dplyr::rename("*clear all"="final") %>% 
  rbind(paste0('cap noi save "', gsub('/', '\\', path, fixed=T),'\\CONS_C1_df_dup_JUL_2020.dta", replace'))%>%
  rbind('cap noi drop id id_mod nombre_centro consentimiento_informado')%>%
  rbind('cap noi drop id id_mod nombre_centro')%>%
  rbind(paste0('cap noi save "', gsub('/', '\\', path, fixed=T),'\\CONS_C1_df_dup_JUL_2020_exp.dta", replace'))

rbind(paste0('cap noi use "', gsub('/', '\\', path, fixed=T),'\\CONS_C1_df_dup_JUL_2020.dta", clear'),export_lab_stata) %>% knitr::kable("html") %>% 
    kableExtra::kable_styling(bootstrap_options = c("striped", "hover"),font_size =10) %>% 
  kableExtra::scroll_box(width = "100%", height = "375px")

*clear all
cap noi use “C:Fondecytunidad (github)_C1_df_dup_JUL_2020.dta”, clear
cap noi label variable row “Numerador de los eventos presentes en la Base de Datos (Último registro)/Events in the Dataset (Last Entry)”
cap noi label variable row_cont_entries “Numerador de los eventos presentes en la Base de Datos()/Events in the Dataset()”
cap noi label variable hash_key “Codificación del RUN/Masked Identifier (RUN)”
cap noi label variable hash_rut_completo “HASH alternativo, en el escenario en que se asuma que el individuo al que se le codificó el RUN presente mayor edad/Alternative HASH-Key”
cap noi label variable id “Codigo Identificación de SENDA/SENDA ID”
cap noi label variable id_mod “ID de SENDA para Presentación en Página Web (enmascara caracteres 5 y 6)/SENDA ID (mask characters 5 & 6)”
cap noi label variable fech_ing “Fecha de Ingreso a Tratamiento (Primera Entrada)/Date of Admission to Treatment (First Entry)”
cap noi label variable fech_egres_imp “Fecha de Egreso (Imputados KNN & Lógico) del Último Registro(b)/Date of Discharge (Imputed KNN & Logic) of the Last Entry(b)”
cap noi label variable tipo_de_plan_2_largest_treat “Tipo de Plan del Registro Más Largo entre entradas intermedias(f)/Type of Plan of the Largest Entry Among Intermediate Entries(f)”
cap noi label variable tipo_de_plan_2_concat_a “Tipo de Plan()/Type of Plan()”
cap noi label variable tipo_de_programa_2 “Tipo de Programa del Registro Más Largo entre Entradas Intermedias/Type of Program of the Largest Entry Among Intermediate Entries”
cap noi label variable id_centro “ID de Centro(b)/Treatment center ID(b)”
cap noi label variable nombre_centro “Nombre del Centro de Tratamiento()/Treatment Center()”
cap noi label variable id_centro_concat_a “ID de Centro()/Treatment center ID()”
cap noi label variable tipo_centro “Tipo de Centro del Último Registro/Type of Center of the Last Entry”
cap noi label variable servicio_de_salud “Servicio de Salud()/Health Service()”
cap noi label variable senda “SENDA del Último Registro/SENDA of the Last Entry”
cap noi label variable numero_de_hijos_mod “Número de Hijos (Valor Max.)/Number of Children (Max. Value)”
cap noi label variable num_hijos_trat_res_mod “Número de Hijos para Ingreso a Tratamiento Residencial del Último Registro/Number of Children to Residential Treatment of the Last Entry”
cap noi label variable tipo_centro_derivacion “Tipo de Centro al que el Usuario es Derivado del Último Registro(b)/Type of Center of Derivation of the Last Entry(b)”
cap noi label variable macrozona “Macrozona del Centro del Último Registro(b)/Macrozones of the Center of the Last Entry(b)”
cap noi label variable nombre_region “Región del Centro del Último Registro(b)/Chilean Region of the Center of the Last Entry(b)”
cap noi label variable comuna_residencia_cod “Comuna de Residencia del Último Registro(b)/Municipality or District of Residence of the Last Entry(b)”
cap noi label variable fecha_ingreso_a_convenio_senda “Fecha de Ingreso a Convenio SENDA (aún no formateada como fecha) (Primera Entrada)/Date of Admission to SENDA Agreement (First Entry)”
cap noi label variable identidad_de_genero “Identidad de Género (Último Registro)(b)/Gender Identity (Last Entry)(b)”
cap noi label variable edad_al_ing “Edad a la Fecha de Ingreso a Tratamiento (numérico continuo) (Primera Entrada)/Age at Admission to Treatment (First Entry)”
cap noi label variable origen_ingreso_mod “Origen de Ingreso (Primera Entrada)/Motive of Admission to Treatment (First Entry)”
cap noi label variable x_se_trata_mujer_emb “Mujer Embarazada al Ingreso (d)/Pregnant at Admission (d)”
cap noi label variable compromiso_biopsicosocial “Compromiso Biopsicosocial(d)/Biopsychosocial Involvement(d)”
cap noi label variable dg_global_nec_int_soc_or “Diagnóstico Global de Necesidades de Integración Social (Al Ingreso)(d)/Global Diagnosis of Social Integration (At Admission)(d)”
cap noi label variable dg_nec_int_soc_cap_hum_or “Diagnóstico de Necesidades de Integración Social en Capital Humano (Al Ingreso)(d)/Global Diagnosis of Social Integration in Human Capital (At Admission)(d)”
cap noi label variable dg_nec_int_soc_cap_fis_or “Diagnóstico de Necesidades de Integración Social en Capital Físico (Al Ingreso)(d)/Global Diagnosis of Social Integration in Physical Capital (At Admission)(d)”
cap noi label variable dg_nec_int_soc_cap_soc_or “Diagnóstico de Necesidades de Integración Social en Capital Social (Al Ingreso)(d)/Global Diagnosis of Social Integration in Social Capital (At Admission)(d)”
cap noi label variable usuario_tribunal_trat_droga “Usuario de modalidad Tribunales de Tratamiento de Drogas(d)/User of Drug Treatment Courts Modality(d)”
cap noi label variable evaluacindelprocesoteraputico “Evaluación del Proceso Terapéutico(d)/Evaluation of the Therapeutic Process(d)”
cap noi label variable eva_consumo “Evaluación al Egreso Respecto al Patrón de consumo(d)/Evaluation at Discharge regarding to Consumption Pattern(d)”
cap noi label variable eva_fam “Evaluación al Egreso Respecto a Situación Familiar(d)/Evaluation at Discharge regarding to Family Situation(d)”
cap noi label variable eva_relinterp “Evaluación al Egreso Respecto a Relaciones Interpersonales(d)/Evaluation at Discharge regarding to Interpersonal Relations(d)”
cap noi label variable eva_ocupacion “Evaluación al Egreso Respecto a Situación Ocupacional(d)/Evaluation at Discharge regarding to Occupational Status(d)”
cap noi label variable eva_sm “Evaluación al Egreso Respecto a Salud Mental(d)/Evaluation at Discharge regarding to Mental Health(d)”
cap noi label variable eva_fisica “Evaluación al Egreso Respecto a Salud Física(d)/Evaluation at Discharge regarding to Physical Health(d)”
cap noi label variable eva_transgnorma “Evaluación al Egreso Respecto a Trasgresión a la Norma Social(d)/Evaluation at Discharge regarding to Transgression to the Norm(d)”
cap noi label variable dg_global_nec_int_soc_or_1 “Diagnóstico Global de Necesidades de Integración Social (Al Egreso)(d)/Global Diagnosis of Social Integration (At Discharge)(d)”
cap noi label variable dg_nec_int_soc_cap_hum_or_1 “Diagnóstico de Necesidades de Integración Social en Capital Humano (Al Egreso)(d)/Global Diagnosis of Social Integration in Human Capital (At Discharge)(d)”
cap noi label variable dg_nec_int_soc_cap_fis_or_1 “Diagnóstico de Necesidades de Integración Social en Capital Físico (Al Egreso)(d)/Global Diagnosis of Social Integration in Physical Capital (At Discharge)(d)”
cap noi label variable dg_nec_int_soc_cap_soc_or_1 “Diagnóstico de Necesidades de Integración Social en Capital Social (Al Egreso)(d)/Global Diagnosis of Social Integration in Social Capital (At Discharge)(d)”
cap noi label variable tiene_menores_de_edad_a_cargo “Menores de Edad A Cargo(d)/Minor Dependants(d)”
cap noi label variable ha_estado_embarazada_egreso “¿Ha estado embarazada? (al Egreso)(d)/Have you been Pregnant (at Discharge)(d)”
cap noi label variable discapacidad “Presenta Discapacidad(d)/Disability(d)”
cap noi label variable opcion_discapacidad “Origen de Discapacidad(d)/Cause of Disability(d)”
cap noi label variable escolaridad “Escolaridad: Nivel Eduacional(d)/Educational Attainment(d)”
cap noi label variable edad_al_ing_grupos “Edad a la Fecha de Ingreso a Tratamiento en Grupos(c)/Age at Admission to Treatment In Groups(c)”
cap noi label variable nacionalidad “Nacionalidad/Nationality”
cap noi label variable sexo_2 “Sexo Usuario/Sex of User”
cap noi label variable embarazo “Embarazo al Ingreso(c)/Pregnant at Admission(c)”
cap noi label variable fech_nac “Fecha de Nacimiento/Date of Birth”
cap noi label variable edad_ini_cons “Edad de Inicio de Consumo/Age of Onset of Drug Use”
cap noi label variable edad_ini_sus_prin “Edad de Inicio de Consumo Sustancia Principal/Age of Onset of Drug Use of Primary Substance”
cap noi label variable estado_conyugal_2 “Estado Conyugal/Marital Status”
cap noi label variable edad_grupos “Edad agrupada/Age in groups”
cap noi label variable freq_cons_sus_prin “Frecuencia de Consumo de la Sustancia Principal (30 días previos a la admisión)(f)/Frequency of Consumption of the Primary or Main Substance (30 days previous to admission)(f)”
cap noi label variable via_adm_sus_prin_act “Vía de Administración de la Sustancia Principal (Se aplicaron criterios de limpieza)(f)/Route of Administration of the Primary or Main Substance (Tidy)(f)”
cap noi label variable etnia_cor “Etnia/Ethnic Group”
cap noi label variable nacionalidad_2 “Segunda Nacionalidad/Second Nationality”
cap noi label variable etnia_cor_2 “Etnia (2)/Second Ethnic Group”
cap noi label variable sus_ini_2_mod “Segunda Sustancia de Inicio(Sólo más frecuentes)/Second Starting Substance”
cap noi label variable sus_ini_3_mod “Tercera Sustancia de Inicio(Sólo más frecuentes)/Third Starting Substance”
cap noi label variable sus_ini_mod “Sustancia de Inicio (Sólo más frecuentes)/Starting Substance (Only more frequent)”
cap noi label variable con_quien_vive “Persona con la que vive el Usuario(f)/People that Share Household with the User (Cohabitation Status)(f)”
cap noi label variable estatus_ocupacional “Condición Ocupacional(f)/Occupational Status(f)”
cap noi label variable cat_ocupacional “Categoría Ocupacional(f)/Occupational Category(f)”
cap noi label variable sus_principal_mod “Sustancia Principal de Consumo (Sólo más frecuentes)(f)/Primary or Main Substance of Consumption at Admission (Only more frequent)(f)”
cap noi label variable tipo_de_vivienda_mod “Tipo de Vivienda(f)/Type of Housing(f)”
cap noi label variable tenencia_de_la_vivienda_mod “Tenencia de la Vivienda(f)/Tenure status of Households(f)”
cap noi label variable rubro_trabaja_mod “Rubro de Trabajo(f)/Area of Work(f)”
cap noi label variable otras_sus1_mod “Otras Sustancias (1)(Sólo más frecuentes)(f)/Other Substances (1)(Only more frequent)(f)”
cap noi label variable otras_sus2_mod “Otras Sustancias (2)(Sólo más frecuentes)(f)/Other Substances (2)(Only more frequent)(f)”
cap noi label variable otras_sus3_mod “Otras Sustancias (3)(Sólo más frecuentes)(f)/Other Substances (3)(Only more frequent)(f)”
cap noi label variable dg_trs_cons_sus_or “Diagnósico de Trastorno por Consumo de Sustancias(d)/Diagnosed of Substance Use Disorder(d)”
cap noi label variable diagnostico_trs_fisico “Diagnóstico de Trastorno Físico(g)/Diagnosis of Physical Disorder(g)”
cap noi label variable otros_probl_at_sm_or “Otros Problemas de Atención Vinculados a Salud Mental(g)/Other problems linked to Mental Health(g)”
cap noi label variable ano_bd_first “Año de la Base de Datos(c)/Year of the Dataset (Source)(c)”
cap noi label variable ano_bd_last “Año de la Base de Datos(b)/Year of the Dataset (Source)(b)”
cap noi label variable obs “Observaciones al Proceso de Limpieza y Estandarización de Casos(e)/Observations to the Process of Data Tidying & Standardization(e)”
cap noi label variable obs_concat_a “Observaciones al Proceso de Limpieza y Estandarización de Casos()/Observations to the Process of Data Tidying & Standardization()”
cap noi label variable rn_common_treats2 “Cuenta de Entradas Comunes(b)/Count of Common Entries(b)”
cap noi label variable concat_hash_id_treatments “Combination of User & Distint Entries”
cap noi label variable at_least_one_cont_entry “Casos de Usuarios con más de una entrada después de otra/Cases of users with more than one entry after another one”
cap noi label variable senda_concat_a “SENDA()/SENDA()”
cap noi label variable tipo_centro_concat_a “Tipo de Centro()/Type of Center()”
cap noi label variable fech_ing_num “Fecha de Ingreso a Tratamiento (Numérico)(c)/Date of Admission to Treatment (Numeric)(c)”
cap noi label variable fech_egres_num “Fecha de Egreso (Imputados KNN & Lógico)(Numérico)(b)/Date of Discharge (Imputed KNN & Logic)(Numeric)(b) of the Next Treatment”
cap noi label variable fech_ing_next_treat “Fecha de Ingreso a Tratamiento (Numérico)(c) del Tratamiento Posterior/Date of Admission to Treatment (Numeric)(c)”
cap noi label variable id_centro_sig_trat “ID del Centro del Tratamiento Posterior/Center ID of the Next Treatment”
cap noi label variable tipo_plan_sig_trat “Tipo de Plan del Tratamiento Posterior/Type of Plan of the Next Treatment”
cap noi label variable tipo_programa_sig_trat “Tipo de Programa del Tratamiento Posterior/Type of Program of the Next Treatment”
cap noi label variable senda_sig_trat “SENDA del Tratamiento Posterior/SENDA of the Next Treatment”
cap noi label variable menor_60_dias_diff “Menor a 60 días de diferencia con el Tratamiento Posterior/Menor a 60 days of difference between the Next Treatment”
cap noi label variable menor_45_dias_diff “Menor a 45 días de diferencia con el Tratamiento Posterior/Less than 45 days of difference between the Next Treatment”
cap noi label variable motivoegreso_derivacion “Motivo de Egreso= Derivación(b)/Cause of Discharge= Derivación(b)”
cap noi label variable abandono_temprano “Abandono temprano(<3 meses)/ Early Drop-out(<3 months)”
cap noi label variable obs_cambios “Cambios del tratamiento en comparación al Tratamiento Posterior/Changes in treatment compared to the Next Treatment”
cap noi label variable obs_cambios_ninguno “Sin cambios del tratamiento en comparación al Tratamiento Posterior/No changes in treatment compared to the Next Treatment”
cap noi label variable obs_cambios_num “Recuento de cambios del tratamiento en comparación al Tratamiento Posterior/Count of changes in treatment compared to the Next Treatment”
cap noi label variable obs_cambios_fac “Recuento de cambios del tratamiento en comparación al Tratamiento Posterior(factor)/Count of changes in treatment compared to the Next Treatment(factor)”
cap noi label variable edad_ini_sus_prin_grupos “Edad de Inicio de Consumo Sustancia Principal (en Grupos)/Age of Onset of Drug Use of Primary Substance (in Groups)”
cap noi label variable hash_key_sex_program “Usuarios a los que se le ha cambiado el sexo de acuerdo al tipo de plan/Users that changed of sex considering the types of plan”
cap noi label variable centro_muj “ID de centro que alude a un centro específico para mujeres/Center ID aludes to a women-specific center”
cap noi label variable cie_10 “Diagnóstico CIE-10 (1 o más)/Psychiatric Diagnoses (ICD-10)(one or more)”
cap noi label variable dsm_iv “Diagnóstico DSM-IV (1 o más)/Psychiatric Diagnoses (DSM-IV)(one or more)”
cap noi label variable con_quien_vive_rec “Persona con la que vive el Usuario (Recodificada)(f)/People that Share Household with the User (Cohabitation Status)(Recoded)(f)”
cap noi label variable cnt_mod_dsm_iv_or “Recuento de Diagnóstico DSM-IV/Count of Psychiatric Diagnoses (DSM-IV)”
cap noi label variable cnt_mod_cie_10_or “Recuento de Diagnóstico CIE-10/Count of Psychiatric Diagnoses (ICD-10)”
cap noi label variable cnt_diagnostico_trs_fisico “Recuento de Diagnóstico de Trastorno Físico/Count of Physical Disorder”
cap noi label variable dg_trs_psiq_sub_dsm_iv_or “Diagnóstico de Trastorno Psiquiátrico, Criterios DSM IV (Subclasificacion)(g)/Diagnosis of Psychiatric Disorders, DSM-IV criteria (sub-classification)(g)”
cap noi label variable dg_trs_psiq_dsm_iv_or “Diagnóstico de Trastorno Psiquiátrico, Criterios DSM IV(g)/Diagnosis of Psychiatric Disorders, DSM-IV criteria(g)”
cap noi label variable dg_trs_psiq_cie_10_or “Diagnóstico de Trastorno Psiquiátrico, Criterios CIE-10(g)/Diagnosis of Psychiatric Disorders, CIE-10 criteria(g)”
cap noi label variable x2_dg_trs_psiq_cie_10_or “Diagnóstico de Trastorno Psiquiátrico, Criterios CIE-10 (2)(g)/Diagnosis of Psychiatric Disorders, CIE-10 criteria (2)(g)”
cap noi label variable x3_dg_trs_psiq_cie_10_or “Diagnóstico de Trastorno Psiquiátrico, Criterios CIE-10 (3)(g)/Diagnosis of Psychiatric Disorders, CIE-10 criteria (3)(g)”
cap noi label variable x4_dg_trs_psiq_cie_10_or “Diagnóstico de Trastorno Psiquiátrico, Criterios CIE-10 (4)(g)/Diagnosis of Psychiatric Disorders, CIE-10 criteria (4)(g)”
cap noi label variable x5_dg_trs_psiq_cie_10_or “Diagnóstico de Trastorno Psiquiátrico, Criterios CIE-10 (5)(g)/Diagnosis of Psychiatric Disorders, CIE-10 criteria (5)(g)”
cap noi label variable x2_dg_trs_psiq_dsm_iv_or “Diagnóstico de Trastorno Psiquiátrico, Criterios DSM IV (2)(g)/Diagnosis of Psychiatric Disorders, DSM-IV criteria (2)(g)”
cap noi label variable x3_dg_trs_psiq_dsm_iv_or “Diagnóstico de Trastorno Psiquiátrico, Criterios DSM IV (3)(g)/Diagnosis of Psychiatric Disorders, DSM-IV criteria (3)(g)”
cap noi label variable x4_dg_trs_psiq_dsm_iv_or “Diagnóstico de Trastorno Psiquiátrico, Criterios DSM IV (4)(g)/Diagnosis of Psychiatric Disorders, DSM-IV criteria (4)(g)”
cap noi label variable dg_trs_psiq_sub_cie_10_or “Diagnóstico de Trastorno Psiquiátrico, Criterios CIE-10 (Subclasificacion)(g)/Diagnosis of Psychiatric Disorders, CIE-10 criteria (subclassification)(g)”
cap noi label variable x2_dg_trs_psiq_sub_cie_10_or “Diagnóstico de Trastorno Psiquiátrico, Criterios CIE-10 (Subclasificacion) (2)(g)/Diagnosis of Psychiatric Disorders, CIE-10 criteria (subclassification) (2)(g)”
cap noi label variable x3_dg_trs_psiq_sub_cie_10_or “Diagnóstico de Trastorno Psiquiátrico, Criterios CIE-10 (Subclasificacion) (3)(g)/Diagnosis of Psychiatric Disorders, CIE-10 criteria (subclassification) (3)(g)”
cap noi label variable x4_dg_trs_psiq_sub_cie_10_or “Diagnóstico de Trastorno Psiquiátrico, Criterios CIE-10 (Subclasificacion)(4)(g)/Diagnosis of Psychiatric Disorders, CIE-10 criteria (subclassification)(4)(g)”
cap noi label variable x2_dg_trs_psiq_sub_dsm_iv_or “Diagnóstico de Trastorno Psiquiátrico, Criterios DSM IV (Subclasificacion) (2)(g)/Diagnosis of Psychiatric Disorders, DSM-IV criteria (sub-classification) (2)(g)”
cap noi label variable x3_dg_trs_psiq_sub_dsm_iv_or “Diagnóstico de Trastorno Psiquiátrico, Criterios DSM IV (Subclasificacion) (3)(g)/Diagnosis of Psychiatric Disorders, DSM-IV criteria (sub-classification) (3)(g)”
cap noi label variable x4_dg_trs_psiq_sub_dsm_iv_or “Diagnóstico de Trastorno Psiquiátrico, Criterios DSM IV (Subclasificacion)(4)(g)/Diagnosis of Psychiatric Disorders, DSM-IV criteria (sub-classification)(4)(g)”
cap noi label variable cnt_otros_probl_at_sm_or “Recuento de Otros Problemas de Atención Vinculados a Salud Mental/Count of Other problems linked to Mental Health”
cap noi label variable escolaridad_rec “Escolaridad: Nivel Eduacional(d) Normalizado a Progresión de Tratamientos/Educational Attainment(d) & Normalized Following Progression of Treatments”
cap noi label variable tipo_de_plan_2 “Tipo de Plan del Último Registro/Type of Plan of the Last Entry”
cap noi label variable motivodeegreso_mod_imp “Motivo de Egreso (con abandono temprano y tardío)(Imputados KNN & Lógico) del Último Registro(b)/Cause of Discharge (with late and early withdrawal)(Imputed KNN & Logic) of the Last Entry(b)”
cap noi label variable rn_hash “Número de Tratamientos por Usuario (menor, tratamiento más antiguo)/Number of Treatments by User (less, older treatment)”
cap noi label variable n_hash “Número total de Tratamientos por Usuario/Total Number of Treatments by User”
cap noi label variable cum_dias_trat_sin_na “Suma acumulada de Días de Tratamiento por Usuario/Cumulative Days of Treatment by User”
cap noi label variable mean_cum_dias_trat_sin_na “Promedio acumulado de Días de Tratamiento por Usuario/Cumulative Average Days of Treatment by User”
cap noi label variable cum_diff_bet_treat “Suma acumulada de Diferencia en Días con Tratamiento Siguiente por Usuario/Cumulative sum of Days of difference between the Next Treatment by User”
cap noi label variable mean_cum_diff_bet_treat “Promedio acumulado de Diferencia en Días entre Tratamientos por Usuario/Cumulative Average Days of Differences Between Treatments By User”
cap noi label variable tipo_de_plan_2_1 “Tipo de Plan del Último Registro/Type of Plan of the Last Entry (1st Treatment)”
cap noi label variable tipo_de_plan_2_2 “Tipo de Plan del Último Registro/Type of Plan of the Last Entry (2nd Treatment)”
cap noi label variable tipo_de_plan_2_3 “Tipo de Plan del Último Registro/Type of Plan of the Last Entry (3rd Treatment)”
cap noi label variable tipo_de_plan_2_4 “Tipo de Plan del Último Registro/Type of Plan of the Last Entry (4th Treatment)”
cap noi label variable tipo_de_plan_2_5 “Tipo de Plan del Último Registro/Type of Plan of the Last Entry (5th Treatment)”
cap noi label variable tipo_de_plan_2_6 “Tipo de Plan del Último Registro/Type of Plan of the Last Entry (6th Treatment)”
cap noi label variable tipo_de_plan_2_7 “Tipo de Plan del Último Registro/Type of Plan of the Last Entry (7th Treatment)”
cap noi label variable tipo_de_plan_2_8 “Tipo de Plan del Último Registro/Type of Plan of the Last Entry (8th Treatment)”
cap noi label variable tipo_de_plan_2_9 “Tipo de Plan del Último Registro/Type of Plan of the Last Entry (9th Treatment)”
cap noi label variable tipo_de_plan_2_10 “Tipo de Plan del Último Registro/Type of Plan of the Last Entry (10th Treatment)”
cap noi label variable motivodeegreso_mod_imp_1 “Cause of Discharge (with late and early withdrawal)(Imputed KNN & Logic) of the Last Entry(b) (1st Treatment)”
cap noi label variable motivodeegreso_mod_imp_2 “Cause of Discharge (with late and early withdrawal)(Imputed KNN & Logic) of the Last Entry(b) (2nd Treatment)”
cap noi label variable motivodeegreso_mod_imp_3 “Cause of Discharge (with late and early withdrawal)(Imputed KNN & Logic) of the Last Entry(b) (3th Treatment)”
cap noi label variable motivodeegreso_mod_imp_4 “Cause of Discharge (with late and early withdrawal)(Imputed KNN & Logic) of the Last Entry(b) (4th Treatment)”
cap noi label variable motivodeegreso_mod_imp_5 “Cause of Discharge (with late and early withdrawal)(Imputed KNN & Logic) of the Last Entry(b) (5th Treatment)”
cap noi label variable motivodeegreso_mod_imp_6 “Cause of Discharge (with late and early withdrawal)(Imputed KNN & Logic) of the Last Entry(b) (6th Treatment)”
cap noi label variable motivodeegreso_mod_imp_7 “Cause of Discharge (with late and early withdrawal)(Imputed KNN & Logic) of the Last Entry(b) (7th Treatment)”
cap noi label variable motivodeegreso_mod_imp_8 “Cause of Discharge (with late and early withdrawal)(Imputed KNN & Logic) of the Last Entry(b) (8th Treatment)”
cap noi label variable motivodeegreso_mod_imp_9 “Cause of Discharge (with late and early withdrawal)(Imputed KNN & Logic) of the Last Entry(b) (9th Treatment)”
cap noi label variable motivodeegreso_mod_imp_10 “Cause of Discharge (with late and early withdrawal)(Imputed KNN & Logic) of the Last Entry(b) (10th Treatment)”
cap noi label variable dias_treat_imp_sin_na_1 “Days of Treatment (1st Treatment)”
cap noi label variable dias_treat_imp_sin_na_2 “Days of Treatment (2nd Treatment)”
cap noi label variable dias_treat_imp_sin_na_3 “Days of Treatment (3rd Treatment)”
cap noi label variable dias_treat_imp_sin_na_4 “Days of Treatment (4th Treatment)”
cap noi label variable dias_treat_imp_sin_na_5 “Days of Treatment (5th Treatment)”
cap noi label variable dias_treat_imp_sin_na_6 “Days of Treatment (6th Treatment)”
cap noi label variable dias_treat_imp_sin_na_7 “Days of Treatment (7th Treatment)”
cap noi label variable dias_treat_imp_sin_na_8 “Days of Treatment (8th Treatment)”
cap noi label variable dias_treat_imp_sin_na_9 “Days of Treatment (9th Treatment)”
cap noi label variable dias_treat_imp_sin_na_10 “Days of Treatment (10th Treatment)”
cap noi label variable diff_bet_treat_1 “Diff Between Treatments (1st Treatment)”
cap noi label variable diff_bet_treat_2 “Diff Between Treatments (2nd Treatment)”
cap noi label variable diff_bet_treat_3 “Diff Between Treatments (3rd Treatment)”
cap noi label variable diff_bet_treat_4 “Diff Between Treatments (4th Treatment)”
cap noi label variable diff_bet_treat_5 “Diff Between Treatments (5th Treatment)”
cap noi label variable diff_bet_treat_6 “Diff Between Treatments (6th Treatment)”
cap noi label variable diff_bet_treat_7 “Diff Between Treatments (7th Treatment)”
cap noi label variable diff_bet_treat_8 “Diff Between Treatments (8th Treatment)”
cap noi label variable diff_bet_treat_9 “Diff Between Treatments (9th Treatment)”
cap noi label variable diff_bet_treat_10 “Diff Between Treatments (10th Treatment)”
cap noi label variable cum_dias_trat_sin_na_1 “Cum. Days of Treatment (1st Treatment)”
cap noi label variable cum_dias_trat_sin_na_2 “Cum. Days of Treatment (2nd Treatment)”
cap noi label variable cum_dias_trat_sin_na_3 “Cum. Days of Treatment (3rd Treatment)”
cap noi label variable cum_dias_trat_sin_na_4 “Cum. Days of Treatment (4th Treatment)”
cap noi label variable cum_dias_trat_sin_na_5 “Cum. Days of Treatment (5th Treatment)”
cap noi label variable cum_dias_trat_sin_na_6 “Cum. Days of Treatment (6th Treatment)”
cap noi label variable cum_dias_trat_sin_na_7 “Cum. Days of Treatment (7th Treatment)”
cap noi label variable cum_dias_trat_sin_na_8 “Cum. Days of Treatment (8th Treatment)”
cap noi label variable cum_dias_trat_sin_na_9 “Cum. Days of Treatment (9th Treatment)”
cap noi label variable cum_dias_trat_sin_na_10 “Cum. Days of Treatment (10th Treatment)”
cap noi label variable mean_cum_dias_trat_sin_na_1 “Avg. Cum. Days of Treatment (1st Treatment)”
cap noi label variable mean_cum_dias_trat_sin_na_2 “Avg. Cum. Days of Treatment (2nd Treatment)”
cap noi label variable mean_cum_dias_trat_sin_na_3 “Avg. Cum. Days of Treatment (3rd Treatment)”
cap noi label variable mean_cum_dias_trat_sin_na_4 “Avg. Cum. Days of Treatment (4th Treatment)”
cap noi label variable mean_cum_dias_trat_sin_na_5 “Avg. Cum. Days of Treatment (5th Treatment)”
cap noi label variable mean_cum_dias_trat_sin_na_6 “Avg. Cum. Days of Treatment (6th Treatment)”
cap noi label variable mean_cum_dias_trat_sin_na_7 “Avg. Cum. Days of Treatment (7th Treatment)”
cap noi label variable mean_cum_dias_trat_sin_na_8 “Avg. Cum. Days of Treatment (8th Treatment)”
cap noi label variable mean_cum_dias_trat_sin_na_9 “Avg. Cum. Days of Treatment (9th Treatment)”
cap noi label variable mean_cum_dias_trat_sin_na_10 “Avg. Cum. Days of Treatment (10th Treatment)”
cap noi label variable cum_diff_bet_treat_1 “Cum. Diff Between Treatments (1st Treatment)”
cap noi label variable cum_diff_bet_treat_2 “Cum. Diff Between Treatments (2nd Treatment)”
cap noi label variable cum_diff_bet_treat_3 “Cum. Diff Between Treatments (3rd Treatment)”
cap noi label variable cum_diff_bet_treat_4 “Cum. Diff Between Treatments (4th Treatment)”
cap noi label variable cum_diff_bet_treat_5 “Cum. Diff Between Treatments (5th Treatment)”
cap noi label variable cum_diff_bet_treat_6 “Cum. Diff Between Treatments (6th Treatment)”
cap noi label variable cum_diff_bet_treat_7 “Cum. Diff Between Treatments (7th Treatment)”
cap noi label variable cum_diff_bet_treat_8 “Cum. Diff Between Treatments (8th Treatment)”
cap noi label variable cum_diff_bet_treat_9 “Cum. Diff Between Treatments (9th Treatment)”
cap noi label variable cum_diff_bet_treat_10 “Cum. Diff Between Treatments (10th Treatment)”
cap noi label variable mean_cum_diff_bet_treat_1 “Avg. Cum. Diff Between Treatments (1st Treatment)”
cap noi label variable mean_cum_diff_bet_treat_2 “Avg. Cum. Diff Between Treatments (2nd Treatment)”
cap noi label variable mean_cum_diff_bet_treat_3 “Avg. Cum. Diff Between Treatments (3rd Treatment)”
cap noi label variable mean_cum_diff_bet_treat_4 “Avg. Cum. Diff Between Treatments (4th Treatment)”
cap noi label variable mean_cum_diff_bet_treat_5 “Avg. Cum. Diff Between Treatments (5th Treatment)”
cap noi label variable mean_cum_diff_bet_treat_6 “Avg. Cum. Diff Between Treatments (6th Treatment)”
cap noi label variable mean_cum_diff_bet_treat_7 “Avg. Cum. Diff Between Treatments (7th Treatment)”
cap noi label variable mean_cum_diff_bet_treat_8 “Avg. Cum. Diff Between Treatments (8th Treatment)”
cap noi label variable mean_cum_diff_bet_treat_9 “Avg. Cum. Diff Between Treatments (9th Treatment)”
cap noi label variable mean_cum_diff_bet_treat_10 “Avg. Cum. Diff Between Treatments (10th Treatment)”
cap noi label variable diff_bet_treat “Días de diferencia con el Tratamiento Posterior/Days of difference between the Next Treatment”
cap noi label variable dias_treat_imp_sin_na “Días de Tratamiento (valores perdidos en la fecha de egreso se reemplazaron por la diferencia con 2019-11-13)/Days of Treatment (missing dates of discharge were replaced with difference from 2019-11-13)”
cap noi save “C:Fondecytunidad (github)_C1_df_dup_JUL_2020.dta”, replace
cap noi drop id id_mod nombre_centro consentimiento_informado
cap noi drop id id_mod nombre_centro
cap noi save “C:Fondecytunidad (github)_C1_df_dup_JUL_2020_exp.dta”, replace

write.table(rbind(paste0('cap noi use "', gsub('/', '\\', path, fixed=T),'\\CONS_C1_df_dup_JUL_2020.dta", clear'),export_lab_stata), file = paste0(path,"/SUD_CL/_label_var_to_stata.do"), sep = "",row.names = FALSE, quote = FALSE, fileEncoding="UTF-8")

*should be in the same folder of the .Rmd to work
cap noi do _label_var_to_stata.do

. *should be in the same folder of the .Rmd to w. cap noi do _label_var_to_stata.do

. *clear all
. cap noi use "C:\Users\CISS Fondecyt\Mi unidad\Alvacast\SISTRAT 2019 (github)\
> CONS_C1_df_dup_JUL_2020.dta", clear

. cap noi label variable row "Numerador de los eventos presentes en la Base de 
> Datos (Ãšltimo registro)/Events in the Dataset (Last Entry)"
note: label truncated to 80 characters

. cap noi label variable row_cont_entries "Numerador de los eventos presentes e
> n la Base de Datos(*)/Events in the Dataset(*)"
note: label truncated to 80 characters

. cap noi label variable hash_key "CodificaciÃ³n del RUN/Masked Identifier (RUN)
> "

. cap noi label variable hash_rut_completo "HASH alternativo, en el escenario e
> n que se asuma que el individuo al que se le codificÃ³ el RUN presente mayor e
> dad/Alternative HASH-Key"
note: label truncated to 80 characters

. cap noi label variable id "Codigo IdentificaciÃ³n de SENDA/SENDA ID"

. cap noi label variable id_mod "ID de SENDA para PresentaciÃ³n en PÃ¡gina Web (e
> nmascara caracteres 5 y 6)/SENDA ID (mask characters 5 & 6)"
note: label truncated to 80 characters

. cap noi label variable fech_ing "Fecha de Ingreso a Tratamiento (Primera Entr
> ada)/Date of Admission to Treatment (First Entry)"
note: label truncated to 80 characters

. cap noi label variable fech_egres_imp "Fecha de Egreso (Imputados KNN & LÃ³gic
> o) del Ãšltimo Registro(b)/Date of Discharge (Imputed KNN & Logic) of the Last
>  Entry(b)"
note: label truncated to 80 characters

. cap noi label variable tipo_de_plan_2_largest_treat "Tipo de Plan del Registr
> o MÃ¡s Largo entre entradas intermedias(f)/Type of Plan of the Largest Entry A
> mong Intermediate Entries(f)"
note: label truncated to 80 characters

. cap noi label variable tipo_de_plan_2_concat_a "Tipo de Plan(*)/Type of Plan(
> *)"

. cap noi label variable tipo_de_programa_2 "Tipo de Programa del Registro MÃ¡s 
> Largo entre Entradas Intermedias/Type of Program of the Largest Entry Among I
> ntermediate Entries"
note: label truncated to 80 characters

. cap noi label variable id_centro "ID de Centro(b)/Treatment center ID(b)"

. cap noi label variable nombre_centro "Nombre del Centro de Tratamiento(*)/Tre
> atment Center(*)"

. cap noi label variable id_centro_concat_a "ID de Centro(*)/Treatment center I
> D(*)"

. cap noi label variable tipo_centro "Tipo de Centro del Ãšltimo Registro/Type o
> f Center of the Last Entry"

. cap noi label variable servicio_de_salud "Servicio de Salud(*)/Health Service
> (*)"

. cap noi label variable senda "SENDA del Ãšltimo Registro/SENDA of the Last Ent
> ry"

. cap noi label variable numero_de_hijos_mod "NÃºmero de Hijos (Valor Max.)/Numb
> er of Children (Max. Value)"

. cap noi label variable num_hijos_trat_res_mod "NÃºmero de Hijos para Ingreso a
>  Tratamiento Residencial del Ãšltimo Registro/Number of Children to Residentia
> l Treatment of the Last Entry"
note: label truncated to 80 characters

. cap noi label variable tipo_centro_derivacion "Tipo de Centro al que el Usuar
> io es Derivado del Ãšltimo Registro(b)/Type of Center of Derivation of the Las
> t Entry(b)"
note: label truncated to 80 characters

. cap noi label variable macrozona "Macrozona del Centro del Ãšltimo Registro(b)
> /Macrozones of the Center of the Last Entry(b)"
note: label truncated to 80 characters

. cap noi label variable nombre_region "RegiÃ³n del Centro del Ãšltimo Registro(b
> )/Chilean Region of the Center of the Last Entry(b)"
note: label truncated to 80 characters

. cap noi label variable comuna_residencia_cod "Comuna de Residencia del Ãšltimo
>  Registro(b)/Municipality or District of Residence of the Last Entry(b)"
note: label truncated to 80 characters

. cap noi label variable fecha_ingreso_a_convenio_senda "Fecha de Ingreso a Con
> venio SENDA (aÃºn no formateada como fecha) (Primera Entrada)/Date of Admissio
> n to SENDA Agreement (First Entry)"
note: label truncated to 80 characters

. cap noi label variable identidad_de_genero "Identidad de GÃ©nero (Ãšltimo Regis
> tro)(b)/Gender Identity (Last Entry)(b)"

. cap noi label variable edad_al_ing "Edad a la Fecha de Ingreso a Tratamiento 
> (numÃ©rico continuo) (Primera Entrada)/Age at Admission to Treatment (First En
> try)"
note: label truncated to 80 characters

. cap noi label variable origen_ingreso_mod "Origen de Ingreso (Primera Entrada
> )/Motive of Admission to Treatment (First Entry)"
note: label truncated to 80 characters

. cap noi label variable x_se_trata_mujer_emb "Mujer Embarazada al Ingreso (d)/
> Pregnant at Admission (d)"

. cap noi label variable compromiso_biopsicosocial "Compromiso Biopsicosocial(d
> )/Biopsychosocial Involvement(d)"

. cap noi label variable dg_global_nec_int_soc_or "DiagnÃ³stico Global de Necesi
> dades de IntegraciÃ³n Social (Al Ingreso)(d)/Global Diagnosis of Social Integr
> ation (At Admission)(d)"
note: label truncated to 80 characters

. cap noi label variable dg_nec_int_soc_cap_hum_or "DiagnÃ³stico de Necesidades 
> de IntegraciÃ³n Social en Capital Humano (Al Ingreso)(d)/Global Diagnosis of S
> ocial Integration in Human Capital (At Admission)(d)"
note: label truncated to 80 characters

. cap noi label variable dg_nec_int_soc_cap_fis_or "DiagnÃ³stico de Necesidades 
> de IntegraciÃ³n Social en Capital FÃsico (Al Ingreso)(d)/Global Diagnosis of S
> ocial Integration in Physical Capital (At Admission)(d)"
note: label truncated to 80 characters

. cap noi label variable dg_nec_int_soc_cap_soc_or "DiagnÃ³stico de Necesidades 
> de IntegraciÃ³n Social en Capital Social (Al Ingreso)(d)/Global Diagnosis of S
> ocial Integration in Social Capital (At Admission)(d)"
note: label truncated to 80 characters

. cap noi label variable usuario_tribunal_trat_droga "Usuario de modalidad Trib
> unales de Tratamiento de Drogas(d)/User of Drug Treatment Courts Modality(d)"
note: label truncated to 80 characters

. cap noi label variable evaluacindelprocesoteraputico "EvaluaciÃ³n del Proceso 
> TerapÃ©utico(d)/Evaluation of the Therapeutic Process(d)"

. cap noi label variable eva_consumo "EvaluaciÃ³n al Egreso Respecto al PatrÃ³n d
> e consumo(d)/Evaluation at Discharge regarding to Consumption Pattern(d)"
note: label truncated to 80 characters

. cap noi label variable eva_fam "EvaluaciÃ³n al Egreso Respecto a SituaciÃ³n Fam
> iliar(d)/Evaluation at Discharge regarding to Family Situation(d)"
note: label truncated to 80 characters

. cap noi label variable eva_relinterp "EvaluaciÃ³n al Egreso Respecto a Relacio
> nes Interpersonales(d)/Evaluation at Discharge regarding to Interpersonal Rel
> ations(d)"
note: label truncated to 80 characters

. cap noi label variable eva_ocupacion "EvaluaciÃ³n al Egreso Respecto a Situaci
> Ã³n Ocupacional(d)/Evaluation at Discharge regarding to Occupational Status(d)
> "
note: label truncated to 80 characters

. cap noi label variable eva_sm "EvaluaciÃ³n al Egreso Respecto a Salud Mental(d
> )/Evaluation at Discharge regarding to Mental Health(d)"
note: label truncated to 80 characters

. cap noi label variable eva_fisica "EvaluaciÃ³n al Egreso Respecto a Salud FÃsi
> ca(d)/Evaluation at Discharge regarding to Physical Health(d)"
note: label truncated to 80 characters

. cap noi label variable eva_transgnorma "EvaluaciÃ³n al Egreso Respecto a Trasg
> resiÃ³n a la Norma Social(d)/Evaluation at Discharge regarding to Transgressio
> n to the Norm(d)"
note: label truncated to 80 characters

. cap noi label variable dg_global_nec_int_soc_or_1 "DiagnÃ³stico Global de Nece
> sidades de IntegraciÃ³n Social (Al Egreso)(d)/Global Diagnosis of Social Integ
> ration (At Discharge)(d)"
note: label truncated to 80 characters

. cap noi label variable dg_nec_int_soc_cap_hum_or_1 "DiagnÃ³stico de Necesidade
> s de IntegraciÃ³n Social en Capital Humano (Al Egreso)(d)/Global Diagnosis of 
> Social Integration in Human Capital (At Discharge)(d)"
note: label truncated to 80 characters

. cap noi label variable dg_nec_int_soc_cap_fis_or_1 "DiagnÃ³stico de Necesidade
> s de IntegraciÃ³n Social en Capital FÃsico (Al Egreso)(d)/Global Diagnosis of 
> Social Integration in Physical Capital (At Discharge)(d)"
note: label truncated to 80 characters

. cap noi label variable dg_nec_int_soc_cap_soc_or_1 "DiagnÃ³stico de Necesidade
> s de IntegraciÃ³n Social en Capital Social (Al Egreso)(d)/Global Diagnosis of 
> Social Integration in Social Capital (At Discharge)(d)"
note: label truncated to 80 characters

. cap noi label variable tiene_menores_de_edad_a_cargo "Menores de Edad A Cargo
> (d)/Minor Dependants(d)"

. cap noi label variable ha_estado_embarazada_egreso "Â¿Ha estado embarazada? (a
> l Egreso)(d)/Have you been Pregnant (at Discharge)(d)"

. cap noi label variable discapacidad "Presenta Discapacidad(d)/Disability(d)"

. cap noi label variable opcion_discapacidad "Origen de Discapacidad(d)/Cause o
> f Disability(d)"

. cap noi label variable escolaridad "Escolaridad: Nivel Eduacional(d)/Educatio
> nal Attainment(d)"

. cap noi label variable edad_al_ing_grupos "Edad a la Fecha de Ingreso a Trata
> miento en Grupos(c)/Age at Admission to Treatment In Groups(c)"
note: label truncated to 80 characters

. cap noi label variable nacionalidad "Nacionalidad/Nationality"

. cap noi label variable sexo_2 "Sexo Usuario/Sex of User"

. cap noi label variable embarazo "Embarazo al Ingreso(c)/Pregnant at Admission
> (c)"

. cap noi label variable fech_nac "Fecha de Nacimiento/Date of Birth"

. cap noi label variable edad_ini_cons "Edad de Inicio de Consumo/Age of Onset 
> of Drug Use"

. cap noi label variable edad_ini_sus_prin "Edad de Inicio de Consumo Sustancia
>  Principal/Age of Onset of Drug Use of Primary Substance"
note: label truncated to 80 characters

. cap noi label variable estado_conyugal_2 "Estado Conyugal/Marital Status"

. cap noi label variable edad_grupos "Edad agrupada/Age in groups"

. cap noi label variable freq_cons_sus_prin "Frecuencia de Consumo de la Sustan
> cia Principal (30 dÃas previos a la admisiÃ³n)(f)/Frequency of Consumption of 
> the Primary or Main Substance (30 days previous to admission)(f)"
note: label truncated to 80 characters

. cap noi label variable via_adm_sus_prin_act "VÃa de AdministraciÃ³n de la Sust
> ancia Principal (Se aplicaron criterios de limpieza)(f)/Route of Administrati
> on of the Primary or Main Substance (Tidy)(f)"
note: label truncated to 80 characters

. cap noi label variable etnia_cor "Etnia/Ethnic Group"

. cap noi label variable nacionalidad_2 "Segunda Nacionalidad/Second Nationalit
> y"

. cap noi label variable etnia_cor_2 "Etnia (2)/Second Ethnic Group"

. cap noi label variable sus_ini_2_mod "Segunda Sustancia de Inicio(SÃ³lo mÃ¡s fr
> ecuentes)/Second Starting Substance"

. cap noi label variable sus_ini_3_mod "Tercera Sustancia de Inicio(SÃ³lo mÃ¡s fr
> ecuentes)/Third Starting Substance"

. cap noi label variable sus_ini_mod "Sustancia de Inicio (SÃ³lo mÃ¡s frecuentes)
> /Starting Substance (Only more frequent)"
note: label truncated to 80 characters

. cap noi label variable con_quien_vive "Persona con la que vive el Usuario(f)/
> People that Share Household with the User (Cohabitation Status)(f)"
note: label truncated to 80 characters

. cap noi label variable estatus_ocupacional "CondiciÃ³n Ocupacional(f)/Occupati
> onal Status(f)"

. cap noi label variable cat_ocupacional "CategorÃa Ocupacional(f)/Occupational
>  Category(f)"

. cap noi label variable sus_principal_mod "Sustancia Principal de Consumo (SÃ³l
> o mÃ¡s frecuentes)(f)/Primary or Main Substance of Consumption at Admission (O
> nly more frequent)(f)"
note: label truncated to 80 characters

. cap noi label variable tipo_de_vivienda_mod "Tipo de Vivienda(f)/Type of Hous
> ing(f)"

. cap noi label variable tenencia_de_la_vivienda_mod "Tenencia de la Vivienda(f
> )/Tenure status of Households(f)"

. cap noi label variable rubro_trabaja_mod "Rubro de Trabajo(f)/Area of Work(f)
> "

. cap noi label variable otras_sus1_mod "Otras Sustancias (1)(SÃ³lo mÃ¡s frecuent
> es)(f)/Other Substances (1)(Only more frequent)(f)"
note: label truncated to 80 characters

. cap noi label variable otras_sus2_mod "Otras Sustancias (2)(SÃ³lo mÃ¡s frecuent
> es)(f)/Other Substances (2)(Only more frequent)(f)"
note: label truncated to 80 characters

. cap noi label variable otras_sus3_mod "Otras Sustancias (3)(SÃ³lo mÃ¡s frecuent
> es)(f)/Other Substances (3)(Only more frequent)(f)"
note: label truncated to 80 characters

. cap noi label variable dg_trs_cons_sus_or "DiagnÃ³sico de Trastorno por Consum
> o de Sustancias(d)/Diagnosed of Substance Use Disorder(d)"
note: label truncated to 80 characters

. cap noi label variable diagnostico_trs_fisico "DiagnÃ³stico de Trastorno FÃsic
> o(g)/Diagnosis of Physical Disorder(g)"

. cap noi label variable otros_probl_at_sm_or "Otros Problemas de AtenciÃ³n Vinc
> ulados a Salud Mental(g)/Other problems linked to Mental Health(g)"
note: label truncated to 80 characters

. cap noi label variable ano_bd_first "AÃ±o de la Base de Datos(c)/Year of the D
> ataset (Source)(c)"

. cap noi label variable ano_bd_last "AÃ±o de la Base de Datos(b)/Year of the Da
> taset (Source)(b)"

. cap noi label variable obs "Observaciones al Proceso de Limpieza y Estandariz
> aciÃ³n de Casos(e)/Observations to the Process of Data Tidying & Standardizati
> on(e)"
note: label truncated to 80 characters

. cap noi label variable obs_concat_a "Observaciones al Proceso de Limpieza y E
> standarizaciÃ³n de Casos(*)/Observations to the Process of Data Tidying & Stan
> dardization(*)"
note: label truncated to 80 characters

. cap noi label variable rn_common_treats2 "Cuenta de Entradas Comunes(b)/Count
>  of Common Entries(b)"

. cap noi label variable concat_hash_id_treatments "Combination of User & Disti
> nt Entries"

. cap noi label variable at_least_one_cont_entry "Casos de Usuarios con mÃ¡s de 
> una entrada despuÃ©s de otra/Cases of users with more than one entry after ano
> ther one"
note: label truncated to 80 characters

. cap noi label variable senda_concat_a "SENDA(*)/SENDA(*)"

. cap noi label variable tipo_centro_concat_a "Tipo de Centro(*)/Type of Center
> (*)"

. cap noi label variable fech_ing_num "Fecha de Ingreso a Tratamiento (NumÃ©rico
> )(c)/Date of Admission to Treatment (Numeric)(c)"
note: label truncated to 80 characters

. cap noi label variable fech_egres_num "Fecha de Egreso (Imputados KNN & LÃ³gic
> o)(NumÃ©rico)(b)/Date of Discharge (Imputed KNN & Logic)(Numeric)(b) of the Ne
> xt Treatment"
note: label truncated to 80 characters

. cap noi label variable fech_ing_next_treat "Fecha de Ingreso a Tratamiento (N
> umÃ©rico)(c) del Tratamiento Posterior/Date of Admission to Treatment (Numeric
> )(c)"
note: label truncated to 80 characters

. cap noi label variable id_centro_sig_trat "ID del Centro del Tratamiento Post
> erior/Center ID of the Next Treatment"

. cap noi label variable tipo_plan_sig_trat "Tipo de Plan del Tratamiento Poste
> rior/Type of Plan of the Next Treatment"

. cap noi label variable tipo_programa_sig_trat "Tipo de Programa del Tratamien
> to Posterior/Type of Program of the Next Treatment"

. cap noi label variable senda_sig_trat "SENDA del Tratamiento Posterior/SENDA 
> of the Next Treatment"

. cap noi label variable menor_60_dias_diff "Menor a 60 dÃas de diferencia con 
> el Tratamiento Posterior/Menor a 60 days of difference between the Next Treat
> ment"
note: label truncated to 80 characters

. cap noi label variable menor_45_dias_diff "Menor a 45 dÃas de diferencia con 
> el Tratamiento Posterior/Less than 45 days of difference between the Next Tre
> atment"
note: label truncated to 80 characters

. cap noi label variable motivoegreso_derivacion "Motivo de Egreso= DerivaciÃ³n(
> b)/Cause of Discharge= DerivaciÃ³n(b)"

. cap noi label variable abandono_temprano "Abandono temprano(<3 meses)/ Early 
> Drop-out(<3 months)"

. cap noi label variable obs_cambios "Cambios del tratamiento en comparaciÃ³n al
>  Tratamiento Posterior/Changes in treatment compared to the Next Treatment"
note: label truncated to 80 characters

. cap noi label variable obs_cambios_ninguno "Sin cambios del tratamiento en co
> mparaciÃ³n al Tratamiento Posterior/No changes in treatment compared to the Ne
> xt Treatment"
note: label truncated to 80 characters

. cap noi label variable obs_cambios_num "Recuento de cambios del tratamiento e
> n comparaciÃ³n al Tratamiento Posterior/Count of changes in treatment compared
>  to the Next Treatment"
note: label truncated to 80 characters

. cap noi label variable obs_cambios_fac "Recuento de cambios del tratamiento e
> n comparaciÃ³n al Tratamiento Posterior(factor)/Count of changes in treatment 
> compared to the Next Treatment(factor)"
note: label truncated to 80 characters

. cap noi label variable edad_ini_sus_prin_grupos "Edad de Inicio de Consumo Su
> stancia Principal (en Grupos)/Age of Onset of Drug Use of Primary Substance (
> in Groups)"
note: label truncated to 80 characters

. cap noi label variable hash_key_sex_program "Usuarios a los que se le ha camb
> iado el sexo de acuerdo al tipo de plan/Users that changed of sex considering
>  the types of plan"
note: label truncated to 80 characters

. cap noi label variable centro_muj "ID de centro que alude a un centro especÃf
> ico para mujeres/Center ID aludes to a women-specific center"
note: label truncated to 80 characters

. cap noi label variable cie_10 "DiagnÃ³stico CIE-10 (1 o mÃ¡s)/Psychiatric Diagn
> oses (ICD-10)(one or more)"

. cap noi label variable dsm_iv "DiagnÃ³stico DSM-IV (1 o mÃ¡s)/Psychiatric Diagn
> oses (DSM-IV)(one or more)"

. cap noi label variable con_quien_vive_rec "Persona con la que vive el Usuario
>  (Recodificada)(f)/People that Share Household with the User (Cohabitation St
> atus)(Recoded)(f)"
note: label truncated to 80 characters

. cap noi label variable cnt_mod_dsm_iv_or "Recuento de DiagnÃ³stico DSM-IV/Coun
> t of Psychiatric Diagnoses (DSM-IV)"

. cap noi label variable cnt_mod_cie_10_or "Recuento de DiagnÃ³stico CIE-10/Coun
> t of Psychiatric Diagnoses (ICD-10)"

. cap noi label variable cnt_diagnostico_trs_fisico "Recuento de DiagnÃ³stico de
>  Trastorno FÃsico/Count of Physical Disorder"

. cap noi label variable dg_trs_psiq_sub_dsm_iv_or "DiagnÃ³stico de Trastorno Ps
> iquiÃ¡trico, Criterios DSM IV (Subclasificacion)(g)/Diagnosis of Psychiatric D
> isorders, DSM-IV criteria (sub-classification)(g)"
note: label truncated to 80 characters

. cap noi label variable dg_trs_psiq_dsm_iv_or "DiagnÃ³stico de Trastorno Psiqui
> Ã¡trico, Criterios DSM IV(g)/Diagnosis of Psychiatric Disorders, DSM-IV criter
> ia(g)"
note: label truncated to 80 characters

. cap noi label variable dg_trs_psiq_cie_10_or "DiagnÃ³stico de Trastorno Psiqui
> Ã¡trico, Criterios CIE-10(g)/Diagnosis of Psychiatric Disorders, CIE-10 criter
> ia(g)"
note: label truncated to 80 characters

. cap noi label variable x2_dg_trs_psiq_cie_10_or "DiagnÃ³stico de Trastorno Psi
> quiÃ¡trico, Criterios CIE-10 (2)(g)/Diagnosis of Psychiatric Disorders, CIE-10
>  criteria (2)(g)"
note: label truncated to 80 characters

. cap noi label variable x3_dg_trs_psiq_cie_10_or "DiagnÃ³stico de Trastorno Psi
> quiÃ¡trico, Criterios CIE-10 (3)(g)/Diagnosis of Psychiatric Disorders, CIE-10
>  criteria (3)(g)"
note: label truncated to 80 characters

. cap noi label variable x4_dg_trs_psiq_cie_10_or "DiagnÃ³stico de Trastorno Psi
> quiÃ¡trico, Criterios CIE-10 (4)(g)/Diagnosis of Psychiatric Disorders, CIE-10
>  criteria (4)(g)"
note: label truncated to 80 characters

. cap noi label variable x5_dg_trs_psiq_cie_10_or "DiagnÃ³stico de Trastorno Psi
> quiÃ¡trico, Criterios CIE-10 (5)(g)/Diagnosis of Psychiatric Disorders, CIE-10
>  criteria (5)(g)"
note: label truncated to 80 characters

. cap noi label variable x2_dg_trs_psiq_dsm_iv_or "DiagnÃ³stico de Trastorno Psi
> quiÃ¡trico, Criterios DSM IV (2)(g)/Diagnosis of Psychiatric Disorders, DSM-IV
>  criteria (2)(g)"
note: label truncated to 80 characters

. cap noi label variable x3_dg_trs_psiq_dsm_iv_or "DiagnÃ³stico de Trastorno Psi
> quiÃ¡trico, Criterios DSM IV (3)(g)/Diagnosis of Psychiatric Disorders, DSM-IV
>  criteria (3)(g)"
note: label truncated to 80 characters

. cap noi label variable x4_dg_trs_psiq_dsm_iv_or "DiagnÃ³stico de Trastorno Psi
> quiÃ¡trico, Criterios DSM IV (4)(g)/Diagnosis of Psychiatric Disorders, DSM-IV
>  criteria (4)(g)"
note: label truncated to 80 characters

. cap noi label variable dg_trs_psiq_sub_cie_10_or "DiagnÃ³stico de Trastorno Ps
> iquiÃ¡trico, Criterios CIE-10 (Subclasificacion)(g)/Diagnosis of Psychiatric D
> isorders, CIE-10 criteria (subclassification)(g)"
note: label truncated to 80 characters

. cap noi label variable x2_dg_trs_psiq_sub_cie_10_or "DiagnÃ³stico de Trastorno
>  PsiquiÃ¡trico, Criterios CIE-10 (Subclasificacion) (2)(g)/Diagnosis of Psychi
> atric Disorders, CIE-10 criteria (subclassification) (2)(g)"
note: label truncated to 80 characters

. cap noi label variable x3_dg_trs_psiq_sub_cie_10_or "DiagnÃ³stico de Trastorno
>  PsiquiÃ¡trico, Criterios CIE-10 (Subclasificacion) (3)(g)/Diagnosis of Psychi
> atric Disorders, CIE-10 criteria (subclassification) (3)(g)"
note: label truncated to 80 characters

. cap noi label variable x4_dg_trs_psiq_sub_cie_10_or "DiagnÃ³stico de Trastorno
>  PsiquiÃ¡trico, Criterios CIE-10 (Subclasificacion)(4)(g)/Diagnosis of Psychia
> tric Disorders, CIE-10 criteria (subclassification)(4)(g)"
note: label truncated to 80 characters

. cap noi label variable x2_dg_trs_psiq_sub_dsm_iv_or "DiagnÃ³stico de Trastorno
>  PsiquiÃ¡trico, Criterios DSM IV (Subclasificacion) (2)(g)/Diagnosis of Psychi
> atric Disorders, DSM-IV criteria (sub-classification) (2)(g)"
note: label truncated to 80 characters

. cap noi label variable x3_dg_trs_psiq_sub_dsm_iv_or "DiagnÃ³stico de Trastorno
>  PsiquiÃ¡trico, Criterios DSM IV (Subclasificacion) (3)(g)/Diagnosis of Psychi
> atric Disorders, DSM-IV criteria (sub-classification) (3)(g)"
note: label truncated to 80 characters

. cap noi label variable x4_dg_trs_psiq_sub_dsm_iv_or "DiagnÃ³stico de Trastorno
>  PsiquiÃ¡trico, Criterios DSM IV (Subclasificacion)(4)(g)/Diagnosis of Psychia
> tric Disorders, DSM-IV criteria (sub-classification)(4)(g)"
note: label truncated to 80 characters

. cap noi label variable cnt_otros_probl_at_sm_or "Recuento de Otros Problemas 
> de AtenciÃ³n Vinculados a Salud Mental/Count of Other problems linked to Menta
> l Health"
note: label truncated to 80 characters

. cap noi label variable escolaridad_rec "Escolaridad: Nivel Eduacional(d) Norm
> alizado a ProgresiÃ³n de Tratamientos/Educational Attainment(d) & Normalized F
> ollowing Progression of Treatments"
note: label truncated to 80 characters

. cap noi label variable tipo_de_plan_2 "Tipo de Plan del Ãšltimo Registro/Type 
> of Plan of the Last Entry"

. cap noi label variable motivodeegreso_mod_imp "Motivo de Egreso (con abandono
>  temprano y tardÃo)(Imputados KNN & LÃ³gico) del Ãšltimo Registro(b)/Cause of D
> ischarge (with late and early withdrawal)(Imputed KNN & Logic) of the Last En
> try(b)"
note: label truncated to 80 characters

. cap noi label variable rn_hash "NÃºmero de Tratamientos por Usuario (menor, tr
> atamiento mÃ¡s antiguo)/Number of Treatments by User (less, older treatment)"
note: label truncated to 80 characters

. cap noi label variable n_hash "NÃºmero total de Tratamientos por Usuario/Total
>  Number of Treatments by User"

. cap noi label variable cum_dias_trat_sin_na "Suma acumulada de DÃas de Tratam
> iento por Usuario/Cumulative Days of Treatment by User"
note: label truncated to 80 characters

. cap noi label variable mean_cum_dias_trat_sin_na "Promedio acumulado de DÃas 
> de Tratamiento por Usuario/Cumulative Average Days of Treatment by User"
note: label truncated to 80 characters

. cap noi label variable cum_diff_bet_treat "Suma acumulada de Diferencia en DÃ
> as con Tratamiento Siguiente por Usuario/Cumulative sum of Days of difference
>  between the Next Treatment by User"
note: label truncated to 80 characters

. cap noi label variable mean_cum_diff_bet_treat "Promedio acumulado de Diferen
> cia en DÃas entre Tratamientos por Usuario/Cumulative Average Days of Differe
> nces Between Treatments By User"
note: label truncated to 80 characters

. cap noi label variable tipo_de_plan_2_1 "Tipo de Plan del Ãšltimo Registro/Typ
> e of Plan of the Last Entry (1st Treatment)"

. cap noi label variable tipo_de_plan_2_2 "Tipo de Plan del Ãšltimo Registro/Typ
> e of Plan of the Last Entry (2nd Treatment)"

. cap noi label variable tipo_de_plan_2_3 "Tipo de Plan del Ãšltimo Registro/Typ
> e of Plan of the Last Entry (3rd Treatment)"

. cap noi label variable tipo_de_plan_2_4 "Tipo de Plan del Ãšltimo Registro/Typ
> e of Plan of the Last Entry (4th Treatment)"

. cap noi label variable tipo_de_plan_2_5 "Tipo de Plan del Ãšltimo Registro/Typ
> e of Plan of the Last Entry (5th Treatment)"

. cap noi label variable tipo_de_plan_2_6 "Tipo de Plan del Ãšltimo Registro/Typ
> e of Plan of the Last Entry (6th Treatment)"

. cap noi label variable tipo_de_plan_2_7 "Tipo de Plan del Ãšltimo Registro/Typ
> e of Plan of the Last Entry (7th Treatment)"

. cap noi label variable tipo_de_plan_2_8 "Tipo de Plan del Ãšltimo Registro/Typ
> e of Plan of the Last Entry (8th Treatment)"

. cap noi label variable tipo_de_plan_2_9 "Tipo de Plan del Ãšltimo Registro/Typ
> e of Plan of the Last Entry (9th Treatment)"

. cap noi label variable tipo_de_plan_2_10 "Tipo de Plan del Ãšltimo Registro/Ty
> pe of Plan of the Last Entry (10th Treatment)"

. cap noi label variable motivodeegreso_mod_imp_1 "Cause of Discharge (with lat
> e and early withdrawal)(Imputed KNN & Logic) of the Last Entry(b) (1st Treatm
> ent)"
note: label truncated to 80 characters

. cap noi label variable motivodeegreso_mod_imp_2 "Cause of Discharge (with lat
> e and early withdrawal)(Imputed KNN & Logic) of the Last Entry(b) (2nd Treatm
> ent)"
note: label truncated to 80 characters

. cap noi label variable motivodeegreso_mod_imp_3 "Cause of Discharge (with lat
> e and early withdrawal)(Imputed KNN & Logic) of the Last Entry(b) (3th Treatm
> ent)"
note: label truncated to 80 characters

. cap noi label variable motivodeegreso_mod_imp_4 "Cause of Discharge (with lat
> e and early withdrawal)(Imputed KNN & Logic) of the Last Entry(b) (4th Treatm
> ent)"
note: label truncated to 80 characters

. cap noi label variable motivodeegreso_mod_imp_5 "Cause of Discharge (with lat
> e and early withdrawal)(Imputed KNN & Logic) of the Last Entry(b) (5th Treatm
> ent)"
note: label truncated to 80 characters

. cap noi label variable motivodeegreso_mod_imp_6 "Cause of Discharge (with lat
> e and early withdrawal)(Imputed KNN & Logic) of the Last Entry(b) (6th Treatm
> ent)"
note: label truncated to 80 characters

. cap noi label variable motivodeegreso_mod_imp_7 "Cause of Discharge (with lat
> e and early withdrawal)(Imputed KNN & Logic) of the Last Entry(b) (7th Treatm
> ent)"
note: label truncated to 80 characters

. cap noi label variable motivodeegreso_mod_imp_8 "Cause of Discharge (with lat
> e and early withdrawal)(Imputed KNN & Logic) of the Last Entry(b) (8th Treatm
> ent)"
note: label truncated to 80 characters

. cap noi label variable motivodeegreso_mod_imp_9 "Cause of Discharge (with lat
> e and early withdrawal)(Imputed KNN & Logic) of the Last Entry(b) (9th Treatm
> ent)"
note: label truncated to 80 characters

. cap noi label variable motivodeegreso_mod_imp_10 "Cause of Discharge (with la
> te and early withdrawal)(Imputed KNN & Logic) of the Last Entry(b) (10th Trea
> tment)"
note: label truncated to 80 characters

. cap noi label variable dias_treat_imp_sin_na_1 "Days of Treatment (1st Treatm
> ent)"

. cap noi label variable dias_treat_imp_sin_na_2 "Days of Treatment (2nd Treatm
> ent)"

. cap noi label variable dias_treat_imp_sin_na_3 "Days of Treatment (3rd Treatm
> ent)"

. cap noi label variable dias_treat_imp_sin_na_4 "Days of Treatment (4th Treatm
> ent)"

. cap noi label variable dias_treat_imp_sin_na_5 "Days of Treatment (5th Treatm
> ent)"

. cap noi label variable dias_treat_imp_sin_na_6 "Days of Treatment (6th Treatm
> ent)"

. cap noi label variable dias_treat_imp_sin_na_7 "Days of Treatment (7th Treatm
> ent)"

. cap noi label variable dias_treat_imp_sin_na_8 "Days of Treatment (8th Treatm
> ent)"

. cap noi label variable dias_treat_imp_sin_na_9 "Days of Treatment (9th Treatm
> ent)"

. cap noi label variable dias_treat_imp_sin_na_10 "Days of Treatment (10th Trea
> tment)"

. cap noi label variable diff_bet_treat_1 "Diff Between Treatments (1st Treatme
> nt)"

. cap noi label variable diff_bet_treat_2 "Diff Between Treatments (2nd Treatme
> nt)"

. cap noi label variable diff_bet_treat_3 "Diff Between Treatments (3rd Treatme
> nt)"

. cap noi label variable diff_bet_treat_4 "Diff Between Treatments (4th Treatme
> nt)"

. cap noi label variable diff_bet_treat_5 "Diff Between Treatments (5th Treatme
> nt)"

. cap noi label variable diff_bet_treat_6 "Diff Between Treatments (6th Treatme
> nt)"

. cap noi label variable diff_bet_treat_7 "Diff Between Treatments (7th Treatme
> nt)"

. cap noi label variable diff_bet_treat_8 "Diff Between Treatments (8th Treatme
> nt)"

. cap noi label variable diff_bet_treat_9 "Diff Between Treatments (9th Treatme
> nt)"

. cap noi label variable diff_bet_treat_10 "Diff Between Treatments (10th Treat
> ment)"

. cap noi label variable cum_dias_trat_sin_na_1 "Cum. Days of Treatment (1st Tr
> eatment)"

. cap noi label variable cum_dias_trat_sin_na_2 "Cum. Days of Treatment (2nd Tr
> eatment)"

. cap noi label variable cum_dias_trat_sin_na_3 "Cum. Days of Treatment (3rd Tr
> eatment)"

. cap noi label variable cum_dias_trat_sin_na_4 "Cum. Days of Treatment (4th Tr
> eatment)"

. cap noi label variable cum_dias_trat_sin_na_5 "Cum. Days of Treatment (5th Tr
> eatment)"

. cap noi label variable cum_dias_trat_sin_na_6 "Cum. Days of Treatment (6th Tr
> eatment)"

. cap noi label variable cum_dias_trat_sin_na_7 "Cum. Days of Treatment (7th Tr
> eatment)"

. cap noi label variable cum_dias_trat_sin_na_8 "Cum. Days of Treatment (8th Tr
> eatment)"

. cap noi label variable cum_dias_trat_sin_na_9 "Cum. Days of Treatment (9th Tr
> eatment)"

. cap noi label variable cum_dias_trat_sin_na_10 "Cum. Days of Treatment (10th 
> Treatment)"

. cap noi label variable mean_cum_dias_trat_sin_na_1 "Avg. Cum. Days of Treatme
> nt (1st Treatment)"

. cap noi label variable mean_cum_dias_trat_sin_na_2 "Avg. Cum. Days of Treatme
> nt (2nd Treatment)"

. cap noi label variable mean_cum_dias_trat_sin_na_3 "Avg. Cum. Days of Treatme
> nt (3rd Treatment)"

. cap noi label variable mean_cum_dias_trat_sin_na_4 "Avg. Cum. Days of Treatme
> nt (4th Treatment)"

. cap noi label variable mean_cum_dias_trat_sin_na_5 "Avg. Cum. Days of Treatme
> nt (5th Treatment)"

. cap noi label variable mean_cum_dias_trat_sin_na_6 "Avg. Cum. Days of Treatme
> nt (6th Treatment)"

. cap noi label variable mean_cum_dias_trat_sin_na_7 "Avg. Cum. Days of Treatme
> nt (7th Treatment)"

. cap noi label variable mean_cum_dias_trat_sin_na_8 "Avg. Cum. Days of Treatme
> nt (8th Treatment)"

. cap noi label variable mean_cum_dias_trat_sin_na_9 "Avg. Cum. Days of Treatme
> nt (9th Treatment)"

. cap noi label variable mean_cum_dias_trat_sin_na_10 "Avg. Cum. Days of Treatm
> ent (10th Treatment)"

. cap noi label variable cum_diff_bet_treat_1 "Cum. Diff Between Treatments (1s
> t Treatment)"

. cap noi label variable cum_diff_bet_treat_2 "Cum. Diff Between Treatments (2n
> d Treatment)"

. cap noi label variable cum_diff_bet_treat_3 "Cum. Diff Between Treatments (3r
> d Treatment)"

. cap noi label variable cum_diff_bet_treat_4 "Cum. Diff Between Treatments (4t
> h Treatment)"

. cap noi label variable cum_diff_bet_treat_5 "Cum. Diff Between Treatments (5t
> h Treatment)"

. cap noi label variable cum_diff_bet_treat_6 "Cum. Diff Between Treatments (6t
> h Treatment)"

. cap noi label variable cum_diff_bet_treat_7 "Cum. Diff Between Treatments (7t
> h Treatment)"

. cap noi label variable cum_diff_bet_treat_8 "Cum. Diff Between Treatments (8t
> h Treatment)"

. cap noi label variable cum_diff_bet_treat_9 "Cum. Diff Between Treatments (9t
> h Treatment)"

. cap noi label variable cum_diff_bet_treat_10 "Cum. Diff Between Treatments (1
> 0th Treatment)"

. cap noi label variable mean_cum_diff_bet_treat_1 "Avg. Cum. Diff Between Trea
> tments (1st Treatment)"

. cap noi label variable mean_cum_diff_bet_treat_2 "Avg. Cum. Diff Between Trea
> tments (2nd Treatment)"

. cap noi label variable mean_cum_diff_bet_treat_3 "Avg. Cum. Diff Between Trea
> tments (3rd Treatment)"

. cap noi label variable mean_cum_diff_bet_treat_4 "Avg. Cum. Diff Between Trea
> tments (4th Treatment)"

. cap noi label variable mean_cum_diff_bet_treat_5 "Avg. Cum. Diff Between Trea
> tments (5th Treatment)"

. cap noi label variable mean_cum_diff_bet_treat_6 "Avg. Cum. Diff Between Trea
> tments (6th Treatment)"

. cap noi label variable mean_cum_diff_bet_treat_7 "Avg. Cum. Diff Between Trea
> tments (7th Treatment)"

. cap noi label variable mean_cum_diff_bet_treat_8 "Avg. Cum. Diff Between Trea
> tments (8th Treatment)"

. cap noi label variable mean_cum_diff_bet_treat_9 "Avg. Cum. Diff Between Trea
> tments (9th Treatment)"

. cap noi label variable mean_cum_diff_bet_treat_10 "Avg. Cum. Diff Between Tre
> atments (10th Treatment)"

. cap noi label variable diff_bet_treat "DÃas de diferencia con el Tratamiento 
> Posterior/Days of difference between the Next Treatment"
note: label truncated to 80 characters

. cap noi label variable dias_treat_imp_sin_na "DÃas de Tratamiento (valores pe
> rdidos en la fecha de egreso se reemplazaron por la diferencia con 2019-11-13
> )/Days of Treatment (missing dates of discharge were replaced with difference
>  from 2019-11-13)"
note: label truncated to 80 characters

. cap noi save "C:\Users\CISS Fondecyt\Mi unidad\Alvacast\SISTRAT 2019 (github)
> \CONS_C1_df_dup_JUL_2020.dta", replace
file C:\Users\CISS Fondecyt\Mi unidad\Alvacast\SISTRAT 2019 (github)\CONS_C1_df
> _dup_JUL_2020.dta saved

. cap noi drop id id_mod nombre_centro consentimiento_informado
variable consentimiento_informado not found

. cap noi drop id id_mod nombre_centro

. cap noi save "C:\Users\CISS Fondecyt\Mi unidad\Alvacast\SISTRAT 2019 (github)
> \CONS_C1_df_dup_JUL_2020_exp.dta", replace
file C:\Users\CISS Fondecyt\Mi unidad\Alvacast\SISTRAT 2019 (github)\CONS_C1_df
> _dup_JUL_2020_exp.dta saved

. 
end of do-file

sessionInfo()

R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)

Matrix products: default

locale:
[1] LC_COLLATE=Spanish_Chile.1252  LC_CTYPE=Spanish_Chile.1252   
[3] LC_MONETARY=Spanish_Chile.1252 LC_NUMERIC=C                  
[5] LC_TIME=Spanish_Chile.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] magrittr_1.5            gridExtra_2.3           readxl_1.3.1           
 [4] forcats_0.5.0           purrr_0.3.4             readr_1.3.1            
 [7] tibble_3.0.1            tidyverse_1.3.0         choroplethrAdmin1_1.1.1
[10] choroplethrMaps_1.0.1   acs_2.1.4               XML_3.99-0.3           
[13] tidylog_1.0.1           dplyr_1.0.0             treemapify_2.5.3       
[16] ggiraph_0.7.0           chilemapas_0.2          sf_0.9-3               
[19] finalfit_1.0.1          lsmeans_2.30-0          emmeans_1.4.7          
[22] RColorBrewer_1.1-2      panelr_0.7.3            lme4_1.1-23            
[25] Matrix_1.2-18           data.table_1.12.8       codebook_0.9.2         
[28] Statamarkdown_0.4.5     devtools_2.3.0          usethis_1.6.1          
[31] sqldf_0.4-11            RSQLite_2.2.0           gsubfn_0.7             
[34] proto_1.0.0             broom_0.7.12            zoo_1.8-8              
[37] rbokeh_0.5.1            janitor_2.0.1           plotly_4.9.2.1         
[40] kableExtra_1.1.0        Hmisc_4.4-0             Formula_1.2-3          
[43] survival_3.2-3          lattice_0.20-41         ggplot2_3.3.2          
[46] stringr_1.4.0           stringi_1.4.6           tidyr_1.1.0            
[49] knitr_1.29              matrixStats_0.56.0      boot_1.3-28            
[52] here_0.1               

loaded via a namespace (and not attached):
  [1] estimability_1.3        coda_0.19-4             acepack_1.4.1          
  [4] bit64_0.9-7             multcomp_1.4-13         rpart_4.1-15           
  [7] generics_0.0.2          callr_3.7.0             TH.data_1.0-10         
 [10] mice_3.9.0              ggfittext_0.9.0         DiagrammeR_1.0.6.1.9000
 [13] chron_2.3-55            bit_1.1-15.2            webshot_0.5.2          
 [16] xml2_1.3.2              lubridate_1.7.9         httpuv_1.5.4           
 [19] assertthat_0.2.1        xfun_0.29               hms_0.5.3              
 [22] jquerylib_0.1.4         data.tree_0.7.11        evaluate_0.14          
 [25] promises_1.1.1          dbplyr_1.4.4            randomizr_0.20.0       
 [28] DBI_1.1.0               tmvnsim_1.0-2           htmlwidgets_1.5.1      
 [31] jsonvalidate_1.1.0      ellipsis_0.3.1          import_1.1.0           
 [34] crosstalk_1.1.0.1       backports_1.1.8         V8_3.1.0               
 [37] markdown_1.1            vctrs_0.3.1             remotes_2.4.2          
 [40] abind_1.4-5             withr_2.4.3             pryr_0.1.4             
 [43] checkmate_2.0.0         ggmap_3.0.0             prettyunits_1.1.1      
 [46] mnormt_2.0.0            cluster_2.1.0           lazyeval_0.2.2         
 [49] crayon_1.3.4            crul_0.9.0              labeling_0.3           
 [52] pkgconfig_2.0.3         units_0.6-6             nlme_3.1-148           
 [55] pkgload_1.1.0           nnet_7.3-14             RJSONIO_1.3-1.4        
 [58] rlang_1.0.1             lifecycle_0.2.0         sandwich_2.5-1         
 [61] httpcode_0.3.0          modelr_0.1.8            cellranger_1.1.0       
 [64] tcltk_4.0.2             rprojroot_1.3-2         shinyFiles_0.8.0.9003  
 [67] carData_3.0-5           reprex_2.0.1            base64enc_0.1-3        
 [70] processx_3.5.2          rjson_0.2.20            png_0.1-7              
 [73] viridisLite_0.3.0       clisymbols_1.2.0        bitops_1.0-7           
 [76] KernSmooth_2.23-17      visNetwork_2.0.9        pander_0.6.3           
 [79] blob_1.2.2              classInt_0.4-3          jpeg_0.1-8.1           
 [82] shinyAce_0.4.1          scales_1.1.1            memoise_1.1.0          
 [85] plyr_1.8.6              hexbin_1.28.1           compiler_4.0.2         
 [88] snakecase_0.11.0        cli_3.1.1               patchwork_1.0.1        
 [91] ps_1.3.3                htmlTable_2.0.0         MASS_7.3-51.6          
 [94] tidyselect_1.1.0        highr_0.8               jtools_2.0.5           
 [97] yaml_2.2.1              radiant.model_1.3.12    latticeExtra_0.6-29    
[100] ggrepel_0.8.2           grid_4.0.2              tools_4.0.2            
[103] rmapshaper_0.4.4        rio_0.5.16              RgoogleMaps_1.4.5.3    
[106] parallel_4.0.2          rstudioapi_0.11         uuid_0.1-4             
[109] foreign_0.8-80          NeuralNetTools_1.5.2    pdp_0.7.0              
[112] gistr_0.5.0             farver_2.0.3            digest_0.6.25          
[115] shiny_1.5.0             geojsonlint_0.4.0       Rcpp_1.0.4.6           
[118] car_3.0-12              later_1.1.0.1           writexl_1.3            
[121] httr_1.4.2              gdtools_0.2.2           WDI_2.6.0              
[124] psych_1.9.12.31         colorspace_1.4-1        rvest_0.3.5            
[127] fs_1.4.2                radiant.data_1.3.9      ranger_0.12.1          
[130] splines_4.0.2           statmod_1.4.34          sp_1.4-2               
[133] xgboost_1.1.1.1         sessioninfo_1.1.1       systemfonts_0.2.3      
[136] xtable_1.8-4            jsonlite_1.7.0          nloptr_1.2.2.1         
[139] testthat_2.3.2          R6_2.4.1                pillar_1.4.6           
[142] htmltools_0.5.2         mime_0.9                glue_1.4.1             
[145] fastmap_1.1.0           minqa_1.2.4             class_7.3-17           
[148] codetools_0.2-18        maps_3.3.0              pkgbuild_1.1.0         
[151] mvtnorm_1.1-1           curl_4.3                zip_2.0.4              
[154] openxlsx_4.1.5          rmarkdown_2.11          desc_1.2.0             
[157] munsell_0.5.0           e1071_1.7-3             labelled_2.5.0         
[160] haven_2.3.1             reshape2_1.4.4          gtable_0.3.0