SISTRAT- Fondecyt 1191282

SUD treatment and ER admissions, hospitalizations, and death among adult patients in Chile
Welcome to the repositories of the construction of the treatment information system (SISTRAT) datasets. On this repository you can find the different processes and actions taken to standardize and prepare the data for the analysis of the investigators of the project.

true , true

SISTRAT Datasets

This page is composed by the following main topics:

  1. Encryption of RUTs and Generation of HASHs

  2. Data Preparation and Standardization of C1

  1. Associations & Analytic Exercises

3a.1. Effect of residential versus ambulatory treatment for substance use disorders on readmission risk in a register-based national retrospective cohort- Main analyses

3a.2. Effect of residential versus ambulatory treatment for substance use disorders on readmission risk in a register-based national retrospective cohort- Supplemental analyses

3b.1. Treatment outcome and readmission risk among women in women-only versus mixed-gender drug treatment programs in Chile- Main

3b.2. Treatment outcome and readmission risk among women in women-only versus mixed-gender drug treatment programs in Chile- Supplemental

3c. Living with ( consolidation )

  1. Data Preparation and Standardization of TOP or Profile of Treatment Results
  1. Chilean prosecutor’s office Data merge

  2. Webinar “¿Qué sabemos de los programas de tratamiento de drogas en Chile? (What do we know about Chilean substance use treatments?)


The main processes are summarized in the following figures.


Figure 1. Diagram of data preparation

To open in a new window

FONDECYT AnalystDataset for queryUp to this point,we must have the events withineach treatment differentiated and without duplicated eventsAsk SENDAs ProfessionalAnalyse the origin of discrpanciesDoes it comes from an error of encryption, or byan error of SENDAs dataset?Contact developer ofencryptionDataset withduplicateddataHASH-KEYs(Masked ID)w/ more thanone SENDAs ID?Identify duplicated treatmentsDistinguish entries by unique events within treatmentsInstitutional validations of SENDAs professionalDuplicated/Overlapped entriesSend doubts to SENDAs professionalAdd to a DatasetNormalization ofDataset & Cleansing in Relevant VariablesSpecificGoalN° 1DiscardDoes the discrpenacy affectthe identificationof unique users, treatments and state of treatments?Approximate them until reaching a criteria that identify each user effectivellyEstablish the criteria that identify each user with more confidenceChanges in the application for the retrieval of IDs from DEIS datasets►ACTIONS (e.g.,)- Define variables,- Standardize dates,- Standardize programs & plans,- Correct ages,- Gender & Sex related to plans- Normalize Days of treatmentSee the origin and ask to third partiesE.g., throughProbabilistic Match● Must consider that:1-Users can have more than one admissions and treatments, but some of them can be duplicated due to insufficient or wrong information that in a next entry would be completed.2-Discarded information should be available in a separated dataset, to query in case we should impute values of other variables.3-Must consider the latest registry of admission (understood as the registry that contains a date of discharge, comes from a recent yearly dataset, or maybe, the entry in this yearly dataset that comes last,in equal conditions).►INVARIANT TO USER: - HASH-Key (hash_key) - Sex (sexo_2) - Age (edad) - Nationality (nacionalidad)►INVARIANT TO TREATMENTS: - Center ID (id_centro) - Motive of Admission (origen_de_ingreso) - Date of Admission(fech_ing)►VARIANT: - Treatment Days (dias_trat) - Date of Discharge (fech_egres) - Educational Attainment (educacion) - No. of ChildrenDescribe the incidence rate of readmissions by health conditions, in every admitted to a public tratamiento between the study period, comparing these rates with the general population with similardemographic characteristicsVARIABLES ►Outcomes:  - Readmission to treatments. ►Exposure:  - Treatment Outcome (administrative discharge, early or late drop-out, referral)  - Identify referrals that are part of the same treatment ►Effect-modifiers:  - Sex  - Age  - Substance of Admission (e.g., polydrug user)  - Type of treatment plan or program  ►Covariates:  - Marital Status  - Educational Attainment  - Occupational Status  - Age of Onset of Drug Use  - Frequency of Consumption of the Main Substance  - Motive of Admission to Treatment  - Psychiatric Comorbidity  - Region  - Type of treatmentYesNoNoYesYesYesNoKeep corrected entriesLogical & Probabilistic ImputationCollapse events within treatments into individual treatmentsIT ProfessionalGenerate modifications to the encrypterSENDAs ProfessionalOriginal ID in the Original DatasetSend an e-mail w/ discrepanciesValidations of entries in previous datasetsProtocols, algorithms & institutional procedures of case examinationCases with row numbers and user's identifiers from the processed dataset by FONDECYT professionals, will be contrasted with original IDFase 1=Entries with Unique ID'sFase 2 = Generate Entries of Unique EventsFase 3 = Data Cleaning and Generation of Unique TreatmentsNo
Figure 2. STROBE Diagram

To open in a new window

SUD treatments from different yearly datasetsDiscard cases that share the same values in 103 variablesSUD treatments once duplicatedvariables were discardedSUD treatments w/ different values in 13 variablesDiscard cases that share the same values in 13 variablesDiscard case that was in a type of plan under probation/paroleInvalid or Missing Ages were filled w/ information of TOP datasetsDefined unique dates of birth for users that had more than oneStandardized & normalized variables relevant for thestudyUnique combination ofHASH & Date of AdmissionEntries w/ same HASH Key & Date of AdmissionDiscarded 417 entries of 381 distinct HASHs & Dates of admissionKept most recent cases, except for ~3 casesAfter 1st discard or kept earliest treatments in overlapped casesoverlappings in treatment ranges & sameHASH KeyDifferent admissions of each user w/o overlapsn= 117,388overlappings in treatment ranges & sameHASH KeyChanged the Date of Discharge of Negative Days of TreatmentCases w/ different valuesin normalized &standardized variablesCases w/ different valuesin 17 normalized & standardized variablesAfter application of criteria provided by SENDA professionalData Editing/ WranglingNormalization of more than one User invariant-valuesin usersDifferent admissions of each user w/o overlaps and valid casesn= 117,212Normalization of more than one User invariant-valuesin users(more complex)Correction of ties in values of variables ofRule-Based ReplacementsStandrardization of Variables to provide for internal studies of the projectDelete entries with >1095 days of treatmentDatabase w/o overlaps or intermediate events marked by referralsn= 109,756<45 days of difference w/ a posterior entry & Referral as acause of dischargeDiscard cases with >1095 days of treatment (& nointermediate treatments)63,206 cases are entries that repeat the same information in two rows7,305 cases are entries that repeat the same information in three rows1,116 cases are entries that repeat the same information in four rows175 cases are entries that repeat the same information in five rows48 cases are entries that repeat the same information in six rows7 cases are entries that repeat the same information in seven rowsAge,SENDA ID,Date of Birth,Type of Plan,Age of Onset of Drug Use,Pregnancy Status,Days of Treatment,Main Substance of Consumption,Other Substances (1, 2 and 3),Starting Substance,Marital Status,Occupational Status,Occupational Category,Motive of Admission to Treatment,Educational Attainment,Route of Administration of Main Substance,Frequency of Consumption of the Main Substance.The rows that corresponded to particular cases:- 4118 and 4842, discarded the first- 38147 and 38755,discarded the first- 9875 and 7161, discarded the last- Imputed treatment days, and replaced the date of discharge- Kept the earliest treatment- Discarded the earliest treatment- Substracted days from the date of discharge of the last treatmentMost of these cases corresponded to overlappings:- With more than two cases involved, or with missing values- With imputed treatment days- Without center ID or name of the center1st part of Deduplication: Exploratory Approach2nd Part of Deduplication:Definition of Individual Treatments and Related EventsHASH Key, Masked Identifier (RUT)Date of Admission to TreatmentType of CenterType of ProgramType of PlanProgram Financed by SENDAMain Substance of ConsumptionOther Substances (1)Other Substances (2)Other Substances (3)Frequency of Consumption of the Main SubstanceStarting SubstanceAge of Onset of Drug Use14,926 cases are entries that repeat the same information in two rows147 cases are entries that repeat the same information in three rows- Masked Identifier - Date of Admission to Treatment - Center ID - Primary or Main Substance of Consumption at Admission- Other Substances (1, 2 & 3) - Starting Substance - Age of Onset of Drug Use- Marital Status - Occupational Status - Occupational Category - Age in groups - Motive of Admission to Treatment - Education Attainment - Route of Administration of the Primary or Main Substance - Frequency of Consumption of the Primary or Main Substance- Recoded route of administration depending on the primary substance - Replaced first DSM-IV and ICD-10 diagnostics, with the second and thirdif empty- Sex - Nationality - Ethnicity - Starting Substance3rd Part of Deduplication: Standardization of Variables and Exploration of Spaces Between Treatments- Age of Onset of Drug Use - Age of Onset of Drug Use Primary Substance- Sex  - Age of Onset of Drug Use - Age of Onset of Drug Use Primary Substance- Geographical (Communes & Regions) - Related to Drug Use - Related to social support and socioeconomic variables - Related to dependent children- Cases that started before 2010- Cases that rounded the yearly change of the databases4th Part of Deduplication: Collapse of Continous Entries of Referrals into Treatments- Entries with <45 days of difference w/ a posterior entry & Referral as a cause of discharge (n= 3,067) - Involved other entries that were surrounding continuous entries (n=12,945)-  Kept treatments that were in the middle of a trajectory of a user (n=71;users= 71)n= 163,146n=37,496(1.1)n=7,561(1.3)n= 125,650n≈ 215(1.4 & 1.5)n≈ 5222(1.7)n=797(2.2)n≈1,448(2.3)n= 117,619n=1n≈ 8702(1.6)n≈ 11(1.2)n= 118,036n=118,089n=118,088n=52(2.1)n≈ 11(2.4)n= 117,388n≈ 29,257(2.5)n≈ 31,384(2.6)n≈ 44,550(3.01 & 3.02)n≈ 2,123(3.03)n= 541(3.04)n= 6,809n=647

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".