Sat0587 the Art of Imputing Missing Data of Disease and Function Activity in Rheumatoid Arthritisregistries

Abstract

Background: Large observational studies become more common in rheumatoid arthritis (RA). Disease registers [1] allow to analyse the effectiveness and safety of RA treatments in real-world populations, but observational studies suffer from missing data. To minimise bias, it has been shown that imputing missing data is superior to the use of complete case analysis [2]. Although some imputation methods have been studied in clinical trials of rheumatic diseases [3] and in small registers [4], the various imputation techniques have never been systematically compared in large registers.Objectives To compare the effects of available imputation methods on the estimated values and on RA remission rate for missing disease activity measures in large registers. Methods We used 1000 patients with complete data for disease activity (Disease Activity Score (DAS28) and Clinical Disease Activity Index (CDAI)) at baseline (treatment initiation), 6, 12, and 24 months after initiation of abatacept or a tumor-necrosis factor inhibitor (TNFi) from an existing register collaboration (PANABA).Simulation procedure: Values were deleted randomly and imputed with three types of imputation methods: (1) methods imputing forward in time, such as Last Observation Carried Forward (LOCF) or Linear Forward Extrapolation (LFE); (2) methods considering data both forward and backward in time, such as Nearest Available Observation (NAO), Linear Extrapolation (LE) or Polynomial Extrapolation (PE); and (3) methods using computer-intensive multi-individual imputations, such as Linear Mixed Effects cubic regression (LME3) and Multiple Imputation by Chained Equation (MICE).We conducted a simulation study by performing this procedure 1000 time and computing the mean difference between the true and the imputed values, and between the true remission rate (CDAI and DAS28) and the imputed ones. Results Results are summarised in Fig. 1. At baseline, all methods underestimated the true values by at least 20%. Despite this, LME3 and MICE were able to provide estimates of baseline remission rates with less than 3% of error. For follow-up data, missingness at 6, 12, or 24 months, NAO, LE and PE led to relative bias of the mean values of 15%, and almost unbiased remission rate. LOCF and LFE respectively over and under-estimated the mean imputed values up to 20%, leading respectively to a non-negligible under and over-estimation of the remission rate. Although LME3 and MICE had low bias in estimating the mean values, they narrowed the distribution of the imputed values and thus strongly underestimated remission rate.

Publication
Annals of the Rheumatic Diseases, 78 page 1386–1386