Using systematic data categorisation to quantify the types of data collected in clinical trials
Brighton, UK

Speaker: Gordon Fernie

This is a talk at the International Clinical Trials Methodology Conference (ICTMC), Brighton, UK, 6th – 9th October 2019.  Venue is the Hilton Brighton Metropole.

Parallel Session 7A – Improving Trial Performance


Data collection consumes a large proportion of trial resources. Each data item requires time and effort for collection, processing and quality control procedures. Generally speaking, more data equals a heavier burden for trial staff and participants. It also increases the cost of the trial. Data is generally collected for 3 broad reasons:

  • To answer the main research question (a primary outcome is specified and drives sample size calculations).
  • Secondary outcomes to supplement the primary outcome.
  • Additional data to monitor safety, maintain quality and for regulatory and data management needs.

Here we report the results of a collaborative Trial Forge project which measured the proportion of data fitting these three broad categories, across 18 trials run from 5 institutions in Ireland and the UK.


Wedeveloped a standard operating procedure to categorise data. We categorised all variables collected on trial data collection forms from 18, mainly publically-funded Randomised Controlled Trials, including clinical trials of an investigational medicinal product and surgical trials. Categorisation was done independently in pairs: one person having in-depth knowledge of the trial, the other independent of the trial. Disagreement was resolved through reference to the trial protocol and discussion, with the project team being consulted if necessary.


Primary outcome data accounted for 11.2% (mean) and 5% (median) of all data items collected. Secondary outcomes constituted a mean of 42.5% (median: 39.9%) of data items. Non-outcome data represented a mean of 36.5% (median: 32.4%) of data items collected.


Our study highlights the proportion of data collected to answer the main research question is minimal in comparison with other data collected, and that much of this is non-outcome data. We discuss implications including whether such data collection is excessive or has detrimental effects on a trial.

Start typing and press Enter to search