<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article  PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "http://dtd.nlm.nih.gov/publishing/3.0/journalpublishing3.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" dtd-version="3.0" xml:lang="en" article-type="research article"><front><journal-meta><journal-id journal-id-type="publisher-id">OJEpi</journal-id><journal-title-group><journal-title>Open Journal of Epidemiology</journal-title></journal-title-group><issn pub-type="epub">2165-7459</issn><publisher><publisher-name>Scientific Research Publishing</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="doi">10.4236/ojepi.2015.53020</article-id><article-id pub-id-type="publisher-id">OJEpi-57821</article-id><article-categories><subj-group subj-group-type="heading"><subject>Articles</subject></subj-group><subj-group subj-group-type="Discipline-v2"><subject>Medicine&amp;Healthcare</subject></subj-group></article-categories><title-group><article-title>
 
 
  Choosing a Method to Reduce Selection Bias: A Tool for Researchers
 
</article-title></title-group><contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>laire</surname><given-names>Keeble</given-names></name><xref ref-type="aff" rid="aff1"><sup>1</sup></xref><xref ref-type="corresp" rid="cor1"><sup>*</sup></xref></contrib><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Graham</surname><given-names>Richard Law</given-names></name><xref ref-type="aff" rid="aff1"><sup>1</sup></xref></contrib><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Stuart</surname><given-names>Barber</given-names></name><xref ref-type="aff" rid="aff2"><sup>2</sup></xref></contrib><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Paul</surname><given-names>D. Baxter</given-names></name><xref ref-type="aff" rid="aff1"><sup>1</sup></xref></contrib></contrib-group><aff id="aff2"><addr-line>Department of Statistics, University of Leeds, Leeds, UK</addr-line></aff><aff id="aff1"><addr-line>Division of Epidemiology and Biostatistics, University of Leeds, Leeds, UK</addr-line></aff><author-notes><corresp id="cor1">* E-mail:<email>c.m.keeble@leeds.ac.uk(LK)</email>;</corresp></author-notes><pub-date pub-type="epub"><day>07</day><month>07</month><year>2015</year></pub-date><volume>05</volume><issue>03</issue><fpage>155</fpage><lpage>162</lpage><history><date date-type="received"><day>8</day>	<month>May</month>	<year>2015</year></date><date date-type="rev-recd"><day>accepted</day>	<month>6</month>	<year>July</year>	</date><date date-type="accepted"><day>9</day>	<month>July</month>	<year>2015</year></date></history><permissions><copyright-statement>&#169; Copyright  2014 by authors and Scientific Research Publishing Inc. </copyright-statement><copyright-year>2014</copyright-year><license><license-p>This work is licensed under the Creative Commons Attribution International License (CC BY). http://creativecommons.org/licenses/by/4.0/</license-p></license></permissions><abstract><p>
 
 
  Selection bias is well known to affect surveys and epidemiological studies. There have been numerous methods proposed to reduce its effects, so many that researchers may be unclear which method is most suitable for their study; the wide choice may even deter some researchers, for fear of choosing a sub-optimal approach. We propose a straightforward tool to inform researchers of the most promising methods available to reduce selection bias and to assist the search for an appropriate method given their study design and details. We demonstrate the tool using three examples where selection bias may occur; the tool quickly eliminates inappropriate methods and guides the researcher towards those to consider implementing. If more studies consider selection bias and adopt methods to reduce it, valuable time and resources will be saved, and should lead to more focused research towards disease prevention or cure.
 
</p></abstract><kwd-group><kwd>Selection Bias</kwd><kwd> Participation Bias</kwd><kwd> Non-Response</kwd><kwd> Methodology</kwd></kwd-group></article-meta></front><body><sec id="s1"><title>1. Introduction</title><p>Selection bias is known to affect health surveys and epidemiological studies [<xref ref-type="bibr" rid="scirp.57821-ref1">1</xref>] and can cause results from different studies on the same area of research to disagree or conclude contradictory findings [<xref ref-type="bibr" rid="scirp.57821-ref2">2</xref>] . However even in recent studies, selection bias is still sometimes ignored or dismissed [<xref ref-type="bibr" rid="scirp.57821-ref3">3</xref>] . If selection bias can be reduced across studies and lead to more consistent findings, time and resources could be saved since fewer repeated studies would be required. These savings could be used to develop more focused areas of research, contributing to increased knowledge.</p><p>With many selection bias reducing methods to choose from, researchers may find the process daunting and possibly be deterred from implementing a method, for fear of choosing an unsuitable approach. The implementation of a method to reduce selection bias may also be viewed by researchers as an undesirable feature of their study, which could lead to criticisms of their study design and data collection, or potentially reduced chances of publication. There may also be time constraints, whereby study results need to be presented or published within a specific time frame, or a funded piece of research completed. In these instances, research into selection bias and methods to reduce it may not be prioritized. This work aims to provide reassurance to researchers that applying a method to reduce selection bias is a positive aspect of their study which should be encouraged whenever selection bias is suspected. Consideration of selection bias as a possibility should be routine. An appropriate method should be applied and used as either a sensitivity analysis to reassure readers of the study results, or to produce findings with reduced bias. We aim to draw attention to the available methods to reduce selection bias and provide sources for further reading. We intend to give guidance for selecting a method, structured in such a way that it is applicable for any study or survey potentially affected by selection bias.</p><p>Various methods, which will be discussed here, have been suggested to reduce selection bias, each with their own requirements and assumptions. Some methods require there to be additional data available external to the study [<xref ref-type="bibr" rid="scirp.57821-ref4">4</xref>] - [<xref ref-type="bibr" rid="scirp.57821-ref7">7</xref>] , some require data regarding the non-participants [<xref ref-type="bibr" rid="scirp.57821-ref8">8</xref>] , and some assume that the variable associated with selection is known and measured [<xref ref-type="bibr" rid="scirp.57821-ref9">9</xref>] - [<xref ref-type="bibr" rid="scirp.57821-ref11">11</xref>] . Each method will be briefly introduced and sources given for further reading. A tool in the form of a straightforward flowchart is also provided to aid the selection of an appropriate method, depending upon the details of a study and any additional information available. Three examples are presented which demonstrate the flowchart. Exploration into the suggested methods can then be conducted, allowing the researcher to select a method to reduce selection bias more easily. We hope this tool encourages the use of such methods and consequently leads to results less affected by selection bias.</p></sec><sec id="s2"><title>2. Methods to Reduce Selection Bias</title><p><xref ref-type="table" rid="table1">Table 1</xref> summarizes the main methods used in the literature to reduce selection bias [<xref ref-type="bibr" rid="scirp.57821-ref3">3</xref>] which will be included in this guidance, and gives suggestions for further reading through original articles, examples or comprehensive summaries. These methods have similar themes, such as using the variable associated with selection in the analysis, weighting responses or predicting the effects of the bias. Additional, less frequently used methods are of course applicable, as are any new methods which are not yet widely used, but this tool is designed to be a starting point which can be developed through time and which should be useful for most types of research affected</p><table-wrap id="table1" ><label><xref ref-type="table" rid="table1">Table 1</xref></label><caption><title> Current key methods used to reduce selection bias</title></caption><table><tbody><thead><tr><th align="center" valign="middle" >Method</th><th align="center" valign="middle" >Brief description</th></tr></thead><tr><td align="center" valign="middle" >Adjust for Selection Bias [<xref ref-type="bibr" rid="scirp.57821-ref9">9</xref>] [<xref ref-type="bibr" rid="scirp.57821-ref12">12</xref>]</td><td align="center" valign="middle" >Include the variable associated with selection in the analysis, to reduce selection bias in a similar way to confounding [<xref ref-type="bibr" rid="scirp.57821-ref13">13</xref>] .</td></tr><tr><td align="center" valign="middle" >Bias Breaking [<xref ref-type="bibr" rid="scirp.57821-ref5">5</xref>]</td><td align="center" valign="middle" >A method which produces bias-adjusted estimates for the odds ratios in case control studies.</td></tr><tr><td align="center" valign="middle" >Imputation [<xref ref-type="bibr" rid="scirp.57821-ref8">8</xref>]</td><td align="center" valign="middle" >Often multiple imputation; replace missing values with reasonable estimates using the collected data.</td></tr><tr><td align="center" valign="middle" >Population Data [<xref ref-type="bibr" rid="scirp.57821-ref7">7</xref>]</td><td align="center" valign="middle" >Use population data in place of control data in a case control study.</td></tr><tr><td align="center" valign="middle" >Post-Stratification [<xref ref-type="bibr" rid="scirp.57821-ref14">14</xref>]</td><td align="center" valign="middle" >Classify unmatched samples of cases and controls based on their values on one or more of the variables in the study. Similar to stratified sampling or frequency matching.</td></tr><tr><td align="center" valign="middle" >Predict the Bias [<xref ref-type="bibr" rid="scirp.57821-ref15">15</xref>] - [<xref ref-type="bibr" rid="scirp.57821-ref17">17</xref>]</td><td align="center" valign="middle" >Use information from non-participants to try to predict the amount of bias present.</td></tr><tr><td align="center" valign="middle" >Propensity Score [<xref ref-type="bibr" rid="scirp.57821-ref10">10</xref>]</td><td align="center" valign="middle" >Can be used to match cases and controls, or as an additional covariate during analysis.</td></tr><tr><td align="center" valign="middle" >Sensitivity Analysis [<xref ref-type="bibr" rid="scirp.57821-ref6">6</xref>] [<xref ref-type="bibr" rid="scirp.57821-ref12">12</xref>]</td><td align="center" valign="middle" >A method for estimating the direction and magnitude of the bias.</td></tr><tr><td align="center" valign="middle" >Stratification [<xref ref-type="bibr" rid="scirp.57821-ref11">11</xref>]</td><td align="center" valign="middle" >Calculate estimates conditional on at least one other variable, which can lead to unbiased estimates within strata.</td></tr><tr><td align="center" valign="middle" >Weighting [<xref ref-type="bibr" rid="scirp.57821-ref4">4</xref>]</td><td align="center" valign="middle" >Usually inverse probability weighting; use external data to assign each subject a weight which is the inverse of their probability of selection, to allow them to represent non-participants.</td></tr></tbody></table></table-wrap><p>by selection bias.</p><p><xref ref-type="table" rid="table2">Table 2</xref> includes the data requirements of each method; including whether it requires the variable associated with selection to be known and recorded, whether population data external to the study are required or whether data regarding the non-participants who declined the study are needed. Where more than one category has been ticked, this indicates the method can be adapted for use when an alternative source for the required data is available. For the variable associated with selection to be collected, it is assumed the variable is known and can be recorded during the study. There may be instances where this variable is unknown, cannot be collected, or is impractical to record. This includes data which are expensive to collect, sensitive, or due to an unidentified variable. However there may be instances where a proxy may be used instead. The population data indicates information external to the study, for example from a database or alternative records. Sources may include census data or hospital registries. These data are assumed to be unbiased and represent the entire population of interest. Non-participant data are basic characteristics recorded from those who were unwilling to participate. These are usually data from the subject themselves, but may also be from external sources similar to those used to collect the population data.</p><p>Although the three data categories used in <xref ref-type="table" rid="table2">Table 2</xref> are sourced from different places (the participants, the population and non-participants respectively), there are relationships between them. For example, if the original potential participants are representative of the population of interest, and relevant information is known for all non-participants, then the non-participant data in conjunction with the participant data could be used to approximate the population data. Therefore under certain circumstances it may be possible to use a different column from <xref ref-type="table" rid="table2">Table 2</xref> for the data source, other than the one(s) ticked. <xref ref-type="table" rid="table2">Table 2</xref> and the consequent tool can be interpreted as a generalization or guide, which can be adapted by the researcher if these conditions are met.</p><p>Although each of the methods in <xref ref-type="table" rid="table1">Table 1</xref> is designed to reduce selection bias, they do so using different techniques and assumptions. Therefore, a method which may be optimal for one study may not be suitable for another. Some are also aimed at particular study designs, for example two were developed specifically for case control studies [<xref ref-type="bibr" rid="scirp.57821-ref5">5</xref>] [<xref ref-type="bibr" rid="scirp.57821-ref7">7</xref>] .</p><p>Several of the methods in <xref ref-type="table" rid="table1">Table 1</xref> have been developed or derived from one another. For example, the bias-breaking method [<xref ref-type="bibr" rid="scirp.57821-ref5">5</xref>] is a form of post-stratification, which is a type of stratification, and the propensity score is derived from stratification. However, their suitability as a method to reduce selection bias differs between studies. There are also similarities amongst some of the methods. For example, predicting the amount of bias present is similar to a sensitivity analysis, and several of the methods also began in survey literature [<xref ref-type="bibr" rid="scirp.57821-ref4">4</xref>] . <xref ref-type="fig" rid="fig1">Figure 1</xref> gives an example of a flowchart based on <xref ref-type="table" rid="table2">Table 2</xref> which could be used by researchers to shortlist potential selection bias reducing methods for further investigation. Researchers could extend this flowchart to meet their specific needs for the variables or datasets they encounter, or alternatively disciplines could form a subject-specific chart to which new methods could be added over time.</p><table-wrap id="table2" ><label><xref ref-type="table" rid="table2">Table 2</xref></label><caption><title> The required data for the methods summarised in <xref ref-type="table" rid="table1">Table 1</xref></title></caption><table><tbody><thead><tr><th align="center" valign="middle" ></th><th align="center" valign="middle" >Selection variable</th><th align="center" valign="middle" >Population data</th><th align="center" valign="middle" >Non-participant data</th></tr></thead><tr><td align="center" valign="middle" >Adjust for Selection Bias</td><td align="center" valign="middle" >P</td><td align="center" valign="middle" ></td><td align="center" valign="middle" ></td></tr><tr><td align="center" valign="middle" >Bias Breaking</td><td align="center" valign="middle" ></td><td align="center" valign="middle" >P</td><td align="center" valign="middle" ></td></tr><tr><td align="center" valign="middle" >Imputation</td><td align="center" valign="middle" ></td><td align="center" valign="middle" ></td><td align="center" valign="middle" >P</td></tr><tr><td align="center" valign="middle" >Population Data</td><td align="center" valign="middle" ></td><td align="center" valign="middle" >P</td><td align="center" valign="middle" ></td></tr><tr><td align="center" valign="middle" >Post-Stratification</td><td align="center" valign="middle" >P</td><td align="center" valign="middle" ></td><td align="center" valign="middle" ></td></tr><tr><td align="center" valign="middle" >Predict the Bias</td><td align="center" valign="middle" >P</td><td align="center" valign="middle" >P</td><td align="center" valign="middle" >P</td></tr><tr><td align="center" valign="middle" >Propensity Score</td><td align="center" valign="middle" >P</td><td align="center" valign="middle" ></td><td align="center" valign="middle" ></td></tr><tr><td align="center" valign="middle" >Sensitivity Analysis</td><td align="center" valign="middle" >P</td><td align="center" valign="middle" >P</td><td align="center" valign="middle" ></td></tr><tr><td align="center" valign="middle" >Stratification</td><td align="center" valign="middle" >P</td><td align="center" valign="middle" ></td><td align="center" valign="middle" ></td></tr><tr><td align="center" valign="middle" >Weighting</td><td align="center" valign="middle" ></td><td align="center" valign="middle" >P</td><td align="center" valign="middle" >P</td></tr></tbody></table></table-wrap></sec><sec id="s3"><title>3. Examples</title><p>Three examples of hypothetical studies follow which utilize the flowchart (<xref ref-type="fig" rid="fig1">Figure 1</xref>) to determine a suitable method to apply. The flowchart begins in the top-left corner, shown using a bold outline. To answer the first question in the flowchart, the requirements for selection bias to occur must be known. For selection bias to be present, there must be the exposure and outcome of interest, and these must both affect whether or not an individual is selected to participate or self-selects (participates) in the study. This selection variable must then be conditioned on, which it often will be since only those who have participated can be studied [<xref ref-type="bibr" rid="scirp.57821-ref1">1</xref>] .</p><p>Once the flowchart has provided a list of possible methods to explore further, it is the responsibility of the researcher to consider each in turn to see which method is most suitable for their particular study. All method assumptions must be considered and the details of the specific study incorporated.</p><sec id="s3_1"><title>3.1. Example 1</title><p>A randomized controlled trial (RCT) is conducted for a new hayfever tablet. Hayfever suffers are recruited and randomly allocated to either the drug group or the placebo group. The new tablet produces some unexpected side-effects and some participants in the drug arm suffer from fainting or severe vomiting. Half of the participants in the drug arm withdraw from the study, as they decide that their hayfever symptoms are preferable to the side effects. The flowchart can be used to see which methods for selection bias may be worth further consideration.</p><p>・ Is the study potentially affected by selection bias? The association of interest is from the new tablet, or treat- ment group, to the severity of the hayfever symptoms. For potential selection bias, both the treatment group and the hayfever symptoms need to influence selection into the study. The side-effects from the tablet causing withdrawal from the study mean that the treatment group does affect inclusion in the analysis and hence ‘selection’. However, hayfever suffers were randomly allocated to either the drug or placebo group, so the severity of hayfever symptoms was balanced between the two treatment groups and therefore the severity of the symptoms did not affect selection into the study. Since only the treatment group and not the severity of the symptoms affects selection, selection bias is not a problem here, and the results can be analyzed as usual without the need for a selection bias reducing method.</p><fig id="fig1"  position="float"><label><xref ref-type="fig" rid="fig1">Figure 1</xref></label><caption><title> A flowchart tool for researchers: Which methods are suitable to reduce selection bias</title></caption><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/2-1890169x5.png"/></fig></sec><sec id="s3_2"><title>3.2. Example 2</title><p>A study is conducted investigating the association between coffee and migraines. Postal surveys are randomly sent to households in the United Kingdom (UK), with a return envelope enclosed. In addition to questions about migraines and coffee consumption, the survey also includes basic demographic data such as sex, age, general location, employment status etc. Those who drink coffee may be more likely to respond, to find out whether they are increasing their chances of migraines, whereas people who do not drink coffee may not be as interested. Those who suffer from migraines may also be more interested in the survey than those who do not. Previous studies have shown that older people are generally more likely to participate in surveys [<xref ref-type="bibr" rid="scirp.57821-ref18">18</xref>] - [<xref ref-type="bibr" rid="scirp.57821-ref22">22</xref>] . Let age and coffee consumption be positively correlated and let migraines be more common in older people. The flowchart can guide the researchers involved in the study towards any possible methods to reduce selection bias.</p><p>・ Is the study potentially affected by selection bias? Coffee is the exposure of interest and migraines are the outcome of interest. Since coffee drinkers and migraine sufferers are more likely to return the survey, then selection is affected by both the exposure and outcome of interest, and therefore selection bias is possible.</p><p>・ Is the variable associated with selection recorded? Coffee consumption, migraine occurrences and age all affect survey returns, and are all recorded in the survey. However, only variables which are not the exposure or outcome of interest can be used in the analysis to reduce selection bias; age in this instance. Therefore, the following methods can be considered further:</p><p>- Adjust for the variable associated with selection; Age can be added to the analysis, for example as a variable in a regression model between coffee and migraines.</p><p>- Stratification; The analysis can be conducted within age strata; for example by analyzing in age groups of ten years, to reduce the effect of age on selection.</p><p>- Post-stratification; Migraine sufferers can be matched to non-sufferers by their age.</p><p>- Predict the bias; The analysis could be conducted with and without the age variable, to predict how selection bias is affected by age.</p><p>- Propensity score; Age can be incorporated into the analysis to either match migraine sufferers to non-suffer- ers, or the propensity score can be calculated using all the variables and included in the analysis.</p><p>・ Therefore all methods here are possible.</p></sec><sec id="s3_3"><title>3.3. Example 3</title><p>A UK case control study is conducted which investigates the association between excessive alcohol consumption and brain tumors. Researchers therefore attempt to recruit cases who have brain tumors and controls who do not. The retrospective nature of the study design mean data are then collected on each participant regarding their alcohol consumption, in this instance over an extended previous period of time. The exposure of interest in the case control study is blinded to the participants and the interviewers used, to reduce the effects of other biases such as interviewer bias. The blinding here is in the form of an extended questionnaire, including questions regarding several possible exposures such as alcohol, smoking, mobile phone use, exercise routines and family history. This leads to some participants intentionally avoiding questions, such as those to which they have undesirable answers. For example, heavy smokers ignore the question regarding the number of cigarettes smoked daily, those who do not exercise often miss the question regarding the number of hours exercise completed per week, and frequent drinkers avoid the question about alcohol consumption.</p><p>Let the data from the questionnaire be available, along with a national database regarding the number of people with brain tumors in the UK. The Office for National Statistics (ONS) also records data for adult drinking habits [<xref ref-type="bibr" rid="scirp.57821-ref23">23</xref>] . The flowchart can be used to determine which methods may be suitable to reduce selection bias.</p><p>・ Is the study potentially affected by selection bias? It is well-documented that cases are often more likely to participate in a study than controls, since they have additional motivation to find a cure or an explanation for their condition [<xref ref-type="bibr" rid="scirp.57821-ref24">24</xref>] [<xref ref-type="bibr" rid="scirp.57821-ref25">25</xref>] . Therefore the outcome is affecting self-selection into the study. Next the exposure of interest, alcohol consumption, is being recorded only for those who are willing to declare their consumption levels; in this instance, those who consume amounts not deemed to be excessive. Therefore, inclusion in the study analysis depends upon the exposure level. Since only those who are willing to participate in the study and who answer the question regarding alcohol consumption are used to investigate the association between excessive alcohol consumption and brain tumors, participation is conditioned on and so selection bias is a possibility.</p><p>・ Is the variable associated with selection recorded? The variables associated with selection are the exposure and outcome themselves, hence methods which use the variable associated with selection in the analysis to reduce selection bias are not suitable here.</p><p>・ Are relevant population data available? The national database for brain tumors and ONS data for drinking habits are available. Therefore, the following methods can be considered further:</p><p>- Sensitivity analysis; The external population data could be used to estimate the magnitude and direction of bias, although unfortunately the drinking habits of those in the population with brain tumors is unknown. This method may be possible.</p><p>- Weighting; If there are no heavy-drinking participants who answer the question about alcohol consumption, then a weight cannot be applied to this category and the weighing method would be unsuitable.</p><p>- Bias-breaking; For this method, a variable must be identified which separates the exposure from the selection criteria, but unfortunately, the risk factor is the variable which is determining selection into the analysis and hence this method is not suitable.</p><p>- Predict the bias; Information from non-participants is not available as such, but could possibly be derived from the population data available in conjunction with the participant data. This method may be possible.</p><p>- Population data; The method requires there to be data regarding the population size, the number of cases and the number of exposed; all of which are recorded in the national database or in the ONS records. Therefore this method is a possible option.</p><p>These examples have shown how the flowchart can quickly eliminate potential methods and guide the researcher towards a subset of methods for further consideration.</p></sec></sec><sec id="s4"><title>4. Discussion</title><p>Selection bias can be problematic for surveys and a range of study designs [<xref ref-type="bibr" rid="scirp.57821-ref26">26</xref>] , but particularly those which are retrospective such as case control studies [<xref ref-type="bibr" rid="scirp.57821-ref2">2</xref>] , as seen in Example 3. Biased results can lead to incorrect findings and the unnecessary repetition of studies, wasting valuable time and resources which could instead be used to fund additional research into diseases or their cure.</p><p>Any form of bias can be viewed as a negative aspect of a study, but action should be taken to reduce as much bias as possible within the results of a study. This work aims to highlight the importance of bias reduction, specifically selection bias, and provide researchers with a summary of methods currently available to reduce selection bias, along with references for further reading.</p><p>A user-friendly flowchart tool has been provided, which can be adapted for particular research areas or depending upon the data resources available, to aid the selection of an appropriate method. The tool is not designed to identify one method to use, but instead guide the researcher to a subset of methods for further consideration. This could be viewed as a limitation, but contemplation of the requirements for each method is necessary. The optimal method depends upon specific details relating to an individual study and would require a complicated flowchart which would be more difficult to use. However, subject-specific flowcharts could be created. This tool is therefore a straightforward flowchart, applicable to a range of study designs, provoking consideration of selection bias while providing references for further reading. We hope demonstration of this versatile tool through examples and raised awareness results in more consideration of selection bias and consequently the implementation of appropriate methods.</p></sec><sec id="s5"><title>5. Conclusion</title><p>Bias reduction is an important part of any study and this work raises awareness of selection bias in particular. A straightforward flowchart, with summaries of the current methods to reduce selection bias, has been provided to guide researchers towards a suitable method, in the hope that more accurate results are generated from studies.</p></sec><sec id="s6"><title>Funding</title><p>Claire Keeble is funded by an MRC Capacity Building Studentship. Paul D Baxter, Stuart Barber and Graham Richard Law are funded by HEFCE. The funding sources had no involvement in the study design, in the collection, analysis and interpretation of data, in the writing of the report or the decision to submit the article for publication.</p></sec></body><back><ref-list><title>References</title><ref id="scirp.57821-ref1"><label>1</label><mixed-citation publication-type="other" xlink:type="simple">Hernan, M., Hernandez-Diaz, S. and Robins, J. (2004) A Structural Approach to Selection Bias. Epidemiology, 15, 615-625. http://dx.doi.org/10.1097/01.ede.0000135174.63482.43</mixed-citation></ref><ref id="scirp.57821-ref2"><label>2</label><mixed-citation publication-type="book" xlink:type="simple">Hennekens, C.H. and Buring, J.E. (1987) Screening. In: Mayrent, S.L., Ed., Epidemiology in Medicine, Little, Brown and Co., Boston, 327-345.</mixed-citation></ref><ref id="scirp.57821-ref3"><label>3</label><mixed-citation publication-type="other" xlink:type="simple">Keeble, C., Barber, S., Law, G. and Baxter, P. (2013) Participation Bias Assessment in Three High Impact Journals. Sage Open, 3, 1-5. http://dx.doi.org/10.1177/2158244013511260</mixed-citation></ref><ref id="scirp.57821-ref4"><label>4</label><mixed-citation publication-type="other" xlink:type="simple">Horvitz, D. and Thompson, D. (1952) A Generalization of Sampling without Replacement from a Finite Universe. Journal of the American Statistical Association, 47, 663-685.  
http://dx.doi.org/10.1080/01621459.1952.10483446</mixed-citation></ref><ref id="scirp.57821-ref5"><label>5</label><mixed-citation publication-type="other" xlink:type="simple">Geneletti, S., Richardson, S. and Best, N. (2009) Adjusting for Selection Bias in Retrospective, Case-Control Studies. Biostatistics, 10, 17-31. http://dx.doi.org/10.1093/biostatistics/kxn010</mixed-citation></ref><ref id="scirp.57821-ref6"><label>6</label><mixed-citation publication-type="other" xlink:type="simple">Geneletti, S., Mason, A. and Best, N. (2011) Adjusting for Selection Effects in Epidemiologic Studies: Why Sensitivity Analysis Is the Only “Solution”. Epidemiology, 22, 36-39. http://dx.doi.org/10.1097/EDE.0b013e3182003276</mixed-citation></ref><ref id="scirp.57821-ref7"><label>7</label><mixed-citation publication-type="other" xlink:type="simple">Keeble, C., Barber, S., Baxter, P., Parslow, R. and Law, G. (2014) Reducing Participation Bias in Case-Control Studies: Type 1 Diabetes in Children and Stroke in Adults. Open Journal of Epidemiology, 4, 129-134. http://dx.doi.org/10.4236/ojepi.2014.43018</mixed-citation></ref><ref id="scirp.57821-ref8"><label>8</label><mixed-citation publication-type="other" xlink:type="simple">Sterne, J., White, I., Carlin, J., Spratt, M., Royston, P., Kenward, M., et al. (2009) Multiple Imputation for Missing Data in Epidemiological and Clinical Research: Potential and Pitfalls. BMJ, 338, 2393-2397. http://dx.doi.org/10.1136/bmj.b2393</mixed-citation></ref><ref id="scirp.57821-ref9"><label>9</label><mixed-citation publication-type="book" xlink:type="simple">Breslow, N. and Day, N. (1980) Chapter 3: General Considerations for the Analysis of Case-Control Studies. In: Breslow, N.E. and Day, N.E., Eds., Statistical Methods in Cancer Research, IARC Scientific Publications, 84-119.</mixed-citation></ref><ref id="scirp.57821-ref10"><label>10</label><mixed-citation publication-type="other" xlink:type="simple">Rosenbaum, P. and Rubin, D. (1983) The Central Role of the Propensity Score in Observational Studies for Causal Effects. Biometrika, 70, 41-55. http://dx.doi.org/10.1093/biomet/70.1.41</mixed-citation></ref><ref id="scirp.57821-ref11"><label>11</label><mixed-citation publication-type="journal" xlink:type="simple"><name name-style="western"><surname>Sarndal</surname><given-names> C.E. </given-names></name>,<etal>et al</etal>. (<year>1992</year>)<article-title>Methods for Estimating the Precision of Survey Estimates when Imputation Has Been Used</article-title><source> Survey Methodology</source><volume> 18</volume>,<fpage> 241</fpage>-<lpage>252</lpage>.<pub-id pub-id-type="doi"></pub-id></mixed-citation></ref><ref id="scirp.57821-ref12"><label>12</label><mixed-citation publication-type="other" xlink:type="simple">Kleinbaum, D., Morgenstern, H. and Kupper, L. (1981) Selection Bias in Epidemiological Studies. American Journal of Epidemiology, 113, 452-463.</mixed-citation></ref><ref id="scirp.57821-ref13"><label>13</label><mixed-citation publication-type="other" xlink:type="simple">Hosmer Jr., D., Lemeshow, S. and Sturdivant, R. (2013) Applied Logistic Regression. John Wiley &amp; Sons, Hoboken. http://dx.doi.org/10.1002/9781118548387</mixed-citation></ref><ref id="scirp.57821-ref14"><label>14</label><mixed-citation publication-type="other" xlink:type="simple">Schlesselman, J. (1982) Case-Control Studies: Design, Conduct, Analysis. Oxford University Press, New York.</mixed-citation></ref><ref id="scirp.57821-ref15"><label>15</label><mixed-citation publication-type="other" xlink:type="simple">Hatch, E., Kleinerman, R., Linet, M., Tarone, R., Kaune, W., Auvinen, A., et al. (2000) Do Confounding or Selection Factors of Residential Wiring Codes and Magnetic Fields Distort Findings of Electromagnetic Field Studies? Epidemiology, 11, 189-198. http://dx.doi.org/10.1097/00001648-200003000-00019</mixed-citation></ref><ref id="scirp.57821-ref16"><label>16</label><mixed-citation publication-type="other" xlink:type="simple">Madigan, M., Troisi, R., Potischman, N., Brogan, D., Gammon, M., Malone, K., et al. (2000) Characteristics of Respondents and Non-Respondents from a Case-Control Study of Breast Cancer in Younger Women. International Journal of Epidemiology, 29, 793-798. http://dx.doi.org/10.1093/ije/29.5.793</mixed-citation></ref><ref id="scirp.57821-ref17"><label>17</label><mixed-citation publication-type="other" xlink:type="simple">Wrensch, M. (2000) Are Prior Head Injuries or Diagnostic X-Rays Associated with Glioma in Adults? The Effects of Control Selection Bias. Neuroepidemiology, 19, 234-244. http://dx.doi.org/10.1159/000026261</mixed-citation></ref><ref id="scirp.57821-ref18"><label>18</label><mixed-citation publication-type="other" xlink:type="simple">Hara, M., Higaki, Y., Imaizumi, T., Taguchi, N., Nakamura, K., Nanri, H., et al. (2010) Factors Influencing Participation Rate in a Baseline Survey of a Genetic Cohort in Japan. Journal of Epidemiology, 20, 40-45. http://dx.doi.org/10.2188/jea.JE20090062</mixed-citation></ref><ref id="scirp.57821-ref19"><label>19</label><mixed-citation publication-type="other" xlink:type="simple">Perez, D., Nie, J., Ardern, C., Radhu, N. and Ritvo, P. (2013) Impact of Participant Incentives and Direct and Snowball Sampling on Survey Response Rate in an Ethnically Diverse Community: Results from a Pilot Study of Physical Activity and the Built Environment. Journal of Immigrant and Minority Health, 15, 207-214. http://dx.doi.org/10.1007/s10903-011-9525-y</mixed-citation></ref><ref id="scirp.57821-ref20"><label>20</label><mixed-citation publication-type="other" xlink:type="simple">Koloski, N., Jones, M., Eslick, G. and Talley, N. (2013) Predictors of Response Rates to a Long Term Follow-Up Mail out Survey. PLoS ONE, 8, e79179. http://dx.doi.org/10.1371/journal.pone.0079179</mixed-citation></ref><ref id="scirp.57821-ref21"><label>21</label><mixed-citation publication-type="other" xlink:type="simple">McLean, S., Paxton, S., Massey, R., Mond, J., Rodgers, B. and Hay, P. (2014) Prenotification but Not Envelope Teaser Increased Response Rates in a Bulimia Nervosa Mental Health Literacy Survey: A Randomized Controlled Trial. Journal of Clinical Epidemiology, 67, 870-876. http://dx.doi.org/10.1016/j.jclinepi.2013.10.013</mixed-citation></ref><ref id="scirp.57821-ref22"><label>22</label><mixed-citation publication-type="other" xlink:type="simple">Nota, S., Strooker, J. and Ring, D. (2014) Differences in Response Rates between Mail, E-Mail, and Telephone Follow-Up in Hand Surgery Research. Hand, 9, 504-510. http://dx.doi.org/10.1007/s11552-014-9618-x</mixed-citation></ref><ref id="scirp.57821-ref23"><label>23</label><mixed-citation publication-type="other" xlink:type="simple">Office for National Statistics (2013) Adult Drinking Habits in Great Britain. http://www.ons.gov.uk/ons/rel/ghs/opinions-and-lifestyle-survey/adult-drinking-habits-in-great-britain–2013/stb-drinking-2013.html</mixed-citation></ref><ref id="scirp.57821-ref24"><label>24</label><mixed-citation publication-type="other" xlink:type="simple">Galea, S. and Tracy, M. (2007) Participation Rates in Epidemiologic Studies. Annals of Epidemiology, 17, 643-653. http://dx.doi.org/10.1016/j.annepidem.2007.03.013</mixed-citation></ref><ref id="scirp.57821-ref25"><label>25</label><mixed-citation publication-type="other" xlink:type="simple">Li, Y., Wang, W., Wu, Q., van Velthoven, M., Chen, L., Du, X., et al. (2015) Increasing the Response Rate of Text Messaging Data Collection: A Delayed Randomized Controlled Trial. Journal of the American Medical Informatics Association, 22, 51-64.</mixed-citation></ref><ref id="scirp.57821-ref26"><label>26</label><mixed-citation publication-type="other" xlink:type="simple">Thadhani, R. and Tonelli, M. (2006) Cohort Studies: Marching Forward. Clinical Journal of the American Society of Nephrology, 1, 1117-1123. http://dx.doi.org/10.2215/CJN.00080106</mixed-citation></ref></ref-list></back></article>