Quality appraisal of clinical practice guidelines on physical restraints in ICU: a systematic review
Patients in the intensive care unit (ICU) often require invasive catheters to assist in their treatment due to their condition, such as tracheal intubation, central venous catheter (CVC), and urinary catheters (1). As it is believed to limit the patient’s movement (2), physical restraints have long been used as a protective measure for patients in the ICU to ensure that patients’ life-support catheters are not accidentally removed (3), which has resulted in physical restraints rate 23.4 times higher in the ICU than in the general wards (4). The rate of physical restraint use varies between countries. In the United States, a study that included 68 ICUs showed that the physical restraints rate in ICUs was 33% (5,6). In China, where several scholars have conducted localized surveys over the past decade, the physical restraints rate fluctuated between 35.1% and 77.2% (7-10). Physical restraints are widely used in ICU around the world (11).
With the establishment of the “biopsychosocial medical model”, the care of patients has gradually changed from “disease-centered” to holistic care centered on the physical and psychological health of patients (12). Therefore, when using physical restraints on ICU patients, medical staff should consider the physical and psychological effects of physical restraints. Unfortunately, it has been proven that physical restraints may lead to delirium (13), neurovascular complications (e.g., redness, limb movement, oedema, and colour complication) (14) and worsen agitation (15). Besides, while the main reason for using physical restraints in the ICU is to prevent the extubation (16), some studies (17-19) have pointed out that physical restraints can increase the level of anxiety and irritability of patients, but instead increase the risk of unplanned extubation and falls, which cannot ensure patient safety.
High quality clinical practice guidelines (CPGs) can enhance the health care quality through presenting recommendations to decision makers (20). Nurses are the primary decision-makers in applying physical restraints (3,21). So, reducing the use of physical restraints requires changing nurses’ perceptions and increasing knowledge of the proper use of physical restraints, guided by high-quality CPGs (22). Although physical restraints CPGs have been published by different organizations, the quality of available ones remains unknown. Our study aims to evaluate the quality of physical restraints guidelines from methodological and reporting perspective and to inform future practice improvement and guideline development. We conducted the systematic review following the PRISMA reporting checklist (available at https://apm.amegroups.com/article/view/10.21037/apm-21-2851/rc) (23).
Data sources and search strategy
On Nov 21, 2021, two reviewers (RL, YL) independently searched MEDLINE (via PubMed), Web of Science, Embase, Cumulative Index to Nursing and Allied Health Literature (CINAHL), China National Knowledge Infrastructure (CNKI), and Wanfang Data since inception using search terms, such as “physical restraint”, “intensive care unit”, “critical care”, “acute care”, “practice guideline”, etc. The details of the search strategies are shown in Appendix 1. We also searched Google and some relevant websites (GIN, NICE, SIGN, RNAO, AHRQ, AACN) using “physical restraint” as search terms.
Inclusion and exclusion criteria
We included guidelines based on the following inclusion criteria: (I) articles met the definitions of the guidelines proposed by the institute of medicine (IOM) in 1990 or 2011 (24,25). When evidence quality is low or very low, the guideline panels label them as consensus statements (26). So, in this study, we included statements based on evidence; (II) guidelines focus on physical restraint; (III) the settings of guidelines include critical care and acute care settings (general residential care setting were excepted); (IV) guidelines focus on adult patients; (V) guidelines were published between January 2001 and November 2021 in Chinese or English.
Our exclusion criteria were: (I) articles focus on chemical restraints or recommendations of physical restraints cannot be clearly distinguished; (II) previous version of guidelines, if there were two same guidelines by the same group/organization; (III) translated versions, brief versions or interpretations of the guidelines.
Study selection
Two reviewers (RL and YL) independently screened the titles, abstracts based on the inclusion and exclusion criteria through bibliographic software EndNote and then screened the full text. Disagreements were resolved by consensus and discussion with a third reviewer (XJ). Prior to the screening, a pilot test was performed until agreement on the screening process was reached.
Data extraction
Data were extracted by two reviewers (RL and YL) using a standardized form: organization, publication date, country/region, systematic literature search, recommendation formulation method, evidence quality grading, conflicts, funding.
Quality appraisal of guidelines
Two reviewers (RL and YL) independently used the Appraisal of Guidelines for Research and Evaluation II (AGREE II) tool (27-30) to assess the methodological quality of the included guidelines. According to the AGREE II manual (27), it contains 23 items in six domains. Each item was rated on a seven-point scale (1-strongly disagree to 7-strongly agree). The domain quality score (between 0 to 100%) was calculated by aggregating the individual scores of the reviewers according to the formula provided in AGREE II handbook, scaled by the percentage of the highest possible score, and averaging the scores of the two reviewers. In addition, the mean scores of the six standardized domains were used to calculate the overall guideline assessment and to classify guidelines (31,32): “high quality” was score >80%; “moderate quality” was score between 50–80%; “low quality” was score <50%.
Two reviewers (RL and YL) independently used the Reporting Items for Practice Guidelines in Healthcare (RIGHT) checklist (33) to assess the reporting quality of the included guidelines. It contains 22 items in seven domains. Items rated as “reported” are scored as 1, items rated as “not reported” are scored as 0. Calculate the percentage of the total score to obtain the overall report assessment is divided into: “well-reported” was score >80%; “moderate-reported” was score between 50–80%; “low-reported” was score <50%.
Statistical analysis
Using SPSS 25.0 to calculate intra-class correlation coefficients (ICC) of two reviewers in AGREE II scores and RIGHT reporting scores to test agreement among reviewers. ICC >0.75 shows good reliability (34).
A total of 635 articles were retrieved, and 74 duplicate articles were excluded; 542 articles were excluded after reading the title and abstract. Based on full-text screening, 13 articles were excluded, and six guidelines were included (35-40). The selection process was shown in Figure 1.

The ICC values for two reviewers were 0.820 (95% CI: 0.757–0.868) in the AGREE II and 0.837 (95% CI: 0.792–0.873) in the RIGHT checklist, both of which indicate good reliability.
Characteristics of guidelines
The six guidelines were published in the United States (37,38,40), the United Kingdom (35,36), and Canada (39), with only the guideline developed by The University of Iowa (38), which were updated in 2016 after being published in 2012, and the guidelines developed by the Intensive Care Society (35), which were published in 2021 and are scheduled to be updated in 2024; none of the remaining guidelines have been updated or plan to update. Five guidelines (36-40) mentioned systematic literature retrieval, but only one (39) described the guideline development process and literature search strategies in detail. Three guidelines (37,39,40) formulated recommendations through expert consensus and used the evidence grading system, including Cochrane, SIGN, and self-defined system. The rest guidelines were not mentioned recommendation formulation methods and evidence grading system. One guideline (38) reported no relevant conflicts of interest for the developers, and two guidelines (39,40) reported funding sources. Specific information was shown in Table 1.
Table 1
No. of guideline | Developing organization | Publication date | Country/region | Systematic literature retrieval | Recommendation formulation method | Evidence quality grading | Conflicts | Funding |
1 (35) | Intensive Care Society | Mar 2021 | UK | Not reported | Not reported | No | Not reported | Not reported |
2 (36) | BACCN | Sep–Oct 2004 | UK | Yes | Not reported | No | Not reported | Not reported |
3 (37) | ACCCM | Nov 2003 | USA | Yes | Expert consensus | Cochrane | Not reported | Not reported |
4 (38) | The University of Iowa | Feb 2016 | USA | Yes | Not reported | No | No | Not reported |
5 (39) | RNAO | Feb 2012 | Canada | Yes | Expert consensus | SIGN | Not reported | Ontario Ministry of Health and Long-Term Care |
6 (40) | HIGN | 2012 | USA | Yes | Expert consensus | Self-defined | Not reported | The Hartford Institute for Geriatric Nursing, New York University College of Nursing |
BACCN, British Association of Critical Care Nurses; ACCCM, American College of Critical Care Medicine Task Force; RNAO, Registered Nurses’ Association of Ontario; HIGN, Hartford Institute for Geriatric Nursing.
Quality of included guidelines
Methodological quality
The mean AGREE II score of the included guidelines were 41.92% with a range of 31.89–69.50%. No guideline was “high quality”. Only one guideline developed by the Registered Nurses’ Association of Ontario (RNAO) with a mean AGREE II score of 69.50% was “moderate quality”, and the remaining five guidelines were of “low quality”. The overall AGREE II scores (average of six domains) for each guideline were shown in Figure 2. “Clarity of Presentation” was the highest score (69.91%), and “Applicability” was the lowest score (21.53%). Each domain AGREE II scores were shown in Figure 3. And the scores of all guidelines in six domains were shown in Table 2.

Table 2
No. of guideline | Scope and purpose | Stakeholder involvement | Rigor of development | Clarity of presentation | Applicability | Editorial independence | Average |
1 (35) | 50.00% | 41.67% | 5.21% | 63.89% | 35.42% | 0.00% | 32.70% |
2 (36) | 44.44% | 41.67% | 26.04% | 75.00% | 4.17% | 0.00% | 31.89% |
3 (37) | 55.56% | 41.67% | 41.67% | 75.00% | 14.58% | 0.00% | 38.08% |
4 (38) | 50.00% | 36.11% | 43.75% | 52.78% | 2.08% | 41.67% | 37.73% |
5 (39) | 72.22% | 63.89% | 84.38% | 94.44% | 68.75% | 33.33% | 69.50% |
6 (40) | 72.22% | 38.89% | 17.71% | 58.33% | 4.17% | 58.33% | 41.61% |
Average | 57.41% | 43.98% | 36.46% | 69.91% | 21.53% | 22.22% | 41.92% |
AGREE II, Appraisal of Guidelines for Research and Evaluation II.
Reporting quality
The mean reporting rate for guidelines was 41.0% with a range of 24.7–77.7%. Only one guideline developed by RNAO was “moderate-reported” with a mean reporting rate of 77.7%, and the remaining five guidelines were all rated as “low-reported”. The overall reporting rates of six guidelines were shown in Figure 2. Among the seven domains, “Information” was the highest reporting rate (66.7%), while “Funding and conflict-of-interest statements and management” was the lowest reporting rate (16.7%). Among 10 key items, the reporting rates of item 14c “other consideration", item 18b “role of funder” and item 19b “management of conflict of interest” were 0%, with the highest reporting rates of item 12 “evidence quality assessment approach” at 50%. Each domain reporting rates were shown in Figure 4. And the reporting rates of all guidelines in seven domains were shown in Table 3.

Table 3
No. of guideline | Information | Background | Evidence | Recommendations | Review and quality assurance | Funding and conflict-of-interest statements and management | Other information of the guideline | Average |
1 (35) | 83.3% | 75.0% | 0.0% | 14.3% | 0.0% | 0.0% | 0.0% | 24.7% |
2 (36) | 83.3% | 37.5% | 0.0% | 14.3% | 50.0% | 0.0% | 66.7% | 36.0% |
3 (37) | 66.7% | 62.5% | 20.0% | 42.9% | 50.0% | 0.0% | 66.7% | 44.1% |
4 (38) | 66.7% | 37.5% | 0.0% | 28.6% | 0.0% | 25.0% | 33.3% | 27.3% |
5 (39) | 83.3% | 100.0% | 100.0% | 85.7% | 50.0% | 25.0% | 100.0% | 77.7% |
6 (40) | 16.7% | 50.0% | 60.0% | 42.9% | 0.0% | 50.0% | 33.3% | 36.1% |
Average | 66.67% | 60.42% | 30.00% | 38.10% | 25.00% | 16.67% | 50.00% | 41.0% |
RIGHT, Reporting Items for Practice Guidelines in Healthcare.
CPGs, developed based on the best research evidence, provide clinicians with clear and comprehensive recommendations and are an important bridge between research evidence and clinical practice (41). AGREE II and RIGHT checklist focus on different aspects of quality, although some items overlap, in general can be a more comprehensive reflection of the quality of a guideline. It has been shown that AGREE II scores and RIGHT reporting rates have a high positive correlation, and writing CPGs based on the RIGHT checklist can improve AGREE II scores and thus improve the quality of guidelines (32,42). Therefore, this study provides suggestions for the future development of high-quality CPGs by evaluating the quality of physical restraints CPGs.
The overall methodological and reporting quality of the six guidelines included in this study was low. Only the guideline developed by RNAO in 2012 were rated as “moderate quality” in terms of methodological and reporting quality. The quality of CPGs affects clinical practice (41), so there is a need to develop high-quality CPGs in physical restraints to better guide clinical practice. There are two categories of CPGs: evidence-based CPGs (EB-CPGs) and non-EB-CPGs (43). Study found that the quality of EB-CPGs was significantly higher than the quality of non-EB-CPGs (26). If only low-quality evidence is available, EB-CPGs can be developed as well (43). Thus, in the future, high-quality physical restraints CPGs should be developed based on the best available evidence and in strict accordance with the methodology of evidence-based guidelines.
Applicability of recommendations in clinical practice is a key element of guideline translation (44). The overall mean score for “Applicability” of physical restraints CPGs in AGREE II was only 17.36%, with 4 guidelines (36-38,40) scoring <10%, which was serious neglect of guideline applicability. Studies have summarized facilitators of guideline application (44), including the provision of guideline implementation tools (e.g., executive summaries, brochures), presentation of guidelines in a short format, and presentation of guidelines in a digital format. It is recommended that future CPGs focus on the evidence as well as the application aspects. In particular, it should pay attention to analyzing what the advantages and disadvantages of applying the recommendations of this guideline, providing supporting tools, potential resource implications of applying the recommendations, and proposing monitoring and/or auditing criterion.
For both funding and conflict of interest statements, physical restraints CPGs scored low quality for both AGREE II and RIGHT checklist corresponding items. Three guidelines (35-37) did not report either financial support or information about conflict of interest. Conflicts of interest are one of the most important factors affecting the reliability of guidelines (24,45). Studies have shown that financial ties exist between guideline authors, panelists, and pharmaceutical companies (46). Therefore, guideline developers should guideline developers should regulate the management and reporting of funding and conflicts of interest, and researchers and administrators need to actively improve management policies and develop corresponding reporting specifications to promote physically binding CPGs toward greater objectivity, fairness, and transparency.
Opinions and preferences of the target population, procedures for updating the guideline, sources and evaluation of evidence were factors that affect the quality of the guidelines. The overall mean score for 14a “Describe whether values and preferences of the target population(s) were considered in the formulation of each recommendation” of physical restraints CPGs in RIGHT checklist was 33.3%. Few guidelines clearly describe how to consider patients’ perspectives and preferences, but clinical experience suggests that patients’ feelings and perceptions and close collaboration with health care professionals play a critical role in physical restraints practice. Thus, without knowledge of patient preferences, the implementation of the guidelines is definitely affected. Similar to the results of other guideline evaluations (47), descriptions of the updating procedures of the guidelines were poor, with only two guidelines (37,39) mentioning an update schedule. The WHO guideline handbook stated that although there is no maximum duration validity of recommendations, the minimum period for guideline updates is 2 years, and the maximum period is 5 years (48). Therefore, it is extremely important to update the guidelines in a timely manner. The mean score of “Rigor of development” in AGREE II (36.46%) and “Evidence” in RIGHT checklist (30.00%) were low, which reflect that the guidelines for “sources and evaluation of evidence” are not very good. Inclusion and exclusion criteria for evidence should be clear and strictly implemented, and formal tools or methods (e.g., Jadad scale, GRADE method) should be used to assess the strength of the evidence, clearly state the limitations of the evidence, balance the pros and cons of the available evidence, and give supporting data. A clear presentation and description of the evidence will help clinical staffs make good decisions based on a synthesis of the evidence when applying the recommendations. In addition, attention should be paid to the completeness of guideline reporting. For example, for process of systematic review, the entire process of literature search should be described in detail, and a complete search strategy should be provided in an appendix rather than just a few key words.
This study provided a systematic literature retrieve to comprehensively explore the methodological and reporting quality of physical restraints CPGs and to make recommendations. Our findings provide clinical experts and methodologists with an overview of methodological and reporting quality of physical restraints guidelines, which may contribute to the development and updating of future guidelines and promote standardization of physical restraints practice. However, this study had several limitations. First, we only have two researchers involved in the quality appraisal process, which may have been problematic in terms of the accuracy of the results. Second, we only searched guidelines published in Chinese and English. So, we may not have included all of the guidelines.
In general, the methodological and reporting quality of physical restraints guidelines were low, and future development or updating of high-quality guidelines to guide clinical practice is needed. The domains of applicability, funding and conflict of interest statement, opinions and preferences of the target population, procedures for updating the guideline, sources and evaluation of evidence still need improvement. In the development of guidelines, more detailed and specific methodological descriptions are needed.
Funding: Chongqing Medical Scientific Research Project (Joint Project of Chongqing Health Commission and Science and Technology Bureau) (No. 2019ZDXM024).
