7.1.
7.2.
7.2.1.
7.2.2.
7.2.3.
7.2.4.
7.2.5.
7.2.6.
7.2.7.
7.3.
Genre: A reflective essay reflects critically on personal experience and opinion in light of broader literature, theories or subject materials. As conventions and expectations may differ across contexts, always check with your lecturer for the specific conventions of the genre.
Context: This short reflective essay and reply was written in response to a weekly assessment task in an atypical development unit that required students to reflect on their own position in relation the following question :
Do Barbie Dolls affect girls' body image? If you had (or have) a young daughter, would you allow her to play with Barbie or Bratz dolls? Why or why not?
Response: Barbie Dolls and Body Image: Just Child’s Play? This title links to the topic of the writing and raises a question that implies a thesis .
This rubric delineates specific expectations about an essay assignment to students and provides a means of assessing completed student essays.
More ideas to try.
Grading rubrics can be of great benefit to both you and your students. For you, a rubric saves time and decreases subjectivity. Specific criteria are explicitly stated, facilitating the grading process and increasing your objectivity. For students, the use of grading rubrics helps them to meet or exceed expectations, to view the grading process as being “fair,” and to set goals for future learning. In order to help your students meet or exceed expectations of the assignment, be sure to discuss the rubric with your students when you assign an essay. It is helpful to show them examples of written pieces that meet and do not meet the expectations. As an added benefit, because the criteria are explicitly stated, the use of the rubric decreases the likelihood that students will argue about the grade they receive. The explicitness of the expectations helps students know exactly why they lost points on the assignment and aids them in setting goals for future improvement.
Category | Exceeds Standard | Meets Standard | Nearly Meets Standard | Does Not Meet Standard | No Evidence | Score |
---|---|---|---|---|---|---|
Reflect personal learning stretch in Science Project | Shows great depth of knowledge and learning, reveals feelings and thoughts, abstract ideas reflected through use of specific details. | Relates learning with research and project, personal and general reflections included, uses concrete language. | Does not go deeply into the reflection of learning, generalizations and limited insight, uses some detail. | Little or no explanation or reflection on learning, no or few details to support reflection. | Shows no evidence of learning or reflection. |
|
Organization-Structural Development of the Idea | Writer demonstrates logical and subtle sequencing of ideas through well- developed paragraphs; transitions are used to enhance organization. | Paragraph development present but not perfected. | Logical organization; organization of ideas not fully developed. | No evidence of structure or organization. |
|
|
Conclusion | The conclusion is engaging and restates personal learning. | The conclusion restates the learning. | The conclusion does not adequately restate the learning. | Incomplete and/or unfocused. |
|
|
Mechanics | No errors in punctuation, capitalization and spelling. | Almost no errors in punctuation, capitalization and spelling. | Many errors in punctuation, capitalization and spelling. | Numerous and distracting errors in punctuation, capitalization and spelling. | Not applicable |
|
Usage | No errors in sentence structure and word usage. | Almost no errors in sentence structure and word usage. | Many errors in sentence structure and word usage. | Numerous and distracting errors in sentence structure and word usage. | Not applicable |
|
Pomona College 333 N. College Way Claremont , CA 91711
Give back to pomona.
Part of The Claremont Colleges
Rubric Code: By Ready to use Public Rubric Subject: Type: Grade Levels: 9-12 |
Enter rubric title | |||||
| |||||
| |||||
| |||||
| |||||
|
Rubrics allow for quicker and more consistent marking. This can be extremely helpful in reflection, which can feel as if it needs to be assessed by instinct alone. A well-defined rubric will make marking of reflection systematic and support both you and the reflectors.
Rubric | A tool to help in assessing students’ work, which usually includes three essential features: evaluative criteria, quality definitions of the criteria at particular levels, and a scoring strategy (Dawson, 2007) |
Holistic rubric | For every grade level or mark, gives an overall description of competence, without a breakdown into individual criteria. |
Analytic rubric | For every grade level or mark, describes the level of competence for each assessment criterion. |
There are many general benefits from using a rubric, which extend beyond reflection. For facilitators a rubric can:
Moreover, students report that having a well-defined rubric available before they engage with an assessment makes it clearer what is expected of them. Other benefits can be:
While the usefulness of rubrics are widely accepted, there are some criticisms arguing that rubrics can fail to make the marking easier as students’ work does not fit onto the predefined categories and will have to be assessed holistically, rather than by a set of components. Moreover, it is argued that a piece of work is often more than the sum of its parts.
These are both fair criticisms. Sometimes you will receive reflections that are hard to mark against your criteria or are indeed better than your rubric would suggest. However, having a rubric will give you a place to start for these reflections.
If you find that your rubric consistently misses aspects this would suggest the criteria need updated.
When choosing your rubric, there are two general approaches: holistic and analytical.
For each level of performance highlighted in the rubrics, it can be helpful to provide an example of that level (for example a series of reflective sentences or an extract).
The holistic rubric gives a general description of the different performance levels, for example novice, apprentice, proficient, or distinguished.
The levels can take many different names, and you can choose as many levels as you find appropriate. It can be recommended to include the same number of levels as the number of grades available for students, for example a level for failing and a level for each passing grade.
The analytic rubric allows you to identify a reflector’s performance against each of your chosen and well-defined assessment criteria.
This can be helpful for you in the marking process and when giving feedback to the reflector as you can tell them exactly what areas they are performing well in and need to improve on.
You may consider giving a student a mark for each criterion and take an average of that for the overall mark. Alternatively, predefine a weight or a set of points available for each criterion and calculate the overall mark according to this. If the latter method is used, you should also make the weightings available to students at the same time as the rubric.
It is unlikely that the first rubric you make is going to capture everything you need, and you may find you need to update it. This is natural for rubrics in all areas, and especially around the area of reflection, which for many is new. Revisiting your rubric is particularly worth doing after the first time it is used.
When using your rubric you can ask yourself:
Rubrics that work well for you have a lot of value, but to ensure that you get an optimal rubric it is important that others using your rubric would give the same grade to the same reflection as you do – ensuring that your rubric has inter-rater reliability.
This is important for two reasons:
Moon’s (2004) four levels of reflective writing.
These four levels distinguish between four types of written accounts you might see a reflector produce.
In this case the three top levels might pass a reflective assignment, where descriptive writing would not.
Taken from Jennifer Moon’s book: A Handbook of Reflective and Experiential Learning (2004)
Descriptive writing | This account is descriptive and it contains little reflection. It may tell a story but from one point of view at a time and generally one point at a time is made. Ideas tend to be linked by the sequence of the account / story rather than by meaning. The account describes what happened, sometimes mentioning past experiences, sometimes anticipating the future – but all in the context of an account of the event. There may be references to emotional reactions but they are not explored and not related to behaviour. The account may relate to ideas or external information, but these are not considered or questioned and the possible impact on behaviour or the meaning of events is not mentioned. There is little attempt to focus on particular issues. Most points are made with similar weight. The writing could hardly be deemed to be reflective at all. It could be a reasonably written account of an event that would serve as a basis on which reflection might start, though a good description that precedes reflective accounts will tend to be more focused and to signal points and issues for further reflection. |
Descriptive account with some reflection | This is a descriptive account that signals points for reflection while not actually showing much reflection. The basic account is descriptive in the manner of description above. There is little addition of ideas from outside the event, reference to alternative viewpoints or attitudes to others, comment and so on. However, the account is more than just a story. It is focused on the event as if there is a big question or there are questions to be asked and answered. Points on which reflection could occur are signalled. There is recognition of the worth of further exploring but it does not go very far. In other words, asking the questions makes it more than a descriptive account, but the lack of attempt to respond to the questions means that there is little actual analysis of the events. The questioning does begin to suggest a ‘standing back from the event’ in (usually) isolated areas of the account. The account may mention emotional reactions, or be influenced by emotion. Any influence may be noted, and possibly questioned. There is a sense of recognition that this is an incident from which learning can be gained, but the reflection does not go sufficiently deep to enable the learning to begin to occur. |
Reflective writing (level 1) | There is description but it is focused with particular aspects accentuated for reflective comment. There may be a sense that the material is being mulled around. It is no longer a straight-forward account of an event, but it is definitely reflective. There is evidence of external ideas or information and where this occurs, the material is subjected to reflection. The account shows some analysis and there is recognition of the worth of exploring motives or reasons for behaviour Where relevant, there is willingness to be critical of the action of self or others. There is likely to be some self-questioning and willingness also to recognise the overall effect of the event on self. In other words, there is some ‘standing back’ from the event. There is recognition of any emotional content, a questioning of its role and influence and an attempt to consider its significance in shaping the views presented. There may be recognition that things might look different from other perspectives that views can change with time or the emotional state. The existence of several alternative points of view may be acknowledged but not analysed. In other words, in a relatively limited way the account may recognise that frames of reference affect the manner in which we reflect at a given time but it does not deal with this in a way that links it effectively to issues about the quality of personal judgement. |
Reflective writing (level 2) | Description now only serves the process of reflection, covering the issues for reflection and noting their context. There is clear evidence of standing back from an event and there is mulling over and internal dialogue. The account shows deep reflection, and it incorporates a recognition that the frame of reference with which an event is viewed can change. A metacognitive stance is taken (i.e. critical awareness of one’s own processes of mental functioning – including reflection). The account probably recognises that events exist in a historical or social context that may be influential on a person’s reaction to them. In other words, multiple perspectives are noted. Self-questioning is evident (an ‘internal dialogue’ is set up at times) deliberating between different views of personal behaviour and that of others. The view and motives of others are taken into account and considered against those of the writer. There is recognition of the role of emotion in shaping the ideas and recognition of the manner in which different emotional influences can frame the account in different ways. There is recognition that prior experience, thoughts (own and other’s) interact with the production of current behaviour. There is observation that there is learning to be gained from the experience and points for learning are noted. There is recognition that the personal frame of reference can change according to the emotional state in which it is written, the acquisition of new information, the review of ideas and the effect of time passing. |
These four levels are different and highlight four alternative approaches to reflective journaling. While they are specifically developed for journal use, the levels will generalise to other types of written reflection.
The rubric is develop by Chabon and Lee-Wilkerson (2006) when evaluating reflective journals of students undertaking a graduate degree in communication sciences and disorders.
Level 1: Descriptive | Students demonstrate acquisition of new content from significant learning experiences. Journal entry provides evidence of gaining knowledge, making sense of new experiences, or making linkages between old and new information. | “I didn’t know that many of the traditions I believed were based in Anglo-American roots. I thought that all cultures viewed traditions similarly.” |
Level 2: Empathetic | Students demonstrate thoughts about or challenges to beliefs, values, and attitudes of self and others. Journal entry provides examples of self-projection into the experiences of other, sensitivity towards the values and beliefs of others, and/or tolerance for differences. | “I felt badly when I heard the derogatory terms used so freely when I visited the South.” |
Level 3: Analytic | Students demonstrate the application of learning to a broader context of personal and professional life. Journal entry provides evidence of student’s use of readings, observations, and discussions to examine, appraise, compare, contrast, plan for new actions or response, or propose remedies to use in and outside structured learning experiences. | “I was able to observe nursing staff interact with a patient whose first language was Tagalog and was diagnosed with altered mental status. The nurses employed many of the strategies that we have read about and discussed in class.” |
Level 4: Metacognitive | Students demonstrate examination of the learning process, showing what learning occurred, how learning occurred, and how newly acquired knowledge or learning altered existing knowledge. Journal entry provides examples of evaluation or revision of real and fictitious interactions. | “I found myself forming impressions about a child’s language abilities and made myself stop until I got additional information as suggested in class discussions.” |
Reflection evaluation for learners’ enhanced competencies tool (reflect) rubric.
This analytic rubric has been developed and empirically tested and improved by Wald et al. (2012). It was developed specifically for medical education, but can easily be used elsewhere. The rubric is designed using theoretical considerations from a range of thinkers around reflection as Moon, Schön, Boud and Mezirow.
This rubric has been used in empirical studies and a high inter-rater reliability has been established.
There are two components to the rubric. The standard rubric and an additional axis. The second axis should be used when a reflector reaches ‘Critical reflection’ and then distinguishes between two types of learning, which reflection can help surface.
Adding the additional axis can help you to differentiate between what kind of learning the student has obtained as well as reminding us that reflection does not need to always create new practice – becoming aware of why one’s practice works can be equally valuable.
Standard Rubric
Superficial descriptive writing approach (fact reporting, vague impressions) without reflection or introspection | Elaborated descriptive writing approach and impressions without reflection | Movement beyond reporting or descriptive writing to reflecting (i.e. attempting to understand, question, or analyse the event) | Exploration and critique of assumptions, values, beliefs, and/or biases, and the consequences of action (present and future) | |
Sense of writer being partially present | Sense of writer being partially present | Sense of writer being largely or fully present | Sense of writer being fully present | |
No description of the disorienting dilemma, conflict, challenge, or issue of concern | Absent or weak description of the disorienting dilemma, conflict, challenge, or issue of concern | Description of the disorienting dilemma, conflict, challenge, or issue of concern | Full description of the disorienting dilemma, conflict, challenge, or issue of concern that includes multiple perspectives, exploring alternative explanations, and challenging assumptions | |
Little or no recognition or attention to emotions | Recognition but no exploration or attention to emotions | Recognition, exploration, and attention to emotions | Recognition, exploration, attention to emotions, and gain of emotional insight | |
No analysis or meaning making | Little or unclear analysis or meaning making | Some analysis and meaning making | Comprehensive analysis and meaning making | |
Poorly addresses the assignment question and does not provide a compelling rationale for choosing an alternative | Partial or unclear addressing of assignment question; does not provide a compelling rationale for choosing an alternative | Clearly answers the assignment question or, if relevant, provides a compelling rationale for choosing an alternative | Clearly answers the assignment question or, if relevant provides a compelling rationale for choosing an alternative |
Axis II for critical reflection
Frames of reference or meaning structures are transformed. Requires critical reflection integration of new learning into one’s identity, informing future perceptions, emotions, attitudes, insights, meanings, and actions. Conveys a clear sense of a breakthrough. | Frames of reference or meaning structures are confirmed. Requires critical reflection. |
This rubric form Jones (n.d) gives another approach to marking reflection. Using five criteria it manages to capture a lot of what is relevant when marking reflection as well as giving clear qualities highlighted for each level of reflection.
\ | ||||
---|---|---|---|---|
Language is unclear and confusing throughout. Concepts are either not discussed or are presented inaccurately. | There are frequent lapses in clarity and accuracy | Minor, infrequent lapses in clarity and accuracy. | The language is clear and expressive. The reader can create a mental picture of the situation being described. Abstract concepts are explained accurately. Explanation of concepts makes sense to an uninformed reader. | |
Most of the reflection is irrelevant to student and/or course learning goals. | Student makes attempts to demonstrate relevance, but the relevance is unclear to the reader. | The learning experience being reflected upon is relevant and meaningful to student and course learning goals. | The learning experience being reflected upon is relevant and meaningful to student and course learning goals. | |
Reflection does not move beyond description of the learning experience(s). | Student makes attempts at applying the learning experience to understanding of self, others, and/or course concepts but fails to demonstrate depth of analysis. | The reflection demonstrates student attempts to analyse the experience but analysis lacks depth. | The reflection moves beyond simple description of the experience to an analysis of how the experience contributed to student understanding of self, others, and/or course concepts. | |
No attempt to demonstrate connections to previous learning or experience. | There is little to no attempt to demonstrate connections between the learning experience and previous other personal and/or learning experiences. | The reflection demonstrates connections between the experience and material from other courses; past experience; and/or personal goals. | The reflection demonstrates connections between the experience and material from other courses; past experience; and/or personal goals. | |
Not attempt at self-criticism. | There is some attempt at self-criticism, but the self-reflection fails to demonstrate a new awareness of personal biases, etc. | The reflection demonstrates ability of the student to question their own biases, stereotypes, preconceptions. | The reflection demonstrates ability of the student to question their own biases, stereotypes, preconceptions, and/or assumptions and define new modes of thinking as a result. |
Chabon, S. and Lee-Wilkerson, D. (2006). Use of journal writing in the assessment of CSD students’ learning about diversity: A method worthy of reflection. Communication Disorders Quarterly, 27(3), 146-158.
Dawson, P. (2017) Assessment rubrics: towards clearer and more replicable design, research and practice. Assessment & Evaluation in Higher Education, 42(3), 347-360.
Jones, S. (n.d.) Using reflection for assessment . Office of Service Learning, IUPUI. (link to PDF on external site)
Moon J.A. (2004). A handbook of reflective and experiential learning: Theory and practice. Routledge.
Kohn, A. (2006). The trouble with rubrics. English Journal, 95(4).
Wald, H.S., Borkan, J.M., Scott Taylor, J., Anthony, D., and Reis, S.P. (2012) Fostering and evaluating reflective capacity in medical education: Developing the REFLECT rubric for assessing reflective writing. Academic Medicine, 87(1), 41-50.
Academic Resources
Campus Resources
University Resources
Information For
Teaching Commons > Teaching Guides > Feedback & Grading > Rubrics > Assessing Reflection
Assessing reflection or reflective processes can be particularly challenging. A few examples of this challenge are:
As there is not just one type of student in your classes/programs, there is not one answer to designing high quality assessment techniques for assessing reflection. You must design your reflection assignments as well as your assessments carefully considering your own context.
A few things to consider when you are designing your assessment strategies are:
Hatton and smith (1995).
Hatton and Smith described four progressive levels of reflection, with each increased level indicating more/better reflective processes.
Ash and Clayton describe a guided process for facilitating and assessing reflection. These researchers focus specifically on service learning, but their model could be applied to other types of learning experiences.
Element | Description |
---|---|
Mechanics | Consistently avoids typographical, spelling and grammatical errors. |
Connection to Experience | Makes clear the connection(s) between the experience and the dimension being discussed. |
Accuracy | Makes statements of fact that are accurate and supported with evidence; for academic articulated learning statements, accurately identifies, describes, and applies appropriate academic principle(s). |
Clarity | Consistently expands on and expresses ideas in alternative ways, provides examples/illustrations. |
Relevance | Describes learning that is relevant to the articulated learning statement category and keeps the discussion specific to the learning being articulated. |
Depth | Addresses the complexity of the problem; answers important question(s) that are raised; avoids over-simplifying when making connections. |
Breadth | Gives meaningful consideration to alternative points of view and interpretations. |
Logic | Demonstrates a line of reasoning that is logical, with conclusions or goals that follow clearly from it. |
Significance | Draws conclusions, sets goals that address a (the) major issue(s) raised by the experience. |
BMC Medical Education volume 20 , Article number: 331 ( 2020 ) Cite this article
3808 Accesses
2 Citations
Metrics details
The main objective of this study is the development of a short reliable easy-to-use assessment tool in the aim of providing feedback to the reflective writings of medical students and residents.
This study took place in a major tertiary academic medical center in Beirut, Lebanon. Seventy-seven reflective essays written by 18 residents in the department of Family Medicine at the American University of Beirut Medical Center (AUBMC) were graded by 3 raters using the newly developed scale to assess the scale reliability. Following a comprehensive search and analysis of the literature, and based on their experience in reflective grading, the authors developed a concise 9-item scale to grade reflective essays through repeated cycles of development and analysis as well as the determination of the inter-rater reliability (IRR) using intra-class correlation coefficients (ICC) and Krippendorff’s Alpha.
The inter-rater reliability of the new scale ranges from moderate to substantial with ICC of 0.78, 95% CI 0.64–0.86, p < 0.01 and Krippendorff’s Alpha was 0.49.
The newly developed scale, GRE-9, is a short, concise, easy-to-use reliable grading tool for reflective essays that has demonstrated moderate to substantial inter-rater reliability. This will enable raters to objectively grade reflective essays and provide informed feedback to residents and students.
Peer Review reports
Reflective practice within medical education is considered an essential aspect of lifelong self-directed learning becoming a crucial element of the medical program at all its levels aiming towards a competence-based curriculum [ 1 ]. The idea of reflective practice was first established by Schon in 1987 and characterized by three stages: awareness of thoughts and feelings, critical analysis of a condition, and development of a new viewpoint of the situation [ 2 ]. Hence, it follows that reflection allows the development and integration of new knowledge into practice leading to the core experience of greater professional competence [ 3 ]. A growing body of research with regard to reflection in the medical education literature highlighted the relationship between reflective capacity and the enhancement of physician competence [ 4 , 5 , 6 , 7 ].
Given the beneficial consequences of reflection [ 8 ], medical educators have sought to explore a variety of methods for fostering and assessing reflection in learners, ranging from one-to-one mentoring [ 9 ] to guided discussions [ 10 ], digital approaches like video cases [ 11 ] and written methods like reflective portfolios, journal and essay writings [ 9 , 12 ]. Reflective writing was reported to be one of the most extensively and widely used forms of reflective teaching in medical education [ 13 , 14 ]. Reflective capacity within these reflective writing exercises can be assessed through various qualitative and quantitative tools [ 15 ]. Despite the presence of diverse methods, there is still a lack of best practices [ 15 ]. With the proliferation of reflective writing in promoting and assessing reflection [ 16 ], the need for a valid and reliable evaluative tool that can be effectively applied to assess students’ levels of reflection was strongly called for [ 17 ].
Existing modalities of reflection evaluation identified in the literature include scales (“paper and pencil” forms with responses scored by respondents), qualitative analysis including thematic coding and more elaborate analysis moving beyond themes into models, and analytical instructional rubrics (theory-based delineation of dimensions or levels of an assessed construct) [ 17 ]. Given that quantitative tools are primarily used in research for curriculum improvement and to guide feedback in reflection practice, analytic rubric assessment tools are extensively chosen in assessing reflective writings [ 15 ]. These rubrics provide a precise data with multiple reflective dimensions that educators can reference to when presenting feedback [ 15 ] leading to a remarkable improvement in reflective capacity of students [ 18 ].
These analytic rubric models are based on theoretical frameworks [ 17 ]. Of the major theoretical foundations is Mezirow’s (1991) reflective taxonomy which explains reflection as a basis for the transformative process in learning [ 19 ]. Theoretical foundations for such analytic rubrics also include Schon’s (1983) [ 20 ] focus on progression from knowing-in-action, to reflection-in-action, experimentation and reflection-on-action (post-experience reflection) as well as Boud and colleagues’ (1985) accentuation on feelings in the reflective process [ 21 ]. For instance, Kember et al. [ 22 ] proposed four methodological requirements for the measurement reflection tool. These incorporated the focus and direct assessment of reflection, the avoidance of construct-irrelevant variances in the conceptualization of the reflection themes, the specified detail of the method and procedure to ensure transparency and replication and an obtainment of an appropriate testing of reliability. Their methodological requirements were mainly based on Mezirow’s (1991) conceptualization of reflection [ 12 ] and they incorporated four categories: habitual action, understanding, reflection and critical reflection.
Notable limitations and challenges with regard to the available coding systems and tools used in the analysis and assessment of reflective writings are documented in the literature [ 17 ]. Some published rubrics for reflective narrative analysis are limited in scope as well as in validity [ 23 , 24 , 25 ]. Others lack a reliable structured worksheet that assesses levels of reflection, are relatively difficult, and are low on reliability outcomes [ 16 , 26 ]. Quantitative evidence supporting the psychometric properties of some of these reflective assessment tools is limited [ 5 ], with evidence regarding the inter-rater reliability of those tools still at its preliminary levels [ 17 , 22 ]. Given these challenges and limitations, and the notion that reflection is hard to measure and assess directly [ 12 ], it becomes imperative to develop simpler tools that are short, concise, include well-defined descriptors, and are easily accessible for analysis and interpretation with high level of objectivity. Despite the implementation of rater training efforts for reflective writings of students, rater variability in scoring remains to be a source of concern [ 27 , 28 ]. Given that students’ approaches to learning might be affected by the type of assessment strategy used [ 29 , 30 ], unreliable assessment strategies can lead to unfair results. Hence, designing a reliable assessment tool is needed to decrease the incidence of rater variability [ 6 ].
Consequently, this study serves as a step in filling this research gap by developing an empirically tested and concise new reflective writing assessment tool and exploring its inter-rater reliability with the aim of establishing a reliable measure of reflective writing.
Overview of study design, instruments and procedures.
For the past several years, the reflective essays written by Family Medicine residents during 3 years of training (between June 2014 and May 2017) were graded by their advisors using a simple grading scale. This scale consists of a simple guide on what the advisor should consider while grading (Additional file 1 ). Throughout the past several years, concerns were raised by numerous faculty members as to the ambiguity of this scale and the lack of standardization of grading. Hence, the aim to develop a new concise and reliable tool to evaluate reflective essays written by residents at the AUBMC was established. Since the essays are part of a routine formative assessment activity, the study was qualified as exempt from the Institutional Review Board (IRB) at the AUBMC.
In an effort to improve the reflective essays grading process at AUBMC, a new scale was developed by faculty members at the Department of Family Medicine. The development of the scale began with an explorative literature review; three Family Medicine physicians versed in the field of medical teaching, curriculum development, as well as reflective writing assessment screened and discussed an initial pool of items for relevance. The literature review included existing theoretical models of reflection, reflective writing pedagogy, elements of reflective practice, and existing assessment modalities in health professions education.
The analytical instructional rubric model was chosen as an evaluative model for the development of our reflective tool. The analytic rubric model outlines diverse reflective dimension levels and assessment criteria, defines benchmark for each of these levels, yields quantitative scores and is used for formative and summative purposes [ 17 , 31 , 32 ]. Analytic-type rubrics also provide total and domain specific scores; in this case allowing educators to identify the global and the domain specific deficits in the reflective skills of the learners and provide accordingly the specific and constructive feedbacks [ 15 ].
In the first cycle of constructing the initial reflection rubric, the rubric was based on a comprehensive analysis of relevant theoretical models of reflection as well as existing reflection assessment measures [ 33 ]. After taking into account a wide range of elements, an agreement was reached in incorporating the levels of reflection which were associated with criteria based on theories of Mezirow [ 19 ], Schon [ 2 ] and Boud and colleagues [ 21 ]. More specifically, the framework scheme of REFLECT rubric, a rigorously developed and a theory-informed analytic rubric was the starting point for item selection [ 17 ]. The REFLECT rubric provides a comprehensive, logical and credible theoretical principles based on Mezirow’s [ 19 ] conceptualization of the different levels of reflection, Schon’s [ 2 ] theory about the reflective practitioner as well as Boud’s and colleagues [ 21 ] reflective analysis. The four levels of reflective capacity of the REFLECT rubric incorporate: (1) habitual action (2) thoughtful action (3) reflection and (4) critical reflection which incorporates transformative learning (new understanding) and confirmatory learning (confirming one’s frames of reference or meaning structure). These four reflective levels include core processes that assimilate writing spectrum, presence, recognize “disorienting” dilemmas, recognize critical analysis of assumptions, attend to emotions, and derive meaning features [ 17 ]. On the basis of these theoretical levels of reflection, the process of developing our analytical instructional rubric commenced on the basis of an accepted methodology, listing of criteria, designating quality levels, creating a rubric draft with further revisions and refining of the draft based on targeting each of the four levels. The items of the Groningen Reflection Ability Scale (GRAS) formed further theoretical frameworks for the newly developed scale. These included three emerged thematic factors of personal reflection: self-reflection, empathetic reflection and reflective communication of the existing reflection assessment measure [ 1 ]. Items of this measure are grounded in reflection literature, and cover three substantive aspects of personal reflection in the context of medical practice and education: Self-reflection, Empathetic Reflection; and Reflective Communication. Self-reflection focuses on the exploration and appraisal of experience and forms a basis to the individual to frame or reframe one’s thoughts, feelings, beliefs, norms or methods. Empathetic reflection focuses on contextual understanding and appraisal when engaging in empathetic placement and when thinking about the position of others, such as patients and colleagues. As for reflective communication, it allows the handling of feedback and discussion, as well as dealing with interpersonal differences and taking responsibility of one’s statements and actions [ 1 ].
The first cycle yielded the first draft of the newly developed scale that consisted of 11 items. This scale then underwent a second cycle that included the previous 3 physicians along with 3 additional physicians. Each physician was asked to use the new 11-item scale to grade 3 reflective essays and to provide feedback related to the objectivity of the scale, the time needed to grade using the scale, the clarity of the items, and the ease of use. After several meetings and discussions between the faculty members, some items were reformulated and 2 items were dropped namely “comments on response” and “critical thinking”. The remaining sample items match with the four levels of the reflective capacity of the REFLECT rubric and the three thematic factors of personal reflection of the GRAS. Table 1 shows the matching comparison.
A session was then conducted to decide on the standardization of the scoring where a rationale was presented and scoring discrepancies were resolved after which a scoring consensus was reached. The first 2 items of the scale, which are descriptive, are given a maximum grade of 1 whereas the rest, which are analytical, are given a maximum grade of 2. The maximum score is 16. The items are followed by a guide that clarifies each point with the aim of facilitating and standardizing the grading process. The scale referred to as the Grading Reflective Essays − 9 (GRE-9) from here on consists of 9 items. Table 2 includes scoring per item as well as the guidance for grading. Eighteen Family Medicine residents training in a four-year program at the AUBMC were asked to write reflective essays based on incidences from their medical practice. Papers were then randomly coded by the authors. Before applying the new scale, several meetings were held for the raters to discuss debatable points and to try to unify the grading system as much as possible. The raters were asked to read the entire essay and then fragment it into parts corresponding to each of the items assessed in the GRE-9. The presence and quality of each item criteria should then be assessed by giving a partial point if an item criterion has been mentioned but not in depth and a full score for critical reflection. All 3 raters were asked to provide an overall feedback on ease-of-use of the scale in terms of clarity of items and time needed to grade.
Given that the sample size calculations for reliability assessment are based on testing for a statistical difference between moderate (i.e., 0.40) and high (i.e., 0.75) kappa values using alpha of 0.05 and beta error rates of 0.2, the estimated sample size would range from 77 to 28 based on the variability of trait prevalence between 10 and 50% [ 34 ]. Given that Kappa values are transferred to Krippendorff’s alpha [ 35 ], the sample size of 77 provides the needed power to detect a statistically significant Krippendorff’s alpha.
Since the minimally acceptable level of Krippendorff’s alpha is at least 0.40, then three raters per subject are required because increasing the number of raters beyond 3 has little effect on the power of hypothesis tests [ 34 , 36 ]; as such, three raters were chosen to evaluate the reflective pieces.
Similarly, taking into consideration the calculation of intra-class correlation (ICC), the calculation of the minimum sample size to estimate the value of ICC was performed by using Power Analysis and formula for minimum sample size (n) estimation using the PASS software which is derived from previous studies [ 36 , 37 ]. With three raters (k = 3) used, pre-specification of an acceptable reliability and an expected reliability of 0.0 and 0.2 respectively and a power set to be at least 80% and value of alpha set to be 0.05, the minimum sample size obtained through the formula is approximately 60 or 61 participants. Given that increasing the number of subjects is the more effective strategy for maximizing power [ 38 ] and given that the estimated sample size of 77 provides the needed power to detect a statistically significant Krippendorff’s alpha, 77 reflective pieces were used to calculate ICC as well.
Consequently, three family physicians, who were involved in the second cycle of the scale development, evaluated the 77 reflective essays. Rater1 is a full time faculty, has been in practice for 24 years, is a professor and is well versed in the area of assessing reflective capacity. The other two raters are part time faculty, one who has been in practice for 15 years (Rater 2) while the other for 5 years (Rater 3).
The three raters were asked to grade 3 essays at one time every other day. Each rater was asked to log the number of the essay and the grade assigned as well as the time needed to read and grade each essay. The essays were reviewed in the same order by all reviewers. Anonymity was assured through randomization of the essays into alphanumeric codes.
Descriptive statistics were used to quantify the level of higher order processing evident in each reflective piece. Interrater reliability was assessed for each level of cognitive processing as well as for the highest level of cognitive processing evident within each entry using ICC and Krippendorff’s alpha.
Among the six different equations or models for calculating the ICC, Model 2 was selected in this research study considering each element or entry of the reflective piece being assessed by all raters who are considered representative of a larger population of potential raters with the expectation that the results may be generalized to other raters with similar characteristics. More specifically, examining the inter-rater reliability of continuous variables, such as total score for the 9 questions for each of the three raters in the GRE-9 scale, the ICC including a two-way random model (ANOVA) with absolute agreement was used for calculation. Given that the aim was to generalize the reliability results to any raters who possess the same characteristics as the selected raters, a two-way random effects model was the appropriate model to use as it specifies that each subject (reflective essay by each student) is evaluated by the same set of k independent raters, who have been randomly sampled from a larger population of raters with similar characteristics [ 35 ]. Also, given that the absolute agreement concerns about the extent to which the raters provide the equal scores and implies that all raters match scores exactly [ 39 , 40 ], the absolute agreement type was chosen to ensure greater precision. When establishing criteria to judge acceptable levels of reliability the authors used the criteria established by Landis and Koch [ 35 ] to judge the strength of reliability reported through the ICC with values of ICC < 0 representing poor agreement, 0.01–0.20 representing slight agreement, 0.21–0.40 representing fair agreement, 0.41–0.60 representing moderate agreement, 0.61–0.80 representing substantial agreement, and 0.81–1.00 representing almost perfect agreement and thus high reliability. With regard to Krippendorff’s alpha, when values range from 0 to 1, 0 is considered perfect disagreement and 1 is considered perfect agreement. According to Krippendorff [ 41 ], α ≥ .800 is the required value, and α ≥ .667 is considered an acceptable value. However, it was also reported that the cut-off value scores provided by Landis and Koch (1971) [ 35 ] can also be transferred to Krippendorff’s alpha [ 42 ].
The average score for the 3 raters is 10.03 ± 2.44 with the average of rater 1, 2 and 3 being 9.38 ± 2.94, 9.68 ± 2.39, and 11.04 ± 2.44 respectively. The average time needed to read and grade an essay of an average of 500 words was 4.5 min (SD = 2.0; range 2–12 min).
The obtained ICC was 0.78 indicating substantial agreement, 95% CI 0.64–0.86 ( F (76, 152) = 5.781, p < 0.01). (Table 3 ).
To examine the inter-rater reliability of ordinal variables for more than 2 raters, the Krippendorff’s Alpha was also calculated. Total agreement between the 3 raters was 0.49 (moderate). (Table 4 ).
The 3 raters provided positive feedback on the usability of the GRE-9 as was inferred from the clarity of the items, the grades assigned to each item, the time needed to grade, and the elaborative guide provided with the scale that explains each individual item.
Excluding item 2, the ICC estimates and krippendorff’s alpha for each of the 9 items varied from 0.41 to 0.70, and 0.18 to 0.49, respectively (Tables 3 and 4 ). The lowest ICC value and krippendorff’s alpha were obtained for item 2 “What is special about this event- State clearly the reason for choosing this event in particular” (ICC = − 0.06, Krippendorff’s alpha = − 0.17,). Item 5 “Understanding of the event - Express what was good and bad about the experience; interpretation of the situation at present with justifications (factors/ knowledge influencing judgment)” showed a low agreement amongst the raters (ICC = 0.41, Krippendorff’s alpha = 0.18). Item 3 “Feelings when it happened - Describe personal thoughts and feelings of the resident while the event was happening” (attending to feelings) obtained the highest ICC (0.70) and the second highest (Krippendorff’s alpha (0.47).
GRE-9, the new rubric for evaluating reflection in medical education, is a short, easy to use, and reliable assessment tool. The items of this rubric were based in theory and were clearly presented. A distinguishing feature of the GRE-9 is that its 8th item: “Reference to old experience and to others” was explicitly presented in the tool unlike the REFLECT rubric. The explicit reporting of this item was in accordance with Mezirow’s [ 19 ] theoretical underpinnings of reflection that focus on recognition and expression of the internal states and conditions of others and attitudes held towards them. The item was also in accordance with Schon’s [ 2 ] documentation that in professional work reflectivity focuses on social observation, interaction, and meaning given to various interactions. This delineated item reflects relevance for gaining insight into the cultural context of Lebanon. The collectivistic culture of Lebanon allows one to emphasize on the significance of the other and the interactions formed with them [ 43 ]. Hence, although the GRE-9’s theoretical framework was based on theoretical underpinnings formulated from the West and despite the notion that reflective assessment tools can be used globally irrespective of rater contextual and education background [ 44 ], the cultural relevance of the 8th item renders an advantage to the GRE-9 and its applicability in a specified cultural context.
The REFLECT rubric was a starting point for the development of the GRE-9 items. The precedence of REFLECT rubric in comparison to the GRE-9 lies in its already obtained validity evidence in similar contexts of measuring medical students’ reflective capacity [ 17 , 45 ] as well as in REFLECT’s slightly higher inter-rater reliability obtained in previous studies compared to GRE-9 [ 15 , 17 ]. The REFLECT rubric can also be regarded as more elaborative as it incorporates three additional grading levels: “Writing Spectrum i.e. exploration and critique of assumptions, values, beliefs, and/or biases, and the consequences of action (present and future)”, “Presence i.e. sense of writer being fully present” and “Attention to assignment i.e. whether the writer answers the assignment question or, if relevant provides a compelling rationale for choosing an alternative” [ 17 ]. As such, the REFLECT does not only measure the reflective ability but also presents input on the extent to which the student was involved and engaged in the reflective and writing process; thus, providing richer information with regard to the credibility of what was reported and written by the medical student as well as the student’s reflection ability. These grading levels are not present in the GRE-9. However, the advantage of the GRE-9 in comparison to the REFLECT rubric lies in the GRE-9’s item simplicity and clarity in comparison to what was reported with regard to the REFLECT rubrics’ needed revision on item clarity [ 15 ]. Also, while the REFLECT rubric is designated as a formative rubric lacking grading [ 17 ], the formative and summative nature of the GRE-9 encourages the analysis of the quality of reflection as well as the assignment of numbers to the reflection levels as an anchor on which discussing the outcomes of student’s reflection level can be explored. This is in line with the call in research for the need to incorporate quantitative and summative reflective assessment rubrics into the learning processes [ 46 ].
It has been reported that reflective tools founded on previous work of reflective assessment and theoretical frameworks provide the premise for different raters to reach same interpretive results when assessing reflection [ 17 ]. Literature has revealed that rubrics can be utilized internationally across diverse educators, cultural backgrounds and education curricula [ 6 , 44 ]. For instance, results of a study conducted by Lucas et al. [ 44 ] revealed that the reflective rubric for assessing reflective essays of pharmacy students that was used by three different raters from different educational backgrounds and cultural contexts maintained a high rater agreement; thus, indicating that the rubric they used is a reliable tool capable of being applied across different educational settings irrespective of the contexts and/or different educational curricula. Consequently, it can be fathomed that the GRE-9 will not only be useful within the AUBMC context, but it can also be presented into the literature of medical education for assessing reflective essays by others professionals.
The results of this study yielded a moderate to substantial inter-rater reliability for the GRE-9 based on the ICC and krippendorff’s alpha. In fact, the inter-rater reliability of the GRE-9 scale (ICC of 0.78) is considered good in comparison to previous studies; for instance, it was only slightly lower than the inter-rater reliability reported by Lucas et al. [ 6 ] who found an average measure ICC of 0.81 for their own developed reflective rubric to assess pharmacy students’ reflective thinking as well as slightly lower than the five-rater ICC (alpha) of 0.80 for the REFLECT rubric composite of reflective written essays of medical students [ 45 ].
The lack of inter-rater agreement on item 2 “What is special about this event- State clearly the reason for choosing this event in particular” might be attributed to the role of subjectivity arising when rating this item. The reason behind choosing to report about a particular event in the reflective essay might not be stated explicitly, but rather implied. Hence, subjectivity and variability among the raters when assessing the implied reason might arise. The reason behind the low inter-rater agreement for item 5 “Understanding of the event - Express what was good and bad about the experience; interpretation of the situation at present with justifications (factors/ knowledge influencing judgment)” could be the difficulty in detecting a clear and full description of the disorienting dilemmas, issues of concern incorporating multiple perspectives, alternative explanations and challenging assumptions especially within written essays especially that identical criterion phrasings might have been used when reporting the different levels of description of conflict and its conclusions. The high inter-rater agreement for item 3 “Feelings when it happened - Describe personal thoughts and feelings of the resident while the event was happening” (attending to feelings) can be explained by the fact that an important aspect of reflection is pointing and dwelling on emotions [ 2 ]; hence, residents might have profoundly focused on expressing their emotions in their reflective writings explicitly and clearly allowing the raters to detect emotions and feelings easily.
In a previous study that aimed to investigate the reliability, feasibility, and responsiveness of a categorization scheme for assessing pharmacy students’ levels of reflection [ 12 ], the mean timing that was needed for only categorizing one essay was 3 min; this time measured was only for the grading procedure and was reported to be prone to increase if used in formative assessments. It was also considered a reasonable timing based on the feasibility test [ 12 ]. The time taken to grade using the GRAS was around 10 min [ 1 ]. Hence, the time needed to read and grade the reflective essays in this paper shows that GRE-9 seems to be a reasonably fast method in assessing the level of reflection in the written reflective essays. This reasonable timing will be especially important when used in teaching settings [ 12 ].
Although the new instrument’s validity was not tested empirically, it is vital to note that theoretically, the newly developed scale incorporates theoretical themes that match with the themes that emerged in validated instruments measuring reflection of medical students. For instance, and as stated previously, the reflection themes in the newly developed instrument match with the thematic structure of the “Reflection Evaluation for Learners’ Enhanced Competencies Tool (REFLECT)” [ 17 ]. Similarly, the thematic underpinnings of the newly developed scale overlap with the three emerged thematic factors of personal reflection: self-reflection, empathetic reflection and reflective communication of the “Groningen Reflection Ability Scale (GRAS)” [ 1 ]. For instance, the items “ (1) What happened (2) What is special about this event (3) Feelings when it happened (5) Understanding of the event (6) Congruence of actions and beliefs of the newly developed scale are compatible with the self-reflection criterion. Items: (4) What was the outcome for the concerned (7) New thoughts and feelings after reflection are compatible with the empathetic reflection criterion, and items: (8) reference to old experience and others (9) How this incident will affect future role, are consistent with the reflective communication criterion which integrates reflective behavior, openness for feedback and discussion, taking responsibility for own statements, actions and ethical accountability [ 1 ]. Consequently, given that the items of the GRE-9 scale were conceptually and thematically based on solid theoretical underpinnings and match with the three essential aspects of personal reflection in the context of medical practice and education of the GRAS tool as well as match with the four reflective levels of the REFLECT tool, the content validity of the scale can be assumed.
The moderate to substantial inter-rater reliability of the GRE-9 allows raters to objectively grade reflective essays as well as provide informed feedback to residents and students. It is important to note however that developing a reliable reflective tool alone does not seem to be the sole solution in solving the problem of rater variability and fair assessment of reflective capability. Solutions for unfairness and variability can be reached by structurally aligning raters on the interpretation of the reflective rubrics through robust rater-training programs [ 12 ], by increasing the number of observations for increased accuracy of reflection assessment [ 5 ] and by assessing multiple reflective samples per students so as to produce a significant conclusion about the reflective capacity [ 5 ].
A number of limitations can be pointed to in this paper. Primarily, this study was conducted solely at AUBMC which might limit the generalizability of the results. Also, this study evaluated single reflective-writing samples per students; hence, this too might limit the generalizability of the results because reflective writings are considered to be context-dependent and many skills can be assessed in medical education, so a single writing sample might not provide an accurate estimate of the reflective competency of students [ 5 ]. Future research with multiple writings will be needed to test the rubric’s psychometrics even further [ 6 ]. Also, the sample size was not enough to calculate construct validity which have influenced the obtainment of rigorous psychometric properties for the scale. Given that the reflective essays were an assignment asked for from Family Medicine residents during their 3 years of training, performance bias might have impacted the write-up of the reflective essays as they sought approval from their advisors or peers [ 47 ].
GRE-9 is a reliable, concise, simple grading tool that has demonstrated moderate to substantial inter-rater reliability that will enable raters to objectively grade reflective essays and provide informed feedback to residents and students. Future research will need to investigate the validity of the scale empirically such as exploring the external validity by comparing the scores of this instrument with other reflection scales and measures of reflective practice outcomes as well as to use a larger sample so as to ensure a more rigorous psychometric soundness of the scale.
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.
American University of Beirut-Medical Center
Intra-class correlation coefficients
Institutional Research Board
Inter-rater reliability
Grading Reflective Essays
Aukes LC, Geertsma J, Cohen-Schotanus J, Zwierstra RP, JPJ S. The development of a scale to measure personal reflection in medical practice and education. Med Teach. 2007;29(2-3):177–82.
Schön D. Educating the Reflective Practitioner. In: Educating the reflective practitioner; 1987.
Google Scholar
Droege M. The role of reflective practice in pharmacy. Educ Health. 2003;16(1).
Arntfield SL, Slesar K, Dickson J, Charon R. Narrative medicine as a means of training medical students toward residency competencies. Patient Educ Couns. 2013;91(3):280–6.
Moniz T, Arntfield S, Miller K, Lingard L, Watling C, Regehr G. Considerations in the use of reflective writing for student assessment: issues of reliability and validity. Med Educ. 2015;49(9):901–8.
Lucas C, Bosnic-Anticevich S, Schneider CR, Bartimote-Aufflick K, McEntee M, Smith L. Inter-rater reliability of a reflective rubric to assess pharmacy students’ reflective thinking. Curr Pharm Teach Learn. 2017;9(6):989–95.
Hess BJ, Lipner RS, Thompson V, Holmboe ES, Graber ML. Blink or think: can further reflection improve initial diagnostic impressions? Acad med; 2015.
Plaza CM, Draugalis JLR, Slack MK, Skrepnek GH, Sauer KA. Use of reflective portfolios in health sciences education. Am J Pharm Educ. 2007;71(2).
Borgstrom E, Morris R, Wood D, Cohn S, Barclay S. Learning to care: medical students’ reported value and evaluation of palliative care teaching involving meeting patients and reflective writing. BMC Med Educ. 2016;16(1):306.
Dexter S, Mann K. Enhancing learners’ attitudes toward reflective practice. Medical Teacher; 2013.
Koole S, Dornan T, Aper L, De Wever B, Scherpbier A, Valcke M, et al. Using video-cases to assess student reflection: development and validation of an instrument. BMC Med Educ. 2012.
Wallman A, Lindblad AK, Hall S, Lundmark A, Ring L. A categorization scheme for assessing pharmacy students’ levels of reflection during internships. Am J Pharm Educ. 2008;72(1).
Wald HS, Reis SP. Beyond the margins: reflective writing and development of reflective capacity in medical education. J Gen Intern Med. 2010;25(7):746–9.
Wear D, Zarconi J, Garden R, Jones T. Reflection in/and writing: pedagogy and practice in medical education. Acad Med. 2012;87(5):603–9.
Miller-Kuhlmann R, Osullivan PS, Aronson L. Essential steps in developing best practices to assess reflective skill: a comparison of two rubrics. Med Teach. 2016;38(1):75–81.
Plack MM, Driscoll M, Blissett S, McKenna R, Plack TP. A method for assessing reflective journal writing. J Allied Health. 2005;34(4):199–208.
Wald HS, Borkan JM, Taylor JS, Anthony D, Reis SP. Fostering and evaluating reflective capacity in medical education: developing the REFLECT rubric for assessing reflective writing. Acad Med. 2012;87(1):41–50.
Tawanwongsri W, Phenwan T. Reflective and feedback performances on Thai medical students’ patient history-taking skills. BMC Med Educ. 2019;19(1):141.
Mezirow J. Transformative dimensions of adult learning. JosseyBass higher and adult education series; 1991.
Schön DA. The reflective practitioner. Basic books; 1983.
Boud D, Keogh R, Walker D. Promoting reflection in learning: a model. In: Boundaries of Adult Learning; 2002.
Peterkin A, Roberts M, Kavanagh L, Havey T. Narrative means to professional ends: new strategies for teaching CanMEDS roles in Canadian medical schools. Can Fam Physician. 2012;58(10):e563–9.
Kember D, Mckay J, Sinclair K, Kam Yuet Wong F. A four-category scheme for coding and assessing the level of reflection in written work. Assess Eval High Educ. 2008;33(4):369–79.
O’Sullivan P, Aronson L, Chittenden E, Niehaus B, Learman L. Reflective Ability Rubric and User Guide. MedEdPORTAL Publ. 2010;6.
Devlin MJ, Mutnick A, Balmer D, Richards BF. Clerkship-based reflective writing: a rubric for feedback. Med Educ. 2010;44(11):1143–4.
Pee B, Woodman T, Fry H, Davenport ES. Appraising and assessing reflection in students’ writing on a structured worksheet. Med Educ. 2002;36(6):575–85.
Eckes T. Rater types in writing performance assessments: a classification approach to rater variability. Lang Test. 2008;25(2):155–85.
Iramaneerat C, Yudkowsky R. Rater errors in a clinical skills assessment of medical students. Eval Heal Prof. 2007;30(3):266–83.
Ramsden P. Learning to Teach in Higher Education Learning to teach in higher education; 2003.
Tsingos C, Bosnic-Anticevich S, Lonie JM, Smith L. A model for assessing reflective practices in pharmacy education. Am J Pharm Educ. 2015;79(8):124.
Musial JL, Rubinfeld IS, Parker AO, Reickert CA, Adams SA, Rao S, et al. Developing a scoring rubric for resident research presentations: a pilot study. J Surg Res. 2007;142(2):304–7.
Newman LR, Lown BA, Jones RN, Johansson A, Schwartzstein RM. Developing a peer assessment of lecturing instrument: lessons learned. Acad Med. 2009;84(8):1104–10.
Wald HS, Davis SW, Reis SP, Monroe AD, Borkan JM. Reflecting on reflections: enhancement of medical education curriculum with structured field notes and guided feedback. Acad Med. 2009;84(7):830–7.
Sim J, Wright CC. The kappa statistic in reliability studies: use, interpretation, and sample size requirements. Phys Ther. 2005;85(3):257–68.
Landis JR, Koch GG. The Measurement of Observer Agreement for Categorical Data. Biometrics. 1977;33(1):159–74.
Walter SD, Eliasziw M, Donner A. Sample size and optimal designs for reliability studies. Stat Med. 1998;17(1):101–10.
Winer BJ, Brown DR, Michels KM. Statistical principles in experimental design. New York: McGraw-Hill; 1971.
Shoukri MM. Agreement, measurement of. In: Wiley StatsRef: Statistics Reference Online; 2014.
McGraw KO, Wong SP. Forming Inferences about Some Intraclass Correlation Coefficients. Psychol Methods. 1996;1(1):30.
Koo TK, Li MY. A guideline of selecting and reporting Intraclass correlation coefficients for reliability research. J Chiropr Med. 2016;15(2):155–63.
Krippendorff K. Reliability in content analysis: Some common misconceptions and recommendations. Human communication research. 2004;30(3):411–33.
Zapf A, Castell S, Morawietz L, Karch A. Measuring inter-rater reliability for nominal data - which coefficients and confidence intervals are appropriate? BMC Med Res Methodol. 2016;16:93.
Hofstede G. National Cultures in four dimensions: a research-based theory of cultural differences among nations. Int Stud Manag Organ. 1983;13(1-2):46–74.
Lucas C, Smith L, Lonie JM, Hough M, Rogers K, Mantzourani E. Can a reflective rubric be applied consistently with raters globally? A study across three countries. Curr Pharm Teach Learn. 2019;11(10):987–94.
Brown A, Jauregui J, Ilgen JS, Riddell J, Schaad D, Strote J, et al. Does the medium matter? Evaluating the depth of reflective writing by medical students on social media compared to the traditional private essay using the REFLECT rubric. West J Emerg Med. 2019;21(1):18–25.
Aronson L. Twelve tips for teaching reflection at all levels of medical education. Med Teach. 2011;33(3):200–5.
Embo MPC, Driessen E, Valcke M, Van Der Vleuten CPM. Scaffolding reflective learning in clinical practice: a comparison of two types of reflective activities. Med Teach. 2014;36(7):602–7.
Download references
The authors wish to thank Dr. Hani Tamim, Associate Professor of Internal Medicine at the American University of Beirut-Medical Center for assistance with statistical analysis and Dr. Alexandra Ghadieh for her work in literature review in the early stage of the tool development.
No sources of funding were required for this study.
Authors and affiliations.
Department of Family Medicine, American University of Beirut-Medical Center, Riad El-Solh, P. O Box 11-0236, Beirut, 1107 2020, Lebanon
Nisrine N. Makarem, Basem R. Saab, Grace Maalouf, Umayya Musharafieh, Fadila Naji & Diana Rahme
Department of Psychology, Haigazian University, Rue Mexique, Kantari, Riad el Solh, P.O.Box: 11-1748, Beirut, 11072090, Lebanon
Dayana Brome
You can also search for this author in PubMed Google Scholar
BS conceived and supervised the study. NM contributed to Literature review, study design, supervised the grading process, conducted the data entry, and had the main responsibility of writing the paper. DB contributed to the statistical analysis and interpretation of the results and contributed in writing the discussion section. UM, FN, and DR graded the articles and reviewed the article and provided valuable feedback. GM reviewed the article and advised on further developing of the article. All authors, except DB, were involved in the scale development. The author(s) read and approved the final manuscript.
Correspondence to Basem R. Saab .
Ethics approval and consent to participate.
Not applicable.
Competing interests.
The authors declare that they have no competing interests.
Publisher’s note.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Additional file 1., rights and permissions.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
Reprints and permissions
Cite this article.
Makarem, N.N., Saab, B.R., Maalouf, G. et al. Grading reflective essays: the reliability of a newly developed tool- GRE-9. BMC Med Educ 20 , 331 (2020). https://doi.org/10.1186/s12909-020-02213-2
Download citation
Received : 18 December 2019
Accepted : 28 August 2020
Published : 25 September 2020
DOI : https://doi.org/10.1186/s12909-020-02213-2
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
ISSN: 1472-6920
IMAGES
VIDEO
COMMENTS
Reflective essay rubric. This is a grading rubric an instructor uses to assess students' work on this type of assignment. It is a sample rubric that needs to be edited to reflect the specifics of a particular assignment. Students can self-assess using the rubric as a checklist before submitting their assignment.
Reflective Writing Rubric. Demonstrate a conscious and thorough understanding of the writing prompt and the subject matter. This reflection can be used as an example for other students. Demonstrate a thoughtful understanding of the writing prompt and the subject matter. Demonstrate a basic understanding of the writing prompt and the subject matter.
GRADING RUBRIC FOR REFLECTION. Exceeds Expectations. Central idea is well developed; clarity of purpose clearly exhibited throughout paper. Abundance of evidence of critical, careful thought to support main ideas, evidence and examples are vivid and specific, while focus on topic remains tight, ideas work together as a unified whole.
Reflective Writing Rubric Exemplary Proficient Developing Novice Course Content Professor to provide Professor to provide Professor to provide Professor to provide INTELLECTUAL SKILLS Self-Awareness Student questions own biases, stereotypes, preconceptions, and/or assumptions and defines new modes of thinking as a result. Student questions own
Poorly chosen quotations, or ineffective framing and explication of quotations. Consistently imprecise or ambiguous wording, confusing sentence structure. Quotations contradict or confuse student's text. Quotations used to replace student's writing. Presentation.
reflection, confusing "reflection" with "reporting" and missing the critical step of self assessment that is at the core of reflection. Where students are asked to reflect in writing, their focus may be simply on the writing, rather than the content. For students to improve their reflective abilities, 1
Reflection is a thinking process that occurs before, during and/or after situations with the purpose of developing greater understanding of oneself and events so that future encounters can be improved. You may have participated in forms of reflections such as journaling, narrative
Rubric Best Practices, Examples, and Templates. A rubric is a scoring tool that identifies the different criteria relevant to an assignment, assessment, or learning outcome and states the possible levels of achievement in a specific, clear, and objective way. Use rubrics to assess project-based student work including essays, group projects ...
Reflective Writing Rubric Criteria Subject Quality Clarity Mechanics Exceeds Expectations Reflection thoroughly addresses the topic and/ or question posed in the prompt. Reflection is thoughtful, carefully written, and demonstrates significant depth of self-reflection on the topic. Reflection is clear, logical, and articulate. Reflection ...
Reflective essays. Genre: A reflective essay reflects critically on personal experience and opinion in light of broader literature, theories or subject materials. As conventions and expectations may differ across contexts, always check with your lecturer for the specific conventions of the genre. Context: This short reflective essay and reply was written in response to a weekly assessment task ...
Criteria. Superior (54-60 points) Sufficient (48-53 points) Minimal (1-47 points) Unacceptable (0 points) Depth of Reflection. (25% of TTL Points) ___/15. Response demonstrates an in-depth reflection on, and personalization of, the theories, concepts, and/or strategies presented in the course materials to date.
Grading rubrics can be of great benefit to both you and your students. For you, a rubric saves time and decreases subjectivity. Specific criteria are explicitly stated, facilitating the grading process and increasing your objectivity. For students, the use of grading rubrics helps them to meet or exceed expectations, to view the grading process ...
Reflective Essay Rubric. Shows great depth of knowledge and learning, reveals feelings and thoughts, abstract ideas reflected through use of specific details. Relates learning with research and project, personal and general reflections included, uses concrete language. Does not go deeply into the reflection of learning, generalizations and ...
ECTIVE ESSAY ASSESSMENT GUIDELINEWrite a reflective essay of your student‐teaching exper. ce in Placement #1 and #2. The essay must b. 2 1⁄2 ‐ 3 1⁄2 pa. . Use standardized English. The essay must contain all comp. Assessment Formats & Timelines To successfully complete this assignment, ensure that you adhere to c.
2. Evidence: Reflective essays 3. Design: • Weekly reflective writing assignments. • Weekly feedback using rubric. • If score a 0 on any rubric category, must revise and resubmit. 4.-6. Gather evidence, draw conclusions, act on results: • Periodically through semester, share with class common strengths and skills to strengthen ...
iRubric L34935: Rubric title Reflective Essay Rubric. Built by tiamcmilllan using iRubric.com. Free rubric builder and assessment tools.
Assessment rubrics. Rubrics allow for quicker and more consistent marking. This can be extremely helpful in reflection, which can feel as if it needs to be assessed by instinct alone. A well-defined rubric will make marking of reflection systematic and support both you and the reflectors. Rubric.
Richard Keyser Essay Grading Rubric 2015 Essay Grading Rubric I have provided here a detailed grading rubric to help you understand the criteria used for grading your essays and papers. However, it is important to take it with a grain of salt, because one cannot really reduce the process of assessing an essay to a checklist of factors that can ...
Descriptive - this is not reflection, but simply describes events that occurred with no attempt to describe 'why.' Descriptive Reflection - description includes reasons, but simply reports reasons. Dialogic Reflection - reflection as a personal dialogue (questioning, considering alternatives). wonder, what if, perhaps….
The main objective of this study is the development of a short reliable easy-to-use assessment tool in the aim of providing feedback to the reflective writings of medical students and residents. This study took place in a major tertiary academic medical center in Beirut, Lebanon. Seventy-seven reflective essays written by 18 residents in the department of Family Medicine at the American ...