How to Obtain the Validity and Reliability of an Essay Test?

Woles...ada test
Woles…ada test

To obtain the validity and reliability of  the essay test I am constructing, I have to see it first from the defintion of validation and reliabilty first.

Validation is the process of accumulating evidence that supports the appropriateness of the inferences made from student responses for the test. To make sure that the essay test I gave to students are valid, there are two things to do:

First, I clearly state the purpose and objectives of the test. For example, the objectives of my essay test are:

1/ students write an organized paragraph

2/ students show logical development of ideas

3/ students use correct grammar and mechanics

4/ students demonstrate style and quality of expression.


By writing those objectives, I can ensure the content validity because the test clearly defines the achievement that I measure.

Next, I develop scoring criteria that address each objective. If one of the objectives is not represented in the score categories, then the rubric doesn’t give necessary evidence to examine the given objective. If some of the scoring criteria are not related to the objectives, then, the appropriateness of the assessment and the rubric is in question.  Scoring rubric meets criterion validity, since I will know precisely the extent of test criteria that have been actually reached by my students.  Here is my scoring rubric:





Essay has an effective introductory paragraph
Topic sentence/thesis statement is stated
Essay has apparent body paragraphs
Essay has a satisfactory concluding paragraph
Ideas are concrete and well developed
Supporting details are relevant and sufficient
Essay reflects complete thought (cohesiveness)
Essay demonstrate syntactic variety and rhetorical fluency
Essay uses correct English writing conventions (punctuation and spelling)
Essay uses a wide range of vocabulary
Essay show good register and concise
Essay is written in neat and legible format


Although the criteria in the rubric seem too detailed, all of them are related to the four objectives mentioned above. Using detailed scoring rubric ensures the validity of my assessment. Indeed, good grading practices can also increase the reliability of essay tests . Also, as far as I am concerned, a valid assessment is by necessity reliable.


Reliability refers to the consistency of assessment scores. If my test is reliable, my student will get the same score regardless of when he/she completed the test, when the response was scored, and who scored the response. Two forms of reliability in classroom assessment and in rubric development involve rater (or scorer) reliability. Rater reliability generally refers to the consistency of scores that are assigned by two independent raters (interrater reliability) and that are assigned by the same rater at different points in time (intra-rater reliability). Sometimes I use interrater reliability by assigning my Teaching Assistant (TA) to score my students essays using the scoring rubric I prepared previously, or by grading papers together (me and TA) for clarity of evaluation and time efficiency. This will check whether there is great discrepancy of TA’s scoring and mine or not. If our scorings do not show great discrepancy, I can say that my test is reliable. Other times, I can also make intrarater reliability by scoring again the works several weeks later. If the previous scores do not show big discrepancies with the second scoring, then my evaluation is reliable.

One thought on “How to Obtain the Validity and Reliability of an Essay Test?

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s