Welcome to the Language Testing I Course! This course is made online for some of its assignments given to students taking LT 1

Tuesday, June 5, 2012

ASSIGNMENT 4

3:22 PM

LANGUAGE TESTING

25 comments

Assignment 4

1. A test is reliable when it can predict precisely the current ability of a test taker. Can you give more explanation or example about a reliable test.

2. There are some methods to measure the reliability of a test such as test-retest, split-half, equivalent test, Kuder –Richardson, Cronbach alpha, and judgment reliability. Please choose one to explain, when, why and how they will be used.

Deadline for the first posting by Thursday Midnight and for the second posting by Sunday midnight.

Please give comment here!

25 comments:

Unknown said...: No. 1
A test is reliable when it can predict precisely the current ability of a test taker. Reliability is defined as the extent to which a questionnaire, test, observation or any measurement procedure produces the same results on repeated trials. In short, it is the stability or consistency of scores over time or across raters. Keep in mind that reliability pertains to scores not people. Thus, in research we would never say that someone was reliable. On the other words, if a measurement device or procedure consistently assigns the same score to individuals or objects with equal values, the device is considered reliable. Researchers must establish the reliability of their measurement devices in order to be certain that they are obtaining a systematic and consistent record of the variation in X and Y. So, I conclude that reliability is one of the most important elements of test quality. It has to do with the consistency, or reproducibility, of an examinee's performance on the test.
For example: consider judges in a platform diving competition. The extent to which they agree on the scores for each contestant is an indication of reliability. Similarly, the degree to which an individual’s responses (i.e., their scores) on a survey would stay the same over time is also a sign of reliability.

No. 2
Well, I will explain about one of the methods to measure the reliability of a test, namely, equivalent test. Equivalent is measured through a parallel forms procedure in which one administers alternative forms of the same measure to either the same group or different group of respondents. In practice the parallel forms procedure is seldom implemented, as it is difficult, if not impossible, to verify that two tests are indeed parallel (i.e., have equal means, variances, and correlations with other measures). Indeed, it is difficult enough to have one well -developed instrument to measure the construct of interest let alone two. Another situation in which equivalence will be important is when the measurement process entails subjective judgments or ratings being made by more than one person. Equivalence reliability is estimated by administering both forms of the exam to the same group of examinees. While the time between the two test administrations should be short, it does need to be long enough so that examinees' scores are not affected by fatigue. The examinees' scores on the two test forms are correlated in order to determine how similarly the two test forms function. This reliability estimate is a measure of how consistent examinees’ scores can be expected to be across test forms.

Answered by:
NURUL AZIZAH
A1D209090; June 7, 2012 at 3:38 AM
La Ode Karmin said...: 1. Test reliability is the aspect of test quality concerned with whether or not a test produces consistent results. Reliability refers to the consistency of a measure. A test is considered reliable if we get the same result repeatedly. For example, if a test is designed to measure a trait (such as introversion), then each time the test is administered to a subject, the results should be approximately the same. Unfortunately, it is impossible to calculate reliability exactly, but it can be estimated in a number of different ways.
2. Cronbach's (alpha) is a coefficient of reliability. It is commonly used as a measure of the internal consistency or reliability of a psychometric test score for a sample of examinees. The measure can be viewed as an extension of the Kuder-Richardson Formula 20 (KR-20), which is an equivalent measure for dichotomous items. Theoretically, alpha varies from zero to 1, since it is the ratio of two variances. This rule should be applied with caution when has been computed from items that systematically violate its assumptions. Furthermore, the appropriate degree of reliability depends upon the use of the instrument. For example, an instrument designed to be used as part of a battery of tests may be intentionally designed to be as short as possible, and therefore somewhat less reliable. Other situations may require extremely precise measures with very high reliabilities. Cronbach's alpha can be written as a function of the number of test items and the average inter-correlation among the items. Additionally, if the average inter-item correlation is low, alpha will be low. As the average inter-item correlation increases, Cronbach's alpha increases as well (holding the number of items constant).; June 7, 2012 at 10:58 AM
sari amiliya said...: 1. Reliability
Reliability also relates with language testing, it’s used in learning process. Conventionally, reliability defines as characteristics of test that has ability to produce consistent measurement. Reliability doesn’t relate with the test that used as measurement but used as result of measurement in form of consistent score, Reliability is one of the most important elements of test quality. It has to do with the consistency, or reproducibility, of an examinee's performance on the test. For example, if you were to administer a test with high reliability to an examinee on two occasions, you would be very likely to reach the same conclusions about the examinee's performance both times. A test with poor reliability, on the other hand, might result in very different scores for the examinee across the two test administrations. If a test yields inconsistent scores, it may be unethical to take any substantive actions on the basis of the test. There are several methods for computing test reliability including test-retest reliability, parallel forms reliability, decision consistency, internal consistency, and interpreter reliability. For many criterion-referenced tests decision consistency is often an appropriate choice.

2. Test Re-test Method

Calculating of reliability level by re-test method requires the usage of the same test for amount of same participants. To estimate test-retest reliability, you must administer a test form to a single group of examinees on two separate occasions. Typically, the two separate administrations are only a few days or a few weeks apart; the time should be short enough so that the examinees' skills in the area being assessed have not changed through additional learning. The relationship between the examinees' scores from the two different
Administrations is estimated, through statistical correlation, to determine how similar the
Scores are. This type of reliability demonstrates the extent to which a test is able to produce stable, consistent scores across time.; June 7, 2012 at 11:14 AM
Safriana Asrawi said...: Name :Safriana Asrawi
Reg. NO :A1D2 09014

Number 1
In my opinion, a test is reliable if the test can provide results that remain when the test is repeated. A reliable test to say if the results of these tests showed determination. In other words, if students are given the same test at different times, each student will remain in the order (ranking) in the same group.

Number 2
Retest method people do to avoid the preparation of two series of tests. In using these techniques or methods testers only have one series of tests but tested twice. Because only a single test and tested two times, then this method can be called a double-single-test-trial method. Then the results of both tests was determined using correlation.
To test that reveals a lot of knowledge (memory) and understanding, this means less wear because tercoba will still remember the items you see. Therefore, granting a grace period will be the first test with two separate issues. if the grace period is too narrow, many students still remember the material. Conversely if the grace period is too long, then the factors or conditions would have different tests, and students themselves may have learned something. Of course, these factors will also affect the reliability.; June 7, 2012 at 1:28 PM
SUYANTI said...: Number 1
Reliability means the characteristic of test that have ability to make the best measurement, consistent, although we used it in different target. Reliability not concern to test as measurement tools but to the score as the best result. The improvement of reliability is doing empirically or in statistic method. The statistic shows the correlation in every level in form of coefficient of correlation. So it means that Reliability refers to the consistency of a measure. A test is considered reliable if we get the same result repeatedly. For example, if a test is designed to measure a trait (such as introversion), then each time the test is administered to a subject, the results should be approximately the same. Unfortunately, it is impossible to calculate reliability exactly, but it can be estimated in a number of different ways.

Number 2
Equivalent reliability is gauged by comparing two different tests that were created using the same content. This is accomplished by creating a large pool of test items that measure the same quality and then randomly dividing the items into two separate tests. The two tests should then be administered to the same subjects at the same time.
To apply this test we have to develop two equivalent tests that are able to produce the same score or equal to the same group participant without repeat the using of the same test. The equivalent should consist of the aspect of test arrangement, start from the number and kind of question, the difficulties level of the test, etc. all of this doing in order the test can be equal and be able to give the equivalent product.
Because of the tight requirement of equivalent, the equivalent test applied the development of standardization test. by two equivalent test, the factor of time in conducting the test that is not a problem in reliability context. The coefficient correlation is gotten by correlate the two of score group that produced.; June 7, 2012 at 2:01 PM
Salmatian Safiuddin said...: SALMATIAN SAFIUDDIN A1D2 09 056
1. I would like to give more explanation about reliability. Reliability is a kind of test which not related with validity. It is concern at score.So,the assessing based on the score. Consistency of the score whenever and wherever they are. The characteristic of the reliability is consistency of the score from time to time. It is the empiric one which is related to the statistical data. By using statistical data its points the correlation in every level in confession correlation form. In the other hand, a test is considered reliable if we get the same result repeatedly.There are some types to measure reliability such as test-retest reliability, Split half Reliability , scorer reliability, equivalent form, Kuder Richardson formula and soon. For example :
assume that you gave a students a history test yesterday and then give the test again today. You find that the students scored, very high at the first day and very low to the second day. So, it could be called as unreliable test.
2. Test- Retest Reliability
The test refers to give the test to the students then the test that we have tested and corrected will be given again in another time or the test is administered twice at two different points in time.The test, we test again to the same group of subjects then we compare the consistency of the score. There are some characteristic to know this test, such as :
- There is no announcement that the test will be tested again in next day
- The condition should be same and the space time between first test and second test have to be in short (at least 1 month)
- The test is given to the students minimum twice
- There is no treatment after one test has conducted.; June 7, 2012 at 3:07 PM
Nurul Atma said...: 1.Reliability is level of consistency of scores achieved by the same people and same test when tested at different times. Thus reliability refers to the characteristics of the score rather than the test. Based on the definition of reliability above, it can be concluded that a test can be reliable if the test has ability to produce consistent measurement. This means that whenever such tests are used will provide same or relatively same result. Reliability is expressed in numerical form, usually as a coefficient. So that a high coefficient means high reliability. Conversely, if the coefficient of a test is low, the reliability is also low.
The example of reliability is: the score of student A in an initial test is lower than student B. Then the test is conducted again the next day and the score of student A is still lower than student B.

2.Equivalent method. This method uses two tests that have similar objectives, level of difficulty, and composition but different grains. It is tested to the same students. In English is called alternate-forms method (parallel forms). Through this method, two parallel tests such as reading test series A (to be searched reliability) and test series B is tested to the same group of students. Then the test results are correlated. Coefficient correlation of these two test results will show the reliability of reading test series A. If the coefficient is high; it means that the test is reliable. The strength of this method is because of students are faced with two kinds of tests, so there is no recall. However, this method requires a long time because the tester should prepare two series of tests.; June 7, 2012 at 4:33 PM
Unknown said...: AIDIL AKBAR
A1D208092

No. 1
A test can be said to be reliable if the test shows the results credible and not contradictory. Reliability of a test is a test to measure the degree of target consistently measured. Reliability is expressed in numerical form, usually as a coefficient. High coefficient means high reliability.
For example : If a test is designed to measure a trait (such as introversion), then each time the test is administered to a subject, the results should be approximately the same. Unfortunately, it is impossible to calculate reliability exactly, but it can be estimated in a number of different ways.

No. 2
I will explain about split-half. Split-half is a test given and divided into halves and are scored separately, then the score of one half of test are compared to the score of the remaining half to test the reliability.
Why use Split-Half?
Split-Half Reliability is a useful measure when impractical or undesirable to assess reliability with two tests or to have two test administrations (because of limited time or money)
How to use Split-Half?
1st-divide test into halves. The most commonly used way to do this would be to assign odd numbered items to one half of the test and even numbered items to the other, this is called, Odd-Even reliability.

2nd- Find the correlation of scores between the two halves by using the Pearson r formula.

3rd- Adjust or reevaluate correlation using Spearman-Brown formula which increases the estimate reliability even more. The longer the test the more reliable it is so it is necessary to apply the Spearman-Brown formula to a test that has been shortened, as we do in split-half reliability; June 7, 2012 at 4:44 PM
zulwan said...: 1. A test is reliable when it can predict precisely the current ability of a test taker. It's means that a reliable test is a test that has a match between what we give to what we want to measure or in other words the test is valid, because if we want to see precisely predict the current ability of a test taker, then we give the test to be valid and when the test was used again so that we get the same results. so it can be said reliability is a measurement of a test to see whether such tests can measure what should be measured. For example a reliable test : if the state of the A in an early test lower than the B, then if it held repeated measurements, the A also is lower than the B. That being said or remained steady, this is the same in position of students among other group members.
2. Test-retest:
We estimate test-retest reliability when we administer the same test to the same sample on two different occasions. This approach assumes that there is no substantial change in the construct being measured between the two occasions. The amount of time allowed between measures is critical. We know that if we measure the same thing twice that the correlation between the two observations will depend in part by how much time elapses between the two measurement occasions. The shorter the time gap, the higher the correlation; the longer the time gap, the lower the correlation. This is because the two observations are related over time, the closer in time we get the more similar the factors that contribute to error. Since this correlation is the test-retest estimate of reliability, you can obtain considerably different estimates depending on the interval.; June 7, 2012 at 7:17 PM
Nosmalasari said...: No. 1
A test is reliable when it can predict precisely the current ability of a test taker. Reliability refers to the consistency of a measure. A test is considered reliable if we get the same result repeatedly. For example, if a test is designed to measure a trait (such as introversion), then each time the test is administered to a subject, the results should be approximately the same.
A test may be reliable but not valid. A test may not be valid but not reliable. Reliability in contrast to the validity of the proof can be a logical, empirical reliability is related to the statistical calculations. This calculation is intended to show a correlation in the form of the correlation coefficient.
Reliability shows the extent to which test scores are free from errors of measurement. There is no test is perfectly reliable because random errors operate to cause scores to vary or be inconsistent from time to time and situation to situation. The goal is to try to minimize these inevitable errors of measurement and thus increase reliability.

No.2
I will explain about the method used to measure the reliability of test methods equivalent. Test equivalent or equivalent tests are two tests that have a common purpose, level of difficulty, and composition, but because different items.
In parallel using this tester should prepare two tests, and each group of participants tested at the same time. Equality is done in order to cover various aspects of the preparation of tests, ranging from the determination of the format and amount of processing time, until the scope of the content, the amount and type of test items, the level of difficulty of the test, and others. The weakness of this test is hard work because the tester should prepare two series of tests and should be available for a long time to try the two tests.

Nosmalasari
A1d2 09 004; June 7, 2012 at 7:57 PM
sartika Dae said...: Number 1
ReliabilitY is a test that has the ability to produce precise measurements, is not change if we used repeatedly on the same target. Therefore, the same test participant must obtain the same score when he was given a repeat test in the same matter or in a different form of matter but have similarities in many aspects of the scores obtained should be equal to or likely to be similar, although performed in different time. Reliability demonstrated by using a statistical calculation that aims to show a correlation at various levels in the form of the correlation coefficient.
Number 2
Test-Retest method
Retest metodh is giving the same tests on a group of subjects twice with different intervals. The assumption is that the scores produced by the same test will result in scores appear relatively similar. In the implementation there should be a grace period between the first and second tests. The time such a short period (several days) and also long period. a short time can result in a higher coefficient because of the factors that can affect memory in doing the test a second time, contrary to the long times can resulted in many changes happening both in the administration of the test and of the participant itself , like, the conditions of the test participant on the second testing is no longer the same as the conditions subject to testing first because learning occurs in the interval of the first testing to second testing, the other possibility is a change in experience, motivation, and etc.

Sartika Dae (A1D2 09 028); June 7, 2012 at 9:11 PM
Nasmah Riyani said...: NASMAH RIYANI(A1D209022)

1.•Well to answer number one, we should know what is reliability means? In statistics, reliability refers to the consistency of a measure. A measure is said to have a high reliability if it produces consistent results under consistent conditions.• Well to answer number one, we should know what is reliability means? In statistics, reliability refers to the consistency of a measure. A measure is said to have a high reliability if it produces consistent results under consistent conditions. whenever such tests are used will provide same or relatively same result. Reliability is expressed in numerical form, usually as a coefficient. So that a high coefficient means high reliability. If coefficient is high so the reliability is high too.
For example : If the score of student A in an initial test is lower than student B. Then the test is conducted again the next time and the score of student A is still lower than student B.

2.I choose Split-half method, there are some question :
What it is method means? The method treats the two halves of a measure as alternate forms. It provides a simple solution to the problem that the parallel-forms method faces: the difficulty in developing alternate forms.

It involves:
Administering a test to a group of individuals
Splitting the test in half
Correlating scores on one half of the test with scores on the other half of the test

The correlation between these two split halves is used in estimating the reliability of the test. This halves reliability estimate is then stepped up to the full test length using the Spearman–Brown prediction formula.

How we use it? There are several ways of splitting a test to estimate reliability. For example, a 40-item vocabulary test could be split into two subtests, the first one made up of items 1 through 20 and the second made up of items 21 through 40. However, the responses from the first half may be systematically different from responses in the second half due to an increase in item difficulty and fatigue.; June 7, 2012 at 10:13 PM
Deltateknikkendari said...: 1. Reliability of a Test
Reliability is one of the most important elements of test quality. It has to do with the consistency, or reproducibility, of an examiner's performance on the test. For example, if you were to administer a test with high reliability to an examine on two occasions, you would be very likely to reach the same conclusions about the examiner's performance both times. A test with poor reliability, on the other hand, might result in very different scores for the examine across the two test administrations. If a test yields inconsistent scores, it may be unethical to take any substantive actions on the basis of the test.
2. A Kind of Reliability
Test-retest Reliability
When it will be used?
Test-retest reliability is estimated when we administer the same test to the same sample on two different occasions. This approach assumes that there is no substantial change in the construct being measured between the two occasions. The amount of time allowed between measures is critical.
Why it will be used?
It is used because this type of reliability demonstrates the extent to which a test is able to produce stable, consistent scores across time.
How it will be used?
Test-retest reliability is a statistical technique used to estimate components of measurement error by repeating the measurement process on the same subjects, under conditions as similar as possible, and comparing the observations. The term reliability in this context refers to the precision of the measurement (i.e. small variability in the observations that would be made on the same subject on different occasions) but is not concerned with the potential existence of bias. In the context of surveys, test-retest is usually in the form of an interview-re-interview procedure, where the survey instrument is administered on multiple occasions (usually twice), and the responses on these occasions are compared.

Dewa Gd Karyadi W. Putra
A1D209086; June 7, 2012 at 11:16 PM
Sri Yuliani .M (A1 D2 09 006) said...: name : SRI YULIANI .M
reg.no : A1 D2 09 006

1. Reliability means the consistency of tests that measure what should be measured. Actually reliability tests necessary but not sufficient as a condition of validity of the test. In order that the test was valid so test must be reliable. But a reliable test is not necessarily valid. Reliability refers to the consistency of scores that achieved by the same individuals when they were retested with the same test on different occasions, or with a set of equivalent items are different.
For example : we give our student a test, in the first meeting we give the test about reading test and the second meeting we give the same test about reading test too. Both of the tests are same. And we get the score, we find that the students’ score are same both of in the first meeting and the second meeting, we can conclude that the test is reliable.

2. Equivalent test
equivaled test is called "double double test trial". Since the beginning of researchers have already developed two instruments which parallel devices (equivalent), the two instruments are compiled based on common objectives, level of difficulty, and composition, but different items. By this method, two parallel tests such as writing test Series A to be searched reability and the Series B to be tested in the same group of students, then the results are correlated. Correlation coefficient of these two test show the reliability coefficient of the writing test series A. if the coefficient is high the test is reliable and can be used as a Reliable tester. In parallel using this tester should prepare two tests, and each group of students tested at the same time.; June 7, 2012 at 11:19 PM
Anonymous said...: 1. Reliability
Reliability refers to the consistency of a measure. A test is considered reliable if we get the same result repeatedly. For example, if a test is designed to measure a trait (such as introversion), then each time the test is administered to a subject, the results should be approximately the same. Unfortunately, it is impossible to calculate reliability exactly, but it can be estimated in a number of different ways.

2. Split-half method:
This method treats the two halves of a measure as alternate forms. It provides a simple solution to the problem that the parallel-forms method faces: the difficulty in developing alternate forms.
It involves:
• Administering a test to a group of individuals
• Splitting the test in half
• Correlating scores on one half of the test with scores on the other half of the test
The correlation between these two split halves is used in estimating the reliability of the test. This halves reliability estimate is then stepped up to the full test length using the Spearman–Brown prediction formula.
There are several ways of splitting a test to estimate reliability. For example, a 40-item vocabulary test could be split into two subtests, the first one made up of items 1 through 20 and the second made up of items 21 through 40. However, the responses from the first half may be systematically different from responses in the second half due to an increase in item difficulty and fatigue.
In splitting a test, the two halves would need to be as similar as possible, both in terms of their content and in terms of the probable state of the respondent. The simplest method is to adopt an odd-even split, in which the odd-numbered items form one half of the test and the even-numbered items form the other. This arrangement guarantees that each half will contain an equal number of items from the beginning, middle, and end of the original test.

AL ADAWIYAH SUMAM
A1D2 09012; June 8, 2012 at 1:46 AM
Astufiani said...: ASTUFIANI (A1D209076)
No.1
Reliability refers to the consistency of a measure. A test is considered reliable if we get the same result repeatedly. For example, if a test is designed to measure a trait (such as introversion), then each time the test is administered to a subject, the results should be approximately the same. Unfortunately, it is impossible to calculate reliability exactly, but it can be estimated in a number of different ways.
To gauge test-retest reliability, the test is administered twice at two different points in time. This kind of reliability is used to assess the consistency of a test across time. This type of reliability assumes that there will be no change in the quality or construct being measured. Test-retest reliability is best used for things that are stable over time, such as intelligence. Generally, reliability will be higher when little time has passed between tests.
No.2
Method Cronbach alpha is coefficient method required in the application and methods of KR-20 KR-21, not always be met, particularly in the form of essays research tests, including tests of writing, or judgment by order of the scale. Usually in the form of test scores given varied according to the weight of, completeness, and feasibility of an answer. Reliability in this study can be done by the method of Cronbach's alpha coefficient. Application of this method is quite simple, requiring only the calculation of variance of each grain about. Fabricate an answer on the test is a unity and does not consist of separate grains, assessment on the basis of its components. with the order and weighting that can be tailored to the needs, including writing assessment component of the content, organization, language, vocabulary, spelling.
Cronbach Alpha/Coefficient Alpha
The Cronbach Alpha/Coefficient Alpha formula is a general formula for estimating the reliability of a test consisting of items on which different scoring weights may be assigned to different responses.
Why to use that method: we has use inter scorer reliability- measures the degree of agreement between persons scoring a subjective test (like an essay exam) or rating an individual. In regards to the latter, this type of reliability is most often used when scorers have to observe and rate the actions of participants in a study. This research method reveals how well the scorers agreed when rating the same set of things. Other names for this type of reliability are inter-rater reliability or inter observer reliability.
How to use that method, we can see For Example: In Table 2, the percentages of agreement between the raters for each occasion (day) are presented two ways: first, for the case in which agreement meant an exact match between raters in their assigned ratings; second, for the case in which agreement was defined more leniently as either exact agreement, or differences between the two raters' scores of not more than one point in either direction. (This latter definition of agreement has been used fairly often in the estimation of interpreter agreement of some types of measures, such as parent-infant interaction scales [Goodwin & Sandall, 1988].) As would be expected, percentages of agreement are lower when agreement is defined in the more conservative way (exact match). The results shown in Table 2 demonstrate that the median percentage of agreement for the 6 days, when agreement was defined as exact match, was 20%; the median percentage of agreement for the 6 days, when the more liberal definition of agreement was used, was 80%. (Percentages of agreement were not calculated for the total scores in Table 1 because this approach to reliability estimation is rarely used if the range of scores is large; here, the total scores could range from 6 to 42.); June 8, 2012 at 8:09 PM
ST. NURJANNAH SONDENG said...: St. Nurjannah. Sondeng (A1 D2 09 044)
Number 1
Reliability is an essential component of validity but, on its own, is not a sufficient measure of validity. A test can be reliable but not valid, whereas a test cannot be valid yet unreliable. Reliability, in simple terms, describes the repeatability and consistency of a test. Testing for reliability is about exercising an application so that failures are discovered and removed before the system is deployed. Because the different combinations of alternate pathways through an application are high, it is unlikely that you can find all potential failures in a complex application.
Number 2
I want to try explaining about Kuder-Richardson Method.
Kuder-Richardson Methods to teacher made test, why KR because is used to test the reliability of binary measurements such as exam questions, to see if the items within the instruments obtained the same binary (no/yes,right/wrong) results over a population of testing subjects.
How to used this methods ? To use this methods we using the Cronbach's Alpha. It is used when applied to binary data, will produce the same value as K-R 20. As Alpha has wider applicability, it has increasingly replaced K-R 20 as a measurement of agreement or internal consistency.
For example :
T1 T2 T3 T4
Student 1 wrong correct correct wrong
Student 2 correct correct correct correct
Student 3 wrong correct wrong wrong
Student 4 wrong wrong correct wrong
Student 5 correct correct correct correct
We have 4 multiple choice questions (T1 to T4), administered to 5 students. 0 represents wrong answer and 1 correct answer, as shown in the table on the top.
The data set to be used is as shown in the table to the bottom, and the results are :
1 1 1 1
0 1 0 0
0 0 1 0
1 1 1 1

K-R 20 = 0.7536; June 8, 2012 at 8:11 PM
Zurrahmah said...: Nama : Zurrahmah
Reg. No. : A1D2 09 008
Number 1
Reliability can be defined as the degree of consistency of scores achieved by the same people and same test when tested at different times. Or, consistency scores can also be obtained with a different problem but similar from various aspects. A reliable measurement results if the implementation of measures in recent times against the same group of subjects who obtained the same relative measurement. It is also revealed by Fraenkel, Wallen & Hyun (2012) in his book "How to Design and Evaluate Research in Education" which states that, "Reliability refers to the consistency of scores or answers from one administration of an instrument to another, and from one set of items to another”

Number 2
• Judgment reliability
Judgment or Inter-Observer Reliability used to assess the degree to the different observers gives consistent estimates of the same phenomenon. Besides, when we want to improve the accuracy of the calculation of the reliability of essay tests we need to apply this method.

We used this method need because the uses of the method of evaluation essay test other methods only on the basis of its components by weighting. In this case we can not completely eliminate the subjectivity of assessment because it affects the level of reliability, however slightly, as reflected in the calculation of reliability coefficients. So we need to apply this method.

In the application of this method, each test taker jobs are rated by more than one appraiser, at least two people. Each rater independently conducts its own assessment, on the basis of predetermined criteria. If the scores given by assessors were correlated, the results show the level of the coefficient essay test reliability.; June 8, 2012 at 8:19 PM
ZULKIFLI said...: according to me :
1. Reliability refers to the consistency of a measure. A test is considered reliable if we get the same result repeatedly. For example, if a test is designed to measure a trait (such as introversion), then each time the test is administered to a subject, the results should be approximately the same. Unfortunately, it is impossible to calculate reliability exactly, but it can be estimated in a number of different ways. and also That is, a reliable measure that is measuring something consistently, may not be measuring what you want to be measuring. For example, while there are many reliable tests of specific abilities, not all of them would be valid for predicting, say, job performance. In terms of accuracy and precision, reliability is analogous to precision, while validity is analogous to accuracy.
2. I will to explain about one of the methods to measure the reliability of a test, namely, split half method Split-half method refers to a form of internal Reliability in which the consistency of item responses is determined by comparing scores on half of the items with scores on the other half of the items. This method treats the two halves of a measure as alternate forms. It provides a simple solution to the problem that the parallel-forms method faces: the difficulty in developing alternate forms[3].

It involves:
Administering a test to a group of individuals, splitting the test in half, Correlating scores on one half of the test with scores on the other half of the test.The correlation between these two split halves is used in estimating the reliability of the test. This halves reliability estimate is then stepped up to the full test length using the Spearman–Brown prediction formula.There are several ways of splitting a test to estimate reliability. For example, a 40-item vocabulary test could be split into two subtests, the first one made up of items 1 through 20 and the second made up of items 21 through 40. However, the responses from the first half may be systematically different from responses in the second half due to an increase in item difficulty and fatigue.

In splitting a test, the two halves would need to be as similar as possible, both in terms of their content and in terms of the probable state of the respondent. The simplest method is to adopt an odd-even split, in which the odd-numbered items form one half of the test and the even-numbered items form the other. This arrangement guarantees that each half will contain an equal number of items from the beginning, middle, and end of the original test.
zulkifli (A1D208070); June 10, 2012 at 1:22 PM
Sitti Rahmawati said...: Name : Sitti Rahmawati
Reg. Number : A1D2 09 078

1. A test is reliable when it can predict precisely the current ability of a test taker. Reliability is defined as a test characteristic that have ability to produce the constant measurement, not change, if we use it time after time on the same object. Actually, reliability is not related to the test as the measure instrument, but the result of the measurement on the constant score and not change. With that characteristic, the same participants of the test should acquire the same or almost same score, if they rework the same test in the different time. In general, the absolute reliability just reputed theoretical, because in fact, almost there is no the constant result of a measurement, without difference, let alone in the test that have many aspect such in language teaching. There are some methods that we can use to know the level of reliability such as test-retest reliability, Split half Reliability, scorer reliability, equivalent form, Kuder-Richardson formula.
For example: if we gave a test to the students and they acquire the high score. Some days later, we gave the same test to the same students, but they acquire the low score for this time, it means that the test that we use is not reliable.

2. Test retest method.
To know the level of reliability with this method, we should use the same test to the same participant of the test in the different time. The coefficient of correlation that show the level of constant score a test can acquired by correlating the two of score, the result of the same test. The higher coefficient of correlation that acquire, higher as well the constancy of test result. In using this method, we need consider time period between the implementation the two of test.; June 10, 2012 at 3:10 PM
wafa laila najah (A1D209074) said...: 1.The reliability of a research instrument concerns the extent to which the instrument yields the same results on repeated trials. Although unreliability is always present to a certain extent, there will generally be a good deal of consistency in the results of a quality instrument gathered at different times. The tendency toward consistency found in repeated measurements is referred to as reliability. A test may be reliable but not valid. A test may not be valid but not reliable. For example, if I use a yard stick that is mislabeled to measure the distance from tee to hole in golf on different length holes, the results will be neither reliable nor valid. If you use the same stick to measure football fields that are the same length the result will reliable (repeatable, consistent) but not valid (wrong numbers of yards). There is no test that is unreliable (repeatable, consistent) and valid (measures what we are looking for).

2.I will explain one of methods to measure the reliability of a test namely Split-Halves Method . This method is more practical in that it does not require two administrations of the same or an alternative form test. In the split-halves method, the total number of items is divided into halves, and a correlation taken between the two halves. This correlation only estimates the reliability of each half of the test. It is necessary then to use a statistical correction to estimate the reliability of the whole test. This correction is known as the Spearman-Brown prophecy formula (Carmines & Zeller, 1979) Pxx" = 2Pxx'/1+Pxx' . where Pxx" is the reliability coefficient for the whole test and Pxx' is the split-half correlation.

Example
If the correlation between the halves is .75,
the reliability for the total test is:

Pxx" = [(2) (.75)]/(1 + .75) = 1.5/1.75 = .857

There are many ways to divide the items in an instrument into halves. The most typical way is to assign the odd numbered items to one half and the even numbered items to the other half of the test. One drawback of the split-halves method is that the correlation between the two halves is dependent upon the method used to divide the items.; June 10, 2012 at 7:19 PM
Winda Mulandari Thahir said...: 1. Reliability refers to the consistency of a measure. A test is considered reliable if we get the same result repeatedly. For example, if a test is designed to measure a trait (such as introversion), then each time the test is administered to a subject, the results should be approximately the same. Unfortunately, it is impossible to calculate reliability exactly, but it can be estimated in a number of different ways. Reliable test is another characteristic that needs to be associated with a language test as one test form used in teaching. conventional test reliability is the hallmark that has the ability to produce measurements that are not changing.
2.re-test method
calculating the level of reliability by retest method requires the use of the same test twice in the same number of test takers. correlation coefficient indicates the level of regularity score produced a test can be obtained by correlating the two series of test score results of the same repetition. a short grace period could result in a higher coefficient, because of the memories of factors that may affect the ability to do the test a second time.

Name : Winda Mulandari Thahir
Reg. No : A1D209 060; June 10, 2012 at 8:27 PM
SUCI WAHYUNI RUSDIN said...: 1. Reliability is a series of measurements or measuring instrument which has a series of consistency when measurements are performed with the measuring devices is done repeatedly. Reliability of the test is the level of regularity (consistency) of a test, namely the extent to which a test can be trusted to produce a score that steady, relatively unchanged despite diteskan in different situations. Reliability of a test is a test to measure the degree of target consistently measured. Reliability is expressed in numerical form, usually as a coefficient. High coefficient means high reliability.
2. Split-half method
The application of split-half method, the level of its reliability tests need to be assessed fairly used once against a group of test participants. Reliability coefficient for the calculation, all the answers from each participant was divided into two parts, usually by way of separate grains answers even number of odd-number answer items. Thus for each participant as if the work obtained two tests are equivalent, although each test is actually just a half of the original test. The use of this method requires a test of two sides consisting of enough grain for the thesis can be divided into two parts respectively and balanced enough to be studied empirically.

NAME : SUCI WAHYUNI RUSDIN
REG. NO : A1D209082; June 10, 2012 at 8:42 PM
Himaya said...: 1.Reliability test is a test of language used in the operation of conventional teaching in language, reliability is defined as the characteristic tests that have the ability to generate a steady measurement, does not change if used repeatedly on the same target.
For example toefl test in junior high school third grade to determine the number of words that are owned by the student, to know the test is reliable or not the tests were performed repeatedly in the same person anyway.
2. Retest method is the use of the same test twice, the number of participants the same test, which shows the correlation confection ajeg score produced a test that can be obtained by correlating the two series of scores, the results of repeating the same tests.
a. When used method Repeat test. Re-do the test methods took place in a short time, like a few days, or within the length up to several months.
b. Why re-used test methods? Because of changes in capabilities, which when used for reworking the same test who, can result in a lower reliability coefficient, which indicates lower levels of reliability.
c. How to re-test method used? The method reset is done by minimizing the memory factor or factors influence the development of capabilities. Both tend to obscure the reliability coefficient obtained from the comparison of the resulting scores.
NAME :HIMAYA
REG.NO :A1D208094; June 12, 2012 at 8:33 PM
Siti Finartin said...: Siti Finartin (A1D2 09 088)
No.1
Reliability is one of the most important elements of test quality. It has to do with the
consistency, or reproducibility, of an examinee's performance on the test. Reliability is a term used to describe the properties of tests and measures. The reliability of a test or measure refers to its degree of stability, consistency and repeatability. Reliability also refers to the consistency of a measure. A test is considered reliable if we get the same result repeatedly. For example, if a test is designed to measure a trait, then each time the test is administered to a subject, the results should be approximately the same. Unfortunately, it is impossible to calculate reliability exactly, but it can be estimated in a number of different ways.
No.2
I will explain about Test-Retest Reliability
To gauge test-retest reliability, the test is administered twice at two different points in time. This kind of reliability is used to assess the consistency of a test across time. This type of reliability assumes that there will be no change in the quality or construct being measured. Test-retest reliability is best used for things that are stable over time, such as intelligence. Generally, reliability will be higher when little time has passed between tests. Test-Retest reliability refers to the test’s consistency among different administrations. To determine the coefficient for this type of reliability, the same test is given to a group of subjects on at least two separate occasions. If the test is reliable, the scores that each student receives on the first administration should be similar to the scores on the second. We would expect the relationship between he first and second administration to be a high positive correlation. One major concern with test-retest reliability is what has been termed the memory effect. This is especially true when the two administrations are close together in time. For example, imagine taking a short 10-question test on vocabulary and then ten minutes later being asked to complete the same test. Most of us will remember our responses and when we begin to answer again, we may just answer the way we did on the first test rather than reading through the questions carefully. This can create an artificially high reliability coefficient as subjects respond from their memory rather than the test itself. When a pre-test and post-test for an experiment is the same, the memory effect can play a role in the results.; June 12, 2012 at 9:55 PM

LANGUAGE TESTING 1

Counter

Time

Blog Archive

About Me

Followers

Tuesday, June 5, 2012

ASSIGNMENT 4

25 comments:

Post a Comment

Popular Posts