«ABSTRACT This paper intends to illustrate how markup can be efficiently used in CALL and identify the difficulties from an algorithmic, computer ...»
ANSWER MARKUP ON COMPUTER-ASSISTED
Ying-Ju Jenny Peng, Duke University
North Carolina, USA
This paper intends to illustrate how markup can be efficiently used in CALL and
identify the difficulties from an algorithmic, computer science point of view.
When computers are confronted with text (CAI, machine translation, information retrieval, etc.), spelling-errors greatly decrease the accuracy of performance. Thanks to expert systems, and artificial intelligence methods, etc., computers now have the potential to assist teachers in language learning. But how can computer-aided instruction (CAI) apply to language learning if computers are not able to handle language errors?
Judging a student response is one of the most difficult parts of a computer-assisted lesson to design and program. It takes care and experience to program good judging routines that provide the instructor with the necessary information about the student response for making decisions concerning feedback.
Feedback should be positive, should avoid negative statements, and should never demean the student. The slowest students, whose confidence and attitudes are already low, will suffer the greatest discouragement. Feedback should be corrective. It should provide the student with information to improve future performance, simply saying "incorrect" after a response is not desirable. Response markup, or answer markup, which is the subject of this paper, can be used to solve this kind of problem.
CALICO Journal, Volume 10 Number 3 31
There are a number of possible kinds of feedback:
• Text feedback
• Graphic feedback
• Error-contingent feedback We will mainly discuss the third and fourth types of feedback — the markup and the error-contingent feedback.
Types of Text Markup Two levels of text markup in CALL can be distinguished, word level and character level.
Word level mark up provides feedback to the student on the positioning and inclusion of words within a multi-word response. The Grammar Checker and the Style Checker are also kinds of word level markup. We will leave aside grammatical errors, since their recovery requires an accurate syntactic analysis of texts by computers, which in turn requires accurate semantic processors and global understanding.
Here we will focus the second text markup — character level markup. Character level markup gives feedback on positioning and inclusion of characters within a word, it is so called "spell checker."
INSTRUCTIONAL VALUE OF TEXT MARKUPAnswer markup is a kind of graphic feedback, which is used when a response is partially correct. Special symbols indicate errors and missing information.
It might be argued that markup is unnecessary because the courseware can simply display the correct answer beside the response and invite the student to discover where he went wrong by comparison. Furthermore, the student would benefit by having performed the comparison himself, rather than having it done for him by the machine.
A secondary objection is that the use of markup requires the student to team the meanings of markup symbols, and may contribute to his confusion if he forgets what they mean.
CALICO Journal, Volume 10 Number 3 32 However, the real advantage of text markup is evident in situations where the comparison requires more than one glance. For example, students learning English as a foreign language, even when, presented with a correct version of a word they misspelled, sometimes have difficulty finding and correcting all the errors in the misspelling. This may be especially true when their native language is not represented by a Latin alphabet. Presumably these students find the comparison tedious and give up after the first error has been found (Nesbit, 1986).
RELATED ERROR DETECTION PROGRAMS
Several programs are available which provide feedback on text characteristics, such as Bell Laboratories' WRITER'S WORKBENCH (Cherry, 1980; Kiefer & Smith, 1983).
WORKBENCH can determine how accessible or readable a text is, and calculate the percentage of passive verbs, however, these text characteristics have to do with style, not correctness or errors. A text that has a poor readability score, or a large percentage of passive verbs, could still be a correct text.
Although WORKBENCH does provide feedback on certain errors, unpaired quotation marks, the accidental repetition of a word, or errors in using "a" or "an," it is not designed specifically to deal with ungrammatical text.
IBM’s CRITIQUE (Heidorn, Jensen, Miller, Byrd, and Chodorow, 1982) can detect several kinds of errors, for example, subject/verb agreement, non-parallel form, pronoun agreement, and will provide feed-back on how such errors might be corrected.
All the programs mentioned above are word level markup yet, they differ in design because some of the programs are meant to be used in an educational environment rather than business settings. We're interested in programs that teach the skill of correcting errors, whereas CRITIQUE and WORKBENCH are designed toward making business writing more efficient and error-free. In CALL systems we need to provide quick feedback to the students regarding the errors in their answers.
Current spell check technology offers many alternative strategies for providing suitable feedback for responses. The best known of these products is CALIS (Computer Assisted Language Instructional System). CALIS is a computer based language instructional system specifically designed for use in foreign language learning. The spelling checker
TYPOLOGY OF SPELLING ERRORSIf tests are seen as strings of symbols, spelling errors can be generally thought of as alterations of strings. Several types of errors, which are due to completely different causes, must receive appropriate modeling and treatment.
There are two basic types of error. The first one is mainly due to technical problems in equipment, such as error in input devices, transmission errors, or information storage problems. We call them noise errors because these errors superimpose some "noise" on texts. The second type is due to users mistyping on keyboards (a key is typed twice, the finger slipped to a neighboring key and two keys are inverted, etc.). We will call these errors typographical errors. Figure 1 will show more detail on this (Veronis, 1988).
• No markup, and student is only told whether or not the whole word is misspelled.
• No markup, but the student is shown the target response.
• Partial markup, misspelled parts of the word are highlighted. (We will discuss this method in the following section.)
• Full markup showing all information necessary to correct the word.
ANSWER MARKUP IN CALISCALIS uses the "partial markup" method. Answer markup is an example of errorcontingent feedback Answer markup is used when the response is partially correct.
However, even when a response is totally wrong, feedback specific to the error is useful.
The following is an example of the automatic markup facility of CALIS. In the first response the symbols indicate that a word (Thomas) is missing. On the student's second try, the symbols indicate that the last name is spelled incorrectly, and also indicates which particular character is wrong by bold setting the character. On the student's third try, the symbols indicate that the last name is missing one character by putting a dash line at the position of that character. On the fourth try, the symbol indicates that two characters are transposed by flashing those two characters. On the student's fifth try, the response is correct.
CALICO Journal, Volume 10 Number 3 35 CALICO Journal, Volume 10 Number 3 36 Algorithm in the CALIS spelling checker The algorithm method has been used by CALIS to display a set of "edit markups"; that is, a set of symbols displayed near the student's answer indicating what areas of the answer are wrong, and the four operations needed to correct the answer. These operations are insertion, deletion, substitution and transposition of characters. The markup symbols do not indicate the exact correction needed except by giving students some hints. The point is that for a brief time the student can focus attention on learning correct spelling for the parts of the original response that were incorrect.
This spelling checker does not explain the student's error in grammatical or semantic terms. The student is expected to consider the editing hints given by the markup symbols and produce a new answer.
Edit Distance Algorithm in Misspellings The problem of recognizing misspelled words is viewed as a variation of the sequence comparison problem. It is based on evaluation of the answer by an "edit distance" algorithm (Nesbit, 1990). This algorithm compares the model with the answer and returns a string of edit markup symbols which indicate the smallest set of edit operations required to correct the answer.
Suppose we are given two strings x and y, which are composed of some characters from the alphabet, and that a sequence of edit operations can be used to convert one string x to the other string y. A cost can be assigned to each edit operation which reflects the intended application. Thus, each sequence of edit operations which convert x to y has an associated total cost which is simply the sum of the costs of all editing operations in the sequence. For examples, we have two string "lkout" and "layout," and the edit
operations with the cost are defined as following (Nesbit, 1990):
• Deletion of any character in x is 2.
• Insertion of any member of alphabet into any position of x is 2.
• Substitution of any character in x with any member of alphabet is 3.
The string "lkout" can be converted to "layout" by many different routes. One could first apply the deletion operation five times to delete the entire string, and then apply the insertion operation six times to get the desired result. The total cost of the edit sequence would be (2x5) + (2x6) = 22.
CALICO Journal, Volume 10 Number 3 37 A shorter route would be to substitute "a" for "k," then insert "y" before 11 0, 11 obtaining a total cost of 3+2=5. It is the lowest cost of the sequence of edit operations.
Therefore under these conditions the edit distance between the string is 5. One should keep in mind that there is often more than one sequence of edit operations, and thus more than one markup, corresponding to the edit distance between a pair of strings.
The sequence found by the procedure will depend on the precedence it gives to individual editing operations.
An important advantage of using the edit distance algorithm for recognition of a misspelled response in CAI is that it can be customized for a particular application by varying the edit costs. Insertion and deletion costs can be assigned for specific characters, and substitution costs for specific letter pairs.
For example, native Japanese speakers learning English often confuse "b" and "v," so courseware aimed at this group might assign an especially low cost to substitutions between these characters. (For readers interested in a more through understanding of basic algorithm, see Kruskal & Sankoff 119831 and Nesbit [19901 for the detailed algorithm.) This kind of edit markup scheme is for Roman alphabet answers. Also this method is language independent in the sense that they do not depend on grammar or semantics, so it may not suitable for languages such as Thai, Burmese, and other languages of Asia which use non-linear writing styles and non-roman scripts such as Chinese, Japanese.
The designing of the CALL software is meant to be used in educational rather than business settings. "The Spell" in Framemaker, uses the schemes we mentioned in the section above.
• No markup, and the student is only told whether or not the whole word is misspelled.
• No markup, but the student is shown the target response.
"The Spell" constructs a list of alternatives based on the amount of difference between the mistyped word and the possible intended words. If a misspelling like "worm" occurs, then "work," or "word" would be the most likely alternates, differing only in one letter, and "wood" would be less likely, because it differs in two letters. Such methods work well when there is only one plausible alternative anyway, but when the words are short, the number of satisfactory alternatives gets fairly large.
CALICO Journal, Volume 10 Number 3 38 When transpositions ("form --- from," for instance) occur, the difference method automatically treats them as less likely, because they create a difference of two letters.
Mishit letters tend to be those near to the target letter. From the educational point of view, this kind of feedback is not suitable for the language learning students. In CALL we are interested in programs that teach the skill of error correction, whereas "The Spell" in Framemaker is geared toward making business writing more efficient and error-free.
CONCLUSIONSIn most instructional situations the students are required to enter text in order to answer the question, thus there is a need for both recognition of misspellings and feedback about the nature of the error.
Answer markup based on edit distance is an approach to enhancing the user interface of an instructional system, such as the spelling checker in CALIS. As we have seen, several variations of the basic procedure are possible and it seems that no single version will be optimal for all instructional situations. As mentioned earlier with Asian languages, no single version will be optimal for all instructional situations.