Aptitude testing. In the past, Ms. Wheeler had administered the Primary Measures of Music Audiation (Gordon, 1986), a test of developmental music aptitude. She stated that it yielded useful information, helping her to identify those children who were high aptitude but low achieving so that she could push those students to reach their potential. However, Ms. Wheeler stated that administering and scoring the test to 90 students in one grade level was too time consuming to be justified by the one or two underperforming students she felt she might discover. She has offered to allow her student teachers to administer it for the experience and the data, but none of them have taken her up on the offer (DW Initial Interview, p. 14).

Performances. Ms. Wheeler considered group musical performances for an audience to be a form of assessment--a chance to show a completed product (DW Initial Interview, p. 6-7).

However, Ms. Wheeler has moved away from formal performances for her younger grades, instead offering informances--chances for parents of children in grades 1 and 2 to come see a music class. Despite some misgivings, Ms. Wheeler continued to prepare her kindergarten students for a performance as part of a “family day” celebration that was a longstanding school tradition. Grades 3 and 4 staged “performance level” (DW Initial Interview, p. 6) programs with singing, Orff instruments, and movement, and fifth graders produced a musical (this year, it was Oliver!). Ms. Wheeler expressed concerns about the rehearsal time required to achieve “performance level:” I’d rather not do the programs, because it is taking a break in the middle of what I’m trying to teach, basically… It wasn’t just teaching the song for the song’s sake, which I did within the curriculum and [while still] teaching [music] skills, but we brought it to a performance level and then performed it… so that spiraling [of curriculum] can’t continue, because you have to take that one part to a certain level… [now the students are] lacking some skills, so I’m having to go back (DW Initial Interview, p. 6-7).

Although Ms. Wheeler considered performances to be a form of assessment, they did not result in records of individual musical skills or abilities, except perhaps the video-recording of solo singing or instrument playing, which was not collected for assessment purposes or evaluated in any way.

When music learning was assessed. Most assessments were embedded as a part of normal music instruction. During an activity, Ms. Wheeler would build in an opportunity for students to demonstrate some musical skill and record their participation or a score to rate their achievement. For example, one day in recorders, students composed eight-beat B sections to a song they were working on playing. Then, the whole class played the A sections, and individual students took turns performing their B sections (DW Journal 2/1, p.1). Ms. Wheeler marked on a class list which students chose to play their B sections for the class, but she did not evaluate playing ability or the student’s composition itself. Another example was a game played a few times in kindergarten during which the children were “messengers” who delivered different colored hearts (letters) to each other as part of a song (e.g., DW Field Notes 2/5, p. 3). Then, Ms.

Wheeler would sing, “Who has the purple heart?” and the child (or children) with purple would sing back, “I have the purple heart” as a way to practice for when Ms. Wheeler assessed their singing voice development at a future date. However, Ms. Wheeler did not record their participation or rate their singing achievement. Composition activities offered rich opportunities for informal assessment, as students experimented by playing their ideas on recorders (self assessment), talked to one another about questions they had (peer coaching/assessment), and asked Ms. Wheeler for feedback. Ms. Wheeler also assessed the compositions formally using a checklist.

Although many assessments were embedded in instructional activities, this was not always the case. Some assessments were whole-class activities in and of themselves, such as self-assessments, the “Rocket Notes” note-reading quizzes, and other written assessments about music concepts or information. Rarely, students would be pulled aside for assessments, such as in kindergarten during centers time or in fourth grade when students went individually to another room to perform a playing test for a video camera, or when they played in duets and trios for Ms.

Wheeler while everyone else practiced.

In 7 weeks of observations, I saw repeated use of assessments. According to my field notes and corroborated by Ms. Wheeler’s journals, each class meeting featured multiple activities that offered the opportunity to assess music knowledge and skills. Many of these activities were whole-group, and Ms. Wheeler circulated around the classroom about once a week with a class list to mark children who had not yet achieved a targeted skill. The fourth grade class I observed did written work, such as a composition, self evaluation, or work in their recorder notebooks, two to three times a month. At least one (and particularly in kindergarten, usually more) activity per class would allow smaller groups or individuals to demonstrate what they knew and could do. These included times that children sang or played alone or in small groups, or that they worked individually on dry-erase boards, playing instruments, or with manipulatives.

In Ms. Wheeler’s journals, there were consistent references to assessment activities that I had not identified as assessments when I coded my field notes. For example, there were a number of times that Ms. Wheeler checked the whole group, halves of the class, small groups, or even individuals on a particular musical skill or behavior but did not record what she saw. I did not code this activity (informally checking for participation and/or comprehension) as an assessment, because it did not result in any kind of descriptive information about an individual that could be used later to adapt instruction to individual differences. Another example of an activity that Ms. Wheeler called an assessment in her journal that I did not code as an assessment in my field notes was composing a song as a whole class and having individual students contribute portions (e.g., treble clef, time signature, rhythm or tonal patterns). While allowing individual students to contribute such information would offer a chance to check the class’s understanding, Ms. Wheeler did not record which student volunteered information or what information was contributed by whom. Therefore, I viewed this activity and others like it as examples of well-delivered whole-group instruction, rather than as assessments of individual skills, knowledge, and abilities.

In summary, some type of assessment activity was present in nearly every class I observed. More complex assessments like compositions, formal written assessments, recorder playing tests, and tests of singing voice development were undertaken less frequently—only once each in the seven-week course of this study. Self-assessments and portfolios were cumulative and presented to the students at the end of each trimester, and the observation period included times during which students worked on self-assessments and completed written work that was placed in their portfolios. Rating scales and/or checklists were completed once or twice a week regarding specific demonstrations of musical skills, although Ms. Wheeler often chose simply to mark who participated. Performances for audiences did not take place in the observation period, but Ms. Wheeler indicated that grades k, 3, 4, and 5 performed once a year, and the kindergarten class I observed was starting to prepare music for their performance.

Scoring Assessments and Tracking Results Checklists and rating scales. Ms. Wheeler’s assessments typically were some form of checklist or rating scale. For many assessments, Ms. Wheeler simply checked “yes” or “no” on class lists to record if a child was participating or demonstrating a particular skill. Danielle

designed her own rating scales. The following scale was used to evaluate kindergarten singing:

–  –  –

Kindergarten students also were rated on their abilities to make up a rhythm pattern in the context of a triple meter chant.

P+ for pattern with correct solfege and meter P for a pattern in triple meter on a neutral syllable or with incorrect solfege P- for a response that was not in the rhythmic context (DW Journal 1/22, p 1).

Ms. Wheeler designed checklists to evaluate summative assessments, such as the final recorder-playing test. Fourth grade students went into the hallway one at a time and played for a video camera. Ms. Wheeler took the video home, watched each example, and rated it with a

yes/no checklist of the following:

–  –  –

The checklist also included a space for comments. Grades on the recorder unit were determined exclusively by this summative recorder-playing test and were based on whether the child successfully performed a song in the grade class they wanted. That is, if they wanted an “A” they had to play a more difficult song than if they were playing for a “B” (DW Journal, 1/15 p.

2). A chart of which songs could be played for what grade was posted in the classroom for a few weeks prior to testing (DW Journal 3/1, p. 5).

Ms. Wheeler also used formal criterion-based assessment of written compositions in fourth grade. The students wrote a song for their recorders and Danielle evaluated it by using a

yes/no checklist of the following:

–  –  –

Notes properly placed on the staff? (DW Journal 3/1, p. 4).

This checklist was on the board for the students as they were composing. Providing the checklist assisted students as they composed, but also resulted in this activity encompassing only the lower levels of thinking on Bloom’s taxonomy. Bloom’s taxonomy stratifies levels of thought, beginning with knowledge, comprehension, and application, and then progressing to analysis, synthesis, and evaluation. When students follow a checklist step-by-step, they are at most applying what they know to a proscribed task.

Observational assessments. Ms. Wheeler described one of her assessment methods as “observational notes” (DW Initial Interview, p. 16). For example, she would circulate during recorders, notice students who were not demonstrating a particular skill (e.g. wrong hand on top) and jot their name down. Fourth grade students also played assessments in duets and trios. For this activity, Ms. Wheeler hung a large chart of different possible songs to play in the hall. The chart was organized by difficulty level. Student duets or trios who played a certain song correctly (pass/fail) were allowed to sign their names under that song in the hall. Ms. Wheeler, her student teacher, and a guest teacher took advantage of these opportunities to give constructive feedback and individual assistance. Some children responded well to the idea of trying for higher levels of challenge, including one duet team who chose to play melody and improvised harmony based on chord tones (DW Think Aloud 2/15, p. 15-16). These observational assessments were formative and interactional and did not result in any data other than the pass/fail list.

Written tests. Although Ms. Wheeler administered other written tests and stored them in portfolios for grades 1 through 5, the only examples I saw were the one-minute “Rocket Notes” note-reading tests. “Rocket Notes” were scored as a total number of correct responses out of the 40 possible responses. Each child selected a personal goal for the next test, which was administered once a week for six weeks. Tests on different days had the same content (notes on the treble staff) but the information was presented in a different order so that students were not just memorizing. Ms. Wheeler graphed responses to track if each student was improving, and this led to conversations with individual students who were not improving or who had consistently low scores.

Methods for eliciting response. Ms. Wheeler indicated that she spent time in kindergarten teaching behaviors that allowed her to assess musical skills. In the observation period, I observed the following methods of teaching children to respond: letter/heart messenger game (echo singing), scan across room (individual response, but very fast), boys respond then girls respond, responses by section of chairs, small group singing into microphones, small groups playing instruments, movement responses (including fluid movement, beat movement, and thumbs up/thumbs down), response cards (each child has own card, points to pictures or holds up card), and popsicle sticks laid on the floor representing rhythm notation. Although many of these methods were still used in fourth grade, written responses on paper and individual white boards were added, and the most prevalent mode of assessable response was individuals playing instruments. Ms. Wheeler stated that routine was crucial to the success of assessment activities, especially in kindergarten. For example, she attributed an interruption in routine (several snow days on music days) to reduced participation in small-group singing (DW Think Aloud 2/15, p.


One day in kindergarten, Ms. Wheeler used centers time to facilitate individual assessments of singing voice development (in the hall) and instrument skills (in the classroom) (DW Field Notes 2/12, pp. 3-4). Her journals do not mention the centers until the week after centers, when I specifically asked her about them. She reported, “I only do them on a ‘party week’ [this day was the kindergarten Valentine’s Day class party]. So they play centers six times a year” (DW Journal 2/19, p/1). In a conversation between classes, Ms. Wheeler told me that she planned centers for these “crazy days” because she felt the students would have difficulty with whole-group direct instruction (DW Field Notes 2/12, p.3). She also stated, “Most of the time, I am just observing [the centers], and I like to do that, because it lets me know the personality of the child… There’s [also] that observational piece of when four people sit down at that drum, and all of them are playing the macro beat—Oh, Cool! And I’ll make comments to them. Or they’ll pull me up to see the pattern that they wrote on the board…” (DW Think Aloud 2/15, p.

