When an LSA required an improvised response, there was no teaching mode, and students’ responses could have a variety of levels of correctness. In such cases, Ms. Stevens designed a different rating system. For example, in third grade, Hailey sang an improvised Major tonic or Major dominant pattern as a prompt (HS Field Notes 3/9, p. 1). The students decided if the prompt was tonic or dominant and responded with a different pattern of the same

variety as an answer. Hailey rated their responses as follows:

If a student was able to improvise a tonal pattern with correct solfege and pitches, I marked it with a “+”. If a student improvised a tonal pattern in tune and function but with incorrect solfege applied, I marked it with a “(+).” If a student improvised a pattern that used correct solfege (e.g., “DO-MI-DO” for a major tonic) but did not sing correct pitches (or didn’t use a singing voice), I marked it with a “(-)”. If a student gave a response that was not sung and did not use correct solfege, I marked it with a “-” (HS

–  –  –

Embedded assessments. Ms. Stevens used four-point rating scales to score embedded assessment activities. She designed her own scales to measure exactly the musical behavior she

wanted to track:

HS: Let’s say… it’s first grade and we are improvising rhythm patterns, just with neutral syllables. If they can do it consistently in my tempo and meter, it’s a 4. A 3 would be mostly there, but maybe there’s a little bobble where they change the meter or something like that. A 2 would be they came up with something different [from my prompt], but not quite rhythmically… you know… all there. And then, a 1 would be not at all. Well... I kind of do that differently with that one, maybe it’s not a good example. A one would be… no rhythm at all. Usually for that I’ll make a note… if they just [echo my prompt], I’ll make a note of that, because they weren’t able to discriminate that what they were

–  –  –

KS: But doing the same as you might show a metric context, though.

HS: Right… but I’m assessing if they can create something different… So if it’s just echoing the rhythm patterns, then 4 would be they can do the rhythm consistently in my tempo and my meter. 3 would be mostly there, but maybe one mistake. 2 would be they did a pattern in my tempo and meter, but maybe they changed a beat or two and 1 would be totally not in tempo or meter (HS Initial Interview, p. 6).

As the nuances between the above “creating” and “echoing” scales demonstrate, designing rating scales explicitly for each activity allowed Ms. Stevens to track specific musical behaviors at particular performance levels. Moreover, her consistent use of a four-point scale meant that she was not reinventing the wheel with each new rating system. “I tend to stick with that. It’s easier for me to keep track of in my mind, when I’m having to write them all down quickly” (HS Initial Interview, p. 6).

Hailey used four-point rating scales at least once per class to track musical progress on a variety of musical tasks. After a child’s solo response, Ms. Stevens would simply record “their turn” as a numeral 1, 2, 3, or 4 in her grade book or palm pilot. This data was then transferred to her assessment spreadsheet in her computer. More samples of rating scales used during the

observation period included the following:

Melodic improvisation over chord roots:

–  –  –

Tracking individual responses this frequently and with this level of detail facilitated Ms. Stevens’ quest to know her students as musicians and people.

I find record-keeping of students’ achievement to be EXTREMELY helpful... If I didn’t keep records of assessments, I would have no tangible information on which to base my expectations of students, measure their progress, or gauge where we need to go next in the learning process. [For example] I was surprised that Hiroyuki was able to play the chord roots perfectly, based on his singing achievement, but it was not surprising based on his tonal aptitude score as indicated by IMMA. The other students who achieved at a level “4” did not surprise me, as they have shown high achievement in previous assessments. I was not surprised that Mario struggled, as he does with many skills in music (which is not surprising given the issues we talked about- new to the school, to the country, probably fairly new to English). I was surprised that Shanelle achieved at a level “1” because she typically does much better than that. I would be curious to see how she did with the activity on a future day, as we all have our “off” days! (HS Journal 2/23,

–  –  –

The quality and quantity of data Hailey amassed also allowed her to monitor the success of her teaching, tailor her instruction to meet students’ needs, and plan future lessons. Designing her own four-point rating scales meant not only that the scale was convenient to use, but also that it measured what she needed it to.

Necessity of individual response. “The most important factor in the ability to assess….

You have to hear [students] alone. If you don’t hear them alone, you don’t know what they can do (HS Final Interview, p. 2). Although she used observation of the class as a whole and informal group assessments to guide her teaching, Hailey’s journal entries mention only those assessments based on individual responses. “I don’t feel I can accurately assess things if [students] are doing it together, because they could be imitating each other” (HS Initial Interview, p. 7). Ms. Stevens designed at least one embedded assessment activity and used LSAs every day as ways to elicit individual responses. “You can’t really individualize instruction if [students] don’t have opportunities to do things alone, and you have no idea what they CAN do, because you have never heard them alone…” (HS Think Aloud 2, p. 1). Individual response was integral to Ms. Stevens’ practice of assessment.

Challenges to assessment. Ms. Stevens faced considerable challenges as she worked to score and track students’ progress in music.

[Elementary general music teachers] see so many students, and often we don’t get the same amount of planning time in our school day as a classroom teacher. It’s really hard to get back and look at all the assessments that you’re doing. That’s my main challenge.

So, I have three hundred to four hundred students in a week. When do I sit down and really examine that assessment data? That’s my main challenge (HS Initial Interview, p.


Due to these challenges, Hailey had to be thorough, accurate, and organized with her record

keeping. She talked about how her assessment practices required considerable multi-tasking:

You’ve gotta be able to have your eyes on the kids, make sure they are all behaving… You have to be able to keep your own teaching plans in your head so that you can keep rolling while you are monitoring [the students]. AND you’ve got to be able to keep track of what each child is doing [musically]. And you have to keep track, written or in your mind, [of] exactly how each student did. I think you have to have a huge ability to multitask… (HS Final Interview, p. 3).

During a think-aloud, we watched a clip of third-grade students singing improvised melodies over chord roots. While she was watching children sing, Ms. Stevens commented: “My memory is so bad… I remember, wow, Selina that day did something that was really cool. But remembering what it was, is gone. Seeing so many students, teaching so many classes, it’s like everything just kind of filters through” (HS Think Aloud 1, p. 3). On another occasion, Hailey facilitated whole-class songwriting as practice for a future small-group composition activity.

Individual students suggested chunks of melody, and Hailey notated the song and provided a harmonic framework.

In the moment I made mental notes on who created what kinds of “chunks,” BUT now I cannot remember who created what! I remember being impressed that the second student created such a clear dominant pattern for the second measure, but I can’t remember who it was! This is why I like to take notes and/or document assessments... (HS Journal 3/23,

–  –  –

Hailey also needed to be self-motivated to track her students’ progress with music learning. Elementary music teachers in her district were philosophically divided regarding assessment (HS Initial Interview, pp. 2-3). “[P]eople like me… believe we can teach specific skills--we can break down these things that we can teach and assess. And other people that think [music] needs to just be a conceptual, holistic, experiential thing” (HS Initial Interview, p. 3).

Furthermore, there was little administrative oversight of elementary music grading practices:

[Y]ou could just make up the grades that go on the report card. You could be doing NO assessment, truly, whatsoever, of your students. It would be really easy… You could just say that everybody is grade level. In fact, I have heard that there are a couple of teachers in this district that do that. [Grade] everyone as proficient (HS Final Interview, p. 6).

In order to integrate assessment practices into her teaching, Ms. Stevens had to be selfmotivated, keep detailed records, multi-task while teaching, and find the time to review the results of assessments so they could inform her instruction.

Summary of scoring and tracking the results of assessments. To score and track music learning, Ms. Stevens typically designed her own four-point rating scales so they would be easy to use and valid for her purposes. These rating scales were utilized to evaluate the embedded assessments that constituted the majority of the assessment activities in Hailey’s classroom. Ms. Stevens infrequently used written quizzes, which were scored as the number of correct answers out of the number of possible answers, and aptitude tests, which resulted in percentile rankings. Daily LSAs were scored by using tally marks or an adapted a rating system that described the nuances possible in students’ responses. Data from aptitude tests, four-point rating scales and quizzes were entered into a grading spreadsheet, and LSA progress was tracked in the LSA binder. Hailey believed that individual response was necessary for an assessment to be accurate. She faced challenges to her assessment practices, including a large number of students, limited contact time, lack of support from colleagues and administration, and the need to multi-task as she collected data.

Impact of Assessment on Differentiation of Instruction.

Ms. Stevens used the results of her assessments to track individual progress in music learning and to guide her instruction of each student.

I think it’s important to go back and study the results of the assessment to see who is achieving with that particular skill. And the kids who achieved it need to be pushed on to something that is going to keep them more challenged. The kids who didn’t quite achieve that skill obviously need some remediation, they need some re-teaching and reinforcement, maybe they need to backtrack… So I [use] assessments to then decide what each individual child needs from that point on, whether it’s to advance or to have more experiences with the content they hadn’t yet mastered (HS Initial Interview, p. 8).

Differentiation inextricably intertwined with assessment practices. The tapestry of Ms. Stevens’ music teaching included nearly omnipresent threads of assessment and differentiated instruction. To me as an observer, these threads were often so intertwined as to be somewhat indistinguishable. Hailey described her view of the role of assessment in

differentiating instruction:

I think [assessment] forces you to hear individual students, to see where they are achieving, it forces you to keep track of [achievement] so you know where they all are, and hopefully [assessment] is informing the decisions that you are making as you are proceeding with what the kids need (HS Final Interview, p. 7).

Differentiated instruction as a natural consequence of assessment. Ms. Stevens’ assessments of student abilities resulted in differentiation of instruction. She differentiated her instruction both while teaching in the moment and also as she planned new learning opportunities for the future. The metaphor of a tapestry again seems apt, as Hailey rarely differentiated simply based on one assessment experience, but seemed to maintain multiple assessment threads for each student—aptitude, singing voice development, rhythmic and tonal achievement, to name a few. These threads were woven together in the moment and in planning both for individual students and for whole classes.

Ms. Stevens’ journal entries and my field notes are replete with descriptions of instructional decisions made in the moment based on either past or present assessments to differentiate instruction. One day, the first grade students played a singing game that featured three phrases echoed by individual student singers (HS Field Notes 2/23, p. 412). The echoed responses, sung on words as part of the song, offered different difficulty levels [1. Do re mi do,

2. Mi mi fa sol, 3. So la ti do sol].

I was originally planning on letting the students choose [who sang next], but based on the wide range of singing abilities in this class, I decided to choose which student would sing 12 This is the activity described in the opening vignette of this dissertation document.

which echo. This enabled me to give the students who had showed consistent, accurate use of singing voice the challenging phrase and those that hadn’t shown as much consistent, accurate use of singing voice one of the easier phrases to sing… Ms. Stevens used past assessments of students’ singing abilities to determine their level of challenge in this activity. She also weighed personality factors when she decided on level of


