Curriculum-Based Measurement in Reading (CBM-R)

Passage Reading Fluency

Cost

Technology, Human Resources, and Accommodations for Special Needs Service and Support Purpose and Other Implementation Information Usage and Reporting

Cost for 1 school:
$25.00 for 30 probes per grade level, 30 word-count scoring sheets per grade level (grades 1-7), and manual.

$10.00 fee to make 1 copy of the materials.

There are no other costs. There are no continuing costs.

Testers will require 1-4 hours of training.

Paraprofessionals can administer the test.

Testing accommodations should be consistent with those specified on the student’s IEP for high-stakes testing and implemented consistently for every progress monitoring occasion across the school year.

Vanderbilt University
PMB # 228
110 Magnolia Circle, Suite 418
Nashville, TN 37203

Field-tested training manuals are available and provide all necessary implementation information.

For questions and to order CBM-R Passage Reading Fluency, contact:

Lynn Davies
Phone: 615-343-4782
Lynn.a.davies@vanderbilt.edu

CBM-R Passage Reading Fluency is a progress monitoring tool based on Curriculum Based Measurement (CBM).

Students are presented with a grade-level passage. Students have 1 minute to read words. The score is the number of correct words. 

The tool provides information on student performance in English.

Administration of the test takes 1 - 2.5 minutes per individual student, depending on the type of progress monitoring measure. Scoring takes an additional 2 – 5 minutes.

30 alternate forms are available per grade per reading measure.

The raw score is the number correct. Percentile scores and developmental benchmarks are also available.

This measure is recommended as an indicator of reading competence at grades 2-4.

 

Reliability of the Performance Level Score

Grade1234567
RatingFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubble

 

Type of Reliability

Grade

n (range)

Coefficient Range

Coefficient Median

SEM

Correlation between odd and even scores (internal consistency)

Grade 1

Grade 2

Grade 3

Grade 4

Grade 5

Grade 6

Grade 7

68

59

62

77

76

80

55

0.94

0.91

0.94

0.92

0.95

0.98

0.94

0.94

0.91

0.94

0.92

0.95

0.98

0.94

 

Stability

Grade 1

Grade 2

Grade 3

Grade 4

65

59

62

60

0.93

0.94

0.93

0.98

0.93

0.94

0.93

0.98

 

Information (including normative data) / Subjects: Internal consistency: 42% African American; 56% subsidized lunch; 6% learning disabilities. Stability: 42% subsidized lunch; 43% African American; 1% IEPs

Reliability of the Slope

Grade1234567
RatingFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubble

Type of Reliability

Age or Grade

n (range)

Coefficient Range

Coefficient Median

SEM

Information (including normative data) / Subjects

HLM

Grade 1

Grade 2

Grade 3

Grade 4

Grade 5

Grade 6

Grade 7

81

92

101

87

82

125

69

0.77

0.73

0.92

0.82

0.81

0.90

0.89

0.77

0.73

0.92

0.82

0.81

0.90

0.89

 

29% African American; 42% subsidized lunch; 5% learning disabilities

Weekly assessments over 6 months (14—21; mean=18)

 

Validity of the Performance Level Score

Grade1234567
RatingFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubble

 

Type of Validity

Age or Grade

Test or Criterion

n (range)

Coefficient Range

Coefficient Median

Information (including normative data) / Subjects

Concurrent validity

Grade 1

Grade 2

Grade 3

Grade 4

Grade 5

Grade 6

Grade 7

89

121

125

88

69

81

100

0.59

0.63

0.88

0.91

0.92

0.82

0.88

0.59

0.63

0.88

0.91

0.92

0.82

0.88

Stanford Achievement Test: Comprehension

29% African American; 42% subsidized lunch; 5% learning disabilities

Concurrent validity

Grade 1

Grade 2

Grade 3

Grade 4

Grade 5

Grade 6

Grade 7

89

121

125

88

69

81

100

0.86

0.81

0.94

0.89

0.78

0.88

0.90

0.86

0.81

0.94

0.89

0.78

0.88

0.90

WRMT-Word Identification

See above.

 

 

Predictive validity

Grade 1

Grade 2

Grade 3

Grade 4

Grade 5

Grade 6

Grade 7

89

121

125

88

69

81

100

0.82

0.80

0.82

0.86

0.80

0.87

0.88

0.82

0.80

0.82

0.86

0.80

0.87

0.88

Stanford Achievement Test: Comprehension

See above.

Predictive validity

Grade 1

Grade 2

Grade 3

Grade 4

77

79

73

75

0.71

0.77

0.81

0.82

0.71

0.77

0.81

0.82

WRMT-Word Attack

43% African American; 48% subsidized lunch; 1% IEPs

Predictive validity

Grade 1

Grade 2

Grade 3

Grade 4

77

79

73

75

0.86

0.72

0.93

0.87

0.86

0.72

0.93

0.87

WRMT-Word Identification

See above.

Predictive validity

Grade 1

Grade 2

Grade 3

Grade 4

77

79

73

75

0.79

0.74

0.84

0.82

0.79

0.74

0.84

0.82

WRMT-Comprehension

See above.

Predictive validity

Grade 1

Grade 2

Grade 3

Grade 4

77

79

73

75

0.88

0.74

0.83

0.91

0.88

0.74

0.83

0.91

WRMT-Total Reading

See above.

 

Predictive Validity of the Slope of Improvement

Grade1234567
RatingFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubble

 

Type of Validity

Age or Grade

Test or Criterion

n (range)

Coefficient Range

Coefficient Median

Information (including normative data) / Subjects

Predictive validity

Grade 1

Grade 2

Grade 3

Grade 4

Grade 5

Grade 6

Grade 7

Iowa Test of Basic Skills, concurrent with end of progress monitoring

50

49

48

51

58

50

56

0.45

0.62

0.57

0.56

0.60

0.61

0.59

0.45

0.62

0.57

0.56

0.60

0.61

0.59

56% African American; 62% subsidized lunch; 11% leaning disabilities

Weekly assessments over 6 months (14—21; mean=18)

Predictive validity

Grade 1

Grade 2

Grade 3

Grade 4

Grade 5

Grade 6

Grade 7

Tennessee Comprehensive Assessment Profile, concurrent with end of progress monitoring

67

46

49

55

60

50

58

0.58

0.55

0.67

0.62

0.67

0.59

0.55

0.58

0.55

0.67

0.62

0.67

0.59

0.55

48% African American; 54% subsidized lunch; 8% leaning disabilities

Weekly assessments over 6 months (14—21; mean=18)

 

Bias Analysis Conducted

Grade1234567
RatingNoNoNoNoNoNoNo

Disaggregated Reliability and Validity Data

Grade1234567
RatingNoNoNoNoNoNoNo

Alternate Forms

Grade1234567
RatingdashFull bubbleFull bubbleFull bubbledashdashdash

Provide evidence that alternate forms are of equal and controlled difficulty or, if IRT based, provide evidence of item or ability invariance (attach documentation of direct evidence).

At each grade level, each alternate form was designed to be of equivalent difficulty as follows.

Passages were written to ensure readability within 1 grade level of the designated grade level.

For passage reading fluency, alternate-form/test-retest reliability is 0.93 for a sample of students with disabilities and 0.94 for a sample of students without disabilities. At grade 2, respective coefficients are 0.92 and 0.90; at grade 3, 0.97 and 0.95; and grade 4, 0.91 and 0.89.

In addition, across published studies and available databases, Fuchs and colleagues, in their program of research, administered a larger set of 40 probes per grade level to more than 1,000 students every week. Using these databases, they eliminated passages for which scores exceeded the student’s standard error of estimate for more than > 25% of students to derive a final set of 30 passages at each grade level. This was done separately at each grade level.

Type of Reliability

Age or Grade

n (range)

Coefficient Range

Coefficient Median

Information (including normative data) / Subjects

Correlation between odd and even scores (internal consistency)

Grade 1

68

0.94

0.94

42% African American; 56% subsidized lunch; 6% learning disabilities

 

Grade 2

59

0.91

0.91

 

Grade 3

62

0.94

0.94

 

Grade 4

77

0.92

0.92

 

Grade 5

76

0.95

0.95

 

Grade 6

80

0.98

0.98

 

Grade 7

55

0.97

0.94

 

What is the number of alternate forms of equal and controlled difficulty? 30

Rates of Improvement Specified

Grade1234567
RatingFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubble

1. Is minimum acceptable growth (slope of improvement or average weekly increase in score by grade level) specified in manual or published materials?

Yes (Note: These norms are based on academically representative samples).

a. Specify the growth standards:

Normative Passage Reading Fluency Data for RTI Decision Making

Sample size Designating Risk Determining Response
Level 5-8 Week Slope Projected End-Year Benchmark Slope of Improvement
Grade 1: 202 <5 <1.75 50 2.0
Grade 2: 324 <15 <1.00 75 1.50
Grade 3: 309 <50 <0.75 100 0.75
Grade 4: 326 <70 <0.50 125 0.50
Grade 5: 179 <80 <0.40 130 0.40
Grade 6: 247 <90 <0.35 150 0.35
Grade 7: 136 <90 <0.30 150 0.30

b. Basis for specifying minimum acceptable growth:

Norm-referenced

Normative profile:

Representation: National
Date: 1990-2000
Number of States: 6
Size: 1,723
Gender: 49% Male, 51% Female
SES: 36% Low, 43% Middle, 21% High
Race/Ethnicity: 39% White, 36% Black, 25% Unknown
ELL: 12%
Disability classification: 7%

End-of-Year Benchmarks

Grade1234567
RatingFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubble

1. Are benchmarks for minimum acceptable end-of-year performance specified in your manual or published materials?

Yes (Note: These norms are based on academically representative samples).

a. Specify the end-of-year performance standards:

Normative Passage Reading Fluency Data for RTI Decision Making

Sample size Designating Risk Determining Response
Level 5-8 Week Slope Projected End-Year Benchmark Slope of Improvement
Grade 1: 202 <5 <1.75 50 2.0
Grade 2: 324 <15 <1.00 75 1.50
Grade 3: 309 <50 <0.75 100 0.75
Grade 4: 326 <70 <0.50 125 0.50
Grade 5: 179 <80 <0.40 130 0.40
Grade 6: 247 <90 <0.35 150 0.35
Grade 7: 136 <90 <0.30 150 0.30

b. Basis for specifying minimum acceptable end-of-year performance:

Norm-referenced

c. Specify the benchmarks:

Normative Passage Reading Fluency Data for RTI Decision Making

Sample size Designating Risk Determining Response
Level 5-8 Week Slope Projected End-Year Benchmark Slope of Improvement
Grade 1: 202 <5 <1.75 50 2.0
Grade 2: 324 <15 <1.00 75 1.50
Grade 3: 309 <50 <0.75 100 0.75
Grade 4: 326 <70 <0.50 125 0.50
Grade 5: 179 <80 <0.40 130 0.40
Grade 6: 247 <90 <0.35 150 0.35
Grade 7: 136 <90 <0.30 150 0.30

d. Basis for specifying these benchmarks?

Norm-referenced

Normative profile:

Representation: National
Date: 1990-2000
Number of States: 6
Size: 1,723
Gender: 49% Male, 51% Female
SES: 36% Low, 43% Middle, 21% High
Race/Ethnicity: 39% White, 36% Black, 25% Unknown
ELL: 12%
Disability classification: 7%

Sensitive to Student Improvement

Grade1234567
RatingdashFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubble

Describe evidence that the monitoring system produces data that are sensitive to student improvement (i.e., when student learning actually occurs, student performance on the monitoring tool increases on average):

Slopes on the progress-monitoring tool are significantly greater than zero; the slopes are significantly different for learning disabled vs. low-achieving vs. average-achieving vs. high-achieving students; and the slopes are greater when effective practices (e.g., peer-assisted learning strategies) are in place. Also, the slopes are significantly greater than zero separately at grades 2, 3, and 4, as well as 5, 6, and 7.

Please note that this evidence is direct, based on Curriculum-Based Measurement in Reading: Passage Reading Fluency. The pertinent references are: (1) Fuchs, L.S. (2003). Assessing treatment responsiveness: Conceptual and technical issues. Learning Disabilities Research and Practice, 18, 172-186; and (2) Fuchs, L.S., Fuchs, D., Hamlett, C.L., Walz, L., & Germann, G. (1993). Formative evaluation of academic progress: How much growth can we expect? School Psychology Review, 22, 27-48.

Also, please note that we have direct evidence from a randomized control trial that performance on these probes is sensitive to treatment effects (i.e., student learning), which are also revealed on other technically sound measure. RCT results hold at each grade separately: grade 2, 3, and 4, as well as grades 5, 6, and 7. A pertinent reference is: Fuchs, L.S., Fuchs, D., & Hamlett, C.L. (1989). Effects of instrumental use of curriculum-based measurement to enhance instructional programs. Remedial and Special Education, 10(2), 43-52. 

Decision Rules for Changing Instruction

Grade1234567
RatingFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubble

Specification of validated decision rules for when changes to instruction need to be made: When the trend line through the student’s most recent 7-10 data points is steeper than the goal line, the teacher increases the student’s year-end goal. When the trend line through the student’s most recent 7-10 data points is less steep than the goal line, the teacher introduces an instructional change.

Evidentiary basis for these decision rules: Randomized control trials showing consequential validity, i.e., when these decision rules are used, teacher planning and student achievement improve.

The following RCTs relied on the specified decision rules:

Fuchs, L.S., Deno, S.L., & Mirkin, P.K. (1984). The effects of frequent curriculum-based measurement and evaluation on pedagogy, student achievement, and student awareness of learning. American Educational Research Journal, 21, 449-460.

Fuchs, L.S. (1988). Effects of computer-managed instruction on teachers’ implementation of systematic monitoring programs and student achievement. Journal of Educational Research, 81, 294-304.

Decision Rules for Increasing Goals

Grade1234567
RatingFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubble

Specification of validated decision rules for when increases in goals need to be made: When the trend line through the student’s most recent 7-10 data points is steeper than the goal line, the teacher increases the student’s year-end goal. When the trend line through the student’s most recent 7-10 data points is less steep than the goal line, the teacher introduces an instructional change.

Evidentiary basis for these decision rules:

Randomized control trials showing consequential validity, i.e., when these decision rules are used, teacher planning and student achievement improve. 

The following RCTs relied on the specified decision rules:

Fuchs, L.S., Fuchs, D., & Deno, S.L. (1985). The importance of goal ambitiousness and goal mastery to student achievement.  Exceptional Children, 52, 63-71.

Fuchs, L.S. (1988). Effects of computer-managed instruction on teachers’ implementation of systematic monitoring programs and student achievement. Journal of Educational Research, 81, 294-304.

Improved Student Achievement

Grade1234567
RatingFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubble

Description of evidence that teachers’ use of the tool results in improved student achievement based on an empirical study that provides this evidence.

Study: Fuchs, L.S., Deno, S.L., & Mirkin, P.K. (1984). The effects of frequent curriculum-based measurement and evaluation on pedagogy, student achievement, and student awareness of learning. American Educational Research Journal, 21, 449-460.

Sample:

Number of students in product/experimental condition: 18 teachers; 64 students

Number of students in control condition: 21 teachers; 77 students

Characteristics of students in sample and how they were selected for participation in study:  

Subjects were 39 New York City Public School special education teachers who volunteered to participate in a project in which they would receive inservice training. Teachers were based in seven buildings, with four to seven teachers per school. Within each school, teachers were assigned randomly to experimental and contrast groups, and each teacher then selected three or four pupils for this project.

In the experimental group, teacher (3 men, 15 women) had taught special education for an average 3.79 years (SD = 2.85). The students’ (51 boys, 13 girls) age-appropriate grade level averaged 5.79 (SD = 1.66); 49% were placed in programs for emotionally handicapped students, 32% in programs for brain-injured students, and 19% in resource programs. 

Contrast teachers (2 men, 19 women) had taught handicapped children for an average 3.59 years (SD = 2.72). The contrast students’ (57 boys, 20 girls) age-appropriate grade level averaged 5.45 (SD = 1.65); 52% were placed in programs for emotionally handicapped students, 32% in resource programs, and 15% equally distributed across programs for brain-injured, physically handicapped, and educable mentally retarded children.

Statistical tests revealed that experimental and contrast groups were similar with respect to teachers’ sex and experience as well as students’ sex and grade level. However, there was a relation between treatment group and the distribution of children amount program types, x2(4) = 24.31, p< 0.001, with a much greater percentage of brain-injured children in the experimental group.  

Design: Used random assignment

Unit of assignment: Teachers

Unit of analysis: Teachers

Duration of product implementation: November through May

Describe analysis: For student learning outcomes: Multivariate 2-way analysis of covariance was conducted on the posttest reading variables, using teacher as the unit of analysis. ANCOVAs were used to follow up the significant MANOCA. The ANCOVAs included teacher training as a blocking factor only to control for a known source of error. Given the absence of a statistically significant interaction between measurement/evaluation and teacher trainer factors on the multivariate analysis, further discussion of the teacher trainer conditions would be extraneous to the purpose of the study.

Fidelity:

Description of when and how fidelity of treatment information was obtained: Teacher trainers met individually with each experimental teacher every week to inspect data collection documents, graphs, and teachers’ instructional plan sheets and to immediately correct any violation of the progress-monitoring study protocol.

Results on the fidelity of treatment implementation measure: Teacher trainers met individually with each experimental every week to review data collection documents, graphs, and teachers’ instructional plans sheets. Any violation of the progress-monitoring study protocol was immediately corrected to ensure fidelity.

Measures:

External outcome measures used in the study, along with psychometric properties:

Measure Name

Reliability Statistics

Passage Reading Test

Internal consistency reliability=0.96

Stanford Diagnostic Reading Test-Structural Analysis

Internal consistency reliability=0.93-0.95

Stanford Diagnostic Reading Test-Reading Comprehension

Internal consistency reliability=0.96

Results:

Results of the study:

Effect sizes for each outcome measure:

Measure name

Effect size

Passage Reading Test

0.84

Stanford-Structural Analysis

0.87

Stanford-Reading Comprehension

1.05

Summary of conclusions and explanation of conditions to which effects should be generalized: Data-based individualization improves teachers’ instruction and student learning, when teachers’ accuracy of implementation is supported.

Other related references or information

Fuchs, L.S., Fuchs, D., & Stecker, P.M. Effects of curriculum-based measurement on teachers’ instructional planning. Journal of Learning Disabilities, 22, 51-59.

Fuchs, L.S., Fuchs, D., Hamlett, C.L., & Ferguson, C. (1992). Effects of expert system consultation within curriculum-based measurement using a reading maze task. Exceptional Children, 58, 436-450.

Improved Teacher Planning

Grade1234567
RatingFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubble

Description of evidence that teachers’ use of the tool results in improved planning based on an empirical study that provides this evidence.

Study: Fuchs, L.S., Deno, S.L., & Mirkin, P.K. (1984). The effects of frequent curriculum-based measurement and evaluation on pedagogy, student achievement, and student awareness of learning. American Educational Research Journal, 21, 449-460.

Sample:

Number of students in product/experimental condition: 18 teachers; 64 students

Number of students in control condition: 21 teachers; 77 students

Characteristics of students in sample and how they were selected for participation in study: Subjects were 39 New York City Public School special education teachers who volunteered to participate in a project in which they would receive inservice training. Teachers were based in seven buildings, with four to seven teachers per school. Within each school, teachers were assigned randomly to experimental and contrast groups, and each teacher then selected three or four pupils for this project.

In the experimental group, teacher (3 men, 15 women) had taught special education for an average 3.79 years (SD = 2.85). The students’ (51 boys. 13 girls) age-appropriate grade level averaged 5.79 (SD = 1.66); 49% were placed in programs for emotionally handicapped students, 32% in programs for brain-0injured students, and 19% in resource programs. 

Contrast teachers (2 men, 19 women) had taught handicapped children for an average 3.59 years (SD = 2.72). The contrast students’ (57 boys, 20 girls) age-appropriate grade level averaged 5.45 (SD = 1.65); 52% were placed in programs for emotionally handicapped students, 32% in resource programs, and 15% equally distributed across programs for brain-injured, physically handicapped, and educable mentally retarded children.

Statistical tests revealed that experimental and contrast groups were similar with respect to teachers’ sex and experience as well as students’ sex and grade level. However, there was a relation between treatment group and the distribution of children amount program types, x2(4) = 24.31, p< 0.001, with a much greater percentage of brain-injured children in the experimental group.    

Design: Used random assignment

Unit of assignment: Teachers

Unit of analysis: Teachers

Duration of product implementation: November through May

Describe analysis: For teacher planning outcomes: On the Structure of Instruction Rating scale, a one-between, one-within (time) analysis of variance was used. On the teacher decision-making survey, data were analyzed via chi-square statistics.

Fidelity:

Description of when and how fidelity of treatment information was obtained: Teacher trainers met individually with each experimental teacher every week to inspect data collection documents, graphs, and teachers’ instructional plan sheets and to immediately correct any violation of the progress-monitoring study protocol. Teacher trainers indexed fidelity using the Accuracy of Implementation Rating Scale. Fidelity was 100%, because (as mentioned) all errors were corrected within a week, in the weekly meetings with the teacher trainers. The Accuracy of Implementation Rating Scale (AIRS) is based on (a) observations of teachers collecting CBM data (i.e., accuracy of direction, accuracy of timing and marking errors; accuracy of score; accuracy of graphing); (b) inspection of graphs to verify conformance with data-utilization rules (i.e., accuracy of timing for teaching changes; accuracy of timing for changes); and (c) inspection of teacher planning sheets to confirm that instructional plans were changed in accord with the timing of the planning changes noted on graphs and represented major instructional changes. The collected data on each of the 3 dimensions are combined to form a 5-point accuracy rating (1 low; 5 high) for a, b, and c separately. Teacher trainers relied on the AIRS to rate the teacher’s performance on a different student each week for (a) (measurement/scoring) and on all participating students each week for (b) (data-utilization rules) and (c) (the timing and appropriateness of instructional changes). Teacher trainers did reliability checks on each other for (a) three times during the first 2 weeks of implementation, three times at the mid-point of implementation, and 3 times in the last 2 weeks of implementation, with 89% agreement. Research assistants independently recoded 25% of AIRS in each of these timeframes for (b) and (c), with 94% agreement. In the first month of implementation, the mean ratings respectively for (a), (b), and (c) were 3.79, 4.02; and 3.85; at mid-point in the study, the mean ratings were 4.92, 4.07, and 4.45; and in the last 2 weeks of implementation, the mean ratings were 4.97, 4.09, and 4.63.

Results on the fidelity of treatment implementation measure: See above.

Measures:

External outcome measures used in the study, along with psychometric properties:

Measure Name

Reliability Statistics

Structure of Instruction Rating Scale

Internal consistency reliability=0.88-0.89

Results:

Results of the study:

Effect sizes for each outcome measure:

Measure name

Effect size

Structure of Instruction Rating Scale

0.47

Summary of conclusions and explanation of conditions to which effects should be generalized: Data-based individualization improves teachers’ instruction and student learning, when teachers’ accuracy of implementation is supported.

Other related references or information:

Fuchs, L.S., Fuchs, D., & Stecker, P.M. Effects of curriculum-based measurement on teachers’ instructional planning. Journal of Learning Disabilities, 22, 51-59.