AIMSweb

Area: Math Concepts and Applications

 

Cost

Technology, Human Resources, and Accommodations for Special Needs

Service and Support

Purpose and Other Implementation Information

Usage and Reporting

M-CAP is included in a subscription to AIMSweb Pro Math or AIMSweb Pro Complete, which ranges from $4.00 to $6.00 per student per year.

Every AIMSweb subscription provides unlimited access to the AIMSweb online system, which includes:

  • AIMSweb assessments for universal screening and progress monitoring
  • Data management and reporting
  • Browser-based scoring
  • Training manuals
  • Administration and scoring manuals

Internet access is required for full use of this product.

Testers will require 1-2 hours of training.

Paraprofessionals can administer the test.

Alternate forms available in Spanish.

Pearson
19500 Bulverde Road
San Antonio, TX 78259
Phone: 866-313-6194
Visit AIMSweb.com

General Information:
866-313-6194 option 2
sales@aimsweb.com

Tech support:
866-313-6194 option 1 aimswebsupport@pearson.com

Field tested training manuals are included with AIMSweb subscriptions which provide administration, scoring, and implementation information.

Ongoing technical support is provided.

Professional development opportunities are available.

M-CAP is a brief (8 to 10 minute) group (or individually) administered and standardized assessment of math computation proficiency. It uses an open-ended fill-in-the-blank response format and consists of 33 alternate forms per grade for grades available for grades 2-8.

The mathematics domains assessed include: number sense, operations, patterns and relationships, data and probability, measurement, data and statistics, geometry, and algebra.

Raw score (words read correctly per minute), national percentiles (grades 2 – 12) and normative performance levels by grade and season, individual student growth percentiles by grade and season (based on rates of improvement, ROI), and success probability scores (cut scores that indicate a 50% or 80% probability of passing the state test). Local norms are also available.

Reports that provide instructional links to enVisionMath and focusMATH, Prentice Hall Mathematics (grades 6 – 8), SuccessMaker Math,  digits,  KeyMath-3 Diagnostic Assessment, and analysis of strengths and weaknesses by NCTM and Common Core domains.

 

Reliability of the Performance Level Score: Convincing Evidence

Type of Reliability Age or Grade n (range) Coefficient SEM Information / Subjects
range median
Alternate form 2 1,064   0.86 4.6 Average inter-probe correlation in the standardization sample
Gender: F 50%, M 50%
Ethnicity:
African American    9%
American Indian     2%
Asian                        3%
Hispanic                  26%
White, non-Hispanic 60%
Other                         1%
Household income:
Low                         50%
Middle                     23%
High                        27%
(national distribution is 33% per level)
Region:
Northeast                 7%
Midwest                  31%
South                       52%
West                        11%
Alternate form 3 965   0.81 4.5
Alternate form 4 1,026   0.80 5.1
Alternate form 5 867   0.84 4.3
Alternate form 6 858   0.86 4.4
Alternate form 7 912   0.88 4.2
Alternate form 8 858   0.86 4.0
Inter-rater 2 60   0.99   Cases pulled at random from the standardization sample.
Sex: Male 50%, Female 50%
Ethnicity:
African American      12%
Asian                           4%
Hispanic                     25%
White                          58%
Other                            2%
Household income:
Low                              51%
Middle                          27%
High                             21%
Region:
Northeast                    10%
Midwest                       28%
South                           52%
West                            10%
Inter-rater 3 60   0.99  
Inter-rater 4 60   0.99  
Inter-rater 5 60   0.99  
Inter-rater 6 60   0.99  
Inter-rater 7 59   0.99  
Inter-rater 8 60   0.97  

 

Reliability of the Slope: Convincing Evidence

Type of Reliability Age or Grade n (range) Coefficient SEM Information / Subjects
range median
Split-half reliability (odd & even data points) 2 6,632   0.78 0.15 Average # of month span per student = 6.9, range 3-11; Average # of data point per student = 15.2, range 10-52.
Split-half reliability (odd & even data points) 3 7,739   0.78 0.14 Average # of month span per student = 7.0, range 3-11; Average # of data point per student = 15.5, range 10-63.
Split-half reliability (odd & even data points) 4 7,553   0.79 0.16 Average # of month span per student = 7.0, range 3-11; Average # of data point per student = 15.6, range 10-66.
Split-half reliability (odd & even data points) 5 7,047   0.77 0.12 Average # of month span per student = 7.1, range 3-11; Average # of data point per student = 15.3, range 10-67.
Split-half reliability (odd & even data points) 6 4,212   0.76 0.13 Average # of month span per student = 7.2, range 3-11; Average # of data point per student = 15.4, range 10-48.
Split-half reliability (odd & even data points) 7 2,906   0.75 0.14 Average # of month span per student = 7.2, range 3-11; Average # of data point per student = 15.3, range 10-44.
Split-half reliability (odd & even data points) 8 3,083   0.78 0.13 Average # of month span per student = 7.0, range 3-11; Average # of data point per student = 14.8, range 10-43.

Validity of the Performance Level Score: Convincing Evidence

Type of Validity Age or Grade Test or Criterion n (range) Coefficient (if applicable) Information (including normative data)/Subjects
median
Predictive 3 (fall) NCEGT 553 0.63 (0.60) NCGET: North Carolina End of Grade Test
ISAT: Illinois Standards Achievement Test See tables below for description of Fall and Winter sample
4 (fall) ISAT 700 0.67 (0.60)
5 (fall) ISAT 752 0.57 (0.60)
6 (fall) ISAT 631 0.76 (0.78)
7 (fall) ISAT 723 0.61 (0.74)
8 (fall) ISAT 640 0.69 (0.74)
3 (winter) NCEGT 733 0.67 (0.64)
4 (winter) ISAT 699 0.60 (0.56)
5 (winter) ISAT 736 0.60 (0.63)
6 (winter) ISAT 855 0.75 (0.74)
7 (winter) ISAT 942 0.66 (0.76)
8 (winter) ISAT 783 0.71 (0.73)
Construct 3 (spring) NCEGT 736 0.64 (0.64) (largely the same as for the Fall and Winter samples)
4 (spring) ISAT 665 0.64 (0.58)
5 (spring) ISAT 746 0.60 (0.65)
6 (spring) ISAT 959 0.78 (0.78)
7 (spring) ISAT 930 0.71 (0.80)
8 (spring) ISAT 784 0.73 (0.76)

 

PREDICTIVE VALIDITY FALL SAMPLE description
Date: 2009-2010   No. of States: 2  Region: Midwest, South
    Gender % Ethnicity %
  Sample Size M F U W AA H A/P O Unk
Grade 3 553 37 39 24 39 18 9 4 3 26
Grade 4 700 46 45 9 62 6 7 1 2 21
Grade 5 752 50 45 5 75 7 6 1 2 9
Grade 6 631 53 43 4 58 6 11 2 1 22
Grade 7 723 46 49 4 58 4 7 2 1 28
Grade 8 640 48 43 9 59 4 6 4 1 25

 

PREDICTIVE VALIDITY WINTER SAMPLE description
Date: 2009-2010   No. of States: 2  Region: Midwest, South
    Gender % Ethnicity %
  Sample Size M F U W AA H A/P O Unk
Grade 3 731 40 41 19 35 25 12 4 4 21
Grade 4 700 46 45 9 61 8 7 1 2 21
Grade 5 736 50 45 5 77 7 4 1 2 9
Grade 6 855 52 46 3 53 10 15 3 2 16
Grade 7 942 47 50 3 54 8 11 3 3 22
Grade 8 783 50 44 6 58 7 10 3 3 18

 

Content Validity:
The M–CAP content design was based on the National Council of Teachers of Mathematics (NCTM) standards, as well as on the principles set forth in the National Resource Council (NRC) report Adding it Up, which indicates that the curricula for grades K–8 should comprise a number of domains, of which an understanding of number concepts and operations are deemed critical. The M–CAP domains (depending on the grade) are as follows: Numbers Sense; Operations; Patterns & Relationships; Measurement; Geometry; Data & Probability; Algebra; Probability; and Data & Statistics. The content coverage of the Stanford Achievement Test, Tenth Edition (Stanford 10) served as a general guideline for determining the proportion of items by learning domains at each grade level. The Stanford 10 was chosen because it is one of the most widely used norm-referenced assessments of mathematics achievement in the United States. See table below for the breakdown of content by grade:

  Grade
  2 3 4 5 6 7 8
Number Sense
Operations
Patterns & Relationships
Measurement
Geometry
Data & Probability      
Algebra      
Probability        
Data & Statistics        

 

Predictive Validity of the Slope of Improvement: Unconvincing Evidence

Type of Validity Age or Grade Test or Criterion n (range) Coefficient Information / Subjects
range median
Predictive 3 OAT (Ohio Achievement Test) Math 58   0.42 Partial correlation of slope with criterion, controlling for initial M-CAP level. Cases from the AIMSweb user database. Average number of months of PM: 5.6 (range 4-9); average number of PM scores per student: 13.2 (range 10-30).
Predictive 4 TAKS (Texas Assessment of Knowledge and Skills) Math 40   0.68 Partial correlation of slope with criterion, controlling for initial M-CAP level. Cases from the AIMSweb user database. Average number of months of PM: 5.5 (range 3-8); average number of PM scores per student: 15.4 (range 10-23).

 

Disaggregated Reliability and Validity Data: Data Unavailable

Alternate Forms: Partially Convincing Evidence

1. Evidence that alternate forms are of equal and controlled difficulty or, if IRT based, evidence of item or ability invariance:

During field testing of M-CAP, several probes were administered to each student using a matrix sampling design. As shown in the table below, the probes within a grade have consistent means and standard deviations.

Means and Standard Deviations of Raw Scores on M-CAP Probes, by Grade
  Grade 2 Grade 3 Grade 4 Grade 5 Grade 6 Grade 7 Grade 8
M SD M SD M SD M SD M SD M SD M SD
21.2 9.7 16.1 7.2 22.8 10.5 14.4 7.2 18.1 8.9 16.7 9.3 13.6 6.6
21.3 10.8 16.3 7.1 22.9 10.4 14.5 8.9 18.2 9.1 17.6 13.9 14.1 8.3
21.4 9.0 16.8 6.9 22.9 9.5 14.6 7.7 18.5 9.0 18.0 13.4 14.2 7.9
21.4 10.2 17.2 7.0 23.1 9.2 14.7 7.9 18.7 8.7 18.5 10.0 14.5 7.3
21.5 9.9 17.4 7.4 23.1 9.5 14.8 6.8 18.7 10.3 18.6 10.3 14.6 7.6
21.5 10.6 17.9 7.4 23.6 10.0 14.9 7.4 19.1 9.6 18.7 11.0 14.9 8.3
21.6 10.5 18.5 8.8 24.2 9.2 15.0 8.9 19.2 9.5 19.0 9.9 15.0 7.1
21.7 9.5 18.5 7.7 24.3 10.3 15.1 8.7 19.3 9.5 19.2 10.3 15.0 7.1
21.8 11.0 18.6 7.2 24.5 8.5 15.2 8.1 19.3 10.5 19.4 9.5 15.4 7.4
21.9 9.3 18.6 7.0 24.7 8.7 15.2 7.4 19.9 10.2 19.8 11.6 15.4 8.8
22.1 10.0 18.8 7.6 25.3 9.7 15.2 9.2 20.0 9.4 19.9 11.7 15.6 8.2
22.2 9.9 19.0 9.2 25.5 7.9 15.4 7.4 20.2 10.6 19.9 11.3 15.7 8.8
22.2 9.3 19.8 8.9 25.6 9.2 15.5 7.3 20.8 10.1 20.4 11.0 16.0 8.3
22.4 9.0 19.9 7.1 25.6 9.8 15.5 8.0 21.1 10.0 20.6 9.2 16.3 9.1
22.8 11.7 20.0 8.3 25.8 9.1 15.7 8.7 21.2 9.2 20.6 9.6 16.3 8.3
22.9 11.1 20.7 7.7 26.3 10.4 15.8 8.4 21.3 10.1 21.3 7.6 16.4 8.6
22.9 9.6 20.7 8.4 26.5 8.8 15.9 8.3 21.7 10.3 21.3 6.6 16.4 8.9
22.9 10.1 21.2 8.2 26.5 10.6 16.0 7.1 22.0 9.5 21.7 8.7 16.7 9.4
23.8 9.6 21.2 7.8 27.0 9.0 16.4 9.1 22.0 10.2 21.8 8.6 16.9 9.3
23.9 10.2 21.3 9.1 27.2 8.1 16.7 9.4 22.1 9.9 21.8 7.5 17.7 9.1
24.0 9.8 21.4 9.9 27.2 10.2 16.8 9.1 22.3 9.0 22.2 7.8 18.7 10.3
24.4 9.7 21.4 9.3 27.3 10.5 17.0 9.9 22.4 10.3 22.8 7.2 18.8 8.7
24.4 10.5 21.5 9.5 27.6 10.3 17.7 9.7 22.5 9.1 22.9 10.5 18.8 9.6
24.9 10.2 21.5 9.0 28.0 8.0 17.8 9.9 22.6 9.1 23.0 10.0 19.0 9.8
25.2 10.8 21.7 9.3 28.5 9.5 18.0 10.6 22.8 9.5 23.0 10.8 19.1 9.5
25.4 10.1 21.9 9.0 28.9 9.0 18.1 9.8 22.8 9.8 23.4 9.8 19.2 8.4
25.6 9.8 22.1 9.2 29.0 8.7 18.1 10.2 23.1 9.8 23.5 10.4 19.3 8.9
25.8 11.2 22.1 8.5 29.6 9.2 18.6 9.9 23.3 9.3 24.0 10.5 19.5 9.5
26.0 10.1 22.2 8.6 29.6 9.5 18.6 11.0 24.2 8.8 24.2 10.2 20.0 10.4
26.4 10.3 22.5 9.3 30.4 8.7 19.3 11.0 25.5 9.3 25.4 11.0 20.2 9.9
Mean: 23.2 10.1 19.9 8.3 26.1 9.4 16.2 8.8 21.1 9.6 21.0 10.0 16.8 8.6
SD 1.7 0.7 1.9 0.9 2.2 0.8 1.4 1.2 1.9 0.5 2.2 1.7 2.0 1.0

2. Number of alternate forms of equal and controlled difficulty:

30 per grade

Sensitive to Student Improvement: Convincing Evidence

1. Describe evidence that the monitoring system produces data that are sensitive to student improvement (i.e., when student learning actually occurs, student performance on the monitoring tool increases on average).

The sensitivity to student improvement of the AIMSWeb M-CAP monitoring system was assessed by comparing students who received instructional intervention in mathematics (as indicated by the fact that they had received progress monitoring with M-CAP) with students from the same school who did not receive interventions (i.e., did not have progress monitoring with M-CAP). Improvement during the year was measured for all students by comparing scores on Fall and Spring administrations of the benchmark M-CAP assessments, which are identical in content coverage and administration procedure to the progress monitoring assessments (but which do not share any common items with the PM measures). Sensitivity to improvement was evaluated by comparing average M-CAP score gains of the students with and without intervention.

At each grade level, one school with a sufficient number of students being progress-monitored was selected. All data were obtained from the 2010-2011 school year.

An independent-samples t test was computed at each grade level to compare the improvement scores of students with and without progress monitoring. The results were statistically significant (p < .05) at each grade level. More detailed information about the samples and results are presented in Table 1 below.

To assess the possibility that group differences resulted from practice effects (as the PM group was administered from 10 to 31 probes), score gains within the PM group were regressed onto the number of administrations controlling for the span of time from initial to final administration. All but one coefficient was non-significant indicating that practice affects were negligible.

To address the possibility that group differences resulted from differences in initial level of performance (as baseline performance is typically used to determine who needs PM), ROI by initial score level from a nationally representative sample is presented. Table 2 below shows ROIs from the nationally representative sample by score level (defined as a percentile ranges centered on the 10th, 25th, 50th, 75th, and 90th percentile). ROIs are about the same across score levels in each grade indicating that ROIs are not expected to be higher solely based on initial performance. Thus, it seems reasonable to rule out initial performance as an explanation for the PM group gains. As further evidence that the PM group experienced elevated growth rates, the ROI for the PM group in each grade is greater than the ROI at every score level in the nationally representative sample; whereas the no-PM group ROIs are about the same as the national sample.

Because the PM group would have received instructional intervention, and additional instruction is expected to lead to more learning, and M-CAP score gains were significantly greater in the PM group than the no-PM group, it is reasonable to conclude that AIMSweb M-CAP assessment measures are sensitive to student improvement.

Table 1. Mean and SDs of the average improvement by group, independent sample t-tests results, and p-values for the effect of the number of PM administrations.

      # of month span # of PM data points Average Improvement/SD t-test # PM administrations
Grade Total # of students # of students with PM Avg. Min Max Avg. Min Max no PM PM t p Partial correlation coefficient
p value
M SD M SD
2 221 31 5.4 3 10 15.0 10 31 0.38 0.20 0.50 0.23 3.08 <.01 0.44
3 176 26 8.0 6 9 15.1 10 18 0.25 0.15 0.35 0.14 3.19 <.01 <.01
4 283 32 8.6 7 9 18.7 10 30 0.13 0.19 0.26 0.14 5.18 <.01 0.79
5 416 23 4.5 3 6 12.1 10 18 0.09 0.14 0.16 0.13 2.24 <.05 0.13
6 314 38 6.9 3 8 15.2 10 27 0.20 0.18 0.32 0.26 2.82 <.01 0.30
7 327 58 7.0 6 7 10.1 10 12 0.28 0.18 0.38 0.16 4.11 <.01 0.46
8 266 51 8.1 6 9 13.2 10 18 0.11 0.14 0.16 0.17 2.15 0.03 0.35

Table 2. Median ROI by initial score level: M-CAP

  Grade
Fall Percentile 2 3 4 5 6 7 8
1-17 0.28 0.22 0.19 0.08 0.14 0.19 0.11
18-32 0.33 0.22 0.17 0.08 0.17 0.19 0.11
33-62 0.39 0.22 0.17 0.08 0.17 0.22 0.11
63-83 0.47 0.22 0.14 0.08 0.17 0.25 0.11
84-99 0.42 0.25 0.14 0.06 0.19 0.22 0.08

 

End-of-Year Benchmarks: Convincing Evidence

1. Are benchmarks for minimum acceptable end-of-year performance specified in your manual or published materials?

Yes.

a. Specify the end-of-year performance standards:

Varies by grade (near the 15th percentile on national norms).

b. Basis for specifying minimum acceptable end-of-year performance:

Criterion-referenced

c. Specify the benchmarks:

Varies by grade (near the 45th percentile on national norms). M-CAP benchmarks were established through empirical research on the relationship of scores to success on state mathematics tests, using data from 20 states (see the State Prediction User’s Guide (2011) for a full description of this research). The cut score for minimum acceptable performance is the score that indicates a 50% probability of passing the typical state mathematics test.

d. Basis for specifying these benchmarks?

Norm-referenced/Criterion-referenced

Normative profile:

Representation: National
Date: 2009-2010
Number of States: approximately 40
Size: 53,927
Gender: 51% Male, 49% Female
SES: 43% free/reduced lunch
Race/Ethnicity: 59% White, 19% Black, 17% Hispanic, 4% Asian/Pacific Islander, 2% Other

Rates of Improvement Specified: Convincing Evidence

1. Is minimum acceptable growth (slope of improvement or average weekly increase in score by grade level) specified in manual or published materials?

No, AIMSweb specifies the median value of growth by grade and score level based on the national norm sample. Users determine what growth rate is required on an individual basis.

a. Specify the growth standards:

N/A

b. Basis for specifying minimum acceptable growth:

Other

2. Normative profile:

Representation: National
Date: 2009-2010
Number of States: approximately 40
Size: 53,927
Gender: 51% Male, 49% Female
SES: 43% free/reduced lunch
Race/Ethnicity: 59% White, 19% Black, 17% Hispanic, 4% Asian/Pacific Islander, 2% Other

3. Procedure for specifying criterion for adequate growth:

To get the most value from progress monitoring, AIMSweb recommends the following: (1) establish a time frame, (2) determine the level of performance expected, and (3) determine the criterion for success. Typical time frames include the duration of the intervention or the end of the school year. An annual time frame is typically used when IEP goals are written for students who are receiving special education. AIMSweb goals can be written as: In 34 weeks (1 academic year), the student will write correct answers to concept and application problems, earning 40 points on grade 5 M–CAP probes.

The criterion for success may be set according to standards, local norms, national norms, or a normative Rate of Improvement (ROI). The team may want to compare a student’s performance to district/local norms; that is, to compare the scores to his or her peers in the context of daily learning. The last type of criterion is to use a normative rate-of-improvement (ROI). Using a mathematical formula (Initial Score + [Expected ROI x Number of Weeks]), an average rate of weekly improvement attained from a normative database is multiplied by the time frame to determine the criterion for success. For detailed information and direction for setting goals, see Progress Monitoring Strategies for Writing Individual Goals in General Curriculum and More Frequent Formative Evaluation (Shinn, 2002b).

Decision Rules for Changing Instruction: Convincing Evidence

Specification of validated decision rules for when changes to instruction need to be made: The newest version of the AIMSweb online system, to be released for piloting in the fall of 2012 and made available to all users no later than the fall of 2013, applies a statistical procedure to the student’s monitoring scores in order to provide empirically-based guidance about whether the student is likely to meet, fall short of, or exceed their goal. The calculation procedure (presented below) is fully described in the AIMSweb Progress Monitoring Guide (Pearson, 2012) and can be implemented immediately by AIMSweb users if they create a spreadsheet or simple software program. Once the new AIMSweb online system is fully distributed, the user will not have to do any calculations to obtain this data-based guidance. The decision rule is based on a 75% confidence interval for the student’s predicted score at the goal date. This confidence interval is student-specific and takes into account the number and variability of monitoring scores and the duration of monitoring. Starting at the sixth week of monitoring, when there are at least four monitoring scores, the AIMSweb report following each monitoring administration includes one of the following statements: “The student is projected to not reach the goal.” This statement appears if the confidence interval is completely below the goal score. “The student is projected to exceed the goal.” This statement appears if the confidence interval is completely above the goal score. “The student is on track to reach the goal. The projected score at the goal date is between X and Y” (where X and Y are the bottom and top of the confidence interval). This statement appears if the confidence interval includes the goal score. If Statement A appears, the user has a sound basis for deciding that the current intervention is not sufficient and a change to instruction should be made. If Statement B appears, there is an empirical basis for deciding that the goal is not sufficiently challenging and should be increased. If Statement C appears, the student’s progress is not clearly different from the aimline and so there is not a compelling reason to change the intervention or the goal; however, the presentation of the confidence-interval range enables the user to see whether the goal is near the upper limit or lower limit of the range, which would signal that the student’s progress is trending below or above the goal.  A 75% confidence interval was chosen for this application because it balances the costs of the two types of decision errors. Incorrectly deciding that the goal will not be reached (when in truth it will be reached) has a moderate cost: an intervention that is working will be replaced by a different intervention. Incorrectly deciding that the goal may be reached (when in truth it will not be reached) also has a moderate cost: an ineffective intervention will be continued rather than being replaced. Because both kinds of decision errors have costs, it is appropriate to use a modest confidence level.
 
Calculation of the 75% confidence interval for the score at the goal date. Calculate the trend line. This is the ordinary least-squares regression line through the student’s monitoring scores. Calculate the projected score at the goal date. This is the value of the trend line at the goal date. Calculate the standard error of estimate (SEE) of the projected score at the goal date, using the following formula: [((1 + 1/k + (GW – mean(w)))/(k – 2))((sum(y – y’)2)/(sum(w – mean(w))2))]1/2 where k = number of completed monitoring administrations w = week number of a completed administration GW = week number of the goal date y = monitoring score y’ = predicted monitoring score at that week (from the student’s trendline). The means and sums are calculated across all of the completed monitoring administrations up to that date. Add and subtract 1.25 times the SEE to the projected score, and round to the nearest whole numbers.
 
Evidentiary basis for these decision rules: The decision rules are statistically rather than empirically based. The guidance statements that result from applying the 75% confidence interval to the projected score are correct probabilistic statements, under certain assumptions: The student’s progress can be described by a linear trend line. If the pattern of the student’s monitoring scores is obviously curvilinear, then the projected score based on a linear trend will likely be misleading. We provide training in the AIMSweb Progress Monitoring Guide about the need for users to take non-linearity into account when interpreting progress-monitoring data. The student will continue to progress at the same rate as they have been progressing to that time. This is an unavoidable assumption for a decision system based on extrapolating from past growth. Even though the rules are not derived from data, it is useful to observe how they work in a sample of real data. For this purpose, we selected random samples of students in the AIMSweb 2010-2011 database who were progress-monitored on either Reading Curriculum-Based Measurement (R-CBM) or Math Computation (M-COMP). All students scored below the 25th percentile in the fall screening administration of R-CBM or M-COMP. The R-CBM sample consisted of 1,000 students (200 each at grades 2 through 6) who had at least 30 monitoring scores, and the M-COMP sample included 500 students (100 per grade) with a minimum of 28 monitoring scores. This analysis was only a rough approximation, because we did not know each student’s actual goal or whether the intervention or goal was changed during the year. To perform the analyses, we first set an estimated goal for each student by using the ROI at the 85th percentile of AIMSweb national ROI norms to project their score at their 30th monitoring administration. Next, we defined “meeting the goal” as having a mean score on the last three administrations (e.g., the 28th through 30th administrations of R-CBM) that was at or above the goal score. At each monitoring administration for each student, we computed the projected score at the goal date and the 75% confidence interval for that score, and recorded which of the three decision statements was generated (projected not to meet goal, projected to exceed goal, or on-track/no-change).
 
In this analysis, accuracy of guidance to change (that is, accuracy of projections that the student will not reach the goal or will exceed the goal) reached a high level (80%) by about the 13th to 15th monitoring administration, on average. The percentage of students receiving guidance to not change (i.e., their trendline was not far from the aimline) would naturally tend to decrease over administrations as the size of the confidence interval decreased. At the same time, however, there was a tendency for the trendline to become closer to the aimline over time as it became more accurately estimated, and this worked to increase the percentage of students receiving the “no change” guidance. 

Decision Rules for Increasing Goals: Convincing Evidence

Specification of validated decision rules for when increases in goals need to be made: The newest version of the AIMSweb online system, to be released for piloting in the fall of 2012 and made available to all users no later than the fall of 2013, applies a statistical procedure to the student’s monitoring scores in order to provide empirically-based guidance about whether the student is likely to meet, fall short of, or exceed their goal. The calculation procedure (presented below) is fully described in the AIMSweb Progress Monitoring Guide (Pearson, 2012) and can be implemented immediately by AIMSweb users if they create a spreadsheet or simple software program. Once the new AIMSweb online system is fully distributed, the user will not have to do any calculations to obtain this data-based guidance. The decision rule is based on a 75% confidence interval for the student’s predicted score at the goal date. This confidence interval is student-specific and takes into account the number and variability of monitoring scores and the duration of monitoring. Starting at the sixth week of monitoring, when there are at least four monitoring scores, the AIMSweb report following each monitoring administration includes one of the following statements: “The student is projected to not reach the goal.” This statement appears if the confidence interval is completely below the goal score. “The student is projected to exceed the goal.” This statement appears if the confidence interval is completely above the goal score. “The student is on track to reach the goal. The projected score at the goal date is between X and Y” (where X and Y are the bottom and top of the confidence interval). This statement appears if the confidence interval includes the goal score. If Statement A appears, the user has a sound basis for deciding that the current intervention is not sufficient and a change to instruction should be made. If Statement B appears, there is an empirical basis for deciding that the goal is not sufficiently challenging and should be increased. If Statement C appears, the student’s progress is not clearly different from the aimline and so there is not a compelling reason to change the intervention or the goal; however, the presentation of the confidence-interval range enables the user to see whether the goal is near the upper limit or lower limit of the range, which would signal that the student’s progress is trending below or above the goal.  A 75% confidence interval was chosen for this application because it balances the costs of the two types of decision errors. Incorrectly deciding that the goal will not be reached (when in truth it will be reached) has a moderate cost: an intervention that is working will be replaced by a different intervention. Incorrectly deciding that the goal may be reached (when in truth it will not be reached) also has a moderate cost: an ineffective intervention will be continued rather than being replaced. Because both kinds of decision errors have costs, it is appropriate to use a modest confidence level. Calculation of the 75% confidence interval for the score at the goal date. Calculate the trend line. This is the ordinary least-squares regression line through the student’s monitoring scores. Calculate the projected score at the goal date. This is the value of the trend line at the goal date. Calculate the standard error of estimate (SEE) of the projected score at the goal date, using the following formula: [((1 + 1/k + (GW – mean(w)))/(k – 2))((sum(y – y’)2)/(sum(w – mean(w))2))]1/2 where k = number of completed monitoring administrations w = week number of a completed administration GW = week number of the goal date y = monitoring score y’ = predicted monitoring score at that week (from the student’s trendline). The means and sums are calculated across all of the completed monitoring administrations up to that date. Add and subtract 1.25 times the SEE to the projected score, and round to the nearest whole numbers.
 
Evidentiary basis for these decision rules: The decision rules are statistically rather than empirically based. The guidance statements that result from applying the 75% confidence interval to the projected score are correct probabilistic statements, under certain assumptions: The student’s progress can be described by a linear trend line. If the pattern of the student’s monitoring scores is obviously curvilinear, then the projected score based on a linear trend will likely be misleading. We provide training in the AIMSweb Progress Monitoring Guide about the need for users to take non-linearity into account when interpreting progress-monitoring data. The student will continue to progress at the same rate as they have been progressing to that time. This is an unavoidable assumption for a decision system based on extrapolating from past growth. Even though the rules are not derived from data, it is useful to observe how they work in a sample of real data. For this purpose, we selected random samples of students in the AIMSweb 2010-2011 database who were progress-monitored on either Reading Curriculum-Based Measurement (R-CBM) or Math Computation (M-COMP). All students scored below the 25th percentile in the fall screening administration of R-CBM or M-COMP. The R-CBM sample consisted of 1,000 students (200 each at grades 2 through 6) who had at least 30 monitoring scores, and the M-COMP sample included 500 students (100 per grade) with a minimum of 28 monitoring scores. This analysis was only a rough approximation, because we did not know each student’s actual goal or whether the intervention or goal was changed during the year. To perform the analyses, we first set an estimated goal for each student by using the ROI at the 85th percentile of AIMSweb national ROI norms to project their score at their 30th monitoring administration. Next, we defined “meeting the goal” as having a mean score on the last three administrations (e.g., the 28th through 30th administrations of R-CBM) that was at or above the goal score. At each monitoring administration for each student, we computed the projected score at the goal date and the 75% confidence interval for that score, and recorded which of the three decision statements was generated (projected not to meet goal, projected to exceed goal, or on-track/no-change).
 
In this analysis, accuracy of guidance to change (that is, accuracy of projections that the student will not reach the goal or will exceed the goal) reached a high level (80%) by about the 13th to 15th monitoring administration, on average. The percentage of students receiving guidance to not change (i.e., their trendline was not far from the aimline) would naturally tend to decrease over administrations as the size of the confidence interval decreased. At the same time, however, there was a tendency for the trendline to become closer to the aimline over time as it became more accurately estimated, and this worked to increase the percentage of students receiving the “no change” guidance.

Improved Student Achievement: Data Unavailable

Improved Teacher Planning Data Unavailable