You are here
Home ›aimswebPlus Math
Number Comparison FluencyPairs
Cost 
Technology, Human Resources, and Accommodations for Special Needs 
Service and Support 
Purpose and Other Implementation Information 
Usage and Reporting 
aimswebPlus™ is a subscriptionbased tool. There are three subscription types available for customers: aimswebPlus Complete is $8.50 per student and includes all measures. aimswebPlus Reading is $6.50 per student and includes early literacy and reading measures. aimswebPlus Math is $6.50 per student and includes early numeracy and math measures. 
Test accommodations that are documented in a student’s Individual Education Plan (IEP) are permitted with aimswebPlus. However, not all measures allow for accommodations. Number Comparison Fluency–Pairs (NCF–P) is an individually administered, timed test that employs strict time limits, in part, to generate ratebased scores. As such, valid interpretation of national norms, which are an essential aspect of decisionmaking during benchmark testing, depend on strict adherence to the standard administration procedures. The following accommodations are allowed for Number Comparison Fluency–Pairs during screening and progress monitoring: enlarging test forms and modifying the environment (e.g., special lighting, adaptive furniture.) 
NCS Pearson, Inc. Training manuals are included and should provide all implementation information. Pearson provides phone and emailbased ongoing technical support, as well as a user group forum that facilitates the asking and answering of questions. 
aimswebPlus is a brief and valid assessment system for monitoring reading and math skills. Normative data were collected in 201314 on a combination of fluency measures that are sensitive to growth as well as new standardsbased assessments of classroom skills. The resulting scores and reports inform instruction and help improve student performance in Grades 2 through 8, while the Early Literacy and Early Numeracy measures provide ecologically valid and developmentally appropriate information about foundational reading and math abilities for students in Kindergarten and Grade 1. The student sees rows of number pairs on each test page. Starting with the first row, the student points to and names the larger number in each pair. Each NCF–P form contains 50 items, presented in 5 rows of number pairs per page. Twenty unique progress monitoring (PM) forms are available; PM testing conducted at teacherdetermined intervals 
While the Kindergarten and Grade 1 measures are administered individually, most of the Grades 2 through 8 measures can be taken online by entire classes. Once testing is complete, summary or detailed reports for students, classrooms, and districts can be immediately generated, and the math and reading composite scores can be used to estimate the risk to students or classes for meeting endofyear goals. aimswebPlus reports also offer score interpretation information based on foundational skills for college and career readiness, learning standards, and other guidelines, Lexile® and Quantile® information, and recommendations for appropriate teaching resources. Raw score and percentiles scores (based on grade norms) are provided. Local norms are also available. NCF–P is a timed measure that assesses fluency of foundational math skills. Performance is reported on the raw number correct score. 
Reliability of the Performance Level Score
Grade  1 

Rating 
Reliability Coefficients for Number Comparison FluencyPairs, Grade 1
Type of Reliability 
Grade 
n (range) 
Coefficient Range 
Coefficient Median 
SEM 
Alternate form 
1 
206–223 
0.86–0.89 
0.88 
2.36 
Reliability of the Slope
Grade  1 

Rating 
Validity of the Performance Level Score
Grade  1 

Rating 
aimswebPlus Math NCFP Score Predictive Validity Coefficient, by Grade and Criterion Measure (TCAP=Tennessee Comprehensive Assessment Program; CA = aimswebPlus Concepts & Applications described in GOM 4)
Correlation 
Gender Percentage 
Race Percentage 
ELL 
% Free/Reduced Lunch 

Criterion 
Grade 
N 
UnAdj 
Adj^{1} 
F 
M 
B 
H 
O 
W 
Yes 
68  100 
34  67 
0  33 
TCAP 
1 (Fall) 
55 
0.56 
0.72 
53 
47 
2 
25 
0 
73 
24 
0 
100 
0 
CA 
1 (Fall) 
801 
0.56 
0.56 
50 
50 
13 
25 
10 
51 
9 
36 
33 
32 
^{1} correlation adjusted for range restriction
aimswebPlus Math NCFP Score Concurrent Validity Coefficient, by Grade and Criterion Measure (TCAP=Tennessee Comprehensive Assessment Program; CA = aimswebPlus Concepts & Applications described in GOM 4)
Correlation 
Gender Percentage 
Race Percentage 
ELL 
% Free/Reduced Lunch 

Criterion 
Grade 
N 
UnAdj 
Adj^{1} 
F 
M 
B 
H 
O 
W 
Yes 
68  100 
34  67 
0  33 
TCAP 
1 (Spring) 
55 
0.60 
0.58 
53 
47 
2 
25 
0 
73 
24 
0 
100 
0 
CA 
1 (Spring) 
801 
0.56 
0.56 
50 
50 
13 
25 
10 
51 
9 
36 
33 
32 
^{1} correlation adjusted for range restriction
Predictive Validity of the Slope of Improvement
Grade  1 

Rating 
The predictive validity of the Number Comparison Fluency–Pairs (NCF–P) slope was assessed using the correlation of the annual NCF–P ROI (NCF–PROI) with spring Concepts & Applications (CASpring) test scores, after controlling for fall NCF–P (NCF–PFall) performance. The model used is shown here:
〖CA〗_spring= Intercept+ (β_1 )×〖NCFP〗_Fall+ (β_2 )×〖NCFP〗_ROI+ ε
A positive and statistically significant β_2 indicates that for a given fall NCF–P score, students with higher NCF–P ROIs had higher spring CA scores.
Concepts & Applications (CA) is an individually administered, untimed (nonspeeded) math measure that assesses conceptual knowledge and math problem solving skills. It is standards based, with items that align to the Common Core State Standards of Mathematics. CA is used exclusively for screening (benchmarking) and is not part of the progress monitoring system. There are three CA benchmark forms (fall, winter, and spring), each with from 25 items. Scores are reported as a total number correct.
Predictive validity of the fall to spring rate of improvement, by Grade 1 measure
Measure 
N 
B_{2} 
SE 
T 
p 
NCF–P 
800 
7.8 
0.85 
8.9 
<.01 
MFF–1D 
800 
5.9 
0.80 
7.4 
<.01 
MFF–T 
800 
6.7 
0.57 
11.7 
<.01 
Bias Analysis Conducted
Grade  1 

Rating  No 
Disaggregated Reliability and Validity Data
Grade  1 

Rating  No 
Alternate Forms
Grade  1 

Rating 
What is the number of alternate forms of equal and controlled difficulty? 20
To maximize the equivalency of the alternate test forms used for progress monitoring, each form was developed from the same set of test specifications (i.e., test blueprints). The specifications indicate which skills to measure, the number of items per skill, and how to sequence the skills on each form. Each submission document previously provided contains test blueprint information for each measure, including item counts by skill.
Twentyfour alternate NCF–P forms per grade were developed from these specifications and administered to a students from across the U.S. For this study, each student completed three NCF–P forms. Twentyfour sets of forms were defined per grade, with each form appearing in two sets. Each set included the winter benchmark form for that grade and two alternate forms. In half of the sets, the same forms were presented but in reverse order. For example, Set 1A = Winter, PM1, PM2 forms, while Set 1B = Winter, PM2, PM1 forms. The winter form was used as an anchor form and to control for sampling differences across sets. Counterbalancing controlled for order effects.
Sets were randomly assigned to students by spiraling sets within grade at each testing site.
Form equivalency is further evaluated by comparing the mean difficulty of each form. Two methods are used here to describe comparability of form difficulty: effect size and percentage of total score variance attributable to form.
The effect size (ES) for each form is the mean of the form minus the weighted average across all forms divided by the pooled SD:
ES=((x_iX ̅ ))/〖SD〗_pooled
Effect sizes less than 0.30 are considered small. Most effect sizes were less than 0.15 for the NCF–P forms.
The percentage of the total score variance attributable to test form was computed by dividing the between form variance by the pooled within form variance plus between form variance. The percentage of test score variance attributable to forms was less than 2%.
Alternate Form Means and Standard Deviations for Number Comparison FluencyPairs, Grade 1
Measure 
Form 
n 
Mean 
SD 
ES 
NCF–P 
4 
57 
27.2 
6.28 
0.18 
NCF–P 
5 
28 
28.6 
9.61 
0.02 
NCF–P 
6 
62 
29.6 
4.26 
0.17 
NCF–P 
7 
37 
28.6 
6.14 
0.02 
NCF–P 
8 
47 
28.5 
7.40 
0.01 
NCF–P 
9 
53 
29.4 
6.66 
0.14 
NCF–P 
10 
71 
28.9 
5.65 
0.07 
NCF–P 
11 
50 
28.1 
7.71 
0.05 
NCF–P 
12 
63 
27.7 
6.17 
0.11 
NCF–P 
13 
39 
29.5 
7.01 
0.16 
NCF–P 
14 
39 
28.6 
5.46 
0.02 
NCF–P 
15 
62 
29.2 
4.40 
0.11 
NCF–P 
16 
47 
27.2 
6.85 
0.18 
NCF–P 
17 
37 
28.6 
7.18 
0.02 
NCF–P 
18 
70 
28.2 
6.66 
0.04 
NCF–P 
19 
57 
27.7 
7.76 
0.11 
NCF–P 
20 
28 
29.3 
9.41 
0.13 
NCF–P 
21 
52 
28.0 
6.47 
0.07 
NCF–P 
22 
50 
27.3 
7.12 
0.17 
NCF–P 
23 
63 
28.8 
7.07 
0.05 


Mean 
28.5 
6.8 
0.09 


SD 
0.76 
1.30 




Percentage variance: 1.25% 


Rates of Improvement Specified
Grade  1 

Rating 
Is minimum acceptable growth (slope of improvement or average weekly increase in score by grade level) specified in your manual or published materials?
Yes
Specify the growth standards:
aimswebPlus provides student growth percentiles (SGP) by grade and initial (fall and winter) performance level for establishing growth standards. An SGP indicates the percentage of students in the national sample whose seasonal (or annual) rate of improvement (ROI) fell at or below a specified ROI. Separate SGP distributions are computed for each of five levels of initial (fall or winter) performance.
When setting a performance goal for a student, the system automatically generates feedback as to the appropriateness of the goal. An SGP < 50 is considered Insufficient; an SGP between 50 and 85 is considered Closes the Gap; an SGP between 85 and 97 is considered Ambitious; and an SGP > 97 is considered Overly Ambitious. aimswebPlus recommends setting performance goals that represents rates of growth between the 85th and 97th SGP. However, the user ultimately determines what growth rate is required on an individual basis.
What is the basis for specifying minimum acceptable growth?
Normreferenced
If normreferenced, describe the normative profile.
Demographic Characteristics of the aimswebPlus Norm Sample, Kindergarten and Grade 1
Sex 
Race 
SES (F/R lunch) 

Subject 
Grade 
F 
M 
B 
H 
O 
W 
Low 
Mod 
High 
Math 
K 
0.50 
0.50 
0.14 
0.25 
0.10 
0.51 
0.32 
0.32 
0.36 
Math 
1 
0.50 
0.50 
0.13 
0.25 
0.10 
0.51 
0.32 
0.32 
0.36 
Reading 
K 
0.50 
0.50 
0.14 
0.25 
0.10 
0.51 
0.32 
0.32 
0.36 
Reading 
1 
0.50 
0.50 
0.13 
0.25 
0.10 
0.51 
0.32 
0.32 
0.36 
Representation: National
Date: 2013–2014
Number of States: 11
Regions: 4
Gender: 50% male, 50% female
SES: Low, middle, high, free and reduced lunch
Please describe other procedures for specifying adequate growth:
To get the most value from progress monitoring, aimswebPlus recommends the following: (1) establish a time frame, (2) determine the level of performance expected, and (3) determine the criterion for success. Typical time frames include the duration of the intervention or the end of the school year. An annual time frame is typically used when IEP goals are written for students who are receiving special education. For example, aimswebPlus goals can be written as follows: In 34 weeks, the student will compare numbers and answer computational problems to earn of score of 30 points on Grade 4 Number Sense Fluency forms.
EndofYear Benchmarks
Grade  1 

Rating 
Are benchmarks for minimum acceptable endofyear performance specified in your manual or published materials?
Yes
Specify the endofyear performance standards:
What is the basis for specifying minimum acceptable endofyear performance?
Normreferenced
Specify the benchmarks:
Percentage of students below proficient level on state test.
What is the basis for specifying these benchmarks?
Normreferenced
If normreferenced, describe the normative profile:
Demographic Characteristics of the aimswebPlus Norm Sample, Kindergarten and Grade 1
Sex 
Race 
SES (F/R lunch) 

Subject 
Grade 
F 
M 
B 
H 
O 
W 
Low 
Mod 
High 
Math 
K 
0.50 
0.50 
0.14 
0.25 
0.10 
0.51 
0.32 
0.32 
0.36 
Math 
1 
0.50 
0.50 
0.13 
0.25 
0.10 
0.51 
0.32 
0.32 
0.36 
Reading 
K 
0.50 
0.50 
0.14 
0.25 
0.10 
0.51 
0.32 
0.32 
0.36 
Reading 
1 
0.50 
0.50 
0.13 
0.25 
0.10 
0.51 
0.32 
0.32 
0.36 
Representation: National
Date: 2013–2014
Number of States: 11
Regions: 4
Gender: 50% male, 50% female
SES: Low, middle, high, free and reduced lunch
Sensitive to Student Improvement
Grade  1 

Rating 
Describe evidence that the monitoring system produces data that are sensitive to student improvement (i.e., when student learning actually occurs, student performance on the monitoring tool increases on average).
Sensitivity to improvement was assessed by demonstrating that annual performance gains were statistically significant and moderate in size as expressed in fall standard deviation units. A gain expressed in SD units that exceeds 0.3 can be considered moderate (see Cohen, J., 1988. Statistical Power Analysis for the Behavioral Sciences (Second Edition). Lawrence Erlbaum Associates.)
NCFP fall and spring benchmark means, SDs, pairedsample t, and annual gain represented as fall standard deviation units
Mean 
SD 
N 
Paired t 
p 
Gain/SD 

Measure 
Fall 
Spring 
Fall 
Spring 

NCF–P 
23.6 
29.8 
7.50 
6.20 
2000 
45.0 
<0.01 
0.83 
Decision Rules for Changing Instruction
Grade  1 

Rating 
Does your manual or published materials specify validated decision rules for when changes to instruction need to be made?
Yes
Specify the decision rules:
aimswebPlus applies a statistical procedure to the student’s progress monitoring scores in order to provide empiricallybased guidance about whether the student is likely to meet, fall short of, or exceed his/her goal. The calculation procedure (presented below) is fully described in the aimsweb Progress Monitoring Guide (Pearson, 2012). aimswebPlus users will not have to do any calculations—the online system does this automatically. The decision rule is based on a 75% confidence interval for the student’s predicted score at the goal date. This confidence interval is studentspecific and takes into account the number and variability of progress monitoring scores and the duration of monitoring. Starting at the sixth week of monitoring (when there are at least four monitoring scores), the aimswebPlus report following each progress monitoring administration includes one of the following statements:
A. “The student is projected to not reach the goal.” This statement appears if the confidence interval is completely below the goal score.
B. “The student is projected to exceed the goal.” This statement appears if the confidence interval is completely above the goal score.
C. “The student is projected to be near the goal. The projected score at the goal date is between X and Y” (where X and Y are the bottom and top of the confidence interval). This statement appears if the confidence interval includes the goal score.
If Statement A appears, the user has a sound basis for deciding that the current intervention is not sufficient and a change to instruction should be made. If Statement B appears, there is an empirical basis for deciding that the goal is not sufficiently challenging and should be increased. If Statement C appears, the student’s progress is not clearly different from the aimline, so there is not a compelling reason to change the intervention or the goal; however, the presentation of the confidenceinterval range enables the user to see whether the goal is near the upper limit or lower limit of the range, which would signal that the student’s progress is trending below or above the goal.
A 75% confidence interval was chosen for this application because it balances the costs of the two types of decision errors. Incorrectly deciding that the goal will not be reached (when in truth it will be reached) has a moderate cost: an intervention that is working will be replaced by a different intervention. Incorrectly deciding that the goal may be reached (when in truth it will not be reached) also has a moderate cost: an ineffective intervention will be continued rather than being replaced. Because both kinds of decision errors have costs, it is appropriate to use a modest confidence level.
Calculation of the 75% confidence interval for the score at the goal date:
Calculate the trend line. This is the ordinary leastsquares regression line through the student’s monitoring scores.
Calculate the projected score at the goal date. This is the value of the trend line at the goal date.
Calculate the standard error of estimate (SEE) of the projected score at the goal date, using the following formula:
〖SEE〗_(predicted score)= √((∑_i^k▒(y_iy ́_i )^2 )/(k2))×√(1+1/k+(GW(∑_1^k▒w_i )/k)^2/(∑_i^k▒(w_i(∑_1^k▒w_i )/k)^2 ))
where k = number of completed monitoring administrations, w = week number of a completed administration, GW = week number of the goal date, y = monitoring score, y’ = predicted monitoring score at that week (from the student’s trendline).The means and sums are calculated across all of the completed monitoring administrations up to that date. Add and subtract 1.25 times the SEE to the projected score, and round to the nearest whole numbers.
What is the evidentiary basis for these decision rules?
The decision rules are statistically rather than empirically based. The guidance statements that result from applying the 75% confidence interval to the projected score are correct probabilistic statements, under certain assumptions: The student’s progress can be described by a linear trend line. If the pattern of the student’s monitoring scores is obviously curvilinear, then the projected score based on a linear trend will likely be misleading. We provide training in the aimsweb Progress Monitoring Guide about the need for users to take nonlinearity into account when interpreting progressmonitoring data. The student will continue to progress at the same rate as they have been progressing to that time. This is an unavoidable assumption for a decision system based on extrapolating from past growth.
Even though the rules are not derived from data, it is useful to observe how they work in a sample of real data. For this purpose, we selected random samples of students in the aimsweb 2010–2011 database who were progressmonitored on either Reading CurriculumBased Measurement (RCBM) or Math Computation (MCOMP). All students selected scored below the 25th percentile in the fall screening administration of RCBM or MCOMP. The RCBM sample consisted of 1,000 students (200 each at of Grades 2 through 6) who had at least 30 monitoring scores, and the MCOMP sample included 500 students (100 per Grades 2 through 6) with a minimum of 28 monitoring scores. This analysis was only a rough approximation, because we did not know each student’s actual goal or whether the intervention or goal was changed during the year.
To perform the analyses, we first set an estimated goal for each student by using the ROI at the 85th percentile of aimsweb national ROI norms to project their score at their 30th monitoring administration. Next, we defined “meeting the goal” as having a mean score on the last three administrations (e.g., the 28th through 30th administrations of RCBM) that was at or above the goal score. At each monitoring administration for each student, we computed the projected score at the goal date and the 75% confidence interval for that score, and recorded which of the three decision statements was generated (projected not to meet goal, projected to exceed goal, or ontrack/nochange).
In this analysis, accuracy of guidance to change (that is, accuracy of projections that the student will not reach the goal or will exceed the goal) reached a high level (80%) by about the 13th to 15th monitoring administration, on average. The percentage of students receiving guidance to not change (i.e., their trendline was not far from the aimline) would naturally tend to decrease over administrations as the size of the confidence interval decreased. At the same time, however, there was a tendency for the trendline to become closer to the aimline over time as it became more accurately estimated, and this worked to increase the percentage of students receiving the “no change” guidance.
Decision Rules for Increasing Goals
Grade  1 

Rating 
Does your manual or published materials specify validated decision rules for when changes to increase goals?
Yes
Specify the decision rules:
The same statistical approach described under Decision Rules for Changing Instruction (GOM 9 above) applies to the decisions about increasing a goal. aimswebPlus provides the following guidance for deciding whether to increase a performance goal:
If the student is projected to exceed the goal and there are at least 12 weeks remaining in the schedule, consider raising the goal.
What is the evidentiary basis for these decision rules?
See GOM 9 evidentiary basis information above.
Improved Student Achievement
Grade  1 

Rating 
Improved Teacher Planning
Grade  1 

Rating 