Test of Early Numeracy  Quantity Discrimination
AIMSweb Test of Early Numeracy (TEN) is included in a subscription to AIMSweb Pro Math, or AIMSweb Pro Complete, which range from $4.00 to $6.00 per student per year.
Every AIMSweb subscription provides unlimited access to the AIMSweb online system, which includes:

AIMSweb assessments for universal screening and progress monitoring

Data management and reporting

Browserbased scoring

Training manuals

Administration and scoring manuals

Internet access is required for full use of product services.
Testers will require 2 – 4 hours of training.
Paraprofessionals can administer the test.

Pearson
19500 Bulverde Road
San Antonio, TX 78259
Phone: 8663136194
Visit AIMSweb.com
General Information:
8663136194 option 2
sales@aimsweb.com
Tech support:
8663136194 option 1 aimswebsupport@ pearson.com
Access to field tested training manuals are included with AIMSweb subscriptions which provide administration, scoring, and implementation information.
Ongoing technical support is provided.
Professional Development opportunities are available.

Test of Early Numeracy (TEN) measures consist of:
Oral counting: Student counts aloud for 1 minute
Number Identification: Student names numbers up to 10 (or 20) presented in random order for 1 minute
Quantity Discrimination: Students indicates which of two numbers up to 10 (or 20) is greater for 1 minute
Missing Number: Student says the value of the missing number from a sequence of three numbers
33 standardized and individually administered alternate forms per grade for Kindergarten and grade 1.

Raw score, national percentiles (K and grade 1) and normative performance levels by grade and season, individual student growth percentiles by grade and season (based on rates of improvement, ROI), and success probability scores (cut scores that indicate a 50% or 80% probability of passing the state test). Local norms are also available.

Reliability of the Performance Level Score
Grade 

Rating   

Interscorer 
1st 

0.99 
Clarke and Shinn (2004) 
Alternate form (Fall) 
1st 

0.93 
Clarke and Shinn (2004) 
Alternate form (Winter) 
1st 

0.92 
Clarke and Shinn (2004) 
Testretest (2 weeks) 
1st 

0.96 
Clarke and Shinn (2004) 
Testretest (13 weeks) 
1st 

0.85 
Clarke and Shinn (2004) 
Testretest (26 weeks) 
1st 

0.86 
Clarke and Shinn (2004) 
Testretest 
K 

0.80 
Chard, et al. (2005) 
Testretest 
1st 

0.91 
Chard, et al. (2005) 
Reliability of the Slope
Grade 

Rating   

Ratio of Observed Variance to True Variance estimated through HLM. 
Grade K aggregated sample across ethnicities. 
102 
0.75 
Reliability of slope for Total sample = 0.75. 
Reliability of slope based on 10 observational data points collected over the course of 10 weeks (i.e., one data point per week). 
Validity of the Performance Level Score
Grade 

Rating   

Concurrent 
1st 
Number Knowledge Test (Fall) 
0.80 
Concurrent 
1st 
Woodcock Johnson Math Applications Test (Winter) 
0.71 
Concurrent 
1st 
MCBM (Winter) 
0.71 
Concurrent 
1st 
Woodcock Johnson Math Applications Test (Spring) 
0.79 
Concurrent 
1st 
MCBM (Spring) 
0.75 
Predictive 
1st 
CBMM (Winter) 
0.76 
Predictive 
1st 
CBMM (Spring) 
0.70 
Predictive 
1st 
Woodcock Johnson Math Applications Test (Spring) 
0.79 
Predictive 
K 
Number Knowledge Test (Spring) 
0.68 
Predictive 
K 
Sat 9 (Spring) 
0.73 
Concurrent 
1st 
Number Knowledge Test (Fall) 
0.61 
Concurrent 
1st 
Missing Number (Winter) 
0.74 
Concurrent 
1st 
Number Knowledge Test (Winter) 
0.61 
Concurrent 
1st 
Number Identification 
0.81 
Concurrent 
1st 
Missing Number (Spring) 
0.68 
Concurrent 
1st 
Number Knowledge Test (Spring) 
0.47 
Concurrent 
1st 
SAT 9 (Spring) 
0.40 
Concurrent 
1st 
Number Identification 
0.77 
Concurrent 
K 
Number Identification (Winter) 
0.79 
Concurrent 
K 
Missing Number (Winter) 
0.71 
Concurrent 
K 
Number Knowledge Test (Winter) 
0.54 
Concurrent 
K 
Number Identification (Spring) 
0.70 
Concurrent 
K 
Missing Number (Spring) 
0.70 
Concurrent 
K 
Number Knowledge Test 
0.69 
Concurrent 
K 
SAT 9 
0.71 
Predictive Validity of the Slope of Improvement
Grade 

Rating   

Predictive 
Kindergarten 
Grade 1 MCBM Spring Benchmark 
102 
0.71 
Predictive validity of slope for Total sample = 0.71. 
Reliability of slope based on 10 data points collected over the course of 10 weeks (i.e., one data point per week). 
Bias Analysis Conducted
Disaggregated Reliability and Validity Data
Disaggregated Validity of the Performance Level Score
Ratio of Observed Variance to True Variance estimated through HLM. 
Grade K disaggregated by ethnicity. 
102 
0.38 to 0.92 
Reliability of slope for Caucasian subsample = 0.80. Reliability of slope for African American subsample = 0.38 (likely due to very small subsample). Reliability of slope for Hispanic subsample = 0.92. 
Reliability of slope based on 10 observational data points collected over the course of 10 weeks (i.e., one data point per week). 
Disaggregated Predictive Validity of the Slope of Improvement
Predictive 
Kindergarten 
Grade 1 MCBM Spring Benchmark. 
102 
0.69 to 0.70 
Predictive validity of slope for Caucasian sample = 0.73. Reliability of slope for African American sample = 0.62. Reliability of slope for Hispanic sample = 0.70. 
Reliability of slope based on 10 data points collected over the course of 10 weeks (i.e. one data point per week). 
Grade 

Rating   

1. Evidence that alternate forms are of equal and controlled difficulty or, if IRT based, evidence of item or ability invariance:
Early Numeracy CurriculumBased Measurement Reliability for All Testing Sessions
Quantity Discrimination 
0.99 
0.93 
0.92 
0.96 
0.85 
0.86 
2. Number of alternate forms of equal and controlled difficulty:
30 alternate forms in grade K and 1.
Rates of Improvement Specified
Grade 

Rating   

1. Is minimum acceptable growth (slope of improvement or average weekly increase in score by grade level) specified in manual or published materials?
Yes.
a. Specify the growth standards:
K 
90 
0.0 
75 
0.2 
50 
0.4 
25 
0.4 
10 
0.3 
Mean 
0.3 
1 
90 
0.2 
75 
0.3 
50 
0.4 
25 
0.4 
10 
0.4 
Mean 
0.3 
b. Basis for specifying minimum acceptable growth:
Criterionreferenced and Normreferenced.
2. Normative profile:
Representation: National
Date: 20012008
Number of States: 49 & DC
3. Procedure for specifying criterion for adequate growth:
AIMSweb TEN will derive its standards for adequate growth in two ways. First, Rates of Improvement (ROI) are calculated using the composite normative sample of AIMSweb customers. Year 1 normative data are being compiled with information on relative standing and rates of improvement provided in continuously updated normative tables as shown on the next pages. Spring AIMSweb TEN data are currently being collected. Second, AIMSweb users will be able to identify their own criterionreferenced rates of progress by linking their AIMSweb TEN scores to Mathematics CBM (MCBM) measures and/or their statemandated high stakes mathematics tests. See (Hintze, Ryan, & Stoner, 2003) as an example of how empirical linkages can be used for goal setting in CBM.
By establishing the predictive relationship between TEN and any accepted highstakes criterion test, AIMSweb users can use the score that predicts passing with an 80% or 90% probability.
EndofYear Benchmarks
Grade 

Rating   

1. Are benchmarks for minimum acceptable endofyear performance specified in your manual or published materials?
Yes.
a. Specify the endofyear performance standards:
Customers can 1) Define their own benchmark targets based on norm tables or other data, 2) Use AIMSweb presets, which are based on the score at the 50th percentile from the AIMSweb National Norms; 3) Use DIBELS presets, or 4) Use the AIMSweb test correlation feature to generate benchmark targets that predict success on high stakes testing.
b. Basis for specifying minimum acceptable endofyear performance:
Normreferenced and criterionreferenced
Normative profile:
Representation: National
Date: 20012008
Number of States: 49 & DC
Size: 315,866
Sensitive to Student Improvement
Grade 

Rating   

1. Describe evidence that the monitoring system produces data that are sensitive to student improvement (i.e., when student learning actually occurs, student performance on the monitoring tool increases on average).
In order to assess the sensitivity of the AIMSWeb TENS to student improvement, data from one year were analyzed for students in Kindergarten and 1st grade who were receiving Tier 2 RTI supplemental instruction. These students were drawn from an overall sample of 320 students attending two primary schools in the northeast. Students were chosen for Tier 2 intervention based on their TENS scores in universal screening at one of the three benchmark testing periods (fall, winter, and spring).
The intervention was provided four times a week by a remedial math teacher, in sessions of 30 to 40 minutes. The structured intervention, Number Worlds, is ungraded but is differentiated by level. Students in Kindergarten were provided with intensified instruction at Levels A through C, which focus on counting and conceptual structure for singledigit numbers, and the relationship of number concepts to the formal symbol system. Students in grade 1 were provided intensified instruction at Level D, which addresses number sense, number pattern and relationship, addition, subtraction, geometry, measurement, and data analysis and applications.
A series of singlesample t tests were computed at each grade level and for each TENS measure comparing the average rate of improvement (ROI) for students in intervention to the mean ROI for students in general education who were not receiving supplemental intervention.
Results were significant (p < 0.05) for each measure at the Kindergarten level. More specifically, on the Quantity Discrimination measure, students receiving RTI support (n=15) outperformed their general education peers by 0.77 responses correct per minute per week.
For Quantity Discrimination, the rate of improvement of students in RTI was not significantly different from the rate of improvement of their general education peers. On the Quantity Discrimination measure, students receiving RTI support (n=18) improved at a rate of 0.29 responses correct per minute per week, compared with 0.30 numbers correct per minute per week for general education peers.
Results of these analyses provide evidence that the AIMSWeb TEN assessment measures are sensitive to validated interventions.
Decision Rules for Changing Instruction
Grade 

Rating   

Specification of validated decision rules for when changes to instruction need to be made: The newest version of the AIMSweb online system, to be released for piloting in the fall of 2012 and made available to all users no later than the fall of 2013, applies a statistical procedure to the student’s monitoring scores in order to provide empiricallybased guidance about whether the student is likely to meet, fall short of, or exceed their goal. The calculation procedure (presented below) is fully described in the AIMSweb Progress Monitoring Guide (Pearson, 2012) and can be implemented immediately by AIMSweb users if they create a spreadsheet or simple software program. Once the new AIMSweb online system is fully distributed, the user will not have to do any calculations to obtain this databased guidance. The decision rule is based on a 75% confidence interval for the student’s predicted score at the goal date. This confidence interval is studentspecific and takes into account the number and variability of monitoring scores and the duration of monitoring. Starting at the sixth week of monitoring, when there are at least four monitoring scores, the AIMSweb report following each monitoring administration includes one of the following statements: “The student is projected to not reach the goal.” This statement appears if the confidence interval is completely below the goal score. “The student is projected to exceed the goal.” This statement appears if the confidence interval is completely above the goal score. “The student is on track to reach the goal. The projected score at the goal date is between X and Y” (where X and Y are the bottom and top of the confidence interval). This statement appears if the confidence interval includes the goal score. If Statement A appears, the user has a sound basis for deciding that the current intervention is not sufficient and a change to instruction should be made. If Statement B appears, there is an empirical basis for deciding that the goal is not sufficiently challenging and should be increased. If Statement C appears, the student’s progress is not clearly different from the aimline and so there is not a compelling reason to change the intervention or the goal; however, the presentation of the confidenceinterval range enables the user to see whether the goal is near the upper limit or lower limit of the range, which would signal that the student’s progress is trending below or above the goal. A 75% confidence interval was chosen for this application because it balances the costs of the two types of decision errors. Incorrectly deciding that the goal will not be reached (when in truth it will be reached) has a moderate cost: an intervention that is working will be replaced by a different intervention. Incorrectly deciding that the goal may be reached (when in truth it will not be reached) also has a moderate cost: an ineffective intervention will be continued rather than being replaced. Because both kinds of decision errors have costs, it is appropriate to use a modest confidence level.
Calculation of the 75% confidence interval for the score at the goal date. Calculate the trend line. This is the ordinary leastsquares regression line through the student’s monitoring scores. Calculate the projected score at the goal date. This is the value of the trend line at the goal date. Calculate the standard error of estimate (SEE) of the projected score at the goal date, using the following formula: [((1 + 1/k + (GW – mean(w)))/(k – 2))((sum(y – y’)2)/(sum(w – mean(w))2))]1/2 where k = number of completed monitoring administrations w = week number of a completed administration GW = week number of the goal date y = monitoring score y’ = predicted monitoring score at that week (from the student’s trendline). The means and sums are calculated across all of the completed monitoring administrations up to that date. Add and subtract 1.25 times the SEE to the projected score, and round to the nearest whole numbers.
Evidentiary basis for these decision rules: The decision rules are statistically rather than empirically based. The guidance statements that result from applying the 75% confidence interval to the projected score are correct probabilistic statements, under certain assumptions: The student’s progress can be described by a linear trend line. If the pattern of the student’s monitoring scores is obviously curvilinear, then the projected score based on a linear trend will likely be misleading. We provide training in the AIMSweb Progress Monitoring Guide about the need for users to take nonlinearity into account when interpreting progressmonitoring data. The student will continue to progress at the same rate as they have been progressing to that time. This is an unavoidable assumption for a decision system based on extrapolating from past growth. Even though the rules are not derived from data, it is useful to observe how they work in a sample of real data. For this purpose, we selected random samples of students in the AIMSweb 20102011 database who were progressmonitored on either Reading CurriculumBased Measurement (RCBM) or Math Computation (MCOMP). All students scored below the 25th percentile in the fall screening administration of RCBM or MCOMP. The RCBM sample consisted of 1,000 students (200 each at grades 2 through 6) who had at least 30 monitoring scores, and the MCOMP sample included 500 students (100 per grade) with a minimum of 28 monitoring scores. This analysis was only a rough approximation, because we did not know each student’s actual goal or whether the intervention or goal was changed during the year. To perform the analyses, we first set an estimated goal for each student by using the ROI at the 85th percentile of AIMSweb national ROI norms to project their score at their 30th monitoring administration. Next, we defined “meeting the goal” as having a mean score on the last three administrations (e.g., the 28th through 30th administrations of RCBM) that was at or above the goal score. At each monitoring administration for each student, we computed the projected score at the goal date and the 75% confidence interval for that score, and recorded which of the three decision statements was generated (projected not to meet goal, projected to exceed goal, or ontrack/nochange).
In this analysis, accuracy of guidance to change (that is, accuracy of projections that the student will not reach the goal or will exceed the goal) reached a high level (80%) by about the 13th to 15th monitoring administration, on average. The percentage of students receiving guidance to not change (i.e., their trendline was not far from the aimline) would naturally tend to decrease over administrations as the size of the confidence interval decreased. At the same time, however, there was a tendency for the trendline to become closer to the aimline over time as it became more accurately estimated, and this worked to increase the percentage of students receiving the “no change” guidance.
Decision Rules for Increasing Goals
Grade 

Rating   

Specification of validated decision rules for when increases in goals need to be made: The newest version of the AIMSweb online system, to be released for piloting in the fall of 2012 and made available to all users no later than the fall of 2013, applies a statistical procedure to the student’s monitoring scores in order to provide empiricallybased guidance about whether the student is likely to meet, fall short of, or exceed their goal. The calculation procedure (presented below) is fully described in the AIMSweb Progress Monitoring Guide (Pearson, 2012) and can be implemented immediately by AIMSweb users if they create a spreadsheet or simple software program. Once the new AIMSweb online system is fully distributed, the user will not have to do any calculations to obtain this databased guidance. The decision rule is based on a 75% confidence interval for the student’s predicted score at the goal date. This confidence interval is studentspecific and takes into account the number and variability of monitoring scores and the duration of monitoring. Starting at the sixth week of monitoring, when there are at least four monitoring scores, the AIMSweb report following each monitoring administration includes one of the following statements: “The student is projected to not reach the goal.” This statement appears if the confidence interval is completely below the goal score. “The student is projected to exceed the goal.” This statement appears if the confidence interval is completely above the goal score. “The student is on track to reach the goal. The projected score at the goal date is between X and Y” (where X and Y are the bottom and top of the confidence interval). This statement appears if the confidence interval includes the goal score. If Statement A appears, the user has a sound basis for deciding that the current intervention is not sufficient and a change to instruction should be made. If Statement B appears, there is an empirical basis for deciding that the goal is not sufficiently challenging and should be increased. If Statement C appears, the student’s progress is not clearly different from the aimline and so there is not a compelling reason to change the intervention or the goal; however, the presentation of the confidenceinterval range enables the user to see whether the goal is near the upper limit or lower limit of the range, which would signal that the student’s progress is trending below or above the goal. A 75% confidence interval was chosen for this application because it balances the costs of the two types of decision errors. Incorrectly deciding that the goal will not be reached (when in truth it will be reached) has a moderate cost: an intervention that is working will be replaced by a different intervention. Incorrectly deciding that the goal may be reached (when in truth it will not be reached) also has a moderate cost: an ineffective intervention will be continued rather than being replaced. Because both kinds of decision errors have costs, it is appropriate to use a modest confidence level.
Calculation of the 75% confidence interval for the score at the goal date. Calculate the trend line. This is the ordinary leastsquares regression line through the student’s monitoring scores. Calculate the projected score at the goal date. This is the value of the trend line at the goal date. Calculate the standard error of estimate (SEE) of the projected score at the goal date, using the following formula: [((1 + 1/k + (GW – mean(w)))/(k – 2))((sum(y – y’)2)/(sum(w – mean(w))2))]1/2 where k = number of completed monitoring administrations w = week number of a completed administration GW = week number of the goal date y = monitoring score y’ = predicted monitoring score at that week (from the student’s trendline). The means and sums are calculated across all of the completed monitoring administrations up to that date. Add and subtract 1.25 times the SEE to the projected score, and round to the nearest whole numbers.
Evidentiary basis for these decision rules: The decision rules are statistically rather than empirically based. The guidance statements that result from applying the 75% confidence interval to the projected score are correct probabilistic statements, under certain assumptions: The student’s progress can be described by a linear trend line. If the pattern of the student’s monitoring scores is obviously curvilinear, then the projected score based on a linear trend will likely be misleading. We provide training in the AIMSweb Progress Monitoring Guide about the need for users to take nonlinearity into account when interpreting progressmonitoring data. The student will continue to progress at the same rate as they have been progressing to that time. This is an unavoidable assumption for a decision system based on extrapolating from past growth. Even though the rules are not derived from data, it is useful to observe how they work in a sample of real data. For this purpose, we selected random samples of students in the AIMSweb 20102011 database who were progressmonitored on either Reading CurriculumBased Measurement (RCBM) or Math Computation (MCOMP). All students scored below the 25th percentile in the fall screening administration of RCBM or MCOMP. The RCBM sample consisted of 1,000 students (200 each at grades 2 through 6) who had at least 30 monitoring scores, and the MCOMP sample included 500 students (100 per grade) with a minimum of 28 monitoring scores. This analysis was only a rough approximation, because we did not know each student’s actual goal or whether the intervention or goal was changed during the year. To perform the analyses, we first set an estimated goal for each student by using the ROI at the 85th percentile of AIMSweb national ROI norms to project their score at their 30th monitoring administration. Next, we defined “meeting the goal” as having a mean score on the last three administrations (e.g., the 28th through 30th administrations of RCBM) that was at or above the goal score. At each monitoring administration for each student, we computed the projected score at the goal date and the 75% confidence interval for that score, and recorded which of the three decision statements was generated (projected not to meet goal, projected to exceed goal, or ontrack/nochange).
In this analysis, accuracy of guidance to change (that is, accuracy of projections that the student will not reach the goal or will exceed the goal) reached a high level (80%) by about the 13th to 15th monitoring administration, on average. The percentage of students receiving guidance to not change (i.e., their trendline was not far from the aimline) would naturally tend to decrease over administrations as the size of the confidence interval decreased. At the same time, however, there was a tendency for the trendline to become closer to the aimline over time as it became more accurately estimated, and this worked to increase the percentage of students receiving the “no change” guidance.
Improved Student Achievement
Grade 

Rating   

Improved Teacher Planning
Grade 

Rating   
