top of page

P-O-S-E(c) Correlations with Benchmarks (Fountas & Pinnell), Northwest Evaluation Association / Measure of Academic Progress (NWEA / MAP) and New York State English Language Arts Test (NYS ELA); Internal Reliability.



Common Core State Standards (CCSS) identify short vowel proficiency by Grade 2 as a foundational skill for literacy. It is hypothesized that the P-O-S-E(c), a focused instrument for assessing short vowel proficiency, should bear a significant correlation as an intrinsic component of more global measures of literacy such as the Fountas and Pinnell Benchmark, Northwest Evaluation Association MAP RIT Reading and the New York State ELA examination.


Multiple correlations were run on the Mineola U.F.S.D. 2012-2013 Grade 3 P-O-S-E(c) Baseline and RTI Error scores (N=191), Baseline and RTI Fountas and Pinnel Benchmarks and Baseline and RTI Northwest Evaluation Association / Measures of Academic Progress (NWEA / MAP ) and with  year-end NYS ELA scale scores. Correlations were statistically highly significant for all paired measures for the total N=191 population (Table 1). External correlations with the P-O-S-E(c) are negative in sign because results are calculated as percent error scores.


Table 2 presents key correlations for each of the 10 classes comprising the 2012-13 Mineola P-O-S-E(c) project.


When Mineola  2012-3 data are compared  with Plainview-Old Bethpage (POB) validation data of 2006-7, P-O-S-E (c)  Baseline  v.  Benchmark  Baseline: r=-0.62 for POB (n=78) and r=-0.69 for  Mineola 2012-13 (N=191) . P-O-S-E(c) correlations with ELA r=-0.35 POB 2006-7 (n=275) v. r=-0.57 Mineola 2012-13 (n=191) may not be valid as CCSS did not apply in 2006-7.


Table 3 show correlations among the baseline P-O-S-E(c) and the F & P Benchmark, NWEA MAP RIT Reading and the NYS ELA scale and raw scores applied to Grade 3 (n=180) Mineola U.F.S.D. for the academic year 2013-2014. Benchmark/P-O-S-E(c) correlations are consistent with Mineola 2012-13 and POB 2006-7. Coefficients of Determination are 0.38 (2006-7, 2013-14) and  0.48 (2012-13). the  NWEA/P-O-S-E correlation is lower than obtained in Mineola 2012-13. The ELA Scale Scores v. P-O-S-E(c) correlations are identical with Mineola 2012-13. All these correlations are statistically significant p < .00001


Table 4 show correlations among the baseline ans d RTI P-O-S-E(c) and the F & P Benchmark, NWEA MAP RIT Reading and the NYS ELA scale and raw scores applied to Grade 3 (n=180) Mineola U.F.S.D. for the academic year 2013-2014. An administrative decision limited P-O-S-E(c) RTI testing to only those Grade 3 students with p-o-S-E(c) error scores >=12.5%. In order to provide matched student data sets for correlational analysis across tests, the number is reduced to n=96.


Table 5 presents for Grade 3 2012-13 the same inter-test correlations displayed in Table 4 for Grade 3 2013-14 limiting the analysis to students with basleine P-O-S-E error scores >=12.5%. This transform facilittates a comparison of inter-test correlatiosn for Grade 3 in both years.




Construct validity is the appropriateness of inferences made on the basis of observations or measurements (often test scores), specifically whether a test measures the intended construct. A test is deemed valid if the assessment measures what it purports to measure. Validity is a metric of how an assessment fulfills its stated objective.


Convergent Validity is a construct that examines the relationship between an assessment’s test scores and the scores and  other instruments measuring similar variables.

For example, A Fountas and Pinnell Benchmark study (2012) reports an F & P Benchmark correlation with The Slosson Oral Reading Test-Revised (SORT-R3), a list of 200 words in increasing order of difficulty administered individually to students, of 0.69 and 0.62 for fiction and non-fiction books, respectively.  Like correlations with The Degrees of Reading Power® (DRP),  a norm-referenced assessment made up of nonfiction text passages formatted using a "cloze" technique, were 0.44 and 0.42 for fiction and non-fiction books.


Both SORT-R3 and DRP entail assessment based on grasp of morphological elements of literacy. The P-O-S-E(c) presents comparable convergent validity as an instrument based on assessment of phonological elements of literacy, specifically. short vowels.


An NWEA study,

"A Study of the Alignment of the NWEA RIT Scale  
with the New York State (NYS) Testing Program" (November, 2013)


reports a correlation of the NWEA MAP with the Spring 2013 NYS ELA Grade 3 as 0.73. This is comparable to the value achieved in the P-O-S-E(c) study for the same RTI 2013 variables of 0.69 (n=191).


In 2013, the P-O-S-E(c) error scores demonstrated target Grade 3 (n=191) RTI correlations of -0.64 with the     F & P benchmark, -0.46 with the NWEA MAP (reading) and -0.54 with the 2013 NYS ELA.


Internal consistency / reliability is a measure of reliability used to evaluate the degree to which different test items that probe the same construct produce similar results.


Table 6 presents correlative data on internal consistency / reliability of the P-O-S-E(c) based on an n of 275 Grade 3 students in the Plainview-old Bethpage Central School District 2006-7.


Cronbach alpha for the entire test of 120 items is 0.96. The Cronbach alpha for the Spelling half (60 items and Reading half (60 items)) were 0.93 and 0.92, respectively.The correlations between forms (Spelling v. Reading) was 0.77. The full test Spearman-Brown coefficient was 0.87 while the Guttman split-half coefficient was 0.85 A matrix (120 x120) of inter-item correlations was calculated generating a mean of 0.16 for the 120 items ranging from -0.10 to +0.66.




Table 1

Table 2

Table 3

Table 4

Table 5

Table 6

bottom of page