Mainland China was the big winner in the newly released scores on the Program for International Student Assessment, which tests 15-year-old students in dozens of countries in math, reading and science every three years. With 600,000 students from 79 countries and school systems taking the exam in 2018, four provinces in China — which for PISA constitutes mainland China — were collectively ranked No. 1 in all three subjects.

But there is good reason to view the scores from mainland China with skepticism, and that’s the subject of this post by Tom Loveless, an expert on student achievement, testing, education policy and K-12 school reform.

A former sixth-grade teacher and Harvard policy professor, Loveless was a senior fellow in governance studies and director of the Brown Center on Education Policy at the Washington-based nonprofit Brookings Institution. He wrote 16 volumes of “The Brown Center Report on American Education,” an annual report analyzing important trends in education.

This isn’t the first time Loveless has commented on PISA scores from China. In 2013, I wrote a post questioning the No. 1 ranking of Shanghai in the 2012 PISA. In that test administration, U.S. students performed no better than average among 65 countries and education systems (like usual).

When the 2012 scores were released in late 2013, the Organization for Economic Cooperation and Development, which sponsors PISA, said the schools that were used in the Shanghai sample represent the city’s 15-year-old population. Loveless, then at the Brookings Institution, and some China experts said that migrant children were routinely excluded from schools in Shanghai, which is wealthier than the rest of China. The OECD has stood by the results.

Incidentally, in the 2018 PISA results, Singapore was second in all three subjects. U.S. students ranked eighth in reading, 11th in science and 30th in math, with scores that have not significantly changed since PISA began a few decades ago.

By Tom Loveless

The 2018 PISA results are out. Generally, countries scored within an expected range given their past records. Except one. The scores are astonishing for B-S-J-Z, an acronym for the four Chinese provinces that participated: Beijing, Shanghai, Jiangsu and Zhejiang. Out of 77 international systems, B-S-J-Z scored No. 1 in all three subjects: reading, math, and science.

The four Chinese provinces taking PISA changed from 2015 to 2018, with Zhejiang taking the place of Guangdong. The 2018 group’s scores are dramatically higher than those of the 2015 group (which appropriately is called B-S-J-G). In fact, the differences are so large that they are bound to raise eyebrows.

B-S-J-Z’s scores are 61 scale score points higher (494 versus 555) in reading, 60 points higher (531 versus 591) in math, and a whopping 72 points higher (518 versus 590) in science. How uncommon are differences like these? To answer that question, I examined PISA data from 2006-2015.

For each three-year test interval, I computed the changes for each country on the three PISA tests and converted them to absolute values. That produced 497 observations, with a mean of 9.5 points and standard deviation of 8.6.

So the typical change in a nation’s scores is about 10 points. The differences between the 2015 and 2018 Chinese participants are at least six times that amount. The differences are also at least seven times the standard deviation of all interval changes. Highly unusual.

A reasonable hypothesis is that changing the provinces participating in PISA, even if it was just one out of a group of four, influenced the test scores. Indeed, when I originally composed a thread for Twitter on this topic, I overlooked the change and treated the 2015-2018 score differences as if the participating provinces were the same. I apologize for the error. My mistake does underscore, however, the larger issue: that PISA scores from China should be viewed skeptically.

Why was Guangdong, China’s most populous province, dropped from participating and Zhejiang added? Is it only a coincidence that scores soared after the change?

The past PISA scores of Chinese provinces have been called into question (by me and others) because of the culling effect of hukou on the population of 15-year-olds — and for the OECD allowing China to approve which provinces can be tested. In 2009, PISA tests were administered in 12 Chinese provinces, including several rural areas, but only scores from Shanghai were released.

Three years later, the BBC reported, “The Chinese government has so far not allowed the OECD to publish the actual data.” To this day, the data have not been released.

The OECD responded to past criticism by attacking critics and conducting data reviews behind closed doors. A cloud hangs over PISA scores from Chinese provinces. I urge the OECD to release, as soon as possible, the results of any quality checks of 2018 data that have been conducted, along with scores, disaggregated by province, from both the 2015 and 2018 participants.

The credibility of international assessments rests on the transparency of test procedures, including how participants are selected and the rules for reporting test results. The OECD risks undermining the credibility of PISA by not being open on its conduct of the assessment in China.