Why do reputable sources keep false data? How did this happen? - cyberdir - is your ultimate destination for cutting-edge cybersecurity

In 2004, Dr. Hwang Woo Suk became famous for his breakthrough discovery of cloned human embryos, and his work was published in the prestigious journal Science. But the discovery turned out to be too good to be true: Dr. Hwang had fabricated the data. Science publicly retracted the paper and assembled a team to investigate what went wrong.

The retractions are often in the news. The high-profile discovery of a room temperature superconductor was retracted on November 7, 2023. A series of retractions ousted the president of Stanford University on July 19, 2023. Major early COVID-19 studies were found to have serious data problems and were retracted on June 4, 2020.

Typically, retractions are viewed in a negative way: as science not working properly, as an embarrassment to the institutions involved, or as a flaw in the peer-review process. They can be all of those things. But they can also be part of the story of how science works properly: finding and correcting errors, and publicly acknowledging when information turns out to be wrong.

A far more pernicious problem arises when information is not and cannot be refuted. There are many seemingly authoritative sources that contain erroneous information. Sometimes this information is intentional, but sometimes it is not – after all, it is inherently human to be wrong. Often there is no mechanism for correction or retraction, meaning that information that is known to be wrong remains on paper without any indication of its flaws.

This is a particularly damaging problem with government information, which is often considered a source of reliable data, but is error-prone and often has no means of refuting the information.

Patent fictions and fraud

Consider patents – documents that contain many technical details that can be useful to scientists. It is impossible to disclaim a patent. And patents often contain errors: Although patents are reviewed by examiners before issuance, they do not verify that the scientific data contained in the patent is correct.

Moreover, the U.S. Patent and Trademark Office allows patentees to include fictitious experiments and data in patents. This practice, called “predictive examples,” is widespread: about 25% of life science patents contain fictitious experiments. The Patent Office requires that prophetic examples be written in the present or future tense, while real experiments can be written in the past tense. But this confuses laypeople, including scientists, who tend to assume that a phrase like “X and Y are mixed at 300 degrees Celsius to achieve a 95 percent yield” means a real experiment.

Nearly a decade after Science magazine retracted the article on cloned human cells, Dr. Hwang was granted a U.S. patent for his disproven discovery. Unlike the journal article, this patent was not revoked. The patent office did not investigate the accuracy of the data – indeed, it issued the patent well after the inaccuracy of the data was publicly recognized – and there is no indication on the face of the patent that it contains information that has been refuted elsewhere.

Why do reputable sources keep false data? How did this happen? — The U.S. Patent and Trademark Office issued a patent to Theranos on December 18, 2018, three months after it was liquidated following a series of investigations and lawsuits detailing its fraud. The patent has not been revoked, and there is no notice of the erroneous nature of the information it contains.

This is not an anomaly. Case in point is Elizabeth Holmes, the former (now jailed) CEO of Theranos, who holds patents on her thoroughly disproven claims about a small device that can quickly run multiple tests on a small blood sample. Some of these patents were already granted after major newspapers wrote about Theranos’ fraud.

Long-lived bad information

Such misleading data can be deeply misleading to readers. The system of retractions in scientific journals is not without criticism, but it compares favorably to the alternative – no retractions. Without retractions, readers do not know that they have incorrect information in front of them.

My colleague Sumi Kim and I conducted a study of patent-article pairs. We looked at cases where the same information was published in a journal article and a patent by the same scientists, and the journal article was later retracted. We found that while the citations of articles dropped dramatically after they were retracted, the citations of patents with the same incorrect information did not decrease.

This was probably because scientific journals draw a large red “retracted” label on retracted articles, informing the reader that the information is incorrect. In contrast, patents have no retraction mechanism, so incorrect information continues to spread.

There are many other examples where seemingly authoritative information is known to be incorrect. The Environmental Protection Agency publishes emissions data provided by companies but not verified by the agency. Similarly, the Food and Drug Administration disseminates official drug information that is generated by drug manufacturers and published without evaluation by the agency.

Consequences of failure to disclose information

When incorrect information cannot be easily corrected, there are also economic consequences. The FDA publishes a list of patents that cover brand-name drugs. The FDA will not approve a generic drug unless the manufacturer can prove that every patent covering the drug has expired, is not infringed, or is invalid.

The problem is that the patent list is generated by brand-name drug manufacturers who have an interest in listing patents that do not cover their drugs. This increases the burden on generic drug manufacturers. The list is not reviewed by the FDA or anyone else, and there are few mechanisms for anyone other than a brand-name drug manufacturer to notify the FDA that a patent has been removed from the list.

Even if rebuttals are possible, they are only effective when readers pay attention. Financial data are sometimes refuted and corrected, but those corrections are not timely. “Markets don’t tend to respond to data revisions,” Paul Donovan, chief economist at UBS Global Wealth Management, told the Wall Street Journal, referring to governments’ revisions to gross domestic product data.

Misinformation is a growing problem. There are no easy answers to solve it. But there are steps that will almost certainly help. One, relatively simple, is for reliable data sources, such as government data sources, to follow the lead of academic journals and create a mechanism to refute erroneous information.