Scientist discovers deleted coronavirus data from China

Thirteen genetic sequences – isolated from people infected with COVID-19 at the start of the pandemic in China – were mysteriously deleted from an online database last year but have now been recovered.

Jesse Bloom, a computer biologist and viral evolution specialist at the Fred Hutchinson Cancer Research Center in Seattle, found that the footage had been removed from an online database at the behest of scientists in Wuhan, China. But with some internet research, he was able to retrieve copies of the data stored on Google Cloud.

The sequences do not fundamentally change scientists’ understanding of the origins of COVID-19 – including the thorny question of whether the coronavirus spread naturally from animals to humans or if it escaped in a laboratory accident. But their removal adds to concerns that Chinese government secrecy has hampered international efforts to understand how COVID-19 emerged.

Bloom’s results were published in a preprint paper, not yet evaluated by other scientists, published Tuesday. “I think this is definitely an attempt to hide the footage,” he told BuzzFeed News.

Bloom learned of the deleted data after read a paper of a team led by Carlos Farkas at the University of Manitoba in Canada on some of the early genetic sequences of SARS-CoV-2. Farkas’ article described sequences sampled from hospital outpatients as part of a project by Wuhan researchers who were developing diagnostic tests for the virus. But when Bloom tried to download footage from the Sequence Playback Archive, an online database maintained by the U.S. National Institutes of Health, it received error messages saying they had been deleted.

Bloom realized that the copies of the SRA data are also kept on servers managed by Google and was able to discover the URLs where the missing sequences could be found in the cloud. In this way, he recovered 13 genetic sequences that can help answer questions about the evolution of the coronavirus and where it came from.

Bloom found that the deleted sequences, like others collected at later dates outside the city, were more similar to bat coronaviruses – believed to be the ultimate ancestors of the virus that causes COVID-19 – than to the sequences related to the Huanan seafood market in Wuhan. This is in addition to previous suggestions that the seafood market may have been an early victim of COVID-19, rather than where the coronavirus first passed from animals to humans.

“This is a very interesting study done by Dr Bloom, and in my opinion the analysis is absolutely correct,” Farkas told BuzzFeed News via email. Scott Gottlieb, former head of the Food and Drug Administration, also praised the results on Twitter.

But some scientists were less impressed. “It really doesn’t add anything to the origins debate,” Robert Garry of Tulane University in New Orleans told BuzzFeed News by email. Garry argued that the Huanan Market or other Wuhan markets could still be the source of COVID-19.

Bloom is one of 18 scientists who in May published a letter criticizing the WHO and Chinese study on the origins of SARS-CoV-2. Scientists argued that the WHO-China report failed to take into account competing ideas that the coronavirus would spread naturally from animals to humans or had escaped from a laboratory – a theory the report said. deemed “extremely unlikely”. After the publication of the WHO-China report, the United States and 13 other governments complained that he “did not have access to complete and original data and samples”.

The suppressed viral sequences were first uploaded to the SRA in early March 2020, around the time the researchers led by Yan Li and Tiangang Liu from Wuhan University published a pre-publication describing their work using genetic sequencing to diagnose COVID-19. A few days before, the Chinese State Council had ordered that all documents related to COVID-19 be approved centrally.

The footage was then removed from the SRA in June, around the time the final version of the article appeared in a scientific journal. According to the NIH, the authors requested the removal of the footage. “The requester indicated that the sequence information had been updated, was submitted to another database and wanted the data to be deleted from SRA to avoid version control issues,” the spokesperson for NIH, Amanda Fine, to BuzzFeed News via email.

However, it is not known if the footage has since been uploaded to another database.

“There is no plausible scientific reason for the deletion,” Bloom wrote in his prepublication, arguing that the footage was probably “deleted to obscure their existence.” This suggested, he wrote, “a less than sincere effort to trace the early spread of the epidemic.”

Although the sequences were removed, Garry pointed out that the key genetic mutations they contained were still published in a table in the Wuhan team’s outcome document. “Jesse Bloom hasn’t found exactly anything new that isn’t already in the scientific literature,” Garry told BuzzFeed News, accusing Bloom of writing his preprint in “an inflammatory way that is unscientific and unnecessary.” .

Bloom wrote to researchers in Wuhan asking them why the footage was removed but received no response. Li and Liu also did not immediately respond to a request from BuzzFeed News.

This isn’t the first time scientists have worried about the deletion of data that could help answer questions about the origins of COVID-19. The main database with coronavirus sequence information maintained by the Wuhan Institute of Virology – which is the subject of speculation about a possible ‘lab leak’ of the virus – has been taken offline in September 2019. When the members of the WHO-China team who studied the origins of the pandemic visited the institute in February, they were told the database, which would have included data out of 22,000 coronavirus samples and sequence records, had been deleted after repeated hacking attempts.

Source link

Leave a Reply

Your email address will not be published. Required fields are marked *