How censoring China’s open-source coders might backfire

How censoring China’s open-source coders might backfire thumbnail

On May 18, thousands of software developers in China woke up to find that their open-source code hosted on Gitee, a state-backed Chinese competitor to the international code repository platform GitHub, had been locked and hidden from public view.

Later that day, Gitee published a statement explaining how the locked code was being manually inspected, as all open source code would need to be reviewed before it could be published. It wrote that the company “didn’t have a choice”. Gitee did not respond to MIT Technology Review’s question about why it made the change. However, it is widely believed that the Chinese government had imposed another layer of heavy-handed censorship.

For the open-source software community in China, which celebrates transparency and global collaboration, the move has come as a shock. Code was supposed not to be political. These developers fear that it will discourage people from contributing open-source projects and China’s software industry could suffer from a lack of collaboration.

” Code review in OSS aims to improve code quality and build trust between developers. Adding politics to code review will harm both, and eventually backlash against the open-source movement of China,” says Han Xiao (Berlin-based founder) Jina AI, a commercial open source software company.

The rise of Gitee

GitHub, founded in 2008 and acquired by Microsoft in 2018, is the go-to platform developers around the world use to publish their code and then critique and learn from each other. Open-source software refers to publicly available code, not proprietary code created by individuals or companies. Of the 73 million people using GitHub as of 2021, 7.5 million are based in China, making them the largest group outside the United States.

But that level of dependence on the platform made the Chinese government wary, especially since American sanctions against Huawei in 2019 reminded it how much the nation still relies on certain foreign companies and services. GitHub is one such company.

At the same time, the open-source industry was growing fast in China. Tencent and Alibaba were among the first to release their own version of GitHub. Gitee was supported by OSChina and the open-source community OSChina and began to lead the domestic competition. So in 2020, China’s Ministry of Industry and Information Technology contracted a consortium of companies and universities, led by Gitee, to grow the existing repository into a “Chinese independent open-source hosting platform.” Gitee now boasts over 8 million users.

Over time, developers began to prefer Gitee to GitHub for a variety of reasons, including performance and cost, as well as protection from foreign interference.

For Daniel Bovensiepen Li (a Beijing-based researcher scientist), Gitee’s main advantage is its location in mainland China. This makes it more reliable and faster. He says that Gitee’s performance is significantly better than GitLab [a similar overseas platform],”. Li has 24 projects hosted on Gitee that were affected by the latest change.

Institutions that have close ties to the government are more likely use Gitee. “The military, public universities, and state-owned companies–they are concerned with the fact that GitHub is eventually owned by Microsoft, an American company,” says Thomas Yao, the Shanghai-based founder of GitCafe, one of China’s earliest GitHub-like websites, which he sold to Tencent in 2016. He says that students and amateur developers may be discouraged from using GitHub due to the high costs and difficulty in finding reliable VPN services in China.

The impact

For now, there’s little clue as to what prompted the change, but censorship of certain types of language–profanity, pornography, and politically sensitive words–has been creeping up on the platform for a while. On Gitee’s official and public feedback page, there are multiple user complaints about how projects were censored for unclear reasons, possibly because technical language was mistaken for a sensitive word.

The immediate result of Gitee’s May 18 change was that public projects hosted on the platform suddenly became unavailable without notice. Users complained that this disrupted services or even ruined their business deals. Developers must submit an application to confirm that the code does not violate Chinese law or infringe copyrights before it can be made public.

Li went through the manual review for all his projects on Gitee, and so far 22 out of 24 have been restored. He says, “Yet, I assume that this review process isn’t a one-time event. So the question is whether the friction of hosting projects in the future will increase.” Li believes that users will stay even though there is no better domestic alternative. “When you code, you are also writing comments or setting up variables names. Yao says that any developer would want to think about whether their code could trigger the sensitive words list while they are writing code.

With almost every other aspect of the internet, the Chinese way of building its own alternative has worked well in recent years. China seems to have hit a wall with open-source software. This is a direct result of cross-border collaboration.

“This push to insulate the domestic open-source community from risks arising from the global community is something that very much goes against the core proposition of open-source tech development,” says Rebecca Arcesati, an analyst at the Mercator Institute for China Studies and coauthor of a report on China’s bet on open-source.

Chinese developers don’t want their voices to be silenced in the global software development conversation. They may also be uncomfortable with China’s current direction.

Cutting off China’s global ties early could disrupt China’s rapid growth in open-source software before it can reap the economic benefits. This is part of a larger concern that has been overshadowed by China’s tech sector. In recent years, the government has increased regulations. Is China sacrificing the long-term advantages of tech in exchange for short-term impacts? “We are not there yet.” “We are not there yet.”

Read More