Skip to main content

Logo
Open Source Highlights

Trends and Insights from GitHub 2022

We analyzed more than 5,000,000,000 rows of GitHub event data and got the results here. In this report, you'll get interesting findings about open source software on GitHub in 2022, including:

Top languages in the open source world over the past four years

This chart ranks programming languages ​​yearly from 2019 to 2022 based on the ratio of new repositories using these languages to all new repositories.

Insightslogo

Python surpassed Java and moved to #3 in 2021.TypeScript rose from #10 to #6, and SCSS rose from #39 to #19. The rise of SCSS shows that open source projects that value front-end expressiveness are gradually gaining popularity.The two languages Ruby and R dropped a lot in ranking over the years.
Additional Notes

Rankings of back-end programming languages

The programming languages used in a pull request reflect which languages developers used. To find out the most popular back-end programming languages, we queried the distribution of programming languages by new pull requests from 2019 to 2022 and took the top 10 for each year.

Insightslogo

Python and Java rank #1 and #2 respectively. In 2021, Go overtook Ruby to rank #3 in 2021.Rust has been trending upward for several years, ranking #9 in 2022.

Geographic distribution of developer behavior

We queried the number of various events that occurred throughout the world from January 1 to September 30, 2022 and identified the top 10 countries by the number of events triggered by developers in these countries. The chart displays the proportion of each event type by country or region.

Insightslogo

The events triggered in the top 10 countries account for about 23.27% of all GitHub events. However, the number of developers from these countries is only 10%.
🇺🇸 US developers are most likely to review code, with a PullRequestReviewEvent share of 6.15%.
🇨🇳 Chinese developers like to star repositories, with 17.23% for WatchEvent and 2.7% for ForkEvent.
🇩🇪 German developers like to open issues and comments, with IssueEvent and CommentEvent accounting for 4.18% and 12.66% respectively.
🇰🇷 Korean developers prefer pushing directly to repositories (PushEvent).
🇯🇵 Japanese developers are most likely to submit code via pull requests, with a PullRequestEvent share of 10%.

Developer behavior distribution on weekdays and weekends

We queried the distribution of each event type over the seven days of the week.

The distribution of specific events

Insightslogo

Pull Request Event, Pull Request Review Event, and Issues Event all have the highest percentage on Tuesdays, while the lowest percentage is on the weekends.The amount of Push Event, Watch Event, and Fork Event activities are similar on weekdays and weekends, while the Pull Request Review Event is the most different. Watch Event and Fork Event are more personal behaviors, Pull Request Review Events are more work behaviors, and Push Events are used more in personal projects.

The most active repositories over the past four years

Here we looked up the top 20 active repositories per year from 2019 to 2022 and counted the total number of listings per repository. The activity of the repository is ranked according to the number of developers participating in collaborative events.

Who gave the most stars in 2022

We queried the developers who gave the most stars in 2022, took the top 20, and filtered out accounts of suspected bots. If a developer's number of star events divided by the number of starred repositories is equal to or greater than 2, we suspect this user to be a bot.

  • 1
    136 stars
    per day
  • 2
    133 stars
    per day
  • 3
    76 stars
    per day

The most active developers since 2011

We queried the top 20 most active developers per year since 2011. This time we didn't filter out bot events.

95%logo

We found that the percentage of bots is becoming larger and larger. Bots started to overtake humans in 2013 and have reached over 95% in 2022.
Appendix

Term Description

About GitHub events

GitHub events are triggered by user actions, like starring a repository or pushing code.

About time range

In this report, the data collection range of 2022 is from January 1, 2022 to September 30, 2022. When comparing data of 2022 with another year, we use year-on-year analysis.

About bot events

Bot-triggered events account for a growing percentage of GitHub events. However, these events are not the focus of this report. We filtered out most of the bot-initiated events by matching regular expressions.

How we classify technical fields by topics

We do exact matching and fuzzy matching based on the repository topic. Exact matching means that the repository topics have a topic that exactly matches the word, and fuzzy matching means that the repository topics have a topic that contains the word.

TopicExact matchingFuzzy matching
GitHub Actionsactionsgithub-action, gh-action
Low Codelow-code, lowcode, nocode, no-code
Web3web3
Databasedbdatabase, databases nosql, newsql, sql mongodb,neo4j
AIai, aiops, aiotartificial-intelligence, machine-intelligence computer-vision, image-processing, opencv, computervision, imageprocessing voice-recognition, speech-recognition, voicerecognition, speechrecognition, speech-processing machinelearning, machine-learning deeplearning, deep-learning transferlearning, transfer-learning mlops text-to-speech, tts, speech-synthesis, voice-synthesis robot, robotics sentiment-analysis natural-language-processing, nlp language-model, text-classification, question-answering, knowledge-graph, knowledge-base gan, gans, generative-adversarial-network, generative-adversarial-networks neural-network, neuralnetwork, neuralnetworks, neural-network, dnn tensorflow PyTorch huggingface transformers seq2seq, sequence-to-sequence data-analysis, data-science object-detection, objectdetection data-augmentation classification action-recognition