If you want to build Another Insight tool that analyzes data from a source rather than GitHub, then you're in the right place. This workshop will teach you that it's not too hard to get insights from big data expecially big data with real-time inserts and updates.
What you will learn
1. Methodology to setup an insight system
Usually, with enough abstraction, there will be 3 steps to setup an insight system like OSS Insight:
- Find the data source (both historical and real-time data).
- Load data to TiDB (or any other HTAP database).
- Get insights with SQL.
2. Knowledge about HTAP database
It’s best to run your insight tool on a Hybrid Transactional and Analytical Processing(HTAP) database. It is SUPER EASY to handle both:
- Acting as a primary RDBMS to serve requests in high concurrency and insert/update in realtime
- Provide analytical ability to get insights
We have a 10-minute blog post that explains why we choose TiDB to support OSSInsight. But to save you some time, the following diagram shows the architectural differences "before" and "after" we use TiDB:
Using this architecture means that we don’t need to learn traditional
big data skills such as: MySQL+sharding technologies -> industrial etl tools -> olap databases->MySQL(write back), and then setup/manage these infrastructures, but just get a
T+1 analysis result finally.
You can load data to MySQL instead of TiDB; however, you will have performance issues.
Ready to learn more? Click the link below and join a workshop. Each one will follow the 3 steps above! 🏃🏃🏃
Join a Workshop!
We have implemented the
mini OSS Insights workshop and are thinking about creating three other workshops. You can try them by yourself with their historical and real-time API (a bit of a challenge, but not too hard :-).
- Mini OSS Insight
- NFT Insight
- Hacker News Insight
- Twitter Insight Not ready
- Stack Overflow Insight Not ready
If you want a further talk about OSS Insight, please join our offline workshop, you may get help there:Apply !