2022 Year in Review
In 2022, BGPKIT as an open-source organization made significant progresses. As the founder, I am grateful for all the opportunities would like to take this time to appreciate all the milestones we achieved. In this post, I will go through some notable changes we made in 2022, and take a look at what we are excited about for the year of 2023.
BGPKIT Parser
There were a number of major features added to BGPKIT Parser in 2022:
v0.7.0 added support for filtering messages by many fields, and allowing reading from uncompressed files.
v0.7.1 added better examples of parallel MRT files processing with
rayon
.v0.7.2 added filtering by multiple
peer_ip
s.v0.8.0 includes many internal refactoring and brought in the new oneio library to improve developer experience.
BGPKIT Broker
In 2022, we have revised the BGPKIT Broker backend to support crawling for estimated file sizes in addition to other fields like timestamps and URLs. This allows us to keep track of MRT file size changes and help users to pick suitable collectors to use, especially at the age where RIB dumps from a single collector could reach over 1 GB in size.
Figure of RIB sizes over time. Blue line is for rrc00, while yellow line is for route-views2.
We also made major revisions to our Rust SDK for more features like .latest()
to get the lastest MRT files from each collector.
https://github.com/bgpkit/bgpkit-broker/releases/tag/v0.5.0
Plus, if you would like to deploy a broker instance yourself, we also have made some significant efforts to improve our documentation on self-hosting guide.
https://github.com/bgpkit/bgpkit-broker-backend/blob/main/deployment/README.md
Python Bindings
In addition to core Rust code base for parser and broker, we have also added support for Python bindings for various Rust SDKs. Users can easily parse MRT files directly using pybgpkit
Python library. It is also proven to be usable on cloud-based Jupyter notebooks like Google Colab (examples).
https://github.com/bgpkit/pybgpkit
Monocle
To ties things together, we have also developed our first investigative tool, monocle
, to help users to quick find relevant BGP announcements with a suite of easy-to-use utilities. Users with Rust toolchain installed can run cargo install monocle
to install the tool.
https://github.com/bgpkit/monocle
Users can use the following subcommands
parse
: parse single MRT files, remotely or locallysearch
: find and filter BGP messages accross multiple public collectorstime
: convert between local time string and Unix timestampswhois
: find out AS names, ASN, registration countries, and organizations.
Web/API and Infrastructure
In 2022, we started to experiment new cloud-based infrastructure, especially cloud-based databases for better API stability and developer experiences. We ended up selecting Supabase as our PostgreSQL production host and a self-hosted instance for backup. We are happy with the performance and cost provided by Supabase and more exicted about the potential capability it brings such as user authentication, cloud storage, local dev schema changes, etc.
Based on the new infrastructure, we have started to test a new integrated API system (still in alpha). This allows us to put all of our data access and processing end-points into one location. Based on the new API, we have also developed a newer version of the BGPKIT Broker statistics page: https://alpha.stats.bgpkit.com/
New Datasets
Apart from provide SDKs and data APIs, we have also started providing free access to historical archives of some data that we find interesting. We blogged about one dataset previous, the peer-stats dataset:
https://blog.bgpkit.com/peer-stats-dataset/
Here is a list of our currently available datasets at https://data.bgpkit.com/
[peer-stats](https://data.bgpkit.com/peer-stats/)
: route collector peers statistics (IP, ASN, v4/v6 prefixes counts)[as2rel](https://data.bgpkit.com/as2rel/)
: AS-level relationship, using all available collectors[pfx2as](https://data.bgpkit.com/pfx2as/)
: prefix-to-AS mapping, using all available collectors[ihr-hegemony](https://data.bgpkit.com/ihr/hegemony/ipv4/global/)
: mirror of IIJ-IHR's global hegemony score dataset (big shout out to Romain and Internet Health Report for producing this data)
All the above datasets are free to use for research or commercial usages, and here is the acceptable usage agreement.
More Public Repositories
There are more open code repositories set available, some experimental and some for data analysis. You can check out the full list here:
https://github.com/orgs/bgpkit/repositories
Founder's Notes
In later 2022, I have joined Cloudflare to continue working on routing security for public benefits. During the first few months, we have built and shipped our new route-leak detection system under the public Radar platform. BGPKIT suite is now used in production for the BGP data anaysis pipeline at Cloudflare. While working full-time now, I am still committed to maintain the software suite and bringing new features to BGPKIT. For example, this year, I ported back our Kafka support used in Cloudflare to BGPKIT Broker backend. Folks at Cloudflare are doing great things in the open-source realm, and BGPKIT software suite will continue to become more useful to BGP enthuesastics and remain completely open-source.
Quote from Cloudflare's route-leak detection system blog.
Looking at 2023, here are a few things that I am really excited to work on for BGPKIT suite:
continue improve the infrastructure of the system and adding new data processing pipelines
continue improve the parser's performance and reliability
adding new RFC supports to parser (e.g. RFC9234 for route-leak prevention)
productionizing the new API and stats website
write more examples and documentations (don't we all love that)
And yeah, we will continue to be open-source first!
If you like what we do here, please consider subscribe to our blog. For all code repositories, check out our GitHub page.