<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[BGPKIT Blog]]></title><description><![CDATA[All about BGP data processing tools and resources.]]></description><link>https://blog.bgpkit.com</link><generator>RSS for Node</generator><lastBuildDate>Tue, 14 Apr 2026 08:42:16 GMT</lastBuildDate><atom:link href="https://blog.bgpkit.com/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[monocle v1.1.0]]></title><description><![CDATA[Monocle v1.1.0 focuses on interface consistency and day-to-day usability. This release simplifies feature gates, standardizes data refresh APIs, and adds quality-of-life improvements across parsing, search, and configuration workflows.
TL;DR

Feature...]]></description><link>https://blog.bgpkit.com/monocle-v110</link><guid isPermaLink="true">https://blog.bgpkit.com/monocle-v110</guid><category><![CDATA[bgp]]></category><dc:creator><![CDATA[Mingwei Zhang]]></dc:creator><pubDate>Sat, 14 Feb 2026 17:00:08 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1770763366951/57b2b87b-0db6-4898-8f47-5c262064c780.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Monocle <code>v1.1.0</code> focuses on interface consistency and day-to-day usability. This release simplifies feature gates, standardizes data refresh APIs, and adds quality-of-life improvements across parsing, search, and configuration workflows.</p>
<h2 id="heading-tldr">TL;DR</h2>
<ul>
<li><p>Feature flags are now simplified to <code>lib</code>, <code>server</code>, and <code>cli</code>.</p>
</li>
<li><p>Config and update flows are more consistent (<code>config update</code>, <code>config backup</code>, <code>config sources</code>, and <code>--no-update</code>).</p>
</li>
<li><p>Data refresh APIs are standardized across ASInfo, AS2Rel, RPKI, and Pfx2as.</p>
</li>
<li><p>Cache TTL defaults are unified at 7 days, with clearer staleness reporting.</p>
</li>
<li><p>Parse/search workflows gain multi-value filters, negation filters, field selection, ordering, timestamp format control, and local cache support.</p>
</li>
</ul>
<h2 id="heading-whats-new">What's New</h2>
<h3 id="heading-simpler-feature-flags">Simpler feature flags</h3>
<p>The crate feature model is now reduced to three options:</p>
<ul>
<li><p><code>lib</code>: complete library functionality (database + lenses + display)</p>
</li>
<li><p><code>server</code>: WebSocket server support (implies <code>lib</code>)</p>
</li>
<li><p><code>cli</code>: full command-line binary (implies <code>lib</code> and <code>server</code>)</p>
</li>
</ul>
<p>This replaces the previous multi-tier setup and makes dependency selection easier for downstream users.</p>
<h3 id="heading-standardized-data-refresh-apis">Standardized data refresh APIs</h3>
<p>Database refresh behavior is now more uniform across ASInfo, AS2Rel, RPKI, and Pfx2as:</p>
<ul>
<li><p>Consistent <code>needs_*_refresh(ttl)</code> checks</p>
</li>
<li><p>A shared <code>RefreshResult</code> shape with source and load details</p>
</li>
<li><p>Standardized naming (<code>refresh_*</code>) with compatibility aliases where needed</p>
</li>
<li><p>URL and local-path loading paths available across repositories</p>
</li>
</ul>
<p>This update reduces API drift and makes maintenance code paths more predictable.</p>
<h3 id="heading-config-command-updates">Config command updates</h3>
<p>Configuration and maintenance commands now use clearer naming:</p>
<ul>
<li><p><code>monocle config db-refresh</code> -&gt; <code>monocle config update</code></p>
</li>
<li><p><code>monocle config db-backup</code> -&gt; <code>monocle config backup</code></p>
</li>
<li><p><code>monocle config db-sources</code> -&gt; <code>monocle config sources</code></p>
</li>
</ul>
<p>The global no-refresh toggle was also renamed for consistency:</p>
<ul>
<li><code>--no-refresh</code> -&gt; <code>--no-update</code></li>
</ul>
<p>The following command shows the active configuration, cache TTL settings, database status, and server defaults:</p>
<pre><code class="lang-bash">monocle config
</code></pre>
<p>Example output:</p>
<pre><code class="lang-text">Monocle Configuration
=====================

General:
  Config file:    /home/user/.monocle/monocle.toml
  Data dir:       /home/user/.monocle/

Cache TTL:
  ASInfo:         7 days
  AS2Rel:         7 days
  RPKI:           7 days
  Pfx2as:         7 days

Database:
  Path:           /home/user/.monocle/monocle-data.sqlite3
  Status:         exists
  Size:           512.47 MB
  Schema:         initialized (v3)
  ASInfo:         120953 records (updated: 2026-02-02 19:54:01 UTC)
  AS2Rel:         877937 records (updated: 2026-02-02 14:25:34 UTC)
  RPKI:           796899 ROAs, 962 ASPAs (updated: 2026-02-10 19:53:50 UTC)
  Pfx2as:         1580626 records (updated: 2026-02-02 20:02:10 UTC)
</code></pre>
<h3 id="heading-better-cache-control-defaults">Better cache control defaults</h3>
<p>All major data sources now support configurable cache TTL with a 7-day default. This applies to ASInfo, AS2Rel, RPKI, and Pfx2as.</p>
<p><code>monocle config sources</code> now reports staleness based on TTL, so it is easier to see what needs updating.</p>
<p>The following command shows per-source status, staleness, and last update recency:</p>
<pre><code class="lang-bash">monocle config sources
</code></pre>
<p>Example output:</p>
<pre><code class="lang-text">Data Sources:

  Name         Status          Stale      Last Updated
  ------------------------------------------------------------
  asinfo       120953 records  yes        a week ago
  as2rel       877937 records  yes        a week ago
  rpki         797861 records  no         2 hours ago
  pfx2as       1580626 records yes        a week ago

Configuration:
  ASInfo cache TTL: 7 days
  AS2Rel cache TTL: 7 days
  RPKI cache TTL:   7 days
  Pfx2as cache TTL: 7 days
</code></pre>
<h3 id="heading-rpki-improvements">RPKI improvements</h3>
<p>Monocle now supports fetching ROAs via RTR (RPKI-to-Router), including endpoint override support and fallback behavior.</p>
<p>The following command refreshes only RPKI data and uses the provided RTR endpoint for this run instead of the default configured source.</p>
<pre><code class="lang-bash">monocle config update --rpki --rtr-endpoint rtr.rpki.cloudflare.com:8282
</code></pre>
<h3 id="heading-parse-and-search-enhancements">Parse and search enhancements</h3>
<p><code>parse</code> and <code>search</code> gained several output and filtering improvements:</p>
<ul>
<li><p>Multi-value filters with OR semantics</p>
</li>
<li><p>Negation filters using <code>!</code></p>
</li>
<li><p>Validation for ASN/prefix filter inputs</p>
</li>
<li><p><code>--fields</code> for column selection</p>
</li>
<li><p><code>--order-by</code> and <code>--order</code> for sorted output</p>
</li>
<li><p><code>--time-format</code> for unix or RFC3339 display</p>
</li>
<li><p><code>search --cache-dir</code> local file + broker query caching</p>
</li>
</ul>
<p>The following command searches one hour of updates starting at <code>2024-01-01</code>, filters for prefix <code>1.1.1.0/24</code>, and caches downloaded MRT files plus broker query results under <code>/tmp/mrt-cache</code> for faster repeat runs.</p>
<pre><code class="lang-bash">monocle search -t 2024-01-01 -d 1h -p 1.1.1.0/24 --cache-dir /tmp/mrt-cache
</code></pre>
<p>The following command uses multi-value filters with negation to exclude two origin ASNs while also matching either of two peer ASNs:</p>
<pre><code class="lang-bash">monocle search -t 2024-01-01 -d 1h -o <span class="hljs-string">'!13335,!15169'</span> -J 174,2914
</code></pre>
<p>Negation and positive values cannot be mixed within the same filter field.</p>
<h2 id="heading-breaking-changes-and-migration-notes">Breaking Changes and Migration Notes</h2>
<h3 id="heading-1-feature-flag-migration">1) Feature flag migration</h3>
<p>If you previously used feature tiers like <code>database</code>, <code>lens-core</code>, <code>lens-bgpkit</code>, <code>lens-full</code>, or <code>display</code>, switch to:</p>
<ul>
<li><p><code>lib</code> for library use</p>
</li>
<li><p><code>server</code> for WebSocket API use</p>
</li>
<li><p><code>cli</code> for full command-line use</p>
</li>
</ul>
<h3 id="heading-2-cli-and-subcommand-renames">2) CLI and subcommand renames</h3>
<p>Update scripts and automation:</p>
<ul>
<li><p><code>--no-refresh</code> -&gt; <code>--no-update</code></p>
</li>
<li><p><code>config db-refresh</code> -&gt; <code>config update</code></p>
</li>
<li><p><code>config db-backup</code> -&gt; <code>config backup</code></p>
</li>
<li><p><code>config db-sources</code> -&gt; <code>config sources</code></p>
</li>
</ul>
<h3 id="heading-3-parsesearch-filter-type-updates-library-api">3) Parse/search filter type updates (library API)</h3>
<p><code>ParseFilters</code> moved from scalar optional fields to vector-based values for multi-value and negation support. Library consumers should update filter construction accordingly.</p>
<h2 id="heading-additional-improvements">Additional Improvements</h2>
<ul>
<li><p>AS name rendering now prefers PeeringDB naming fields before falling back to AS2Org/core names</p>
</li>
<li><p>Data refresh logging now shows specific reasons (empty vs outdated)</p>
</li>
<li><p>Example layout was reorganized to one example per lens</p>
</li>
</ul>
<h2 id="heading-full-change-list">Full Change List</h2>
<p>See the v1.1.0 section in <a target="_blank" href="http://CHANGELOG.md"><code>CHANGELOG.md</code></a> for the complete list of changes.</p>
]]></content:encoded></item><item><title><![CDATA[BGPKIT Parser v0.14.0 Release and v0.13.0 Highlights]]></title><description><![CDATA[We are pleased to announce the release of BGPKIT Parser v0.14.0. This update introduces support for negative filters and the RPKI-to-Router (RTR) protocol. We also want to highlight key features from the recent v0.13.0 release, including enhanced deb...]]></description><link>https://blog.bgpkit.com/bgpkit-parser-v0140-release</link><guid isPermaLink="true">https://blog.bgpkit.com/bgpkit-parser-v0140-release</guid><category><![CDATA[rpki]]></category><category><![CDATA[bgp]]></category><dc:creator><![CDATA[Mingwei Zhang]]></dc:creator><pubDate>Thu, 25 Dec 2025 00:47:46 GMT</pubDate><content:encoded><![CDATA[<p>We are pleased to announce the release of <strong>BGPKIT Parser v0.14.0</strong>. This update introduces support for negative filters and the RPKI-to-Router (RTR) protocol. We also want to highlight key features from the recent v0.13.0 release, including enhanced debugging tools.</p>
<h2 id="heading-v0140-features">v0.14.0 Features</h2>
<h3 id="heading-negative-filter-support">Negative Filter Support</h3>
<p>A frequent request has been the ability to filter <em>out</em> specific data points. We have added support for negative filters across most filter types, allowing exclusion of specific origins, prefixes, peers, or communities.</p>
<p>In the CLI, use the <code>!=</code> operator. For example, to process all records <em>except</em> those originating from AS 13335:</p>
<pre><code class="lang-bash">bgpkit-parser https://spaces.bgpkit.org/parser/update-example.gz --filter <span class="hljs-string">"origin_asn!=13335"</span>
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1766623389524/e052484f-bc35-4d6e-ab6c-ba4727750396.png" alt class="image--center mx-auto" /></p>
<p>In Rust code, use the <code>!</code> prefix:</p>
<pre><code class="lang-rust"><span class="hljs-keyword">let</span> parser = BgpkitParser::new(<span class="hljs-string">"..."</span>)
    .add_filter(<span class="hljs-string">"!origin_asn"</span>, <span class="hljs-string">"13335"</span>)
    .add_filter(<span class="hljs-string">"!peer_ip"</span>, <span class="hljs-string">"192.168.1.1"</span>);
</code></pre>
<p>Supported negative filters include <code>!origin_asn</code>, <code>!prefix</code>, <code>!peer_ip</code>, <code>!peer_asn</code>, <code>!type</code>, <code>!as_path</code>, <code>!community</code>, and <code>!ip_version</code>.</p>
<h3 id="heading-rpki-rtr-protocol-support">RPKI RTR Protocol Support</h3>
<p>We have added support for the RPKI-to-Router (RTR) protocol, covering both version 0 (<a target="_blank" href="https://datatracker.ietf.org/doc/html/rfc6810">RFC 6810</a>) and version 1 (<a target="_blank" href="https://datatracker.ietf.org/doc/html/rfc8210">RFC 8210</a>).</p>
<p>The new <code>models::rpki::rtr</code> and <code>parser::rpki::rtr</code> modules allow developers to build custom RTR clients or servers.</p>
<p>Here is how you can use the library to connect to an RTR server and request data:</p>
<pre><code class="lang-rust"><span class="hljs-keyword">use</span> bgpkit_parser::models::rpki::rtr::*;
<span class="hljs-keyword">use</span> bgpkit_parser::parser::rpki::rtr::{read_rtr_pdu, RtrEncode, RtrError};
<span class="hljs-keyword">use</span> std::net::TcpStream;
<span class="hljs-keyword">use</span> std::io::Write;

<span class="hljs-comment">// 1. Connect to the RTR server</span>
<span class="hljs-keyword">let</span> <span class="hljs-keyword">mut</span> stream = TcpStream::connect(<span class="hljs-string">"rtr.rpki.cloudflare.com:8282"</span>)?;

<span class="hljs-comment">// 2. Send a Reset Query to request the full database</span>
<span class="hljs-keyword">let</span> reset_query = RtrResetQuery::new_v1();
stream.write_all(&amp;reset_query.encode())?;

<span class="hljs-comment">// 3. Read the response PDUs</span>
<span class="hljs-keyword">loop</span> {
    <span class="hljs-keyword">match</span> read_rtr_pdu(&amp;<span class="hljs-keyword">mut</span> stream)? {
        RtrPdu::IPv4Prefix(p) =&gt; {
            <span class="hljs-built_in">println!</span>(<span class="hljs-string">"Received IPv4 ROA: {}/{}-{} -&gt; AS{}"</span>, 
                p.prefix, p.prefix_length, p.max_length, p.asn);
        }
        RtrPdu::EndOfData(_) =&gt; <span class="hljs-keyword">break</span>,
        _ =&gt; {}
    }
}
</code></pre>
<p>We have included a fully functional RTR client example that connects to a server, fetches ROAs, and performs route validation.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1766623418936/d4509343-f9bd-46ca-b5aa-9a06ddf9620b.png" alt class="image--center mx-auto" /></p>
<p>You can run the example yourself:</p>
<pre><code class="lang-bash">cargo run --example rtr_client -- rtr.rpki.cloudflare.com 8282
</code></pre>
<h2 id="heading-in-case-you-missed-it-v0130">In Case You Missed It: v0.13.0</h2>
<p>The v0.13.0 release introduced several improvements for debugging and analyzing MRT data.</p>
<h3 id="heading-record-level-output">Record-Level Output</h3>
<p>The CLI supports inspecting individual MRT records rather than just parsed BGP elements. This aids in debugging parser issues or analyzing raw MRT files.</p>
<p>Switch to record-level output with <code>--level records</code> and choose a format (e.g., JSON):</p>
<pre><code class="lang-bash">bgpkit-parser https://spaces.bgpkit.org/parser/update-example.gz --level records --format json
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1766623443381/f7c2d7e2-864a-454c-ae3e-eb9dad7c3b28.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-raw-bytes-access">Raw Bytes Access</h3>
<p>For developers, <code>RawMrtRecord</code> now includes a <code>header_bytes</code> field, and the <code>raw_bytes</code> field has been renamed to <code>message_bytes</code>. This provides access to the exact bytes of the MRT header and the message body as they appeared on the wire, enabling byte-for-byte export and debugging without re-encoding.</p>
<h3 id="heading-other-improvements">Other Improvements</h3>
<ul>
<li><p><strong>Testing &amp; Fuzzing</strong>: Added a <code>cargo-fuzz</code> harness and initial fuzz targets.</p>
</li>
<li><p><strong>Performance</strong>: Continued optimizations for faster processing.</p>
</li>
</ul>
<p>Check out the full <a target="_blank" href="https://github.com/bgpkit/bgpkit-parser/blob/main/CHANGELOG.md">CHANGELOG</a> for more details.</p>
<hr />
<p><em>Happy Parsing!</em></p>
]]></content:encoded></item><item><title><![CDATA[BGPKIT Broker v0.9.0: Better Pagination and New Collectors]]></title><description><![CDATA[We are excited to announce the release of BGPKIT Broker v0.9.0. This release introduces total count support for efficient pagination, making it easier to build data exploration interfaces and manage large result sets. We also expand our collector cov...]]></description><link>https://blog.bgpkit.com/bgpkit-broker-v090</link><guid isPermaLink="true">https://blog.bgpkit.com/bgpkit-broker-v090</guid><category><![CDATA[bgp]]></category><category><![CDATA[Rust]]></category><dc:creator><![CDATA[Mingwei Zhang]]></dc:creator><pubDate>Sat, 01 Nov 2025 14:54:12 GMT</pubDate><content:encoded><![CDATA[<p>We are excited to announce the release of BGPKIT Broker v0.9.0. This release introduces total count support for efficient pagination, making it easier to build data exploration interfaces and manage large result sets. We also expand our collector coverage with two new Internet Exchange points.</p>
<h2 id="heading-previously-on-bgpkit-broker">Previously on BGPKIT Broker</h2>
<p>Since the major v0.7 release that unified the architecture into a single SQLite-backed CLI application, BGPKIT Broker has been serving the community with stable uptime and continuous data coverage. The v0.8 series brought several enhancements including automated backup systems, configuration validation, and convenient query shortcuts.</p>
<p>However, one common challenge remained for developers building interfaces on top of Broker: pagination. When querying large time ranges, users needed to fetch all results to know the total count, making it inefficient to build proper pagination controls or display result statistics.</p>
<h2 id="heading-v09-efficient-pagination-support">V0.9: Efficient Pagination Support</h2>
<p>Version 0.9.0 addresses pagination needs with total count support at both the SDK and API levels. This allows applications to fetch result counts independently from the data itself, enabling responsive user interfaces without over-fetching data.</p>
<h3 id="heading-sdk-querytotalcount-method">SDK: <code>query_total_count()</code> Method</h3>
<p>The SDK now provides a dedicated <code>query_total_count()</code> method that returns the total number of matching items without retrieving the actual data. This is useful when you need to know how many results exist before deciding how to paginate through them.</p>
<pre><code class="lang-rust"><span class="hljs-keyword">use</span> bgpkit_broker::BgpkitBroker;

<span class="hljs-meta">#[tokio::main]</span>
<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">fn</span> <span class="hljs-title">main</span></span>() {
    <span class="hljs-keyword">let</span> broker = BgpkitBroker::new()
        .ts_start(<span class="hljs-string">"2024-10-01"</span>)
        .ts_end(<span class="hljs-string">"2024-10-02"</span>)
        .project(<span class="hljs-string">"route-views"</span>);

    <span class="hljs-comment">// Get total count without fetching items</span>
    <span class="hljs-keyword">let</span> total = broker.query_total_count().<span class="hljs-keyword">await</span>.unwrap();
    <span class="hljs-built_in">println!</span>(<span class="hljs-string">"Total matching items: {}"</span>, total);

    <span class="hljs-comment">// Now fetch paginated results</span>
    <span class="hljs-keyword">let</span> result = broker
        .page_size(<span class="hljs-number">100</span>)
        .page(<span class="hljs-number">1</span>)
        .query()
        .<span class="hljs-keyword">await</span>
        .unwrap();

    <span class="hljs-built_in">println!</span>(<span class="hljs-string">"Retrieved {} items out of {} total"</span>, 
             result.data.len(), 
             result.total.unwrap_or(<span class="hljs-number">0</span>));
}
</code></pre>
<p>This method executes a <code>COUNT(*)</code> query against the database with the same filters as your main query, returning just the number rather than all the data. For large time ranges with thousands of results, this can save significant bandwidth and processing time.</p>
<h3 id="heading-api-total-field-in-search-results">API: Total Field in Search Results</h3>
<p>The <code>/v3/search</code> endpoint now includes a <code>total</code> field in every response, providing the total count of matching items alongside the paginated results. This enhancement makes it straightforward to build pagination controls in web interfaces or CLI tools.</p>
<pre><code class="lang-bash">curl <span class="hljs-string">"https://api.bgpkit.com/v3/broker/search?ts_start=2024-10-01&amp;ts_end=2024-10-02&amp;page_size=10&amp;page=1"</span>
</code></pre>
<p>Response:</p>
<pre><code class="lang-json">{
  <span class="hljs-attr">"total"</span>: <span class="hljs-number">1234</span>,
  <span class="hljs-attr">"data"</span>: [
    {
      <span class="hljs-attr">"ts_start"</span>: <span class="hljs-number">1727740800</span>,
      <span class="hljs-attr">"ts_end"</span>: <span class="hljs-number">1727740800</span>,
      <span class="hljs-attr">"collector"</span>: <span class="hljs-string">"route-views2"</span>,
      <span class="hljs-attr">"project"</span>: <span class="hljs-string">"route-views"</span>,
      <span class="hljs-attr">"url"</span>: <span class="hljs-string">"..."</span>,
      ...
    }
  ]
}
</code></pre>
<p>With the <code>total</code> field, client applications can:</p>
<ul>
<li><p>Display "showing X of Y results" to users</p>
</li>
<li><p>Calculate the number of pages needed for pagination</p>
</li>
<li><p>Decide whether to fetch all results or paginate</p>
</li>
<li><p>Provide accurate progress indicators for data processing</p>
</li>
</ul>
<h3 id="heading-new-routeviews-collectors">New RouteViews Collectors</h3>
<p>This release adds support for two new RouteViews collectors:</p>
<ul>
<li><p><strong>hkix.hkg</strong>: Hong Kong Internet Exchange (HKIX) collector in Hong Kong</p>
</li>
<li><p><strong>ix-br.gru</strong>: <a target="_blank" href="http://IX.br">IX.br</a> (<a target="_blank" href="http://PTT.br">PTT.br</a>) in São Paulo</p>
</li>
</ul>
<p>These additions enhance BGPKIT Broker's global coverage, providing more diverse vantage points into internet routing behavior. Users who update to v0.9.0 can run <code>bgpkit-broker update</code> to automatically bootstrap data for these new collectors.</p>
<h2 id="heading-behind-the-scenes-improvements">Behind the Scenes Improvements</h2>
<p>Beyond the user-facing features, we also made several code improvements:</p>
<ul>
<li><p>Refactored bootstrap download logging to centralize progress reporting and eliminate redundant code paths</p>
</li>
<li><p>Updated the oneio dependency to v0.20.0 with optimized feature flags for better handling rustls providers</p>
</li>
<li><p>Enhanced test coverage overall</p>
</li>
</ul>
<h2 id="heading-upgrading-to-v090">Upgrading to v0.9.0</h2>
<h3 id="heading-for-sdk-users">For SDK Users</h3>
<p>Update your <code>Cargo.toml</code> to use the latest v0.9 release:</p>
<pre><code class="lang-toml"><span class="hljs-section">[dependencies]</span>
<span class="hljs-attr">bgpkit-broker</span> = <span class="hljs-string">"0.9"</span>
</code></pre>
<p>Then run:</p>
<pre><code class="lang-bash">cargo update
</code></pre>
<p>The new <code>query_total_count()</code> method is immediately available for use. The <code>total</code> field in <code>BrokerQueryResult</code> is backward compatible as an optional field.</p>
<h3 id="heading-for-self-hosted-instances">For Self-Hosted Instances</h3>
<p>If you run your own BGPKIT Broker instance, update to the latest version:</p>
<pre><code class="lang-bash">cargo install --force bgpkit-broker --version 0.9.0 --features cli
</code></pre>
<p>Or pull the latest Docker image:</p>
<pre><code class="lang-bash">docker pull bgpkit/bgpkit-broker:latest
</code></pre>
<p>After upgrading, run the update command to add the new collectors:</p>
<pre><code class="lang-bash">bgpkit-broker update --db-path your-database.sqlite3
</code></pre>
<p>The update process will automatically detect and bootstrap historical data for the two new collectors.</p>
<h2 id="heading-looking-forward">Looking Forward</h2>
<p>BGPKIT Broker continues to serve as a fundamental component for BGP data processing pipelines. With v0.9.0, we focused on making the developer experience better for building applications that need pagination and result statistics. We continue to maintain the public Broker instance at <a target="_blank" href="https://api.bgpkit.com/v3/broker"><code>https://api.bgpkit.com/v3/broker</code></a> with 99.996% uptime, and encourage users to deploy on-premise instances for production pipelines to reduce external dependencies.</p>
<p>For the full v0.9.0 release notes, please check out our <a target="_blank" href="https://github.com/bgpkit/bgpkit-broker/releases/tag/v0.9.0">GitHub release page</a>. If you have any comments, please drop us a message on <a target="_blank" href="https://twitter.com/bgpkit">Twitter</a>, <a target="_blank" href="https://infosec.exchange/@bgpkit">Mastodon</a>, <a target="_blank" href="https://bsky.app/profile/bgpkit.com">Bluesky</a> or <a target="_blank" href="mailto:contact@bgpkit.com">email</a>.</p>
<h2 id="heading-get-started">Get Started</h2>
<p>Ready to try out the new features? Here are some resources:</p>
<ul>
<li><p><strong>GitHub</strong>: <a target="_blank" href="http://github.com/bgpkit/bgpkit-broker">github.com/bgpkit/bgpkit-broker</a></p>
</li>
<li><p><strong>Examples</strong>: Browse <a target="_blank" href="https://github.com/bgpkit/bgpkit-broker/tree/main/examples">example code</a> demonstrating common usage patterns</p>
</li>
<li><p><strong>Public API</strong>: Try queries at <a target="_blank" href="https://api.bgpkit.com">https://api.bgpkit.com</a></p>
</li>
</ul>
<p>Have questions or feedback? Open an issue on <a target="_blank" href="https://github.com/bgpkit/bgpkit-broker/issues">GitHub</a> or join discussions in our <a target="_blank" href="https://discord.gg/XDaAtZsz6b">Discord Channel</a>.</p>
<h2 id="heading-8jslg">💖</h2>
<p>If you find our libraries and services useful, we would highly appreciate if you consider sponsoring us on <a target="_blank" href="https://github.com/sponsors/bgpkit">GitHub</a>.</p>
]]></content:encoded></item><item><title><![CDATA[Run BGPKIT on Cloudflare Containers]]></title><description><![CDATA[For the longest time, I’ve been using Cloudflare exclusively for web/API hosting and relatively lightweight tasks. For the most of my work on BGP, there is really not much that I can accomplish with just JavaScript/TypeScript (except maybe working wi...]]></description><link>https://blog.bgpkit.com/run-bgpkit-on-cloudflare-containers</link><guid isPermaLink="true">https://blog.bgpkit.com/run-bgpkit-on-cloudflare-containers</guid><category><![CDATA[#cloudflare-containers]]></category><category><![CDATA[bgp]]></category><category><![CDATA[cloudflare]]></category><category><![CDATA[cloudflare-worker]]></category><category><![CDATA[Docker]]></category><dc:creator><![CDATA[Mingwei Zhang]]></dc:creator><pubDate>Mon, 30 Jun 2025 01:06:51 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/1cqIcrWFQBI/upload/e175d8cdd35a205e176085ff9a7899c9.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>For the longest time, I’ve been using <a target="_blank" href="https://developers.cloudflare.com/">Cloudflare</a> exclusively for web/API hosting and relatively lightweight tasks. For the most of my work on BGP, there is really not much that I can accomplish with just JavaScript/TypeScript (except maybe working with <a target="_blank" href="https://ris-live.ripe.net/">RIS Live</a> WebSocket). The computationally intensive nature of the most BGP data processing doesn't naturally fit within the typical Cloudflare developer platform.</p>
<p>This changes with the recent <a target="_blank" href="https://blog.cloudflare.com/containers-are-available-in-public-beta-for-simple-global-and-programmable/">announcement</a> of <a target="_blank" href="https://developers.cloudflare.com/containers/">Cloudflare Containers</a>. In short, it allows developers to build and run custom containers on the Cloudflare’s platform, enabling the heavy workload to mix with other primitives and unify the deployment.</p>
<p>In this blog post, I will show you how to build a BGP data search API with BGPKIT and deploy on Cloudflare Containers. The source code is available <a target="_blank" href="https://github.com/bgpkit/bgpkit-cf-containers">on GitHub</a>.</p>
<h2 id="heading-bgp-data-search-api">BGP Data Search API</h2>
<p>For this example, I will show you how to build a very straightforward HTTP API to allow accepting search parameters and let BGPKIT to fetch and parse BGP archives and return the parsed messages.</p>
<p>The API accepts four parameters: <code>collector</code>, <code>prefix</code>, <code>ts_start</code>, and <code>ts_end</code>, to filter and parse BGP archives efficiently.</p>
<ul>
<li><p><code>collector</code>: the BGP route collector ID to use (e.g. <code>rrc00</code> or <code>route-views2</code>). We want to limit the search to a single collector.</p>
</li>
<li><p><code>prefix</code>: the prefix of the BGP relevant updates. Open-ended search will burn up resources quick, but you can opt-out this requirement.</p>
</li>
<li><p><code>ts_start</code> and <code>ts_end</code>: the starting and ending timestamps. The goal is to limit the search to end up parsing a very small amount of MRT files (e.g. give them the same value will do). We should probably leave large-scale data crunching to a environment with more CPU powers.</p>
</li>
</ul>
<p>The parameters are defined in a struct to be able to passed in to a <code>axum</code> “GET” route:</p>
<pre><code class="lang-rust"><span class="hljs-meta">#[derive(Deserialize, Serialize)]</span>
<span class="hljs-class"><span class="hljs-keyword">struct</span> <span class="hljs-title">Params</span></span> {
    collector: <span class="hljs-built_in">String</span>,
    prefix: <span class="hljs-built_in">String</span>,
    ts_start: <span class="hljs-built_in">String</span>,
    ts_end: <span class="hljs-built_in">String</span>,
}
</code></pre>
<p>With the given parameters, we will first find the relevant MRT files by setting the filters of timestamps and collector to <code>BgpkitBroker</code> instance:</p>
<pre><code class="lang-rust">        <span class="hljs-keyword">let</span> files = <span class="hljs-keyword">match</span> bgpkit_broker::BgpkitBroker::new()
            .ts_end(ts_end.clone())
            .ts_start(ts_start.clone())
            .collector_id(collector.clone())
            .query(){
            <span class="hljs-literal">Ok</span>(items) =&gt; items,
            <span class="hljs-literal">Err</span>(e) =&gt; {
                <span class="hljs-keyword">return</span> Json(<span class="hljs-built_in">Result</span> {
                    error: <span class="hljs-literal">Some</span>(e.to_string()),
                    data: <span class="hljs-built_in">vec!</span>[],
                    meta: <span class="hljs-literal">None</span>,
                });
            }
        };
</code></pre>
<p>For each file, we will parse the whole MRT file and collect BGP updates that is relevant to the target prefix:</p>
<pre><code class="lang-rust">        <span class="hljs-keyword">for</span> file <span class="hljs-keyword">in</span> files {
            <span class="hljs-keyword">let</span> <span class="hljs-keyword">mut</span> parser = <span class="hljs-keyword">match</span> bgpkit_parser::BgpkitParser::new(file.url.as_str()){
                <span class="hljs-literal">Ok</span>(parser) =&gt; parser,
                <span class="hljs-literal">Err</span>(e) =&gt; {
                    <span class="hljs-keyword">return</span> Json(<span class="hljs-built_in">Result</span> {
                        error: <span class="hljs-literal">Some</span>(e.to_string()),
                        data: <span class="hljs-built_in">vec!</span>[],
                        meta: <span class="hljs-literal">None</span>,
                    });
                }
            };

            parser = <span class="hljs-keyword">match</span> parser.add_filter(<span class="hljs-string">"prefix"</span>, prefix.as_str()){
                <span class="hljs-literal">Ok</span>(parser) =&gt; parser,
                <span class="hljs-literal">Err</span>(e) =&gt; {
                    <span class="hljs-keyword">return</span> Json(<span class="hljs-built_in">Result</span> {
                        error: <span class="hljs-literal">Some</span>(e.to_string()),
                        data: <span class="hljs-built_in">vec!</span>[],
                        meta: <span class="hljs-literal">None</span>,
                    });
                }
            };
            items.extend(parser.into_elem_iter());
        }
</code></pre>
<p>Because the BGPKIT parser and broker code are implemented in the sync environment, we will need to wrap the code above in a blocking thread in order to use it in async web frameworks like <code>axum</code>.</p>
<pre><code class="lang-rust"><span class="hljs-keyword">let</span> result = tokio::task::spawn_blocking(<span class="hljs-keyword">move</span> || {
   <span class="hljs-comment">// THE BLOCKING CODE PIECES</span>
}).<span class="hljs-keyword">await</span>.unwrap();
</code></pre>
<p>Please see the full source code here for more details:</p>
<p><a target="_blank" href="https://github.com/bgpkit/bgpkit-cf-containers/blob/main/container-src/src/main.rs">https://github.com/bgpkit/bgpkit-cf-containers/blob/main/container-src/src/main.rs</a></p>
<h2 id="heading-cloudflare-containers-deployment">Cloudflare Containers Deployment</h2>
<p>Now that we have an Rust-based API code working, we will need to 1. containerize the code, 2. put a Cloudflare Container wrapper around it for deployment.</p>
<p>The container definition is a typical two-stage build definition, with a builder stage to build the binary application, and a minimum deployment stage to call the executable. The two-stage build is almost necessary as Cloudflare Containers <a target="_blank" href="https://developers.cloudflare.com/containers/platform-details/#limits">has limits</a> on the size of each image and the total storage per account. The smaller the image the better.</p>
<pre><code class="lang-dockerfile"><span class="hljs-comment"># ---- Build Stage ----</span>
<span class="hljs-keyword">FROM</span> rust:<span class="hljs-number">1.86</span> AS builder

<span class="hljs-keyword">WORKDIR</span><span class="bash"> /app</span>

<span class="hljs-comment"># Install build dependencies</span>
<span class="hljs-keyword">RUN</span><span class="bash"> apt-get update &amp;&amp; apt-get install -y pkg-config libssl-dev</span>

<span class="hljs-comment"># Build application</span>
<span class="hljs-keyword">COPY</span><span class="bash"> container-src/Cargo.lock container-src/Cargo.toml ./</span>
<span class="hljs-keyword">COPY</span><span class="bash"> container-src/src ./src</span>
<span class="hljs-keyword">RUN</span><span class="bash"> cargo build --release</span>

<span class="hljs-comment"># ---- Runtime Stage ----</span>
<span class="hljs-keyword">FROM</span> debian:bookworm-slim

<span class="hljs-comment"># Install minimal runtime dependencies</span>
<span class="hljs-keyword">RUN</span><span class="bash"> apt-get update &amp;&amp; apt-get install -y ca-certificates &amp;&amp; rm -rf /var/lib/apt/lists/*</span>

<span class="hljs-keyword">WORKDIR</span><span class="bash"> /app</span>

<span class="hljs-comment"># Copy the statically linked binary from the builder</span>
<span class="hljs-keyword">COPY</span><span class="bash"> --from=builder /app/target/release/bgpkit-cf-container /app/bgpkit-cf-container</span>

<span class="hljs-keyword">EXPOSE</span> <span class="hljs-number">3000</span>

<span class="hljs-keyword">CMD</span><span class="bash"> [<span class="hljs-string">"/app/bgpkit-cf-container"</span>]</span>
</code></pre>
<p>The rest of the task is to build a API app for Workers and configure Containers. The following config is pretty much all it needs to configure the Workers script to build and push the container image, and create a <a target="_blank" href="https://developers.cloudflare.com/durable-objects/">Durable Object</a> to coordinate and run Containers.</p>
<pre><code class="lang-json">    <span class="hljs-string">"containers"</span>: [
        {
            <span class="hljs-attr">"class_name"</span>: <span class="hljs-string">"BgpkitContainer"</span>,
            <span class="hljs-attr">"image"</span>: <span class="hljs-string">"./Dockerfile"</span>,
            <span class="hljs-attr">"max_instances"</span>: <span class="hljs-number">5</span>
        }
    ],
    <span class="hljs-string">"durable_objects"</span>: {
        <span class="hljs-attr">"bindings"</span>: [
            {
                <span class="hljs-attr">"class_name"</span>: <span class="hljs-string">"BgpkitContainer"</span>,
                <span class="hljs-attr">"name"</span>: <span class="hljs-string">"BGPKIT_CONTAINER"</span>
            }
        ]
    },
    <span class="hljs-string">"migrations"</span>: [
        {
            <span class="hljs-attr">"new_sqlite_classes"</span>: [
                <span class="hljs-string">"BgpkitContainer"</span>
            ],
            <span class="hljs-attr">"tag"</span>: <span class="hljs-string">"v1"</span>
        }
    ]
</code></pre>
<p>The main Workers script is only 22 lines long:</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">import</span> { Container, getContainer } <span class="hljs-keyword">from</span> <span class="hljs-string">'@cloudflare/containers'</span>;
<span class="hljs-keyword">import</span> { Hono } <span class="hljs-keyword">from</span> <span class="hljs-string">"hono"</span>;

<span class="hljs-keyword">export</span> <span class="hljs-keyword">class</span> BgpkitContainer <span class="hljs-keyword">extends</span> Container {
    defaultPort = <span class="hljs-number">3000</span>;
    sleepAfter = <span class="hljs-string">'5m'</span>;
}

<span class="hljs-comment">// Create Hono app with proper typing for Cloudflare Workers</span>
<span class="hljs-keyword">const</span> app = <span class="hljs-keyword">new</span> Hono&lt;{
    Bindings: { BGPKIT_CONTAINER: DurableObjectNamespace&lt;BgpkitContainer&gt; };
}&gt;();

app.get(<span class="hljs-string">"/search"</span>, <span class="hljs-keyword">async</span> (c) =&gt; {
    <span class="hljs-keyword">if</span> (!c.req.query(<span class="hljs-string">'collector'</span>) || !c.req.query(<span class="hljs-string">'prefix'</span>) || !c.req.query(<span class="hljs-string">'ts_start'</span>) || !c.req.query(<span class="hljs-string">'ts_end'</span>)) {
        <span class="hljs-keyword">return</span> c.json({ error: <span class="hljs-string">"Missing required query parameters: collector, prefix, ts_start, ts_end"</span> }, <span class="hljs-number">400</span>);
    }
    <span class="hljs-keyword">const</span> container = getContainer(c.env.BGPKIT_CONTAINER);
    <span class="hljs-keyword">return</span> <span class="hljs-keyword">await</span> container.fetch(c.req.raw);
});

<span class="hljs-keyword">export</span> <span class="hljs-keyword">default</span> app;
</code></pre>
<p>The important pieces are:</p>
<ul>
<li><p><code>class BgpkitContainer extends Container</code> block defines the port to use and configures how long the container should run after the last query. In this example, containers will be killed after 5 minutes of inactivity. It is crucial to realize that Cloudflare Containers are not a drop-in replacement of some other container deployment platforms like fly.io or Railway, as the workload for Containers are intended to be short-lived (ping me if this changes) and scales horizontally depending on the requests amount.</p>
</li>
<li><p>the <code>getContainer</code> function here tries to reach a container. If the intended container is overloaded, it may create a new container on demand. You may choose to use the <code>getRandom</code> function to round-robin containers. See <a target="_blank" href="https://developers.cloudflare.com/containers/scaling-and-routing/">the docs</a> for more.</p>
</li>
<li><p>the <code>container.fetch(c.req.raw)</code> forwards the query to the container, including the query parameters, which will then be handled by the running container.</p>
</li>
</ul>
<h2 id="heading-example-queries">Example Queries</h2>
<p>The following example will reach the Workers script (handled by Hono), and then reach the container to run the actual BGP data crunching task. (This URL won’t actually work as we don’t have budget to provide such service openly. Feel free to deploy it on your own account to try it out.)<br /><a target="_blank" href="https://bgpkit-cf-containers.bgpkit.workers.dev/search?collector=rrc00&amp;prefix=103.228.200.0/24&amp;ts_start=1751231488&amp;ts_end=1751231488">https://EXAMPLE.bgpkit.workers.dev/search?collector=rrc00&amp;prefix=1.1.1.0/24&amp;ts_start=1751231488&amp;ts_end=1751231488</a></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751244889255/8aba97e4-f3b3-4932-a344-263aeec5de5b.png" alt class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751244904587/7bb5e796-3c7b-4f14-991d-ec6914cb03b3.png" alt class="image--center mx-auto" /></p>
]]></content:encoded></item><item><title><![CDATA[BGPKIT Broker 0.7 Release]]></title><description><![CDATA[BGPKIT Broker is a fundamental component to our design of a all-purpose BGP data processing pipeline. In short, it is a BGP data file meta information "broker" that tells the data consumers what MRT files from RouteViews and RIPE RIS are available fo...]]></description><link>https://blog.bgpkit.com/bgpkit-broker-07-release</link><guid isPermaLink="true">https://blog.bgpkit.com/bgpkit-broker-07-release</guid><dc:creator><![CDATA[Mingwei Zhang]]></dc:creator><pubDate>Sat, 22 Jun 2024 23:41:16 GMT</pubDate><content:encoded><![CDATA[<p><a target="_blank" href="https://github.com/bgpkit/bgpkit-broker">BGPKIT Broker</a> is a fundamental component to our design of a all-purpose BGP data processing pipeline. In short, it is a BGP data file meta information "broker" that tells the data consumers what MRT files from RouteViews and RIPE RIS are available for any given time range in question. It commonly serves as a data input entry point for data pipelines.</p>
<p>For instance, here is a simple diagram for a system that creates a semi-real-time BGP data stream with BGPKIT Broker and Parser (a very common use case for these two libraries).</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1719080509165/c088686a-fef6-4e42-828d-e6c72f55fdd4.png" alt="Sample workflow diagram where BGPKIT Broker indexes meta information from MRT archives and BGPKIT parser parses these files to BGP messages." class="image--center mx-auto" /></p>
<p>BGPKIT Broker periodically crawls the websites of RIPE RIS and RouteViews MRT data pages of their collectors and index meta information into a database. Downstream consumers can ask and retrieve new files and process the files into BGP messages.</p>
<h1 id="heading-previously-on-bgpkit-broker">Previously on BGPKIT Broker</h1>
<p>In the BGPKIT Broker version 0.1 to 0.6, a working broker instance consists of three individual components: a crawler, a Postgres database, and an API.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1719081403845/fb270169-e102-4007-9faf-4b154c88adf7.png" alt class="image--center mx-auto" /></p>
<p>Each of the three component runs independently, and requires independent configuration, cronjobs, deployment, and all these goodies. For example, to run BGPKIT Broker v0.6, a user will need to configure and run</p>
<ul>
<li><p>a PostgreSQL database with proper credentials and schema set up;</p>
</li>
<li><p>a cronjob instance that periodically crawls the data sources, with optional locks to prevent overlapping executions in case some crawl became slow;</p>
</li>
<li><p>a API application likely sitting behind a configured reverse proxy like Caddy to serve the data.</p>
</li>
</ul>
<p>It's fun and exciting to set up all these for the first time, but quickly became tiring and too complex for repeated set up or bootstrapping for new users.</p>
<h1 id="heading-v07-one-cli-app-that-does-everything">V0.7: one CLI app that does everything</h1>
<p>We completely revamped the architecture for BGPKIT Broker in V0.7 to merge every functionality needed for a running Broker instance into one single command-line application: <code>bgpkit-broker</code>. V0.7 provides a single application to configure, run, debug, query everything on BGPKIT Broker.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1719081897617/02deb7b7-f1f6-4d1d-89e7-b8f3fd37bd8e.png" alt class="image--center mx-auto" /></p>
<p>To achieve this redesign, we made some major changes to our architecture.</p>
<h2 id="heading-sqlite-instead-of-postgresql">SQLite instead of PostgreSQL</h2>
<p>There are two major topics to concern when choosing a backend database for BGPKIT Broker: performance and portability.</p>
<h3 id="heading-sqlite-is-more-than-fast-enough">SQLite is more than fast enough</h3>
<p>BGPKIT Broker indexes metadata for <strong>all collectors</strong> from RouteViews and RIPE RIS, which includes time, URL, type, size of every RIB dump and updates MRT files from these two public archives. Dating all the way back to 1999, we have indexed roughly 48 million MRT files' metadata.</p>
<p>With a single index on the timestamp of files, we are able to search data files in less than 0.5s for any queries, which is more than fast enough for our use cases. We admit that we have spent our time for "early optimization" and in the end, the simple schema out-weighs the small performance gains.</p>
<h3 id="heading-backup-and-bootstrap-with-just-one-file">Backup and bootstrap with just one file</h3>
<p>Now in terms of portability, we can appreciate enough the beauty of single-file database like SQLite. In our current production setup, we periodically backup the database, and it literally involves just copying a single file to another directory (well, we also upload it to Cloudflare R2 for safekeeping).</p>
<p>Portability also means users can move their instance anywhere they want with ease. This is definitely the case for V0.7 where new users can bootstrap by simply download a SQLite file (our CLI provides all that functionality), and move to new locations by scp it to anywhere they desire.</p>
<p>Here is a video demonstrating bootstrapping a local BGPKIT Broker SQLite database with the new <code>bgpkit-broker bootstrap</code> command.</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://youtu.be/SsCyfQ5q0G0">https://youtu.be/SsCyfQ5q0G0</a></div>
<p> </p>
<h2 id="heading-new-file-notification-via-nats">New file notification via NATS</h2>
<p>Before V0.7, pipelines that needs to continuously processing new MRT files will need to "pull" data from BGPKIT Broker instance periodically and keep track of the latest files processed. We consider this a hassle that developers should not be dealing with and thus introduced a new <a target="_blank" href="https://nats.io/"><code>NATS</code></a>-based message channel allowing data consumers to subscribe to the public/private NATS channel where a Broker instance may publish new file notification to.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1719092321929/7cd3fd1f-309e-4e3f-a7b3-660d8dd9aff5.png" alt class="image--center mx-auto" /></p>
<p>We dedicated <code>nats.broker.bgpkit.com</code> as the public endpoint for any NATS consumers to connect to. Whenever a new file becomes available in Broker, it will publish a new file notification with all metadata as in the database entry to the public channel. Consumers (e.g. data pipelines) can use the <code>NatsNotifier::new(None).start_subscription()</code> to start waiting for new files. The following snippet below shows how a simple pipeline can use this feature in a loop.</p>
<pre><code class="lang-rust"><span class="hljs-keyword">let</span> <span class="hljs-keyword">mut</span> notifier = <span class="hljs-keyword">match</span> NatsNotifier::new(url).<span class="hljs-keyword">await</span> {
    <span class="hljs-literal">Ok</span>(n) =&gt; n,
    <span class="hljs-literal">Err</span>(e) =&gt; {
        error!(<span class="hljs-string">"{}"</span>, e);
        <span class="hljs-keyword">return</span>;
    }
};
<span class="hljs-keyword">if</span> <span class="hljs-keyword">let</span> <span class="hljs-literal">Err</span>(e) = notifier.start_subscription(subject).<span class="hljs-keyword">await</span> {
    error!(<span class="hljs-string">"{}"</span>, e);
    <span class="hljs-keyword">return</span>;
}
<span class="hljs-keyword">while</span> <span class="hljs-keyword">let</span> <span class="hljs-literal">Some</span>(item) = notifier.next().<span class="hljs-keyword">await</span> {
    <span class="hljs-keyword">if</span> pretty {
        <span class="hljs-built_in">println!</span>(<span class="hljs-string">"{}"</span>, serde_json::to_string_pretty(&amp;item).unwrap());
    } <span class="hljs-keyword">else</span> {
        <span class="hljs-built_in">println!</span>(<span class="hljs-string">"{}"</span>, item);
    }
}
</code></pre>
<p>We also implemented a simple new file watcher in the app as <code>bgpkit-broker live</code> subcommand. It will start a subscription to the public BGPKIT NATS endpoint and print out new file data as they come to the channel.</p>
<h2 id="heading-one-command-to-serve-and-update">One command to serve and update</h2>
<p>As mentioned previously, the new <code>bgpkit-broker</code> application includes everything one needs to start a instance. Once one bootstrapped the database to a local sqlite file (via <code>bgpkit-broker bootstrap &lt;FILENAME&gt;</code> command), all they need to start a auto-updating API is to run <code>bgpkit-broker serve &lt;FILENAME&gt;</code> .</p>
<pre><code class="lang-bash">bgpkit-broker serve --<span class="hljs-built_in">help</span>
Serve the Broker content via RESTful API

Usage: bgpkit-broker serve [OPTIONS] &lt;DB_PATH&gt;

Arguments:
  &lt;DB_PATH&gt;  broker db file location

Options:
  -i, --update-interval &lt;UPDATE_INTERVAL&gt;  update interval <span class="hljs-keyword">in</span> seconds [default: 300]
      --no-log                             <span class="hljs-built_in">disable</span> logging
  -b, --bootstrap                          bootstrap the database <span class="hljs-keyword">if</span> it does not exist
      --env &lt;ENV&gt;                          
  -s, --silent                             <span class="hljs-built_in">disable</span> bootstrap progress bar
  -h, --host &lt;HOST&gt;                        host address [default: 0.0.0.0]
  -p, --port &lt;PORT&gt;                        port number [default: 40064]
  -r, --root &lt;ROOT&gt;                        root path, useful <span class="hljs-keyword">for</span> configuring docs UI [default: /]
      --no-update                          <span class="hljs-built_in">disable</span> updater service
      --no-api                             <span class="hljs-built_in">disable</span> API service
  -h, --<span class="hljs-built_in">help</span>                               Print <span class="hljs-built_in">help</span>
  -V, --version                            Print version
</code></pre>
<p>The <code>serve</code> subcommand will also start a thread that periodically crawl and update the SQLite database to make sure the API always serve the up-to-date data.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1719093388805/e69d3d56-f582-4318-831b-ce54f3269280.png" alt class="image--center mx-auto" /></p>
<p>Noticed that error message? It's by design as it tries to connect to notification channel for new files as a default behavior for a service, but not NATS URL is configured. We use the <code>BGPKIT_BROKER_NATS_URL</code> environment variable to configure the NATS channel to use.</p>
<p>We also allow users to optionally configure a heartbeat URL to monitor the data updating status. After every success data crawling run, Broker will try to execute a HTTP GET to a URL if <code>BGPKIT_BROKER_HEARTBEAT_URL</code> is set in the environment. This is useful to monitor the running status of the Broker instance without the need of setting up a cronjob.</p>
<p>We use <a target="_blank" href="https://betterstack.com/">Better Stack's Uptime</a> monitoring service for page and heartbeat monitoring, and the public Broker instance is running V0.7 with the heartbeat URL set to this service. All status information can be found at <a target="_blank" href="https://status.bgpkit.com/">https://status.bgpkit.com/</a></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1719093878224/65a3e5a0-c751-4ab4-927f-3c595eb7ba54.png" alt class="image--center mx-auto" /></p>
<h1 id="heading-production-ready-on-prem-deployment">Production-ready, on-prem deployment</h1>
<p>Although BGPKIT Broker has not yet reached V1.0, we consider it to be feature-complete and production-ready. Ever since V0.2, we have made our better efforts on not introducing any breaking changes and the service has been serving the community with a stable uptime ever since. We believe all libraries running in production should at least be 1.0, and thus <strong>we will release V1.0 soon this summer</strong>.</p>
<p>We also made significant efforts in V0.7 release to make sure BGPKIT Broker is as portable as possible. New users can spin up a fully functioning Broker instance with just two commands: <code>bgpkit-broker bootstrap</code> and <code>bgpkit-broker serve</code>, all within 5 minutes. With V0.7 released, we <strong>encourage all data pipeline designers to deploy a Broker instance on-premise</strong>, ensuring data pipelines are self-containing and reduce external dependencies as much as possible. We also will continue maintain our public instance to our best efforts (we are currently at <strong>99.996% uptime</strong>). Thanks to <a target="_blank" href="https://github.com/sponsors/bgpkit">our sponsors</a>, we are able to keep the services up as we do, and we plan to continue serving the community the same way in the foreseeable future.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1719094868864/68d427b1-cb8a-4c4a-8f64-c1d5e052d02a.png" alt class="image--center mx-auto" /></p>
<p>For the full V0.7 release notes, please check out our <a target="_blank" href="https://github.com/bgpkit/bgpkit-broker/releases/tag/v0.7.0">GitHub release page</a>. If you have any comments, please drop us a message at <a target="_blank" href="https://twitter.com/bgpkit">Twitter</a>, <a target="_blank" href="https://infosec.exchange/@bgpkit">Mastodon</a>, or <a target="_blank" href="mailto:contact@bgpkit.com">email</a>.</p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💖</div>
<div data-node-type="callout-text">If you find our libraries and services useful, we would highly appreciate if you consider <a target="_blank" href="https://github.com/sponsors/bgpkit">sponsor us on GitHub</a>.</div>
</div>]]></content:encoded></item><item><title><![CDATA[Command-line Routing Stats with Monocle and Cloudflare Radar API]]></title><description><![CDATA[BGPKIT monocle is a command-line utility program that helps users quickly pull Internet routing-related information from publicly available sources.
https://github.com/bgpkit/monocle
In BGPKIT monocle version V0.5, we add support for querying Cloudfl...]]></description><link>https://blog.bgpkit.com/monocle-cloudflare-radar</link><guid isPermaLink="true">https://blog.bgpkit.com/monocle-cloudflare-radar</guid><category><![CDATA[bgp]]></category><category><![CDATA[cloudflare]]></category><category><![CDATA[routing]]></category><dc:creator><![CDATA[Mingwei Zhang]]></dc:creator><pubDate>Sun, 21 Apr 2024 18:49:59 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1713727053371/76c6c9ba-de3a-4e06-a621-3e3fe6d689e3.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>BGPKIT <code>monocle</code> is a command-line utility program that helps users quickly pull Internet routing-related information from publicly available sources.</p>
<p><a target="_blank" href="https://github.com/bgpkit/monocle">https://github.com/bgpkit/monocle</a></p>
<p>In BGPKIT <code>monocle</code> version V0.5, we add support for querying <a target="_blank" href="https://radar.cloudflare.com/">Cloudflare Radar</a>'s new BGP <a target="_blank" href="https://developers.cloudflare.com/api/operations/radar-get-bgp-routes-stats">routing statistics</a> and <a target="_blank" href="https://developers.cloudflare.com/api/operations/radar-get-bgp-pfx2as">prefix-to-origin mapping</a> APIs, the same APIs that power the <a target="_blank" href="https://radar.cloudflare.com/routing">Cloudflare Radar routing section</a>. <code>monocle</code> users can now quickly glance overview of routing stats for any given ASN, country, or the whole Internet. Users can also quickly look up prefix origins and examine their RPKI validation status as well as prefix visibility on the global routing tables.</p>
<h2 id="heading-using-monocle-radar">Using <code>monocle radar</code></h2>
<p>We added a new <code>monocle radar</code> command group in V0.5, which contains the following to subcommands:</p>
<ul>
<li><p><code>monocle radar stats [QUERY]</code>: get routing stats (like prefix count, rpki invalid count) for a given country or ASN.</p>
</li>
<li><p><code>monocle radar pfx2as [QUERY] [--rpki-status valid|invalid|unknown]</code>: get prefix to origin mapping for a given prefix or ASN</p>
</li>
</ul>
<pre><code class="lang-plaintext">mingwei@terrier ~ % monocle radar
Cloudflare Radar API lookup (set CF_API_TOKEN to enable)

Usage: monocle radar &lt;COMMAND&gt;

Commands:
  stats   get routing stats
  pfx2as  look up prefix to origin mapping on the most recent global routing table snapshot
  help    Print this message or the help of the given subcommand(s)

Options:
  -h, --help     Print help
  -V, --version  Print version
</code></pre>
<h3 id="heading-cloudflare-api-token-needed">Cloudflare API token needed</h3>
<p>Since the <code>monocle radar</code> command relies on querying data using Cloudflare Radar public API, we also need to specify a user API token as the environment variable <code>CF_API_TOKEN</code>. Obtaining an API token is free and only needs a Cloudflare account. Interested users can follow their <a target="_blank" href="https://developers.cloudflare.com/radar/get-started/first-request/">official tutorial</a> to obtain a token. The environment variable can be set in a <code>.env</code> file in the current directory, or set in ~<code>/.bashrc</code> or <code>~/.profile</code> etc.</p>
<h3 id="heading-monocle-radar-stats"><code>monocle radar stats</code></h3>
<p>Users can query the routing statistics for a given country or ASN. For example, <code>monocle radar stats us</code> returns the routing stats for the United States, while <code>monocle radar stats 174</code> returns the stats for Cogent (<code>AS174</code>).</p>
<p>The displayed table is further divided into three rows, one for overall counting, and one for IPv4 and IPv6-specific counting. For each row, we show the following fields:</p>
<ul>
<li><p><code>origins</code>: the number of origins ASes registered in the given country</p>
</li>
<li><p><code>prefixes</code>: the number of prefixes originated by the given ASN or ASes registered in the given country</p>
</li>
<li><p><code>rpki_valid/invalid/unknown</code>: the number of RPKI valid/invalid/unknown prefix routes (prefix-origin mapping) on the global routing table and their percentage of the overall routes.</p>
</li>
</ul>
<p><img src="https://github.com/bgpkit/monocle/assets/659667/d83c4d5e-ee79-4342-afec-163428a799b1" alt /></p>
<h3 id="heading-monocle-radar-pfx2as"><code>monocle radar pfx2as</code></h3>
<p>Users can query the prefix-to-origin API to get the mapping of origin ASes and their originated prefixes on the global routing table.</p>
<p>In the following example, <code>monocle radar pfx2as 174 --rpki-status invalid</code>, we ask for all the prefixes originated by <code>AS174</code> with the RPKI validation status to be invalid. This command returns us the list of RPKI invalid prefixes originated by <code>AS174</code> at the time of generating the dataset.</p>
<p><img src="https://github.com/bgpkit/monocle/assets/659667/30ef0f5e-056e-4070-87dd-4e7bef6d436d" alt /></p>
<h3 id="heading-questions-it-can-answer-now-more-in-the-future">Questions it can answer now (more in the future)</h3>
<p>Here is a selected list of questions that <code>monocle radar</code> command can answer you:</p>
<ul>
<li><p>How many ASes are there on the Internet that announce at least one prefix? (81,770)</p>
</li>
<li><p>How many of these ASes announce only IPv6 prefixes? (6,853)</p>
</li>
<li><p>How many prefixes are there on the global routing table? (1,205,218)</p>
</li>
<li><p>How many prefixes do <code>AS400644</code> announce? (1)</p>
</li>
<li><p>Which AS(es) originates <code>1.1.1.0/24</code>? (AS13335)</p>
</li>
<li><p>How many prefixes originated by <code>AS174</code> are NOT covered by some RPKI ROA? (a lot, 94%+)</p>
</li>
<li><p>How about the RPKI valid ratio for the Philippines? (77%, nice!)</p>
</li>
</ul>
<h2 id="heading-powered-by-cloudflare-radar-free-api">Powered by Cloudflare Radar free API</h2>
<blockquote>
<p>Cloudflare Radar is a hub that showcases global Internet traffic, attack, and technology trends and insights.</p>
</blockquote>
<p>What <a target="_blank" href="https://radar.cloudflare.com/">Cloudflare Radar</a> shines is its data openness. Everything you see on the Cloudflare Radar website is powered by their free <a target="_blank" href="https://developers.cloudflare.com/api/operations/radar-get-bgp-pfx2as-moas">publicly available APIs</a>. It's a treasure trove there, and all users need is a <a target="_blank" href="https://developers.cloudflare.com/radar/get-started/first-request/">free API token</a> to access everything.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1713725141394/dc2f6a63-3476-40aa-b1df-e2212ca2829c.png" alt class="image--center mx-auto" /></p>
<p>At BGPKIT, we think we can further improve the usability of the API by exposing them as a proper Rust SDK: <a target="_blank" href="https://github.com/bgpkit/radar-rs"><code>radar-rs</code></a>. This is our (unofficial) effort on bringing the Cloudflare Radar's rich data to Rust developers. For example, <code>monocle radar</code> is powered by this SDK.</p>
]]></content:encoded></item><item><title><![CDATA[2022 Year in Review]]></title><description><![CDATA[In 2022, BGPKIT as an open-source organization made significant progresses. As the founder, I am grateful for all the opportunities would like to take this time to appreciate all the milestones we achieved. In this post, I will go through some notabl...]]></description><link>https://blog.bgpkit.com/2022-year-in-review</link><guid isPermaLink="true">https://blog.bgpkit.com/2022-year-in-review</guid><dc:creator><![CDATA[Mingwei Zhang]]></dc:creator><pubDate>Tue, 31 Jan 2023 17:14:22 GMT</pubDate><enclosure url="https://images.unsplash.com/photo-1637769270420-e02b7419a721?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=MnwxMTc3M3wwfDF8c2VhcmNofDMwfHwyMDIyJTIwZmlyZXdvcmtzfGVufDB8fHx8MTY3NTE4MDUyNw&amp;ixlib=rb-4.0.3&amp;q=80&amp;w=2000" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In 2022, BGPKIT as an open-source organization made significant progresses. As the founder, I am grateful for all the opportunities would like to take this time to appreciate all the milestones we achieved. In this post, I will go through some notable changes we made in 2022, and take a look at what we are excited about for the year of 2023.</p>
<hr />
<h2 id="heading-bgpkit-parser">BGPKIT Parser</h2>
<p>There were a number of major features added to BGPKIT Parser in 2022:</p>
<ul>
<li><p><a target="_blank" href="https://github.com/bgpkit/bgpkit-parser/releases/tag/v0.7.0">v0.7.0</a> added support for filtering messages by many fields, and allowing reading from uncompressed files.</p>
</li>
<li><p><a target="_blank" href="https://github.com/bgpkit/bgpkit-parser/releases/tag/v0.7.1">v0.7.1</a> added better examples of parallel MRT files processing with <code>rayon</code>.</p>
</li>
<li><p><a target="_blank" href="https://github.com/bgpkit/bgpkit-parser/releases/tag/v0.7.2">v0.7.2</a> added filtering by multiple <code>peer_ip</code>s.</p>
</li>
<li><p><a target="_blank" href="https://github.com/bgpkit/bgpkit-parser/releases/tag/v0.7.1">v0.8.0</a> includes many internal refactoring and brought in the new <a target="_blank" href="https://github.com/bgpkit/oneio">oneio</a> library to improve developer experience.</p>
</li>
</ul>
<h2 id="heading-bgpkit-broker">BGPKIT Broker</h2>
<p>In 2022, we have revised the BGPKIT Broker backend to support crawling for estimated file sizes in addition to other fields like timestamps and URLs. This allows us to keep track of MRT file size changes and help users to pick suitable collectors to use, especially at the age where RIB dumps from a single collector could reach over 1 GB in size.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1677107884486/2ab47564-5e22-4073-ba39-880356abad8f.png" alt="2022 Year in Review" class="image--center mx-auto" /></p>
<p>Figure of RIB sizes over time. Blue line is for rrc00, while yellow line is for route-views2.</p>
<p>We also made major revisions to our Rust SDK for more features like <code>.latest()</code> to get the lastest MRT files from each collector.</p>
<p><a target="_blank" href="https://github.com/bgpkit/bgpkit-broker/releases/tag/v0.5.0?ref=blog.bgpkit.com">https://github.com/bgpkit/bgpkit-broker/releases/tag/v0.5.0</a></p>
<p>Plus, if you would like to deploy a broker instance yourself, we also have made some significant efforts to improve our documentation on self-hosting guide.</p>
<p><a target="_blank" href="https://github.com/bgpkit/bgpkit-broker-backend/blob/main/deployment/README.md?ref=blog.bgpkit.com">https://github.com/bgpkit/bgpkit-broker-backend/blob/main/deployment/README.md</a></p>
<h2 id="heading-python-bindings">Python Bindings</h2>
<p>In addition to core Rust code base for parser and broker, we have also added support for Python bindings for various Rust SDKs. Users can easily parse MRT files directly using <code>pybgpkit</code> Python library. It is also proven to be usable on cloud-based Jupyter notebooks like Google Colab (<a target="_blank" href="https://colab.research.google.com/drive/1AuNnzT43LYAZNnp1muhJTy0rvv3qbCX6">examples</a>).</p>
<p><a target="_blank" href="https://github.com/bgpkit/pybgpkit?ref=blog.bgpkit.com">https://github.com/bgpkit/pybgpkit</a></p>
<h2 id="heading-monocle">Monocle</h2>
<p>To ties things together, we have also developed our first investigative tool, <code>monocle</code> , to help users to quick find relevant BGP announcements with a suite of easy-to-use utilities. Users with Rust toolchain installed can run <code>cargo install monocle</code> to install the tool.</p>
<p><a target="_blank" href="https://github.com/bgpkit/monocle?ref=blog.bgpkit.com">https://github.com/bgpkit/monocle</a></p>
<p>Users can use the following subcommands</p>
<ul>
<li><p><code>parse</code>: parse single MRT files, remotely or locally</p>
</li>
<li><p><code>search</code>: find and filter BGP messages accross multiple public collectors</p>
</li>
<li><p><code>time</code>: convert between local time string and Unix timestamps</p>
</li>
<li><p><code>whois</code>: find out AS names, ASN, registration countries, and organizations.</p>
</li>
</ul>
<h2 id="heading-webapi-and-infrastructure">Web/API and Infrastructure</h2>
<p>In 2022, we started to experiment new cloud-based infrastructure, especially cloud-based databases for better API stability and developer experiences. We ended up selecting <a target="_blank" href="https://supabase.com/">Supabase</a> as our PostgreSQL production host and a self-hosted instance for backup. We are happy with the performance and cost provided by Supabase and more exicted about the potential capability it brings such as user authentication, cloud storage, local dev schema changes, etc.</p>
<p>Based on the new infrastructure, we have started to test a new integrated API system (<a target="_blank" href="https://alpha.api.bgpkit.com/docs/">still in alpha</a>). This allows us to put all of our data access and processing end-points into one location. Based on the new API, we have also developed a newer version of the BGPKIT Broker statistics page: <a target="_blank" href="https://alpha.stats.bgpkit.com/">https://alpha.stats.bgpkit.com/</a></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1677107908351/45dc8204-0836-445f-987d-9422ab6f83ba.png" alt="2022 Year in Review" /></p>
<h2 id="heading-new-datasets">New Datasets</h2>
<p>Apart from provide SDKs and data APIs, we have also started providing free access to historical archives of some data that we find interesting. We blogged about one dataset previous, the peer-stats dataset:</p>
<p><a target="_blank" href="https://blog.bgpkit.com/peer-stats-dataset/">https://blog.bgpkit.com/peer-stats-dataset/</a></p>
<p>Here is a list of our currently available datasets at <a target="_blank" href="https://data.bgpkit.com/">https://data.bgpkit.com/</a></p>
<ul>
<li><p><code>[peer-stats](https://data.bgpkit.com/peer-stats/)</code>: route collector peers statistics (IP, ASN, v4/v6 prefixes counts)</p>
</li>
<li><p><code>[as2rel](https://data.bgpkit.com/as2rel/)</code>: AS-level relationship, using all available collectors</p>
</li>
<li><p><code>[pfx2as](https://data.bgpkit.com/pfx2as/)</code>: prefix-to-AS mapping, using all available collectors</p>
</li>
<li><p><code>[ihr-hegemony](https://data.bgpkit.com/ihr/hegemony/ipv4/global/)</code>: mirror of <a target="_blank" href="https://ihr.iijlab.net/ihr/en-us">IIJ-IHR</a>'s global hegemony score dataset (big shout out to <a target="_blank" href="https://twitter.com/romain_fontugne">Romain</a> and <a target="_blank" href="https://twitter.com/ihr_alerts">Internet Health Report</a> for producing this data)</p>
</li>
</ul>
<p>All the above datasets are free to use for research or commercial usages, and here is the <a target="_blank" href="https://bgpkit.com/aua">acceptable usage agreement</a>.</p>
<h2 id="heading-more-public-repositories">More Public Repositories</h2>
<p>There are more open code repositories set available, some experimental and some for data analysis. You can check out the full list here:</p>
<p><a target="_blank" href="https://github.com/orgs/bgpkit/repositories?ref=blog.bgpkit.com">https://github.com/orgs/bgpkit/repositories</a></p>
<hr />
<h2 id="heading-founders-notes">Founder's Notes</h2>
<p>In later 2022, I have joined Cloudflare to continue working on routing security for public benefits. During the first few months, we have built and shipped our <a target="_blank" href="https://blog.cloudflare.com/route-leak-detection-with-cloudflare-radar/">new route-leak detection system</a> under the public <a target="_blank" href="https://blog.cloudflare.com/route-leak-detection-with-cloudflare-radar/">Radar</a> platform. BGPKIT suite is now used in production for the BGP data anaysis pipeline at Cloudflare. While working full-time now, I am still committed to maintain the software suite and bringing new features to BGPKIT. For example, this year, I ported back our Kafka support used in Cloudflare to BGPKIT Broker backend. Folks at Cloudflare are doing great things in the open-source realm, and BGPKIT software suite will continue to become more useful to BGP enthuesastics and remain completely open-source.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1677107913152/9dd6537d-f586-4157-969c-02b29817b0f7.png" alt="2022 Year in Review" /></p>
<p>Quote from Cloudflare's <a target="_blank" href="https://blog.cloudflare.com/route-leak-detection-with-cloudflare-radar">route-leak detection system blog</a>.</p>
<p>Looking at 2023, here are a few things that I am really excited to work on for BGPKIT suite:</p>
<ol>
<li><p>continue improve the infrastructure of the system and adding new data processing pipelines</p>
</li>
<li><p>continue improve the parser's performance and reliability</p>
</li>
<li><p>adding new RFC supports to parser (e.g. <a target="_blank" href="https://datatracker.ietf.org/doc/rfc9234/">RFC9234</a> for route-leak prevention)</p>
</li>
<li><p>productionizing the new API and stats website</p>
</li>
<li><p>write more examples and documentations (don't we all love that)</p>
</li>
</ol>
<p>And yeah, we will continue to be open-source first!</p>
<p><a target="_blank" href="https://bgpkit.com"><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1677107914304/6ab31119-6c46-444b-af15-037893bec7ad.png" alt="2022 Year in Review" /></a></p>
<hr />
<p>If you like what we do here, please consider subscribe to our blog. For all code repositories, check out our GitHub page.</p>
<p><a target="_blank" href="https://github.com/bgpkit?ref=blog.bgpkit.com">https://github.com/bgpkit</a></p>
]]></content:encoded></item><item><title><![CDATA[Introducing Peer-Stats Dataset]]></title><description><![CDATA[Public BGP data collector projects like RouteViews and RIPE RIS provide valuable research and operational information for understanding BGP and detecting Internet routing anomalies.
There are many BGP routers involved in BGP collection projects.
A pr...]]></description><link>https://blog.bgpkit.com/introducing-peer-stats-dataset</link><guid isPermaLink="true">https://blog.bgpkit.com/introducing-peer-stats-dataset</guid><category><![CDATA[Announcement]]></category><category><![CDATA[dataset]]></category><dc:creator><![CDATA[Mingwei Zhang]]></dc:creator><pubDate>Mon, 16 May 2022 15:20:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/oyXis2kALVg/upload/b778f6ab1e2175a448ab9d663a92f3ea.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Public BGP data collector projects like RouteViews and RIPE RIS provide valuable research and operational information for understanding BGP and detecting Internet routing anomalies.</p>
<p><strong>There are many BGP routers involved in BGP collection projects.</strong></p>
<p>A project includes many "collectors," and each serves as a collection of messages from several active BGP peers from different networks. Some bigger collectors collect BGP data from more than one hundred BGP routers. For example, RIPE RIS <code>rrc00</code>   has 112 active BGP peers at writing. <em>(To learn about the complete list of BGP peers from all collectors, try the</em><a target="_blank" href="https://github.com/bgpkit/bgpkit-labs/tree/main/collector-peers"><em>experimental tools</em></a><em>we developed.)</em></p>
<p><strong>Sometimes, too many BGP peers may become problematic.</strong></p>
<p>Not all peers present the same amount of data. Some peers are so-called "full-feed" peers, which are the ones that provide the full routing tables to the collector. In a routing table dump file from the collectors, we can observe the full table of these peers. Some peers, however, only provide a limited number of routing entries to the collectors, not representing the whole routing status from these peers. In a project that tries to rebuild full routing tables, e.g., some BGP hijack detection or anomaly detectors, people prefer to use the full-feed peers as their data source.</p>
<p>At times, we are only interested in data from certain peers. For example, when studying the routing data from a particular network, if the network connects to BGP data collectors, we can directly pull data from the collectors' data. However, it can be troublesome to learn about what collectors have data from certain peers. RIPE RIS provides a <a target="_blank" href="https://stat.ripe.net/data/ris-peers/data.json">nice API</a> for querying such info, but we couldn't find one for RouteViews.</p>
<p>Historical data for such information is also missing. Unfortunately, for the researchers who want to study the evolution of the data collectors, even RIPE RIS's peers API could not help with that.</p>
<h2 id="heading-introducing-bgpkit-peer-stats-dataset">Introducing BGPKIT Peer-Stats Dataset</h2>
<p><code>Peer-Stats</code> dataset is a publicly available, free-to-use dataset that aims to provide daily collector peer information for all RouteViews and RIPE RIS collectors for ten years.</p>
<p><a target="_blank" href="https://data.bgpkit.com/peer-stats/">https://data.bgpkit.com/peer-stats/</a></p>
<p>The data includes the following fields for each peer of a BGP collector:</p>
<ol>
<li><p><code>asn</code>: Autonomous System Number of the collector peer</p>
</li>
<li><p><code>ip</code>: the IP address of the collector peer</p>
</li>
<li><p><code>num_v4_pfxs</code>: the number of IPv4 prefixes propagated from the collector peer</p>
</li>
<li><p><code>num_v6_pfx</code>s: the number of IPv6 prefixes propagated from the collector peer</p>
</li>
<li><p><code>num_connected_asns</code>: the number of connected (immediate next hop) ASes from the collector peer</p>
</li>
</ol>
<p>The dataset is organized by the following structure.</p>
<pre><code class="lang-plaintext">- collector
    - year
        - month
            - data files
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1677107921150/87aafcb1-473c-461f-b1ce-8cf720d94bf7.png" alt="Introducing Peer-Stats Dataset" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1677107922421/c8ab2ed9-44fc-4366-a2c9-2fcb081e3de9.png" alt="Introducing Peer-Stats Dataset" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1677107923722/7040d959-bef8-4c0e-b53d-c38edcbea05f.png" alt="Introducing Peer-Stats Dataset" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1677107925125/a52bbb29-0fe1-41e4-afc4-d5331aade117.png" alt="Introducing Peer-Stats Dataset" /></p>
<p>Screenshots of the dataset file listing site.</p>
<p>Each data file is in JSON format (see the section below) and compressed with bzip2. Users can easily use tools like <code>bzcat</code> and <code>jq</code> to view the data files. For example, you can run the following command to quickly view any of the peer-stats data for the collector <code>rrc00</code> on 2022-05-01.</p>
<pre><code class="lang-bash">curl <span class="hljs-string">"https://data.bgpkit.com/peer-stats/rrc00/2022/05/rrc00-2022-05-01-1651363200.bz2"</span> --silent | bzcat | jq
</code></pre>
<pre><code class="lang-json">{
  <span class="hljs-attr">"collector"</span>: <span class="hljs-string">"rrc00"</span>,
  <span class="hljs-attr">"peers"</span>: {
    <span class="hljs-attr">"102.67.56.1"</span>: {
      <span class="hljs-attr">"asn"</span>: <span class="hljs-number">328474</span>,
      <span class="hljs-attr">"ip"</span>: <span class="hljs-string">"102.67.56.1"</span>,
      <span class="hljs-attr">"num_connected_asns"</span>: <span class="hljs-number">330</span>,
      <span class="hljs-attr">"num_v4_pfxs"</span>: <span class="hljs-number">919443</span>,
      <span class="hljs-attr">"num_v6_pfxs"</span>: <span class="hljs-number">0</span>
    },
    <span class="hljs-attr">"103.102.5.1"</span>: {
      <span class="hljs-attr">"asn"</span>: <span class="hljs-number">131477</span>,
      <span class="hljs-attr">"ip"</span>: <span class="hljs-string">"103.102.5.1"</span>,
      <span class="hljs-attr">"num_connected_asns"</span>: <span class="hljs-number">184</span>,
      <span class="hljs-attr">"num_v4_pfxs"</span>: <span class="hljs-number">895482</span>,
      <span class="hljs-attr">"num_v6_pfxs"</span>: <span class="hljs-number">0</span>
    },
...
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1677107926443/3827c05a-72b9-49b6-9ad9-76b3a6847e4e.png" alt="Introducing Peer-Stats Dataset" /></p>
<p>Because all the data files are generated against the midnight UTC RIB dump of the day, you can also easily construct a URL to a data file for any particular date using the following template.</p>
<p><code>https://data.bgpkit.com/peer-stats/{COLLECTOR}/{YEAR}/{MONTH}/{COLLECTOR}-{YEAR}-{MONTH}-{DAY}-{MIDNIGHT_TIMESTAMP}.bz2</code></p>
<h2 id="heading-open-source">Open-source</h2>
<p>We also open-sourced the data collection command-line tool source code on GitHub. Feel free to check it out and run it on your infrastructure if needed.</p>
<p><a target="_blank" href="https://github.com/bgpkit/peer-stats">https://github.com/bgpkit/peer-stats</a></p>
<hr />
<h2 id="heading-credits-and-sponsorship">Credits and Sponsorship</h2>
<p>The original idea for this work came from our extensive discussion with Romain Fontugne (follow him on Twitter at <a target="_blank" href="https://twitter.com/romain_fontugne">@romain_fontugne</a>) from IIJ. This work is made possible by IIJ's generous sponsorship.</p>
<p>Please consider sponsoring us on GitHub if you find our work valuable and would like to see more open-source code and datasets on BGP.</p>
<p><a target="_blank" href="https://github.com/sponsors/bgpkit?ref=blog.bgpkit.com">https://github.com/sponsors/bgpkit</a></p>
]]></content:encoded></item><item><title><![CDATA[KhersonTelecom Outage and Connectivity Change]]></title><description><![CDATA[Internet service in Russian-occupied Kherson, Ukraine was disabled at 16:12 UTC (6:12pm local) on Saturday, 30 April. #UkraineRussiaWar  
Khersontelecom service was restored ~24hrs later via Russian transit from nearby Crimea. pic.twitter.com/uN31jLr...]]></description><link>https://blog.bgpkit.com/2022-05-01-khersontelecom-connectivity-change</link><guid isPermaLink="true">https://blog.bgpkit.com/2022-05-01-khersontelecom-connectivity-change</guid><category><![CDATA[Outage]]></category><category><![CDATA[Case Study]]></category><dc:creator><![CDATA[Mingwei Zhang]]></dc:creator><pubDate>Tue, 03 May 2022 18:06:05 GMT</pubDate><enclosure url="https://images.unsplash.com/photo-1606765962248-7ff407b51667?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=MnwxMTc3M3wwfDF8c2VhcmNofDJ8fGludGVybmV0JTIwb3V0YWdlfGVufDB8fHx8MTY1MTYwMDE2OQ&amp;ixlib=rb-1.2.1&amp;q=80&amp;w=2000" length="0" type="image/jpeg"/><content:encoded><![CDATA[<blockquote>
<p><img src="https://images.unsplash.com/photo-1606765962248-7ff407b51667?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=MnwxMTc3M3wwfDF8c2VhcmNofDJ8fGludGVybmV0JTIwb3V0YWdlfGVufDB8fHx8MTY1MTYwMDE2OQ&amp;ixlib=rb-1.2.1&amp;q=80&amp;w=2000" alt="KhersonTelecom Outage and Connectivity Change" /></p>
<p>Internet service in Russian-occupied Kherson, Ukraine was disabled at 16:12 UTC (6:12pm local) on Saturday, 30 April. <a target="_blank" href="https://twitter.com/hashtag/UkraineRussiaWar?src=hash&amp;ref_src=twsrc%5Etfw">#UkraineRussiaWar</a>  </p>
<p>Khersontelecom service was restored ~24hrs later via Russian transit from nearby Crimea. <a target="_blank" href="https://t.co/uN31jLrzEc">pic.twitter.com/uN31jLrzEc</a></p>
<p>— Doug Madory (@DougMadory) <a target="_blank" href="https://twitter.com/DougMadory/status/1521102562509873152?ref_src=twsrc%5Etfw">May 2, 2022</a></p>
</blockquote>
<p>The <code>AS47598</code> experienced an outage shortly after <code>2022-04-30T16:10:00</code> and then resumed connectivity <code>2022-05-01T16:15:00</code> (both UTC time). After the outage, the <code>AS47598</code> is then connected via a different upstream provider, <code>AS201776</code>.</p>
<p>The upstream provide change can also be seen on IIJ's <a target="_blank" href="https://ihr.iijlab.net/ihr/en-us/networks/AS47598?af=4&amp;last=3&amp;date=2022-05-01&amp;rov_tb=routes">Internet Health Report</a>.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1677107933707/425f328f-54b6-4794-bac8-04dbec0f5267.png" alt="KhersonTelecom Outage and Connectivity Change" /></p>
<h2 id="heading-prefixes">Prefixes</h2>
<p><code>AS47598</code> announces only one prefix <code>91.206.110.0/23</code> (data from <a target="_blank" href="https://bgp.he.net/AS47598#_prefixes">Hurricane Electric</a>). Most of the BGP announcements are for this prefix. However, after the provider change happened, there were also a IPv6 prefix announcements for this V6 prefix as well <code>5bce:6e00::/23</code>. It is possible that this prefix is a V4-translated prefix propagated to a V6 collector peer, but we do not know for sure.</p>
<p>We can also confirm the outage of the prefix with RIPEstat’s <a target="_blank" href="https://stat.ripe.net/widget/routing-history#w.resource=91.206.110.0/23&amp;w.starttime=2022-04-25T00:00:00&amp;w.endtime=2022-05-02T00:00:00">Routing History data widget</a>. (uncheck the <code>No low visibility</code> box to reveal the outage).</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1677107934897/b052b200-7cde-411b-b66f-01a2b0a133ed.png" alt="KhersonTelecom Outage and Connectivity Change" /></p>
<h2 id="heading-bgp-messages">BGP Messages</h2>
<p>We can visualize the overall BGP announcements volume with <a target="_blank" href="https://radar.cloudflare.com/asn/47598?date_filter=last_7_days">Cloudflare Radar’s AS-level page</a>.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1677107935965/49d3f261-a859-45da-a31f-5d6dc92ad3aa.png" alt="KhersonTelecom Outage and Connectivity Change" /></p>
<p>The following are the UTC timestamps for the corresponding BGP message spikes.</p>
<ul>
<li><code>2022-04-30T16:10:00</code>: announcements with old provider <code>12883</code> in the paths.</li>
<li><code>2022-05-01T16:00:00</code>: announcements with the new provider <code>201776</code> in the paths</li>
<li><code>2022-05-03T10:45:00</code>: similar announcements with a new provider in the paths.</li>
</ul>
<p>During the first gap time (2022-04-30T16:15:00 to 2022-05-01T16:00:00), there were 0 BGP updates for the prefix or from the ASN.</p>
<p>The old provider paths look like this where <code>AS12883</code> is the next hop for <code>AS47598</code>. See the full list of messages from <code>rrc00</code> here: <a target="_blank" href="https://gist.github.com/digizeph/c58b77f755d7fec8a7969807fb17d5ba">https://gist.github.com/digizeph/c58b77f755d7fec8a7969807fb17d5ba</a>.</p>
<p>    207564 56655 3257 12883 47598</p>
<p>The new provider paths look like this where <code>AS12389</code> and <code>AS201776</code> are the next hops for <code>AS47598</code>. See the full list of messages from <code>rrc00</code> here: <a target="_blank" href="https://gist.github.com/digizeph/896a4a7e4de23082b496b92ab5bdab5b">https://gist.github.com/digizeph/896a4a7e4de23082b496b92ab5bdab5b</a></p>
<p>    207564 28824 28824 1299 12389 201776 47598</p>
<h2 id="heading-bgp-data-tooling">BGP Data Tooling</h2>
<p>The analysis is done using a privately hosted open-source BGPKIT parser web API. You can host it on your infrastructure, and the source code is freely available at <a target="_blank" href="https://github.com/bgpkit/pybgpkit-api">https://github.com/bgpkit/pybgpkit-api</a>. Comments and feedback are welcome!</p>
<hr />
<h3 id="heading-update-on-2022-05-04t092000-pacific">Update on 2022-05-04T09:20:00 Pacific</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1677107937096/a9b672b5-10ab-4ab1-94c5-29985a49de0d.png" alt="KhersonTelecom Outage and Connectivity Change" /></p>
<blockquote>
<p><a target="_blank" href="https://twitter.com/hashtag/Kherson?src=hash&amp;ref_src=twsrc%5Etfw">#Kherson</a> — Internet connectivity is returning to the occupied city in South of <a target="_blank" href="https://twitter.com/hashtag/Ukraine?src=hash&amp;ref_src=twsrc%5Etfw">#Ukraine</a>, after an outage since Saturday. <a target="_blank" href="https://twitter.com/Cloudflare?ref_src=twsrc%5Etfw">@Cloudflare</a> data shows growth in requests since 04:15 UTC, and telecom connection was confirmed by the Ukrainian Vice PM <a target="_blank" href="https://twitter.com/FedorovMykhailo?ref_src=twsrc%5Etfw">@FedorovMykhailo</a>. <a target="_blank" href="https://t.co/JxT3kcM234">pic.twitter.com/JxT3kcM234</a></p>
<p>— Cloudflare Radar (@CloudflareRadar) <a target="_blank" href="https://twitter.com/CloudflareRadar/status/1521812037055176705?ref_src=twsrc%5Etfw">May 4, 2022</a></p>
</blockquote>
<p>The upstreams for <code>AS47598</code> has reverted back to the original ASes, and the traffic has started coming back to normal.</p>
]]></content:encoded></item><item><title><![CDATA[Parallel MRT Files Parsing with BGPKIT]]></title><description><![CDATA[In this post, we will talk about how to implement a Rust workflow that can process a large number of BGP data files as fast as we can. We will use BGPKIT Parser and Broker for data collection and parsing, and Rayon crate for parallelization of the co...]]></description><link>https://blog.bgpkit.com/parallel-mrt-files-parsing-with-bgpkit</link><guid isPermaLink="true">https://blog.bgpkit.com/parallel-mrt-files-parsing-with-bgpkit</guid><category><![CDATA[Tutorial]]></category><dc:creator><![CDATA[Mingwei Zhang]]></dc:creator><pubDate>Wed, 16 Mar 2022 04:50:13 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1713724265729/20e637ad-993e-42a2-9e62-d4a4996b41b9.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In this post, we will talk about how to implement a Rust workflow that can process a <strong>large number</strong> of BGP data files <strong>as fast as we can</strong>. We will use BGPKIT <a target="_blank" href="https://bgpkit.com/parser">Parser</a> and <a target="_blank" href="https://bgpkit.com/broker">Broker</a> for data collection and parsing, and <a target="_blank" href="https://github.com/rayon-rs/rayon">Rayon</a> crate for parallelization of the code.</p>
<h2 id="heading-task-overview">Task Overview</h2>
<p>Before we begin to talk about the code design, we first need to introduce the data we are dealing with. We want to process the BGP data collected by various collectors, saved in compressed binary MRT format, and archived to files with a fixed interval. In this post, we are using <a target="_blank" href="http://archive.routeviews.org/">RouteViews</a> archive data as an example. The average data file size ranges from 2MB to 10MB by different collectors (<a target="_blank" href="http://archive.routeviews.org/route-views.amsix/bgpdata/2021.10/UPDATES/">AMSIX collector</a> for example has pretty large files).</p>
<p>For processing, we can use the simplest task possible to do: <em>sum the number of MRT records in all the files</em>. We want to download and process all the updates files for one hour from all the collectors in RouteViews project. Here is the estimated amount of data we are dealing with:</p>
<ul>
<li><p>35 collectors</p>
</li>
<li><p>5-minute interval — 12 files per collector</p>
</li>
<li><p><strong>420 total number of files</strong> to download and process</p>
</li>
<li><p><strong>840MB to 4.2GB total download size</strong> (it’s somewhere in between)</p>
</li>
</ul>
<p>Ok! Now that we know what we want to do and have a sense of the estimated workload of the overall task, let’s coding!</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1677107940405/4f445f71-9d38-46fb-8b91-91d3ddff0c66.jpeg" alt /></p>
<p>Photo by <a target="_blank" href="https://unsplash.com/@glenncarstenspeters?utm_source=medium&amp;utm_medium=referral">Glenn Carstens-Peters</a> on <a target="_blank" href="https://unsplash.com/?utm_source=medium&amp;utm_medium=referral">Unsplash</a></p>
<hr />
<h2 id="heading-1-sequential-parsing">1. Sequential Parsing</h2>
<p>Our first attempt to achieve the goal is to design and implement a naive sequential workflow as described below</p>
<ol>
<li><p>find all BGP updates files within the hour of interest</p>
</li>
<li><p>iterate through each file, parse the MRT data and count the number of records</p>
</li>
<li><p>sum all record counts and print out the result</p>
</li>
</ol>
<p>For this sequential workflow, we will need to pull in two dependencies into <code>Cargo.toml</code>:</p>
<pre><code class="lang-ini"><span class="hljs-section">[dependencies]</span>
<span class="hljs-attr">bgpkit-parser</span> = <span class="hljs-string">"0.7.2"</span>
<span class="hljs-attr">bgpkit-broker</span> = <span class="hljs-string">"0.3.2"</span>
</code></pre>
<p>The <code>bgpkit-broker</code> handles looking for updates files within the hour, while <code>bgpkit-parser</code> handles parsing each individual file.</p>
<h3 id="heading-finding-files">Finding files</h3>
<p>BGPKIT Broker indexes all available BGP MRT data archive files from both RouteViews and RIPE RIS in close-to-real-time. For each data file, it saves the following information:</p>
<ul>
<li><p><code>project</code>: <code>route-views</code> or <code>riperis</code></p>
</li>
<li><p><code>collector</code>: the collector ID, e.g. <code>rrc00</code> or <code>route-views2</code></p>
</li>
<li><p><code>url</code>: the URL to the corresponding MRT file</p>
</li>
<li><p><code>timestamp</code>: the UNIX time of the <em>start time</em> of the MRT data file.</p>
</li>
</ul>
<p>With all this information indexed, we can then query the backend and retrieve files information as we want. BGPKIT Broker provides both <a target="_blank" href="https://docs.broker.bgpkit.com">RESTful API</a>, <a target="_blank" href="https://github.com/bgpkit/bgpkit-broker">Rust API</a>, as well as a <a target="_blank" href="https://pypi.org/project/pybgpkit/">Python API</a>. Here we use the Rust API to pull in the information we need:</p>
<pre><code class="lang-rust"><span class="hljs-keyword">let</span> broker = BgpkitBroker::new_with_params(
    <span class="hljs-string">"https://api.broker.bgpkit.com/v1"</span>,
    QueryParams {
        start_ts: <span class="hljs-literal">Some</span>(<span class="hljs-number">1640995200</span>),
        end_ts: <span class="hljs-literal">Some</span>(<span class="hljs-number">1640998799</span>),
        project: <span class="hljs-literal">Some</span>(<span class="hljs-string">"route-views"</span>.to_string()),
        data_type: <span class="hljs-literal">Some</span>(<span class="hljs-string">"update"</span>.to_string()),
        ..<span class="hljs-built_in">Default</span>::default()
    });

<span class="hljs-keyword">for</span> item <span class="hljs-keyword">in</span> &amp;broker {
    <span class="hljs-built_in">println!</span>(<span class="hljs-string">"processing {:?}..."</span>, &amp;item);
}
</code></pre>
<p>The above block queries the broker and prints out all information of the retrieved files' metadata. The <code>BgpkitBroker::new_with_params</code> call accepts two parameters, one for the endpoint of the broker instance, and the other specifies the filtering criteria. In this example, we search for all BGP updates files from RouteViews with timestamps between <code>2022-01-01T00:00:00</code> and <code>2022-01-01T00:59:59</code> UTC. It prints out the output as the following:</p>
<pre><code class="lang-plaintext">processing BrokerItem { collector_id: "route-views.telxatl", timestamp: 1640997000, data_type: "update", url: "http://archive.routeviews.org/route-views.telxatl/bgpdata/2022.01/UPDATES/updates.20220101.0030.bz2" }...
processing BrokerItem { collector_id: "route-views.uaeix", timestamp: 1640997000, data_type: "update", url: "http://archive.routeviews.org/route-views.uaeix/bgpdata/2022.01/UPDATES/updates.20220101.0030.bz2" }...
processing BrokerItem { collector_id: "route-views.wide", timestamp: 1640997000, data_type: "update", url: "http://archive.routeviews.org/route-views.wide/bgpdata/2022.01/UPDATES/updates.20220101.0030.bz2" }...
processing BrokerItem { collector_id: "route-views2", timestamp: 1640997900, data_type: "update", url: "http://archive.routeviews.org/bgpdata/2022.01/UPDATES/updates.20220101.0045.bz2" }...
</code></pre>
<h3 id="heading-parse-each-mrt-file">Parse each MRT file</h3>
<p>Previously, in the for loop, we only print out the retrieved meta information of the MRT files. Now let's add the actual parsing of the files into the loop. The code is very simple, as designed:</p>
<pre><code class="lang-rust"><span class="hljs-keyword">let</span> <span class="hljs-keyword">mut</span> sum: <span class="hljs-built_in">usize</span> = <span class="hljs-number">0</span>;
<span class="hljs-keyword">for</span> item <span class="hljs-keyword">in</span> &amp;broker {
    <span class="hljs-built_in">println!</span>(<span class="hljs-string">"processing {}..."</span>, &amp;item.url);
    <span class="hljs-keyword">let</span> parser = BgpkitParser::new(&amp;item.url).unwrap();
    <span class="hljs-keyword">let</span> count = parser.into_record_iter().count();
    sum += count;
}
</code></pre>
<p>We first define a mutable variable <code>sum</code> outside the loop. Then for each file, we create a new Parser instance by <code>BgpkitParser::_new_(&amp;item.url)</code>. Here, as our goal is to count the number of records, we call the parser's <code>.into_record_iter()</code> function to create a iterator over the records of the file, and then <code>.count()</code> to get the count of the records. Lastly, we add the count to the overall sum variable.</p>
<h3 id="heading-run-and-timing">Run and timing</h3>
<p>For testing, I use a fairly powerful VM on a host with AMD 3950x CPU (32 threads), then build the release build and time the release run. The runtime includes downloading the MRT files to my machine with 1Gpbs down link in Southern California.</p>
<pre><code class="lang-bash">cargo build --release
time cargo run --release --bin sequential
</code></pre>
<p>It ended up taking about <strong>1 minute and 23 seconds</strong> to sequentially parse 144 MRT files from RouteViews for all available ones within the first hour of 2022 (UTC).</p>
<pre><code class="lang-plaintext">total number of records for 144 files is 10554212

real    1m23.081s
user    0m39.535s
sys     0m1.006s
</code></pre>
<h2 id="heading-2-parallel-parsing">2. Parallel Parsing</h2>
<p>Since the parsing of each file is completely independent of each other, we can parse the files in parallel and then sum up the count for each thread at the end. In Rust with Rayon, this conversion is very simple.</p>
<p>Let's first add the dependency of Rayon first:</p>
<pre><code class="lang-ini"><span class="hljs-section">[dependencies]</span>
<span class="hljs-attr">bgpkit-parser</span> = <span class="hljs-string">"0.7.2"</span>
<span class="hljs-attr">bgpkit-broker</span> = <span class="hljs-string">"0.3.2"</span>
<span class="hljs-attr">rayon</span> = <span class="hljs-string">"1.5.1"</span>
</code></pre>
<p>Then we change the broker code one tiny bit to collect all meta information for files into a vector first.</p>
<pre><code class="lang-rust"><span class="hljs-keyword">let</span> items = broker.into_iter().collect::&lt;<span class="hljs-built_in">Vec</span>&lt;BrokerItem&gt;&gt;();
</code></pre>
<p>This would enable us to fully utilize <code>rayon</code>'s great syntax sugar to turn our sequential code into a parallel one.</p>
<pre><code class="lang-rust"><span class="hljs-keyword">let</span> sum: <span class="hljs-built_in">usize</span> = items.par_iter().map(|item| {
    <span class="hljs-built_in">println!</span>(<span class="hljs-string">"processing {}..."</span>, &amp;item.url);
    <span class="hljs-keyword">let</span> parser = BgpkitParser::new(&amp;item.url).unwrap();
    <span class="hljs-keyword">let</span> count = parser.into_record_iter().count();
    count
}).sum();
</code></pre>
<p>The key difference here is the calling of <code>.par_iter()</code>. It turns a sequential iterator into a parallel iterator, and by default going to utilize all available cores on the host machine for scheduling. Then we call <code>.map()</code> to define the parsing steps for each file, and then call <code>.sum()</code> at the end to add all results up.</p>
<p>The final result is approximately <strong>10x faster</strong> than the sequential version, and it took only <strong>8 seconds</strong> to parse all MRT files and get the record counts.</p>
<pre><code class="lang-plaintext">total number of records is 10554212

real    0m8.086s
user    0m42.569s
sys     0m1.068s
</code></pre>
<p>The full code for this example is as follows:</p>
<hr />
<p>The source code of the two examples is available on GitHub. Feel free to poke around and tweak it as you wish.</p>
<div class="gist-block embed-wrapper" data-gist-show-loading="false" data-id="c23ba39968c6cb4e1ad323520540010f"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a href="https://gist.github.com/digizeph/c23ba39968c6cb4e1ad323520540010f" class="embed-card">https://gist.github.com/digizeph/c23ba39968c6cb4e1ad323520540010f</a></div><p> </p>
<p><a target="_blank" href="https://github.com/bgpkit/bgpkit-tutorials/tree/main/parallel-parsing?ref=blog.bgpkit.com">https://github.com/bgpkit/bgpkit-tutorials/tree/main/parallel-parsing</a></p>
]]></content:encoded></item><item><title><![CDATA[Real-time RIS Live Data with BGPKIT Parser]]></title><description><![CDATA[In terms of real-time BGP data processing, RIPE NCC provides a great data source: Routing Information Service Live (RIS Live).
To begin with, here is what RIS Live by the creators:

RIS Live is a feed that offers BGP messages in real-time. It collect...]]></description><link>https://blog.bgpkit.com/real-time-bgp-data-processing-2-ris-live</link><guid isPermaLink="true">https://blog.bgpkit.com/real-time-bgp-data-processing-2-ris-live</guid><category><![CDATA[Tutorial]]></category><category><![CDATA[sdk]]></category><dc:creator><![CDATA[Mingwei Zhang]]></dc:creator><pubDate>Fri, 12 Nov 2021 20:27:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1677107948155/2527e18c-e4f9-46f1-90e0-1a697b83fc81.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In terms of real-time BGP data processing, <a target="_blank" href="https://www.ripe.net/">RIPE NCC</a> provides a great data source: <a target="_blank" href="https://ris-live.ripe.net/">Routing Information Service Live (RIS Live)</a>.</p>
<p>To begin with, here is what <a target="_blank" href="https://ris-live.ripe.net/">RIS Live</a> by the creators:</p>
<blockquote>
<p>RIS Live is a feed that offers BGP messages in real-time. It collects information from the RIS Route Collectors (RRCs) and uses a WebSocket JSON API to monitor and detect routing events around the world. A non-interactive full stream (“firehose”) is also available.</p>
</blockquote>
<p>In essence, RIS Live provides:</p>
<ul>
<li><p>a WebSocket interface to stream BGP messages in real-time</p>
</li>
<li><p>ability to subscribe to “sub-streams” with custom filtering messages</p>
</li>
<li><p>JSON-encoded BGP messages as the stream payload</p>
</li>
<li><p>“firehose” HTTPS stream interface as well, without needing to work with websocket.</p>
</li>
</ul>
<p>In this post, we will discuss how to use the RIS Live stream in practice.</p>
<h2 id="heading-ris-live-message-format">RIS Live Message Format</h2>
<p>RIS Live has <em>client messages</em> and <em>server messages</em>.</p>
<p>The client messages is used to setup or dismantle “subscriptions”, which essentially tell the server what kind of BGP messages a client would like to receive, and allow the server to send only the interested messages to the client.</p>
<p>A server acknowledges the requests from the client and afterwards start streaming requested data back to the client. At a high-level, a server sends either <code>ris_message</code> or <code>ris_error</code> messages. The <code>ris_message</code> is the main payload that we are interested in, while the <code>ris_error</code> message provides debugging messages for the scenarios where stream or subscription fails.</p>
<h2 id="heading-subscribe-to-a-websocket-stream">Subscribe to a WebSocket Stream</h2>
<p>RIS Live provides great flexibility for the clients to specify/narrowdown the interested messages, allowing both the server and client process less messages during a streaming session.</p>
<ul>
<li><p><code>host</code>: only messages collected from a particular RRC (e.g. <code>rrc21</code>)</p>
</li>
<li><p><code>type</code>: only messages of a given type, e.g. <code>UPDATE</code> , <code>OPEN</code></p>
</li>
<li><p><code>require</code> : only messages containing a given key, e.g. <code>withdrawals</code> will return only message that contains any withdrawn prefixes</p>
</li>
<li><p><code>peer</code>: messages from a particular BGP peer</p>
</li>
<li><p><code>path</code> : ASN or pattern to match the AS Path attribute in BGP update messages</p>
</li>
<li><p><code>prefix</code>: only messages containing information for a given prefix</p>
</li>
<li><p><code>moreSpecific</code> and <code>lessSpecific</code>: only messages that are the subprefix or super-prefix of the specified prefix</p>
</li>
<li><p><code>includeRaw</code>: whether to include the Base64-encoded RAW BGP messages</p>
</li>
</ul>
<p>As an example, let’s take a look at the following message from the official manual:</p>
<pre><code class="lang-json">{
  <span class="hljs-attr">"host"</span>: <span class="hljs-string">"rrc01"</span>,
  <span class="hljs-attr">"type"</span>: <span class="hljs-string">"UPDATE"</span>,
  <span class="hljs-attr">"require"</span>: <span class="hljs-string">"announcements"</span>,
  <span class="hljs-attr">"path"</span>: <span class="hljs-string">"64496,64497$"</span>
}
</code></pre>
<p><img src="https://miro.medium.com/max/968/1*klDmwtEfQyda7HAAaG9PhA.png" alt="https://cdn.hashnode.com/res/hashnode/image/upload/v1677107947080/2021b63c-f4b9-42b1-9ee7-34403e220ebf.png" /></p>
<p>Example subscription message composer on RIS Live official site</p>
<p>As an example, let’s take a look at the following message from the official manual:</p>
<pre><code class="lang-json">{
  <span class="hljs-attr">"host"</span>: <span class="hljs-string">"rrc01"</span>,
  <span class="hljs-attr">"type"</span>: <span class="hljs-string">"UPDATE"</span>,
  <span class="hljs-attr">"require"</span>: <span class="hljs-string">"announcements"</span>,
  <span class="hljs-attr">"path"</span>: <span class="hljs-string">"64496,64497$"</span>
}
</code></pre>
<ul>
<li><p>collected by <code>rrc01</code></p>
</li>
<li><p>BGP UPDATE messages</p>
</li>
<li><p>have at least one announced prefix</p>
</li>
<li><p>the last two hops of the AS Path is 66496 and 64497 (the origin)</p>
</li>
</ul>
<p>The <code>ris_message</code> consists of “common header” fields and “data” fields (although they’re on the same level).</p>
<p>The “common header” fields are present for all types of sub-type messages, including <code>timestamp</code>, <code>peer</code>, <code>peer_asn</code>, <code>id</code>, <code>host</code>, <code>type</code> . The rest of the fields are the data fields that are dependent on the type of the messages. For most people, the <code>UPDATE</code> message is what they need. The following JSON block is an example message pulled directly from the demo site.</p>
<p>Example JSON formatted RIS message:</p>
<p>This example shows a BGP announcement of AS132354 originating two prefixes <code>103.249.208.0/23</code> and <code>103.14.184.0/24</code> , with the next hop to be <code>37.49.237.228</code>. At this point, the information we see here is pretty similar to what we can see from other BGP MRT reader’s output (e.g. from <code>bgpdump</code> or <code>bgpreader</code>), just in JSON format.</p>
<h2 id="heading-websocket-or-firehose">WebSocket or Firehose?</h2>
<p>Provided that RIS Live provides both WebSocket and HTTP Firehose, one would naturally wonder which one is the right choice for their application. Here we have a brief comparison between the two in the context of RIS Live.</p>
<p><strong>WebSocket</strong></p>
<p>Good:</p>
<ul>
<li><p>easy to customize stream by composing a simple JSON subscribe message</p>
</li>
<li><p>work with various toolings in languages like Python and JavaScript</p>
</li>
</ul>
<p>Bad:</p>
<ul>
<li><p>requires extra library dependencies to work with WebSocket</p>
</li>
<li><p>need to write somewhat lengthy to get started (comparing to firehose)</p>
</li>
</ul>
<p><strong>Firehose</strong></p>
<p>Good:</p>
<ul>
<li><p>easy to consume by simply calling GET request on the URL</p>
</li>
<li><p>simple single-liner commandline program can start the stream (e.g. a simple <code>curl</code> call), no need complex script</p>
</li>
</ul>
<p>Bad:</p>
<ul>
<li><p>customizing stream is doable with <code>XRIS-SUBSCRIBE</code> HTTP request header, but feels clunky and limited</p>
</li>
<li><p>in my personal tests, the stream get disconnected often due to the stream cannot keep up with the data producer. this did not happen with websocket tests.</p>
</li>
</ul>
<h3 id="heading-summary">Summary</h3>
<p>If your application could afford additional dependencies or writing extra scripts, WebSocket is the better choice. RIS Live official manual also makes implication that the WebSocket format is the current formally-supported streaming method.</p>
<hr />
<h2 id="heading-ris-live-coding-example-with-bgpkit-parser">RIS Live Coding Example with BGPKIT Parser</h2>
<p><img src="https://images.unsplash.com/photo-1534665482403-a909d0d97c67?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=MnwxMTc3M3wwfDF8c2VhcmNofDIyfHxjb2Rpbmd8ZW58MHx8fHwxNjQ3Mjg2MDUz&amp;ixlib=rb-1.2.1&amp;q=80&amp;w=2000" alt="Person coding with MacBook Pro" /></p>
<p>Now that we have a basic idea of what is RIS Live and the basic message format, we can get started working on some code that will actually use RIS Live to do something useful.</p>
<p>In the following example, we will build a short monitoring service that alerts us when Facebook operators announces their DNS IP prefix (see what happened before <a target="_blank" href="https://blog.cloudflare.com/october-2021-facebook-outage/">here</a>). We are going to build the service in Rust with <a target="_blank" href="https://bgpkit.com/parser">BGPKIT Parser</a> , WebSocket library <a target="_blank" href="https://github.com/snapview/tungstenite-rs">Tungstenite</a>.</p>
<p>First, lets collect some basic information about what we are going to monitor here:</p>
<ol>
<li><p>Facebook’s autonomous system number is 32934. So we will watch for all messages that was originated from AS32934.</p>
</li>
<li><p>Facebook’s DNS server IP prefixes involved in the previous incidence are <code>129.134.30.0/23</code> and <code>185.89.218.0/23</code> . So we want to carefully watch these two prefixes in our monitoring system.</p>
</li>
<li><p>We want to use one of the RIPE RIS collectors’ data for monitoring, <code>rrc21</code> is a good choice since it’s being used by RIS Live’s demonstration. You can easily extend this service by tweaking the subscription message later.</p>
</li>
</ol>
<p>OK, we are good to go. Let’s do it!</p>
<h2 id="heading-setting-up-the-stream">Setting up the stream</h2>
<p>We picked the Tungstenite library as our WebSocket library of choice, partly because it has a very straightforward API design.</p>
<p>Let’s first connect to the websocket server by calling <code>connect</code> function given a websocket URL. One thing to notice is that the URL protocol section here is <code>ws</code> as opposed to the <code>wss</code> mentioned in the RIS Live documentation. For some reason, Tungstenite does not work with <code>wss</code> protocol (with SSL).</p>
<pre><code class="lang-rust"><span class="hljs-keyword">use</span> tungstenite::{connect, Message}; 
<span class="hljs-keyword">const</span> RIS_LIVE_URL: &amp;<span class="hljs-built_in">str</span> = <span class="hljs-string">"ws://ris-live.ripe.net/v1/ws/?client=rust-bgpkit-parser"</span>;
<span class="hljs-keyword">let</span> (<span class="hljs-keyword">mut</span> socket, _response) =
    connect(Url::parse(RIS_LIVE_URL).unwrap())
    .expect(<span class="hljs-string">"Can't connect to RIS Live websocket server"</span>);
</code></pre>
<p>Now, with a socket ready, we will first send a subscription message to let server know that we want some messages and we are ready to receive.</p>
<pre><code class="lang-rust"><span class="hljs-keyword">let</span> msg = json!({<span class="hljs-string">"type"</span>: <span class="hljs-string">"ris_subscribe"</span>, <span class="hljs-string">"data"</span>: {<span class="hljs-string">"host"</span>: <span class="hljs-string">"rrc21"</span>}}).to_string();
socket.write_message(Message::Text(msg)).unwrap();
</code></pre>
<p>Here we composed a simple subscription message that limits the stream to have messages only from <code>rrc21</code> collector.</p>
<h2 id="heading-parsing-json-messages">Parsing JSON messages</h2>
<p>At this point, we have a WebSocket connection to RIS Live server, and have sent out a subscription message to the server. The server should be sending back messages anytime now, and we are ready to consume the stream.</p>
<p>We would like to code the following behavior:</p>
<ol>
<li><p>continuously reading the websocket messages;</p>
</li>
<li><p>parse JSON string into internal BGP structs;</p>
</li>
<li><p>check each message if it contains origins (withdraw-only messages does not contain AS paths, and thus no origins either;</p>
</li>
<li><p>if the origin AS is AS32934, and the announced prefix is <code>129.134.30.0/23</code> or <code>185.89.218.0/23</code> , then we print out the message to output.</p>
</li>
</ol>
<pre><code class="lang-rust"><span class="hljs-keyword">loop</span> {
    <span class="hljs-keyword">let</span> msg = socket.read_message().expect(<span class="hljs-string">"Error reading message"</span>).to_string();
    <span class="hljs-keyword">if</span> letOk(elems) = parse_ris_live_message(msg.as_str()) {
        <span class="hljs-keyword">for</span> elem <span class="hljs-keyword">in</span> elems {
            <span class="hljs-keyword">if</span> letSome(origins) = elem.origin_asns.as_ref() {
                <span class="hljs-keyword">if</span> origins.contains(&amp;<span class="hljs-number">32934</span>) &amp;&amp;
                    ( elem.prefix.to_string() ==  <span class="hljs-string">"129.134.30.0/23"</span>.to_string() ||
                        elem.prefix.to_string() ==  <span class="hljs-string">"185.89.218.0/23"</span>.to_string() )
                {
                    <span class="hljs-built_in">println!</span>(<span class="hljs-string">"{}"</span>, elem);
                }
            }
        }
    }
}
</code></pre>
<p>The full example code can be found here:</p>
<div class="gist-block embed-wrapper" data-gist-show-loading="false" data-id="fcac3027555c0b744ea0b3a11197b694"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a href="https://gist.github.com/digizeph/fcac3027555c0b744ea0b3a11197b694" class="embed-card">https://gist.github.com/digizeph/fcac3027555c0b744ea0b3a11197b694</a></div><p> </p>
<hr />
<h2 id="heading-building-more-with-bgpkit-tools">Building More with BGPKIT Tools</h2>
<p>As introduced in our <a target="_blank" href="https://blog.bgpkit.com/real-time-bgp-data-processing-1-bmp-and-openbmp-9d9ac142846a">previous blog post</a>, we added support of real-time BMP stream to BGPKIT Parser as well. Combining with RIPE RIS Live, and RouteViews BMP stream, we can build a powerful real-time BGP monitoring service directly within BGPKIT Parser. We also offer indexing and processing of historical BGP data as well with <a target="_blank" href="https://bgpkit.com/broker">BGPKIT Broker</a>.</p>
<p>Our goal at BGPKIT is to design, develop, and deploy the most developer-friendly BGP data processing toolkit. To learn more about our offerings, please check out our website and official <a target="_blank" href="https://twitter.com/bgpkit">Twitter account</a>.</p>
]]></content:encoded></item><item><title><![CDATA[Real-time BMP with BGPKIT Parser]]></title><description><![CDATA[Real-time BGP data processing is very critical on building monitoring services that can detect BGP issues quickly with minimum delay and react to anomalies quickly and mitigate potential issues.
We are creating a new series of posts describing how we...]]></description><link>https://blog.bgpkit.com/real-time-bgp-data-processing-1-bmp-and-openbmp</link><guid isPermaLink="true">https://blog.bgpkit.com/real-time-bgp-data-processing-1-bmp-and-openbmp</guid><category><![CDATA[Tutorial]]></category><category><![CDATA[sdk]]></category><dc:creator><![CDATA[Mingwei Zhang]]></dc:creator><pubDate>Wed, 10 Nov 2021 20:13:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1677107952280/8c41a036-afd9-42b5-8965-4e123ce1cdf5.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Real-time BGP data processing is very critical on building monitoring services that can detect BGP issues quickly with minimum delay and react to anomalies quickly and mitigate potential issues.</p>
<p>We are creating a new series of posts describing how we design our software to work with real-time BGP data streams. As an opening, we will describe how we handle data streams with BMP protocol and OpenBMP messages.</p>
<hr />
<h1 id="heading-bmp">BMP</h1>
<p>The BGP Monitoring Protocol (BMP) is a protocol that allows monitoring of BGP devices.</p>
<p>The <a target="_blank" href="https://datatracker.ietf.org/doc/html/rfc7854">RFC7854</a> describes the purpose of BMP as:</p>
<blockquote>
<p>Many researchers and network operators wish to have access to the contents of routers’ BGP Routing Information Bases (RIBs) as well as a view of protocol updates the router is receiving. This monitoring task cannot be realized by standard protocol mechanisms. Prior to the introduction of BMP, this data could only be obtained through screen scraping.</p>
<p>BMP provides access to the Adj-RIB-In of a peer on an ongoing basis and a periodic dump of certain statistics the monitoring station can use for further analysis. From a high level, BMP can be thought of as the result of multiplexing together the messages received on the various monitored BGP sessions.</p>
</blockquote>
<p>There are multiple types of BMP messages, each serving different purposes.</p>
<ul>
<li><p>Peer up and down notification: notification about the status of peering sessions to a monitored router;</p>
</li>
<li><p>Initiation message: inform the monitoring station of the routers vendor, software version, and so on;</p>
</li>
<li><p>Termination message: provides information on why a monitored router is terminating a session;</p>
</li>
<li><p><strong>Route monitoring</strong>:initial synchronization of the routing table;</p>
</li>
<li><p><strong>Route mirroring</strong>: verbatim duplication of messages as received.</p>
</li>
</ul>
<p>For real-time BGP data processing, we are specifically interested in the route monitoring and route-mirroring messages, as we provide the routing information encoded as actual BGP messages.</p>
<hr />
<h1 id="heading-openbmp">OpenBMP</h1>
<p><a target="_blank" href="https://www.openbmp.org/">OpenBMP</a> is a software implementation of the BMP protocol. It is an open-source project created by Cisco and currently maintained by nice folks from <a target="_blank" href="https://www.caida.org/">CAIDA/UCSD</a> and <a target="_blank" href="http://routeviews.org/">RouteViews</a>. It is implemented in C++, can be used with any compliant BMP sender (e.g., router).</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1677107951150/87df6776-6cc7-4c6f-9cc6-7d5778df00c5.png" alt /></p>
<p>Architecture graph of OpenBMP</p>
<p>OpenBMP provides multiple formats for outputting the BMP messages collected from the connected routers, one of which is the <code>raw_bmp</code> format, which is a thin wrapper of the raw BMP messages. The <code>raw_bmp</code> format provides the best performance and allows use to handle the BMP messages directly without having to write a different parser for the plaintext messages.</p>
<p>RouteViews currently provides a OpenBMP Kafka stream that streams BMP messages from their collectors.</p>
<hr />
<h1 id="heading-bgpkit-parser-with-bmpopenbmp-support">BGPKIT Parser with BMP/OpenBMP Support</h1>
<p>We develop BGPKIT Parser to provide a one-stop solution for handling all parsing tasks regarding BGP data. Supporting real-time data like BMP is a very important milestone for us.</p>
<p>We have recently developed the full support for BMP messages, and partial support for OpenBMP messages (for <code>raw_bmp</code> type only). This enables us to start working with real-time BMP streams like RouteViews’ Kafka stream.</p>
<p>Below is an example code that takes RouteViews’ Kafka OpenBMP stream and parse the messages into internal data structures:</p>
<pre><code class="lang-rust"><span class="hljs-keyword">let</span> <span class="hljs-keyword">mut</span> reader = Cursor::new(<span class="hljs-built_in">Vec</span>::from(kafka_payload));
<span class="hljs-keyword">let</span> header = parse_openbmp_header(&amp;<span class="hljs-keyword">mut</span> reader).unwrap();
<span class="hljs-keyword">if</span> <span class="hljs-keyword">let</span> <span class="hljs-literal">Ok</span>(msg) = parse_bmp_msg(&amp;<span class="hljs-keyword">mut</span> reader) {
    info!(<span class="hljs-string">"Parsing OK: {:?}"</span>, msg.common_header.msg_type);
    <span class="hljs-keyword">match</span> msg.message_body {
        MessageBody::RouteMonitoring(m) =&gt; {
            dbg!(m.bgp_update);
        }
        _ =&gt; {}
    }
}
</code></pre>
<p>Here is a break down of what it does:</p>
<ul>
<li><p>it first creates a bytes reader from the raw Kafka message payload;</p>
</li>
<li><p>then parse OpenBMP message header, which contains some basic information about the BMP session;</p>
</li>
<li><p>then it calls the <code>parse_bmp_msg</code> function to parse the embedded raw BMP messages and print out the BGP update messages if the parsing is successful.</p>
</li>
</ul>
<p>Here is a full code example:</p>
<div class="gist-block embed-wrapper" data-gist-show-loading="false" data-id="fcac3027555c0b744ea0b3a11197b694"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a href="https://gist.github.com/digizeph/fcac3027555c0b744ea0b3a11197b694" class="embed-card">https://gist.github.com/digizeph/fcac3027555c0b744ea0b3a11197b694</a></div><p> </p>
<p>We have published the SDK on <a target="_blank" href="https://crates.io/crates/bgpkit-parser">crates.io</a> and <a target="_blank" href="https://github.com/bgpkit/bgpkit-parser">GitHub</a>. Feel free the check out the example code at <a target="_blank" href="https://github.com/bgpkit/bgpkit-parser/blob/main/examples/real-time-routeviews-kafka-openbmp.rs">examples/routeviews-kafka.rs</a> if you are interested.</p>
]]></content:encoded></item><item><title><![CDATA[Introducing BGPKIT Parser]]></title><description><![CDATA[BGPKIT Parser is an open-source Rust-based MRT/BGP data parser that takes a MRT formatted binary file and turns it into BGP messages. It is one of the most important building block software that enables BGP data processing and analysis tasks.
Design ...]]></description><link>https://blog.bgpkit.com/introducing-bgpkit-parser</link><guid isPermaLink="true">https://blog.bgpkit.com/introducing-bgpkit-parser</guid><category><![CDATA[sdk]]></category><category><![CDATA[Announcement]]></category><dc:creator><![CDATA[Mingwei Zhang]]></dc:creator><pubDate>Mon, 01 Nov 2021 20:02:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1713724369716/9c629b9d-8cd8-4696-81e8-bae1d01866c5.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1677107955421/af004e2c-ca5b-4e17-9f59-7c8ac4aed58b.png" alt /></p>
<p>BGPKIT Parser is an open-source Rust-based MRT/BGP data parser that takes a <a target="_blank" href="https://datatracker.ietf.org/doc/html/rfc6396">MRT</a> formatted binary file and turns it into BGP messages. It is one of the most important building block software that enables BGP data processing and analysis tasks.</p>
<h2 id="heading-design-and-features">Design and Features</h2>
<p>As mentioned in our previous post <a target="_blank" href="https://medium.com/bgpkit/introducing-bgpkit-broker-b734dac4661e">introducing BGPKIT Broker</a>, the most used BGP data collection projects, RouteViews and RIPE RIS, both publish their collected BGP data in MRT format on their data platform. The BGPKIT Parser is designed to handle parsing tasks for these data sources.</p>
<p>We design our parser to strictly follow the industry standard, e.g. <a target="_blank" href="https://datatracker.ietf.org/doc/html/rfc4271">RFC4271</a> and <a target="_blank" href="https://datatracker.ietf.org/doc/html/rfc6396">RFC6396</a>. BGPKIT Parser is also designed with the following goals:</p>
<ul>
<li><p><strong>performant</strong>: comparable to C-based implementations like <code>bgpdump</code> or <code>bgpreader</code>.</p>
</li>
<li><p><strong>actively maintained</strong>: we consistently introduce feature updates and bug fixes, and support most of the relevant BGP RFCs.</p>
</li>
<li><p><strong>ergonomic API</strong>: a three-line for loop can already get you started.</p>
</li>
<li><p><strong>battery-included</strong>: ready to handle remote or local, <code>bzip2</code> or <code>gz</code> data files out of the box.</p>
</li>
<li><p><strong>open-source</strong>: we want people to use our parser freely and we can continue develop and improve it based on community feedbacks.</p>
</li>
</ul>
<p>To demonstrate how easy it is to get started using BGPKIT Parser, check out the example below where we print out all BGP messages from a remote MRT file on RouteViews:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1677107956349/be057666-0f62-4a94-aa1a-07d39bd9ec93.png" alt /></p>
<p>Example of reading a remote MRT file from RouteViews and print out BGP messages. Code available at <a target="_blank" href="https://gist.github.com/digizeph/9977371653f39a459ff3ae507dc3636c">https://gist.github.com/digizeph/9977371653f39a459ff3ae507dc3636c</a></p>
<p>There are a number of things happens when we call <code>for elem in BgpkitParser::new(url)</code> :</p>
<ol>
<li><p>it creates a new BgpkitParser struct instance with the provided URL to the data file;</p>
</li>
<li><p>it tries to retrieve the content of the remote file and download the raw compressed bytes into memory;</p>
</li>
<li><p>it determines the compression type by file suffix and calls corresponding decompression library to create a buffered reader;</p>
</li>
<li><p>it then creates an iterator (used by the for loop) that continuous return new parsed items until it reaches the end of the data stream.</p>
</li>
</ol>
<p>We determined that it is worth the extra binary file size to bring in the network and compression libraries into the project so that the library users will never have to worry about handling data downloading and decompression by themselves again.</p>
<h2 id="heading-the-future">The Future</h2>
<p>The future of the BGPKIT Parser lies on continuous performance and stability improvements, as well as some exciting features that we are currently planning. Some of the coming features include</p>
<ol>
<li><p>adding capability of handling <strong>real-time data streams</strong> coming from RIPE RIS Live and RotueViews’s Kafka BMP stream;</p>
</li>
<li><p>supporting <strong>data serialization</strong> back to MRT files (reverse-parsing), which allows users to produce customized MRT files after data processing;</p>
</li>
<li><p>adding <strong>WASM support</strong> to allow BGP data parsing directly on the web with JavaScript.</p>
</li>
</ol>
<p>Because we are building our software in Rust, we can effortlessly tapping into Rust’s great software ecosystem and continue introducing new features and improvements. The future of BGPKIT Parser is exciting and we can’t wait to bring more features for you to try out!</p>
<p>For more details about the BGPKIT Parser, check out our GitHub repo and our website. Feedback is highly appreciated!</p>
<p><a target="_blank" href="https://github.com/bgpkit/bgpkit-parser?ref=blog.bgpkit.com">https://github.com/bgpkit/bgpkit-parser</a></p>
]]></content:encoded></item><item><title><![CDATA[Introducing BGPKIT Broker]]></title><description><![CDATA[BGPKIT Broker is a data API service that focuses on building a BGP data file index to enable searching for public/private BGP data files with custom filters. It is one of the building block components designed by BGPKIT to facilitate BGP data process...]]></description><link>https://blog.bgpkit.com/introducing-bgpkit-broker</link><guid isPermaLink="true">https://blog.bgpkit.com/introducing-bgpkit-broker</guid><category><![CDATA[sdk]]></category><category><![CDATA[Announcement]]></category><dc:creator><![CDATA[Mingwei Zhang]]></dc:creator><pubDate>Sun, 31 Oct 2021 20:04:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1713724495767/66755057-6104-4d39-b6a0-224d3acde139.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>BGPKIT Broker is a data API service that focuses on building a BGP data file index to enable searching for public/private BGP data files with custom filters. It is one of the building block components designed by BGPKIT to facilitate BGP data processing with ease.</p>
<h2 id="heading-the-first-step-to-investigate-a-bgp-event">The first step to investigate a BGP event.</h2>
<p>Imagine this scenario: a malicious player just attempted to hijack a IP prefix using BGP announcements, and you are interested in learning what exactly happened during that half-hour down time of the victim network, how would you start investigating?</p>
<p><strong>Collecting evidences</strong>. Luckily for us, the actors on the Internet, good and bad, always leave traces behind if they use BGP as their method. There are a number of <strong>reputable public BGP route collectors</strong> operating for years collecting all BGP messages received from their connected router peers and dump them into regular dump files. The most used projects are <a target="_blank" href="http://routeviews.org/">RouteViews</a> and <a target="_blank" href="https://www.ripe.net/analyse/internet-measurements/routing-information-service-ris/ris-raw-data">RIPE RIS Data</a>.</p>
<p><strong>Files are all over the places.</strong> There are more than 60 different data collectors from the two projects alone, and each publishing their data under separate sites. The two projects also have different data file structure and compression algorithms for their data dump files. It is not hard to go find one data file during the interested event time from one collector, but it will be a real hassle to gather URLs toward <strong>all data files</strong> that include information within a specified time range.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1677107965030/0b1e9b43-53b2-4e84-8918-da35b467f39c.png" alt="Collector’s data is published as compressed MRT files at regular frequency." /></p>
<p>Collector’s data is published as compressed MRT files at regular frequency.</p>
<h2 id="heading-bgpkit-broker-a-bgp-data-file-index-api-service">BGPKIT Broker — A BGP Data File Index API Service</h2>
<p>We designed the BGPKIT Broker to resolve one and only one problem: quickly collect links to the BGP data files that matches a filtering criteria.</p>
<p>You can filter BGP data using multiple criteria:</p>
<ul>
<li><p><code>start_ts</code> : UNIX timestamp that all files must be dumped after</p>
</li>
<li><p><code>end_ts</code> : UNIX timestamp that all files must be dumped before</p>
</li>
<li><p><code>data_type</code> : the type of the data file, can be <code>update</code> or <code>rib</code></p>
</li>
<li><p><code>collector</code>: the collector ID that the files are generated from</p>
</li>
<li><p><code>project</code>: the data collection project, can be <code>route-views</code> or <code>riperis</code></p>
</li>
<li><p><code>page</code> and <code>page_size</code>: the pagination control for collecting a large number of files</p>
</li>
</ul>
<p>Here is an example REST API call: <a target="_blank" href="https://api.broker.bgpkit.com/v1/search?data_type=update&amp;start_ts=1633046400&amp;end_ts=1633132800&amp;collector=rrc00&amp;project=riperis&amp;page%20=2&amp;page_size=3">https://api.broker.bgpkit.com/v1/search?data_type=update&amp;start_ts=1633046400&amp;end_ts=1633132800&amp;collector=rrc00&amp;project=riperis&amp;page%20=2&amp;page_size=3</a></p>
<p>It asks for all data <strong>updates</strong> files dumped between <strong>1633046400</strong> and <strong>1633132800</strong>, from <strong>RIPE RIS</strong>’s collector <strong>rrc00</strong>. It also requested for the <strong>second page</strong> of the results and <strong>each page contains 3 items</strong>.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1677107966233/628714ec-3287-41c0-9662-9705831d6746.png" alt /></p>
<p><strong>DEPRECATED</strong>: BGPKIT Broker API service is freely available to use, hosted at <a target="_blank" href="https://api.broker.bgpkit.com/v1/">https://api.broker.bgpkit.com/v1/</a>. The documentation is available at <a target="_blank" href="https://docs.broker.bgpkit.com/">https://docs.broker.bgpkit.com/</a></p>
<p><strong>BGPKIT Broker API and SDK has been upgraded to V2 now. The V1 examples are left up for legacy services that depends on it. Please check out the current API documentation at for more:</strong><a target="_blank" href="https://api.broker.bgpkit.com/v2/"><strong>https://api.broker.bgpkit.com/v2/</strong></a></p>
<h2 id="heading-bgpkit-broker-rust-api">BGPKIT Broker Rust API</h2>
<p>The BGPKIT Broker API service is built entirely using Rust, and of course we also developed native Rust API to access the broker data with ease using Rust.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1677107967522/678c704b-c588-4db7-9e68-ef380e72b263.png" alt /></p>
<p>Example BGPKIT Broker Rust API call</p>
<p>The Rust API is open source under MIT license, and with the free API, you can already build your own workflow today!</p>
<p><a target="_blank" href="https://github.com/bgpkit/bgpkit-broker?ref=blog.bgpkit.com">https://github.com/bgpkit/bgpkit-broker</a></p>
<h2 id="heading-easy-on-premise-deployment">Easy On-premise Deployment</h2>
<p>For API service like this, especially that it also collects and indexes data of over 10 years span, one might imagine the deployment could be complex and slow.</p>
<p>For BGPKIT Broker, we spent extra efforts to make the API deployment process as quickly as possible, and also efficient on resource consumption. For references, we bootstrapped the entirety of the database in <strong>under 5 minutes</strong>, and cost <strong>less than 1 GB storage</strong> to store the information in PostgreSQL database. The whole database and API run fluently with <strong>less than 500MB of RAM</strong> usage. This allows use to deployment extra instances when under heavy load without costing a fortune.</p>
<p>Users who need dedicated resource allocation for query performance can contact us for private API hosting that are not shared by others. Enterprise option is also available for on-premise deployment with customization consultation available. If you are interested in testing it out, feel free to shoot us an email at contact@bgpkit.com.</p>
<p>For more information, checkout our website!</p>
<p><a target="_blank" href="https://bgpkit.com/broker">https://bgpkit.com/broker</a></p>
]]></content:encoded></item><item><title><![CDATA[BGPKIT Journey Started]]></title><description><![CDATA[BGPKIT is a small-team start-up that aims to provide comprehensive tool suite to facilitate companies building on-premise BGP data monitoring services. We started our journey of building the best BGP data toolkit for developers in October, 2021. Here...]]></description><link>https://blog.bgpkit.com/bgpkit-journey-started</link><guid isPermaLink="true">https://blog.bgpkit.com/bgpkit-journey-started</guid><category><![CDATA[Announcement]]></category><dc:creator><![CDATA[Mingwei Zhang]]></dc:creator><pubDate>Tue, 26 Oct 2021 23:17:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1677107976858/2727128b-5d43-493a-8e71-ac6eb3a4eec8.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><a target="_blank" href="https://bgpkit.com/">BGPKIT</a> is a small-team start-up that aims to provide comprehensive tool suite to facilitate companies building on-premise BGP data monitoring services. We started our journey of building the best BGP data toolkit for developers in October, 2021. Here is a brief glance on what we are working on and what values we strive to provide.</p>
<p><a target="_blank" href="https://bgpkit.com"><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1677107976045/a522a54e-02a5-44a0-81e7-2124b5de428a.png" alt /></a></p>
<h1 id="heading-bgp">BGP</h1>
<p>Border Gateway Protocol (BGP) is the de facto inter-domain routing protocols being used by every major companies on the Internet. <strong>The main purpose of BGP is to allow companies exchange IP prefix reachability information.</strong> In other words, companies tell other companies what IP blocks they have, and how to reach these IP blocks. This the key functionality that enables the Internet.</p>
<p>Because the purpose of BGP is to exchange information and allow everyone to know how to reach certain IP blocks, the BGP messages must be propagated globally and publicly. There are a number of data sources available out there that provides BGP data archives (e.g. <a target="_blank" href="http://www.routeviews.org/routeviews/">RouteViews</a> and <a target="_blank" href="https://www.ripe.net/analyse/internet-measurements/routing-information-service-ris/ris-raw-data">RIPE RIS</a>), or real-time data like BGP <a target="_blank" href="https://lg.he.net/">looking glass</a>es or <a target="_blank" href="https://ris-live.ripe.net/">live BGP data stream</a>.</p>
<p>The aforementioned data sources contains a wealth of information if you know what to look for. For example, by looking at BGP information, researchers can <a target="_blank" href="https://www.caida.org/catalog/datasets/as-relationships/">infer companies relationships</a>, monitor <a target="_blank" href="https://www.bgpmon.net/">security</a> <a target="_blank" href="https://radar.qrator.net/">incidences</a>. Having handy toolkit in hand enables people to build successful businesses around BGP data.</p>
<h1 id="heading-bgpkit">BGPKIT</h1>
<p>At BGPKIT, we build software tools to process BGP data and reveal insights from BGP messages. We aims to provide the best developer experience, and enable customers to build their own BGP data processing pipeline and monitoring services on-premise.</p>
<h2 id="heading-complete-tool-suite">Complete Tool Suite</h2>
<p>Our goal is to build complete tool suite for BGP data processing: data collection, parsing, analysis, programmable and visual interface, and data warehousing. Everything you need to handle BGP data.</p>
<h2 id="heading-rust-implementation">Rust Implementation</h2>
<p>To achieve the best performance and security, we choose to focus on building our tools using the <a target="_blank" href="https://www.rust-lang.org/">Rust programming language</a>.</p>
<p>We believe that Rust’s ecosystem is now mature enough that building modern features like data streaming, async data workflow, parallel processing, web API, or even porting the entire codebase onto WASM and running it on browsers is a archivable task with reasonable efforts.</p>
<h2 id="heading-powerful-extensibility">Powerful Extensibility</h2>
<p>At BGPKIT, we design our libraries to provide powerful API and assist customers to further customize workflow to meet individual needs. We strive to provide the most ergonomic interfaces that allow library consumers to easily integrate our library into theirs.</p>
<h2 id="heading-embrace-open-source">Embrace Open-Source</h2>
<p>We also believe that good tools empowers people, so we open-sourced our building block libraries to the public with very permissive license so that any interested parties can explore and build their own ideas free of charge and limitation.</p>
<p>We are excited to start this journey of building, and we hope to have the chance to work with more people and building more dreams together! Follow us here, on <a target="_blank" href="https://twitter.com/bgpkit">Twitter</a>, or visit our website to learn more!</p>
]]></content:encoded></item></channel></rss>