At Georgia Tech, Prof. Alberto Dainotti and Dr. Zachary Bischof teach a newly developed graduate course on Internet Data Science . Its goal is for students to learn about cutting-edge research on networking, with a focus on Internet measurement techniques and datasets, such that they can then perform novel analyses and execute a final project of their own design.
One such project, by Manasvini Sethuraman, focused on comparing the Censys dataset with data from the ANT (Analysis of Network Traffic) Internet Census. While these projects both scan the IPV4 address space, a key difference is that the ANT Internet Census relies on ICMP pings while Censys’ approach uses transport layer protocols (TCP/UDP). This difference in methodology creates distinct perspectives in terms of Internet “liveness”.
Figure 1: IODA’s country level outage map, Feb 26 to March 2, 2024
Manasvini’s project was of particular interest to Alberto and Zachary due to its relevance in detecting Internet outages. Their lab’s outage detection platform, IODA (Internet Outage Detection and Analysis at http://ioda.live), provides a public dashboard of Internet connectivity worldwide (Figure 1) and provides insight into events such as severe weather or government-ordered shutdowns in countries like Iran. Online since 2016, IODA is largely used by a broad set of users, including human rights organizations (e.g., Amnesty International, Freedom House), government agencies and intergovernmental organizations (e.g., US FCC, United Nations), journalists, and researchers from industry and academia. The platform includes multiple “connectivity signals” for detecting Internet outages, one of which—the Active Probing signal— relies on ICMP pings to measure connectivity and identify outages. Incorporating additional probing techniques, such as those leveraged by Censys, could potentially allow IODA to discover more hosts. However, the extent to which these additional hosts would improve IODA’s outage detection was unclear. So, this project set out to answer the question:
Can we improve IODA’s outage detection by leveraging Censys?
IODA’s Active Probing signal is able to detect outages by leveraging the methodology of Trinocular. The main idea of Trinocular is to use Bayesian inference to determine the connectivity status at the granularity of a /24 block of IP addresses. This method requires the average availability (measured as the ratio between the number of responding hosts and the total number of hosts in the /24 block) be at least 0.3 to accurately and quickly detect Internet outages using very few probes. For blocks with low availability (<0.3), outage detection requires additional probing and can be much slower. Having a larger number of responsive hosts in a /24 makes outage detection faster and more reliable.
Unfortunately, some networks drop any incoming ICMP traffic, others contain a significant number of hosts that ignore all ICMP probes. In such cases, IODA is currently unable to detect outages. Incorporating Censys’ TCP/UDP probing techniques could help address this issue. However, before investing a significant amount of time and resources into modifying IODA’s active probing systems, we first wanted an idea of the potential benefits. Comparing nine snapshots from both datasets across two years, we found that millions of /24 blocks were uniquely identified by each technique (Figure 2), indicating significant potential benefits of combining both techniques.
Figure 2: /24 block coverage comparison of ANT and Censys.
Improving /24 Block Coverage
To understand if TCP/UDP probes could improve block coverage, we compared the average availability of each /24 block using only ICMP versus using both ICMP and TCP/UDP probes. We found a significant reduction in the number of /24 blocks with availability less than 0.3 (1.8M versus 2.3M with ICMP only), as shown in Figure 3. When the availability of a block is higher, its connectivity status can be determined more reliably and quickly with fewer probes. Additionally, 328.5K /24 blocks were only found in the Censys dataset, i.e., these blocks could only be discovered via TCP/UDP probes. Of these, 106k could likely be reliably monitored for outage detection. Overall, adding TCP/UDP probes has the potential to reduce the number of unreliable /24 blocks by about 800k, which represents about ~5% of all /24 blocks in IPv4.
Figure 3: Block availability, comparing ICMP versus ICMP + TCP + UDP
Improving AS Coverage
We next mapped the /24 blocks to their corresponding autonomous systems (ASes). At the AS level, we then compared the availability of /24 blocks on average. First, with TCP/UDP probes, we found many ASes that were not visible through ICMP probing (1.8K or 2% of assigned ASes). Second, we also found that the fraction of reliable blocks within an AS increased when adding TCP/UDP probes (Figure 4). Analyzing the geographic distribution of the newly discovered ASes, we found that many of these previously unmonitored blocks and ASes were located in regions which are often critically underrepresented in Internet research, including many regions in the Global South.
Figure 4: CDF of ASes vs number of reliable (availability >0.3) /24s within the AS
Conclusion
In our work, we leveraged Censys data to demonstrate its potential for improving /24 block coverage for outage detection by supplementing ICMP probes with TCP/UDP probing techniques. Though some of these improvements may seem like a small percentage (i.e., 5% of IPv4 address blocks, 2% of ASes), they are out of an extremely large set (i.e., the entire Internet) and are still significant. Enabling outage detection for these addresses and networks could potentially represent millions of Internet users. Our research team believes that the results of this analysis indicate that there is promising potential and plan to incorporate Censys scanning techniques into IODA’s Active Probing signal.