BLOG ◆ JUN 29, 2026

Your DLP Stack Has a Blind Spot. It's the Public Internet.

Attack Surface Management, External Attack Surface Management

Data Protection Teams Have an Internet Visibility Problem

Data protection programs have gotten good at watching data move. You classify it, set retention on it, answer DSARs about it, and run DLP across email, SaaS, endpoints, and cloud storage to catch it leaving through a sanctioned channel. Every one of those controls answers the same question: where did our sensitive data go?

None of them answer a different question that regulators care about just as much: where is our sensitive data already reachable, right now, with little to no movement required?

A misconfigured database, a forgotten FTP server, an unauthenticated message broker, or an exposed AI integration doesn’t trip a DLP alert. No one emailed a file. No sharing policy was violated. No endpoint sensor agent saw a copy. The data just sits on an Internet-facing service that your DLP stack was never pointed at.

Some examples found by Censys ARC:

Exposed pub/sub message queues containing financial data
Publicly accessible DICOM servers, a specialized system that stores and retrieves medical images
Open FTP servers containing gigabytes of corporate data

Breach notification law rarely cares how exposure happened, only that unauthorized parties could reach regulated data. That exposure is in scope for you whether or not you own the app or infrastructure.

This is the gap Censys fills. DLP monitors movement. Censys discovers exposure. You need both, and below we show you exactly how to find the second one. Complete with queries you can run today.

Modern DLP is focused on control points and user-behavior, movement and sharing. It’s time to expand that.

Building a Partnership Between Data Protection and Exposure Management

“Hold on!” you may say. “This simply doesn’t belong to me; this is for exposure management, vulnerability management, AppSec, and [five other teams].”

In many ways, you’re right. They open the tickets, they close the services.

But they answer “what is exposed.” Only data protection can answer “what regulated data does this exposure put at risk, and does it trigger a notification obligation?” An exposed Redis instance isn’t your host to patch, but if it holds session tokens or customer identifiers tied to a regulated workflow, the consequences are yours to own. The partnership is the point: Exposure/Vuln Management finds the door; you decide whether what’s behind it is reportable.

The goal is collaboration. Together you can answer questions neither team can answer alone:

Which Internet-facing assets are most likely to contain regulated data?
Which exposures create the greatest privacy risk?
Which findings should be prioritized for remediation?
Which exposures could trigger breach notification obligations if compromised?

What follows is a starter kit: five exposure classes, each with a Censys query to make the abstract concrete. Emphasis on “starter”: shape these queries into your own! Start by scoping each to your organization by appending your ASN, netblock, certs, DNS names, etc. Example additions include: and host.autonomous_system.name: "YOUR_ASN" ; and host.ip: "125.8.0.0/13" ; and host.services.cert.names: "google.com" ; and host.dns.names: "google.com"

1. Exposed file transfer: HR records on a server everyone forgot

FTP is old, boring, and still online at thousands of organizations because a payroll or benefits hand-off five years ago was never decommissioned. The directory still holds onboarding packets, tax IDs, and benefits exports. DLP will never see it.

Find FTP services that don’t require TLS, the ones most likely to be both legacy and leaking in cleartext:

host.services.protocol = "FTP" and not
host.services.ftp.implicit_tls = true

*Millions of FTP servers with implicit TLS, spanning nearly every country on earth*

Pivot to the higher-fidelity finding: file shares that expose their contents to anyone, with the filenames visible:

host.services.labels.value = "OPEN_DIRECTORY" and host.services.endpoints.open_directory.files.name: "payroll"

Swap “payroll” for the terms that map to your regulated data, such as tax, benefits, export, backup, .sql. Censys surfaces the file names in an open directory, so a data protection analyst can judge sensitivity before anyone touches the host. That’s the whole workflow: classify the exposure, attribute it to your org or a processor, and escalate with context (Is encryption observed? Is it still needed? Does it create a notification obligation?).

FTP Exposure Brief: Examining the 55-Year-Old Protocol Used by Millions

Censys ARC has the definitive measurement of FTP on the Internet.

Read the Article

2. Internet-facing databases: the classic breach, still happening

The largest accidental exposures in history have been open databases. The risk isn’t theoretical and it isn’t dated.

The naive query finds all of a database type. The useful query finds the ones answering to strangers. MongoDB returns a master handshake to an unauthenticated probe when access control isn’t enforced. That boolean is your high-signal finding:

host.services.mongodb.is_master.is_master = true

*You don’t want your MongoDB returning a master handshake on a critically vulnerable host!*

For Redis, it’s the same logic, different protocol. An instance that answers PONG to an unauthenticated PING took the command without credentials:

host.services.redis.ping_response = "PONG"

Broaden to every database-role asset Censys recognizes, then layer on your scope and a “recently appeared” review cadence to catch instances stood up during cloud migrations:

host.services.labels.value = "DATABASE"

Treat each hit as a high-priority privacy finding regardless of whether DLP ever alerted. Pivot from the host to what else runs on it (host.service_count, co-located services) to understand blast radius.

3. Unauthenticated message queues: the live event stream nobody secured

Message brokers don’t look like data stores, which is exactly why they get exposed. But a pub/sub layer in a financial or healthcare app can carry transaction events, account identifiers, fraud alerts, and session telemetry. Live! An external subscriber can watch the operational heartbeat of the business.

NATS publishes whether it requires auth, so the exposure is unambiguous — auth_required = false is a definitive unauthenticated finding, not an inference:

host.services.nats_io.auth_required = false

For MQTT, a broker that returns an accepted connection status to an anonymous probe took the connection without credentials:

host.services: (protocol = "MQTT" and
mqtt.connection_ack_return.return_value: "Accepted")

ZeroMQ exposes its socket handshake; a publisher socket (PUB) reachable on the open Internet means anyone can subscribe to whatever it’s streaming:

host.services: (protocol = "ZEROMQ" and
zeromq.handshake.socket_type = "PUB")

The data protection question for each: which business process owns the stream, and could those messages contain personal, financial, or regulated data? Engineering owns the broker; you own the consequence.

Unauthenticated Message Queues are a Problem

Censys ARC investigates unauthenticated message queues, finds chaos.

Read the Article

4. Exposed MCP servers: AI integrations that advertise a path to your data

This is the newest and least-watched class, and it’s the strongest argument for putting data protection on the Internet-exposure map. Model Context Protocol servers connect AI assistants to tools and data. Databases, file stores, ticketing, CRM, finance, and more.

*Some of these systems are primarily designed to read and write data on behalf of connected clients.*

Crucially, the protocol doesn’t require authentication by default. As of late April 2026, Censys ARC counted 12,520 Internet-accessible MCP services across 8,758 IPs. The number has skyrocketed since, to over 2.5 million MCP web endpoints.

An exposed MCP server advertises its own capabilities. Censys parses the tool and resource metadata, so you can read what an external client could discover:

host.services.endpoints.mcp.tools.name: *

Now make it a data protection query. Find MCP servers advertising tools or resources that name sensitive data stores:

host.services.endpoints: (mcp.tools.name: "database" or
mcp.tools.name: "customer" or mcp.resources.uri: "file://")

Run the same hunt against web properties to catch MCP exposed over HTTP(S) front ends:

web.endpoints.mcp.tools.name: "query"
or web.endpoints.mcp.resources.content: *

The metadata alone (tool names, resource URIs, even embedded prompts) can reveal which regulated systems an AI integration can reach. A DLP policy can’t protect data when a newly shipped AI tool publishes a direct route to it. This is a data governance finding, and it’s the kind of exposure that didn’t exist for your program eighteen months ago.

Finally, if you really want to cast a wide net, you can search for any exposures bearing the Censys ARC label AI, and do the hunting yourself.

host.services.labels.value = "AI" or web.labels.value = "AI"

MCP Servers on the Internet

Censys ARC surveils over 2.5 million MCP web endpoints.

Read the Article

5. Healthcare and other regulated-data systems

Some exposures map straight to a regulated data category, which makes them automatic critical findings. Medical imaging and clinical systems are the headline example. Censys provides convenient labels for identifying these.

host.services.labels.value = "MEDICAL"
or host.services.labels.value = "MEDICAL_DEVICE"

Pair that with a hunt for the lightweight web viewers that imaging teams stand up “temporarily” and forget. Take the same query and add the label for login pages and web UIs on healthcare-scoped infrastructure:

and host.services.labels.value = "LOGIN_PAGE"

Censys ARC has identified everything from unauthenticated access to medical images (DICOM), to exposed patient record logins (EMR/EHR).

The Global State of Internet of Healthcare Things (IoHT) Exposures on Public-Facing Networks

Censys ARC’s State of Internet of Healthcare Things (IoHT)

Read the Article

What These Five Examples Have in Common

File transfer, databases, message queues, AI integrations, medical systems.

Wildly different technologies, one truth: sensitive data doesn’t only leak through the channels DLP was built to watch. It leaks through forgotten infrastructure, vendor-managed systems, operational telemetry, and brand-new AI tooling. Your DLP stack is pointed inward at movement. Not outward, at the vast open Internet.

Anthropic released MCP in 2024, and already there are 2.5 million public endpoints. The boom in AI-assisted building and tools isn’t slowing down.

ASM answers what is exposed. Data protection answers what regulated data this exposure affects, and whether it’s reportable. Neither answer is complete alone. The queries above are how you start producing your half.

The shift is small and the consequence is large. Stop asking only “do we have a policy that says this shouldn’t be public?” Start asking “is there an Internet-accessible system exposing this right now?”

And then go run the query.

AUTHOR

Alex Gartner

Alex Gartner has led teams to uncover novel threats and build scalable data platforms for SecOps. Previously tackling sensitive missions for the U.S. Air Force, and serving as Sr. Engineering Manager of Security Research, he brings industry-leading data practices into detection engineering. SQL everything.

Your DLP Stack Has a Blind Spot. It's the Public Internet.

Data Protection Teams Have an Internet Visibility Problem

Building a Partnership Between Data Protection and Exposure Management

1. Exposed file transfer: HR records on a server everyone forgot

FTP Exposure Brief: Examining the 55-Year-Old Protocol Used by Millions

2. Internet-facing databases: the classic breach, still happening

3. Unauthenticated message queues: the live event stream nobody secured

Unauthenticated Message Queues are a Problem

4. Exposed MCP servers: AI integrations that advertise a path to your data

MCP Servers on the Internet

5. Healthcare and other regulated-data systems

The Global State of Internet of Healthcare Things (IoHT) Exposures on Public-Facing Networks

What These Five Examples Have in Common

Your DLP Stack Has a Blind Spot. It's the Public Internet.

AsyncRAT Family Threat Overview

UNC1151 Phishing Email Targeting Belarusian Politician Points to Multi-National Campaign

Subscribe to our newsletter

Subscribe to our newsletter

Your DLP Stack Has a Blind Spot. It's the Public Internet.

Data Protection Teams Have an Internet Visibility Problem

Building a Partnership Between Data Protection and Exposure Management

1. Exposed file transfer: HR records on a server everyone forgot

FTP Exposure Brief: Examining the 55-Year-Old Protocol Used by Millions

2. Internet-facing databases: the classic breach, still happening

3. Unauthenticated message queues: the live event stream nobody secured

Unauthenticated Message Queues are a Problem

4. Exposed MCP servers: AI integrations that advertise a path to your data

MCP Servers on the Internet

5. Healthcare and other regulated-data systems

The Global State of Internet of Healthcare Things (IoHT) Exposures on Public-Facing Networks

What These Five Examples Have in Common

Share

Your DLP Stack Has a Blind Spot. It's the Public Internet.

AsyncRAT Family Threat Overview

UNC1151 Phishing Email Targeting Belarusian Politician Points to Multi-National Campaign

Subscribe to our newsletter

Subscribe to our newsletter