Capturing Traffic Once and Making That Traffic Available to Multiple Tools

March 13, 2008

I’ve been obsessed with an idea for a while now of a networking and security tool that captures network data once and makes that data available to any kind of tool that asks for it. It’s not my idea; an buddy of mine named Eric, who I’ve since lost touch with, told me about it over lunch at Maggiano’s > many years ago. I’ve been thinking about it ever since.

Anyway, Richard Bejtlich > just put up an interesting post about something similar >. One of his readers asked him whether he’d thought of a single capture box that runs multiple applications. This is not the same as , but the idea is the same: capture the data once, re-use it many times.

But the way I see this playing out is more like an interface to the data running on a single box, which is accessed from many separate tools, rather than multiple applications running on the capture box itself. Storage is getting cheaper all the time, as are computing resources, so the idea here for these boxes would be to:

Capture ALL traffic for a given segment (full packet captures).
Store it for as long as needed.
Present an open, agnostic interface to the data, including real-time and/or historic views.

Richard actually mentioned a couple of options that I’m not familiar with, Solara Networks > and Endace Ninja >. I’ll have to check into them.

Another interesting idea that was brought up was the power of taps. The problem there is that it’s only real-time and the storage bit would still fall onto multiple systems. It just seems so wasteful to have multiple network and security tools all over the network creating their own copies of packet data. Especially when they’re often stored in a proprietary format.

Imagine (John Lennon style) if they all spoke a single data retrieval protocol where you could ask a common interface for raw, untainted packet data — but at a particular level. So one security product could just ask for port data via one type of query, and another one could ask for flow data, and another could be pulling a full replay of all layers. The cool part would be that the output of the query would be a filtered data stream that was uniquely useful to the requesting application.

So if FooSecurityApp just needed flow data it could build a query to the Network Data Interface (NDI?) that only returned flow data, and in a clean, universal (compressed?) format. The idea being that it would save tons of bandwidth by giving you just what you needed.

And if a security tool decided it needed to see byte 13 of the TCP header on everything leaving the network from one machine, last Thursday between 1400 and 1430, it could build a query to get just that (and any requisite context, of course). Very little data would come back relative to pulling everything and filtering at the requesting app.

Anyway, that’s taking it to an extreme but it seems like an interesting idea, if nothing else.

Thoughts?