Tags:anonymous communication, denoising model and flow correlation
Abstract:
Flow correlation is a common approach to break the anonymity of anonymous communication. However, unpredictable network noise caused by multiple factors in open Internet raises the bar for existing correlation methods. Traditional methods comparing statistical distance of data flows and deep learning methods such as convolutional neural network behave worse because network noise changes traffic shape. In this paper, we design a pre-processing model called ResTor to perform the noise reduction before actually correlating entering and exiting flows. ResTor treats the byte accumulation sequences smoothed at fixed intervals as fitting targets, and takes advantage of the stacked auto-encoder architecture to remove noise in two phases. Experiment results show that the exiting Tor flows processed by ResTor are closer to their corresponding entering flows, thus the correlation task can be finished effectively even using traditional correlation ways: cosine distance and other statistical metrics assisted by ResTor achieve less computational overhead and higher correlation accuracy on Tor compared to the state-of-the-art method of DeepCorr, especially when traffic is obfuscated.
ResTor: a Pre-Processing Model for Removing the Noise Pattern in Flow Correlation