A case for robust semi-experiments
Abstract
In this paper, we demonstrate that anomalies in Internet traces can have a significant impact on semi-experiments that are designed to determine the causes of scaling behavior of traffic. A semi-experiment involves artificially modifying a specific aspect of a trace and studying the resulting change in scaling behavior. We demonstrate using MAWI traces that semi-experiments performed without addressing the presence of anomalies give insights that contradict widely accepted theories regarding Internet traffic scaling behavior. For example, a direct semi-experimental analysis seems to suggest that removing large flows does not result in the removal of LRD behavior and that the scaling behavior of MAWI traces is the same before and after the removal of the large flows. This observation hence challenges the well-known hypothesis that the heavy-tailed distribution of flow sizes is the primary factor causing correlation at large time-scales. To mitigate the impact of anomalies, we couple the semi-experiments with a recently proposed sketchbased procedure for robust estimation of scaling behavior. We term these "robust semi- experiments". Our analysis shows that using a robust estimation procedure enables a meaningful semiexperimental analysis and that the conclusions drawn from the robust semi-experiments agree with well-established theories regarding Internet traffic scaling behavior. © 2010 IEEE.