Load distribution


Warning: file_get_contents(http://www.bibsonomy.org/bib/bibtex/23c7219cbf40328df8673dc41580ccab0) [function.file-get-contents]: failed to open stream: HTTP request failed! HTTP/1.1 404 Not Found in /var/www/vhosts/tma-portal.eu/httpdocs/wp-content/plugins/bib2html/bib2html.php on line 252

Network monitoring infrastructures are extremely resource constrained and, therefore, it is highly desirable for such systems to be scalable, e.g., to provide their operators with the ability to incrementally add more computing nodes to the system, in order to support more applications and to sustain a higher volume of traffic.

However, providing network monitoring applications with the ability to migrate across nodes is not trivial. The solution of simply replicating the incoming traffic to other nodes is prohibitively expensive, since it requires additional bandwidth and may involve traffic replication devices and packet capture hardware. The alternative of splitting the traffic across several nodes, so that each application runs in all nodes simultaneously, is also undesirable, since not all monitoring applications are implementable with distributed algorithms.

Furthermore, such infrastructures must efficiently deal with hot spots that monitoring applications naturally create, since the events of interest are usually localized (e.g. intrusion and anomaly detection). Such events can cause load to be distributed unevenly across the monitoring infrastructure. This scenario differs from the ones traditionally explored in distributed systems research [Casavant and Kuhl, 1988] in that monitoring applications are continuous and never finish. Therefore, in such an environment, the principal scheduling mechanism is task migration.

Borealis

Recently, this new problem has also been studied by the database research community in the context of data stream management systems. [Xing et al., 2005] propose a greedy load distribution algorithm that avoids overload and minimizes end-to-end latency by minimizing load variance and maximizing load correlation across nodes.

CoMo

In the area of network monitoring, CoMo proposes a two-stage architecture for the applications to provide network monitoring infrastructures with migration capabilities [Sanjuàs-Cuxart et al., 2008]. The first stage of each application performs those computations with severe real-time constraints that require access to the raw packet stream. Therefore, the first stage runs in the nodes equipped with the specialized packet capture hardware (i.e., capture nodes). The main goal of the first stage is to perform traffic filtering and short-term aggregation to enable the second stage to be easily migrated to remote nodes.

The second stage continuously receives the results from the first stage and performs stateful, potentially more complex and longer-term computations. Applications that do not support migration can still run on the capture node. The resource management in the capture node is handled by means of load shedding, while simple load distribution techniques are used in the second stage to map each application to the available computing nodes.

References

  • [journals-tse-CasavantK88] bibtex
    T. L. Casavant and J. G. Kuhl, "A Taxonomy of Scheduling in General-Purpose Distributed Computing Systems.," IEEE Trans. Software Eng., vol. 14, iss. 2, pp. 141-154, 1988.
    @article{journals/tse/CasavantK88, added-at = {2011-11-07T00:00:00.000+0100},
      author = {Casavant, Thomas L. and Kuhl, Jon G.},
      biburl = {http://www.bibsonomy.org/bibtex/24f2c14c0169646a42f0d949561df3b45/dblp},
      ee = {http://doi.ieeecomputersociety.org/10.1109/32.4634},
      interhash = {88c93cb6c49e65860ceb6dca167f2292},
      intrahash = {4f2c14c0169646a42f0d949561df3b45},
      journal = {IEEE Trans. Software Eng.},
      keywords = {dblp},
      number = 2, pages = {141-154},
      timestamp = {2011-11-07T00:00:00.000+0100},
      title = {A Taxonomy of Scheduling in General-Purpose Distributed Computing Systems.},
      url = {http://dblp.uni-trier.de/db/journals/tse/tse14.html#CasavantK88},
      volume = 14, year = 1988 }
  • J. Sanjuàs-Cuxart, P. Barlet-Ros, G. Iannaccone, and J. Solé-Pareta. Distributed scheduling in large scale monitoring infrastructures. In Proc. of ACM CoNEXT Student Workshop, Dec. 2008.

http://www.bibsonomy.org/bib/bibtex/23c7219cbf40328df8673dc41580ccab0 bibtex file empty