Skip to content
Snippets Groups Projects
user avatar
Martin Sauter authored
a3caa119
History

NaviX

pipeline status coverage report MIT license

Use case of NaviX

High Throughput Computing leads to enormous challenges for storage systems, the processing computing clusters and network connections between them. The tremendously increasing amount of data that has to be processed in High Energy Physics causes a demand for additional resources and optimizations of workloads. Here, high throughput and short turnaround cycles are core requirements for the efficient processing of I/O-intense workflows. Just distributing the workflows to all kind of resources will cause inefficient utilization and overloading of network connections or storage systems. This is especially true for infrastructures, where data providers and computing clusters are separated such as cloud or HPC resources.

Caching mechanisms are able to solve this problems by caching data that is iteratively processed. As usually multiple resources each one providing its own cache are addressed within one resource pool, simple caching mechanisms will not be sufficient. Here, we observed that caches are filled with same data, which leads to waste of resources and performance losses.

We suggest a coordination of those distributed caches, so that data scheduling to caches as well as workflow sheduling to resources or caches is optimized. The aim is to reach data locality by bringing workflows to most suitable resource or cache.

For that purpose we developed NaviX, a intermediate between the workflow scheduler HTCondor and the data transfer protocol XRootD. This protocol was developed by the High Energy Physics community for streaming of event-based data and already supports basic caching functionality. Proxy servers cache files on-the-fly during access and redirect iterated accesses to the cached version instead of the remote one. On top, XRootD managers coordinate data accesses and collect meta-data of the underlying proxy or data servers. This hierarchical infrastructure enables utillization of meta-data for redirecting data accesses to the most suitable cache proxy. Here, NaviX integrates all coordination decisions into the overlaying batch system HTCondor to influence the job to resource matching using cache meta-data.

Requirements

  • HTCondor batch system
  • XRootD proxy caches managed by XRootD meta-manager
  • python (version 3.X) including development packages
  • python-setuptools
  • xrootd-client including development packages
  • six
  • pyyaml
  • xrootd-python (suitable for python version 3.X)

Installation

For RHEL based distributions

  1. Install prerequisites python and XRootD:
yum install python34 xrootd-client-devel python34-devel python34-setuptools
  1. Install additional prerequisites six and pyyaml via pip3
cd /usr/lib/python3.4/site-packages && python3.4 easy_install.py pip
pip3 install six pyyaml
  1. Get python 3 compatible version of pyxrootd
git clone https://github.com/KIT-ETP-Computing/xrootd-python.git /opt/xrootd-python
export XRD_LIBDIR="/usr/lib64"
export XRD_INCDIR="/usr/include/xrootd"
cd /opt/xrootd-python && python3.4 setup.py install
  1. Clone NaviX repository and start service
git clone https://gitlab.ekp.kit.edu/ETP-XRootD/NaviX.git
cd NaviX
python3 NaviX.py

Configuration

Please adopt the configs listed in the configs folder to your local setup. Layout and syntax are shown within the exemplary config files.

Links

Trivia