Principal HPC Systems Administration Specialist – File Systems

The selected individual will participate in the operation, administration, and maintenance of file systems for the supercomputing resources of the Argonne Leadership Computing Facility (ALCF). This includes some of the largest (10s-100s of PB) and fastest (100s-1000s of GB/s) parallel files systems in the world. For the home file system, stability and robustness to failure are paramount. For the parallel file system, performance is the critical demand. Coordinates with the other storage and operations team members for system administration duties. Appointee will participate in the design and planning for future machine upgrades (every 3–5 years), as well as investigate improved/novel approaches for providing file system and disk storage services. This position also includes SAN hardware responsibilities.

Position Requirements

Required skills:

  • Expertise in installing, configuring, tuning, and supporting Lustre file systems
  • Experience with networked, parallel and cluster file systems such as NFS, PVFS, Lustre, and GPFS
  • Experience in UNIX systems administration, with emphasis on LINUX
  • Experience of systems administration tools and languages such as Perl, Python, C, PHP, and shell scripts
  • Expertise in SAN and related technologies
  • Comprehensive knowledge of and expertise in Internet resources, facilities, activities, and techniques
  • Comprehensive problem-solving skills
  • Comprehensive ability to work effectively as a member of a team
  • Comprehensive flexibility in dealing with assignments and in working on several projects simultaneously
  • Ability to work onsite for hardware support and other tasks that need to be completed
  • Must be able to pass an Office of Personnel Management National Background Investigations Bureau background investigation

Preferred, but not required:

  • Expertise in installation, management and use of software such as compilers, scientific applications, and job resource managers
  • Experience with DAOS

This position can be hired at one of two levels and the related requirements are as follows:

  • PT3: Bachelor’s degree + 4 years of experience, or a Master’s degree + 2 years, or equivalent
  • PT4: Bachelor’s degree + 6 years of experience, or a Master’s degree + 4 years, or equivalent