Northwestern University
Company Website
Full Time
Closes: 1 July 2023
Applications have closed
HPC Lead Systems Engineer

This leadership role on Northwestern’s Research Computing Infrastructure (RCI) team supports researchers using Northwestern’s High-Performance Computing (HPC) infrastructure, a suite of resources that includes Quest, an HPC system with more than 50,000 cores, used by Northwestern researchers to make cutting-edge discoveries through computational research and data science.

The hiring manager for this position is a member of WHPC.  Please feel free to contact me directly with any questions at <janna dot nugent at northwestern dot edu>.

HPC Lead Systems Engineer

As Lead HPC Systems Engineer, you will actively support Northwestern’s HPC systems while leading the RCI team in best practices for developing, implementing, maintaining, and securing HPC cluster systems and solutions for computational research requirements, including AI/ML and data science.  As a Lead member of the RCI team, you will work closely with your teammates and the Research Computing Services team to develop a long-term strategy for the evolving Northwestern research enterprise.

Specific Responsibilities:

  • Leads implementation of best practices for HPC design, operations, automation, maintenance and security
  • Leads developing technical skills of RCI team members including mentoring, instructing, directing, documenting, and coaching
  • Leads troubleshooting and diagnostics
  • Oversees the installation, maintenance, configuration, and integrity of Northwestern’s research infrastructure
  • Develops strong working relationships with our stakeholders and partners, prioritizing excellent customer service
  • Continually develops technical skills to meet the evolving needs of research computing, both through independent learning and attending conferences and workshops
  • Owns relationships with partners who provide technical expertise for specific engagements, developing effective working relationships and overseeing their work

This role offers a hybrid work schedule (currently 2 days on prem each week), and takes part in regular 24/7 on-call HPC support rotations.