Hpc Operations Engineer

Hpc Operations Engineer
Empresa:

Coreweave Europe


Detalles de la oferta

We are looking for people willing to work in two shifts from 7am to 9pm. This is fully remote within Spain. Successful candidates will be expected to attend onboarding training at our US Headquarters for up to 2 weeks within their first month of employment. About the role: The High Performance Computing Operations team is responsible for the day-to-day provisioning, management and uptime of CoreWeave's ever-expanding fleet of server nodes. Playing a central role in CoreWeave's growth strategy, this team is on the front line for configuration, updates and remote troubleshooting of our highest tier of supercomputing clusters and their networking, delivery platforms and tools dependencies. You will be in a daily battle with the forces of entropy to maximise the number of nodes CoreWeave can deliver to customers.
We are seeking curious, creative and persistent problem solvers to join our HPC Operations team to help us drive batches of server nodes through our provisioning and validation processes while efficiently and effectively troubleshooting node or cluster problems as they arise. This individual will join a team of committed engineers working to deploy nodes as fast as they can be racked and turned on.
Key Responsibilities: Install, configure, and maintain large-scale high-performance supercomputing clusters running state-of-the-art GPUsTroubleshoot hardware and software issues; escalate and coordinate as needed with data centre, network and platform teams to drive resolutionMonitor and analyse system performance and take appropriate remediation actions for cloud healthApproach your work with flexibility and optimism anticipating shifting business and technical prioritiesCreate and maintain documentation of team processes, knowledge and best practices for system managementThink critically about your day-to-day work and work collaboratively to improve team processes and efficiencySuccessful candidates typically share the following skills and experience: Experience troubleshooting or administering data center or on-prem infrastructure (servers, storage, network or a mix)Strong understanding of Linux system administration and networking conceptsAbility to troubleshoot hardware and software issues and perform system maintenance tasks consistently and reliablyIdeal candidates may also have experience in one or more of these: Software development or scripting languages (bash, python, powershell, etc)Grafana, prometheus, promsql queries or similar observability platformsData centre environments including server racks, HVAC systems, fiber traysKubernetes administrationThe salary for this position ranges from 34,000€ to 38,000€ plus competitive benefits. Pay is based on a number of factors including job-related knowledge, skills, and experience.

#J-18808-Ljbffr


Fuente: Jobleads

Requisitos

Hpc Operations Engineer
Empresa:

Coreweave Europe


Project Engineering Manager / Gerente De Proyectos (Químico / Farmacéutico)

The company is a global manufacturer of process technology, offering an extensive range of products from laboratory to production sized machines and complete...


Desde Dunn Belmont Limited - Madrid

Publicado a month ago

Ingeniero/A Electrónica De Potencia Aeroespacial Y Defensa · Madrid - Arganda Del Rey · Híbrido

En Sener Aeroespacial y Defensadeseamos incorporar en nuestro equipo a un/a Ingeniero/a de Electrónica de Potencia , en nuestro centro de trabajo de Cerdanyo...


Desde Sener En Aeroespacial - Madrid

Publicado a month ago

Senior Electrical Engineer Offshore Wind (M/F/X)

Country:Germany | SpainCity:Hamburg | Stuttgart | Bilbao | MadridDivision/company:Fichtner GmbH & Co. KGReference number:000453To maintain our sustainable gr...


Desde Fichtner Gmbh & Co. Kg - Madrid

Publicado a month ago

Reliability Data Analyst

It takes the brightest minds to be a technology leader. It takes imagination to create green energy for the generations to come. At Siemens Gamesa we make re...


Desde Siemens Gamesa Renewable Energy, S.A. - Madrid

Publicado a month ago

Built at: 2024-05-20T19:03:16.998Z