Deloitte Lead InfraOps Engineer Senior Consultant in Phoenix, Arizona
Deloitte is offering AI based infrastructure as a service (IaaS) as a "single-stop", end to end managed service for customers doing AV Development/Test and Simulation. We will be offering this service based on Nvidia's DGX/A00- Super-Pod reference design in on-prem or Colo configurations.
A key part to building this practice will involve setting up an internal DGX Super-pod reference & training environment as a "first prototype" in our Deloitte Data Center with all the automation services 'built in" for offering this as a service (pay by the drip consumption) to Automotive customers.
The InfraOps Lead Engineer will work with our clients to:
Focus on deployment and data center operations supporting infrastructure services.
Have the ability to Lead a team of highly skilled infrastructure engineers.
Drive client conversation and help with maintenance and management of client HPC systems.
Own, Delegate and Drive any Infrastructure related problems & projects.
4+ years of experience in installing, configuring and basic problem solving in both Linux and Windows operating systems in global, large-scale environments.
4+ Good hands-on experience of handling Kubernetes ecosystem and helm based application deployment patterns.
Strong understanding of Container Orchestration Systems(K8s/Mesos) etc.
Good programming skills in Shell Scripting, ansible, terraform, Helm Template.
3+ years of experience with hands-on experience for server hardware at the component level: sbios, vbios, firmware, IPMI, component burn-in process, and memory and CPU testing.
Fundamental knowledge and understanding of network protocols: TCP/IP, SMB, HTTP, NFS, FTP, SNMP, DHCP.
Experience with Server-side (GPU), DGX / SuperPOD installations, upgrades and maintenance.
Experience maintaining and managing HPC/ Infiniband
Proficiency in basic switch setup to include interface, VLANs, access/trunk port configuration.
BS in Computer Engineering or equivalent
Travel up to 30% of the time (Monday - Thursday/Friday). (While 30% of travel is a requirement of the role, due to COVID-19, non-essential travel has been suspended until further notice.)
Limited Immigration sponsorship may be available
Ability to quickly learn and evaluate new technologies.
Ability to influence and establish relationships with other software and IT functional groups such as development, server, storage and security teams.
Responding effectively and speedily to any problems.
Documenting all reported malfunctions and actions taken in response.
Ability to manage multiple client projects simultaneously