Machine Learning Engineer Job at Evolve Group, Fremont, CA

TGpreVo3cTVxdnZaUFRybE1QaDVSY0tvbkE9PQ==
  • Evolve Group
  • Fremont, CA

Job Description

Machine Learning Engineer

Tech start-up

San Fransisco based

We’ve partnered with one of the most ambitious and technically rigorous AI research labs in the world. Based in San Francisco, this team is building foundation models entirely from scratch.

They are now hiring ML Infrastructure Engineers to design and scale the systems that power large-scale, distributed model training. If you’ve built infrastructure that runs across hundreds of GPUs, thrive under technical complexity, and want to work side-by-side with elite AI researchers — this is the role.

Key Responsibilities:

  • Build and scale distributed training systems for large-scale model training across LLMs, vision, and robotics.
  • Set up and run large-scale training across many GPUs using tools like Kubernetes, DeepSpeed, and FSDP.
  • Troubleshoot system issues (GPU errors, network problems) and build tools to monitor and recover from failures.
  • Optimize PyTorch pipelines, sharding, and sampling strategies.
  • Collaborate closely with researchers to support novel model training at scale.

Requirements:

  • 3–15 years in ML infrastructure, systems, or research engineering roles.
  • Proven experience scaling distributed training for large models.
  • Strong with PyTorch, CUDA, NCCL, Kubernetes.
  • Familiar with setting up distributed training clusters.
  • Deep understanding of PyTorch dataloaders, data sharding, and sampling.
  • Strong communicator with a collaborative, mission-driven mindset.

This is a fully in-person role based in San Francisco , it's ideal for engineers excited to build at the edge of what's possible in AI.

Job Tags

Immediate start,

Similar Jobs

Alexander Rose International

Project Manager Job at Alexander Rose International

 ...General Contractor in the Southeast to recruit an experienced Project Manager to join our growing heavy civil infrastructure team in...  ...performance while ensuring alignment with project goals. Contract & Budget Management: Administer contracts and control project... 

Industrial

Packaging Operations Operator Job at Industrial

 ...Roles & Responsibilities Prepare and package medical device kits, including labeling, inspecting, and ensuring compliance with quality...  ...chemicals with water solutions. Operate basic machinery and handle product movement using carts and pallets. Maintain clean... 

Essentia Health

PEDIATRIC OCCUPATIONAL THERAPIST - Rehabilitation Services (0.8 FTE) - Park Rapids, MN Job at Essentia Health

 ...standards of the American Association of Occupational Therapy using Clinical Practice...  ...: Provides therapy services to pediatric patients, ages 0 - 18. Services include...  ...Current state licensure as Occupational Therapist Employee Benefits at Essentia... 

HSBC

Senior Mortgage Business Analyst Job at HSBC

 ...investors, our communities and the planet we all share. The Senior Mortgage Business Analyst is responsible for analyzing and evaluating...  ...processes and procedures to ensure clarity, consistency, and compliance Collect, analyze, and interpret data from various sources... 

Carney, Sandoe & Associates

French Teacher Job at Carney, Sandoe & Associates

 ...nationwide and internationally. French Teacher Responsibilities: Teach four French courses across multiple proficiency levels for grades 6-12. Collaborate with colleagues to maintain and improve course curriculum as needed. Partner with colleagues to establish...