Full Time Job / Senior Software Engineer- Machine Learning Infrastructure

Full Time Job / Senior Software Engineer- Machine Learning Infrastructure

Job Overview

Location
New York City, New York
Job Type
Full Time Job
Job ID
128584
Date Posted
1 year ago
Recruiter
Dennis Ruth
Job Views
207

Job Description

We’ll trust you to:

  • Interact with Data Engineers and ML experts to understand their workflows and requirements to inform the next set of features for Platform Services
  • Provide GPU management solution to enhance distributed training performance and resource usage efficiency
  • Enhance user experience using main stream and internal training frameworks
  • Design seamless workflows from model training to model inference
  • Solve and debug user issues
  • Provide operational and user facing documentation
  • Provide performance analysis and capacity planning for clusters

You'll need to have:

  • 4+ years of programming experience with an object-oriented programming language (Go, Python, C++, Java, or JavaScript)
  • A degree in Computer Science, Engineering or similar field of study or equivalent work experience
  • Experience with distributed systems eg. Kubernetes, Kafka, Zookeeper, Spark
  • Experience with mainstream machine learning frameworks such as Pytorch and Tensorflow
  • Experience building and scaling Docker-based systems using Kubernetes, Swarm or Mesos
  • Have a strong sense of curiosity to solve new problems and keep learning new technologies

We'd love to see:

  • Experience with ML infrastructure open source such as Kubeflow, Triton, MLFlow, Feast
  • Knowledge of authentication & authorization systems such as Spiffe and Spire
  • Experience with cloud providers such as AWS, GCP or Azure
  • Experience with configuration management systems (Chef, Puppet, Ansible, or Salt)
  • Experience with continuous integration tools and technologies (Jenkins, Git, Chat-ops)
  • Experience with data encryption
  • Experience working with GPU compute software and hardware
  • Ability to identify and perform OS and hardware-level optimizations

Job ID: 128584

Similar Jobs

Meta

Full Time Job

Full time job / senior software engineer- machine learning infrastructure Full time job / senior software engineer- machine learning infrastructure

Meta is embarking on the most transformative change to its business and technolo...

Full Time Job

Deloitte

Full Time Job

Full time job / senior software engineer- machine learning infrastructure Full time job / senior software engineer- machine learning infrastructure

Deloitte’s Enterprise Performance professionals are leaders in optimizing...

Full Time Job

Labcorp

Full Time Job

Full time job / senior software engineer- machine learning infrastructure Full time job / senior software engineer- machine learning infrastructure

Job Duties/Responsibilities:Determine the acceptability of specimens for testing...

Full Time Job

Braintrust

Full Time Job

Full time job / senior software engineer- machine learning infrastructure Full time job / senior software engineer- machine learning infrastructure

• JOB TYPE: Direct Hire Position (no agencies/C2C - see notes below)â€Â...

Full Time Job

Cookies

This website uses cookies to ensure you get the best experience on our website.

Accept