AI Group - Platform Engineer

Job Overview

Location
New York City, New York
Job Type
Full Time Job
Job ID
118675
Date Posted
9 months ago
Recruiter
Dennis Ruth
Job Views
279

Job Description

Bloomberg’s Data Science Platform provides a standard set of tooling and infrastructure to facilitate MLOps - from experimentation, data engineering and training to inference across the company. We provide scalable compute, specialized hardware and first-class support for a variety of workloads such as PyTorch, Spark, TensorFlow and Jupyter. We provide advanced features such as hyperparameter-tuning as a service and are beginning to invest in model-management and governance. The platform is built leveraging containerization, container orchestration and cloud architecture and built on top of 100% open-source projects.
 
Having built an excellent foundational infrastructure layer for the Data Science Platform on top of open-source components like Kubernetes, Cloud Native Buildpacks, Kubeflow training operators, KServe, Spark, Argo and more, we are now looking to offer higher level abstractions to facilitate and automate common workflows involved in the Model Development Lifecycle. Highlights from our upcoming roadmap focus on creating integrations built atop our infrastructure layers that allow for Continuous Training and Evaluation (CT/CE) of models, out-of-the-box AutoML solutions, Hypertune as a service, Drift Detection, Model and Feature Stores.
 
That’s where you come in. As a member of the multi-disciplinary Data Science Platform team within the AI Group at Bloomberg, you’ll have the opportunity to make key technical decisions in improving the end-user experience of AI engineers. 
 
While working on the team, the backbone of Bloomberg\'s up-and-coming AI products, you will have the opportunity to create a platform experience and tools that meet the requirements of our AI Researchers and Engineers, with a focus on creating a more cohesive, integrated and managed MLOps experience.
Our team makes extensive use of open source and is deeply involved in a number of communities like Cloud Native Buildpacks, Kubeflow, Paketo, Kyverno, CycloneDX, Sigstore, Anchore OSS and more. We collaborate widely with the industry and are rooted in Open Source!

More about us:

In the AI group, we build data-driven, highly distributed, high-throughput systems, which are collectively called billions of times a day. Our engineers are responsible for architecting and implementing these services end-to-end overcoming unique challenges that come with machine learning systems in the financial domain which involve building systems that have high throughput, availability, consistency and low latency. The AI Group is the central engineering group with close to 200 researchers and engineers working together to build these data-driven customer-facing products, as well as AI infrastructure and algorithms used by engineers across the company.

We’ll trust you to:

  •  Interact with data engineers and ML experts across the company to understand their workflows and requirements to inform the next set of features for the platform.
  •  Collaborate with open-source communities and internal platform teams to build a cohesive MLOps experience 
  •  Enhance distributed training user experience using main stream and internal training frameworks
  •  Design seamless workflow that facilitates a continuous model training to model inference cycle
  •  Troubleshoot and debug user issues
  •  Provide operational and user facing documentation

What we are looking for:

  •  Have a strong sense of curiosity to solve new problems and keep learning new technologies.
  •  Have a passion for machine learning and making it more accessible to engineers
  •  Proficiency in one or more languages (Go, Python, JavaScript or similar) and willingness to learn more as needed
  •  Exposure to container technologies like Buildpacks, Docker, Buildkit
  •  Ideally have experience using open-source ML infrastructure projects such as Kubeflow, MLFlow, Feast, ONNX, DVC

Nice to haves:

  •  Open source involvement such as a well-curated blog, accepted contribution, or community presence
  •  Experience with mainstream machine learning frameworks such as Pytorch, Tensorflow
  •  Passion for education e.g. providing workshops for tenants
  •  Industry experience with machine learning teams
  •  Experience working with GPU compute software and hardware
  •  Experience with cloud providers such as AWS, GCP or Azure
  •  Experience with configuration management systems (Helm, Ansible, Terraform)

Job ID: 118675

Similar Jobs

Meta

Full Time Job

Ai group - platform engineer Ai group - platform engineer

Meta is embarking on the most transformative change to its business and technolo...

Full Time Job

Deloitte

Full Time Job

Ai group - platform engineer Ai group - platform engineer

Deloitte’s Enterprise Performance professionals are leaders in optimizing...

Full Time Job

Labcorp

Full Time Job

Ai group - platform engineer Ai group - platform engineer

Job Duties/Responsibilities:Determine the acceptability of specimens for testing...

Full Time Job

Braintrust

Full Time Job

Ai group - platform engineer Ai group - platform engineer

• JOB TYPE: Direct Hire Position (no agencies/C2C - see notes below)â€Â...

Full Time Job

Cookies

This website uses cookies to ensure you get the best experience on our website.

Accept