Job Description
The Hendrix ML Platform team is dedicated to developing a robust, Spotify-wide platform for training and serving machine learning models. This platform streamlines the productionization of AI and ML models by mitigating the incidental complexities involved in creating backend services for serving predictions and training models.
What You'll Do
- Manage and maintain large-scale production Kubernetes clusters for ML workloads, including ML platform infrastructure and necessary DevOps.
- Contribute to Spotify ML Platform SDK and build tools for various ML operations.
- Collaborate with Machine Learning Engineers (MLE), researchers, and various product teams to deliver scalable ML platform tooling solutions that meet the timelines and specifications of given requirements.
- Work independently and collaboratively on squad projects that often require learning and applying new technologies that may go beyond existing skillsets.
- Design, document, and implement reliable, testable, and maintainable solutions for ML infrastructure capabilities.
Who You Are
- You have 1+ years of hands-on experience implementing production ML infrastructure at scale in Python, Go, or similar languages.
- Knowledge of deep learning fundamentals, algorithms, and open-source tools such as Huggingface, Ray, PyTorch, or TensorFlow.
- Contributed to a production ML model or ML infrastructure.
- You have a general understanding of data processing for ML.
- You have experience with agile software processes and modular code design following industry standards.
Where You'll Be
- Location: Toronto.
- Flexibility to work remotely with some in-person meetings.
Additional Information
Today, we are the world’s most popular audio streaming subscription service.