Job Description
At Roche, you can show up as yourself, embraced for the unique qualities you bring. Our culture encourages personal expression, open dialogue, and genuine connections, where you are valued, accepted, and respected for who you are. This environment supports your personal and professional growth.
Our Mission
We aim to prevent, stop, and cure diseases, and ensure everyone has access to healthcare now and for future generations.
The Position
A healthier future, driven by innovation.
Galileo is Roche's strategic Informatics program focusing on enabling high-value AI use cases, primarily Generative AI (GenAI), through dedicated platforms and services. The program is establishing a Center of Excellence in AI.
The Use Case Delivery (UCD) Team comprises several delivery squads, responsible for building innovative GenAI applications.
Role Overview: Data Engineer
We are seeking a highly skilled Data Engineer to join a new AI solutions development squad. The squad will build cutting-edge applications leveraging Large Language Models (LLMs), managing the end-to-end lifecycle from concept to operations.
Responsibilities:
- Generative AI Application Co-creation: Collaborate with AI engineers, data scientists, product owners, and developers to integrate LLMs into scalable, ethical, and real-time applications focusing on user experience and relevance.
- Data Infrastructure Development and Integration: Design high-performance data pipelines for AI/GenAI applications, ensuring efficient data ingestion, transformation, storage, and retrieval.
- Vector Database Management: Work with vector databases like AWS OpenSearch or Azure AI Search for high-dimensional data retrieval.
- Cloud-Based Data Engineering: Build and maintain solutions using AWS (OpenSearch, S3) or Azure (Azure AI Search, Azure Blob Storage).
- Snowflake Implementation: Optimize data storage with Snowflake for scalable analytics.
- Data Processing & Transformation: Develop ETL/ELT pipelines for real-time and batch processing.
- Support AI Model Workflows: Collaborate with AI/ML engineers to ensure seamless data pipeline integration.
- Performance Optimization: Enhance data storage and processing strategies for efficiency and cost-effectiveness.
Requirements:
- Experience: 5-7+ years supporting AI/ML applications. Degree in Computer Science, Data Engineering, or related fields.
- Programming: Python, SQL, and vector database languages.
- Databases: Relational, NoSQL, vector databases, Snowflake.
- Cloud Platforms: AWS (OpenSearch, S3, Lambda) or Azure (AI Search, Blob Storage, Automation).
- ETL/ELT Pipelines: Skills in dbt, Apache Airflow, or similar.
- APIs & Microservices: Experience in RESTful API design.
- Data Security & Governance: Knowledge of encryption and role-based access.
- DevOps: Git, CI/CD, Docker, Kubernetes, Infrastructure as Code (Terraform, CloudFormation).
- Generative AI Support: Experience with embeddings, RAG, and LLM fine-tuning data.
Additional Information
- Relocation benefits are not provided.
About Roche
A global leader with over 100,000 employees dedicated to advancing science and healthcare. Roche has treated over 26 million people and conducted over 30 billion diagnostic tests.
Our Values
Innovation, collaboration, creativity, and ambition drive us to build a healthier future together.
Roche is an Equal Opportunity Employer.
Job Highlights
{job_highlight_markdown}