Senior Data Engineer

Location: Roivant Sciences, Inc., 151 West 42nd Street, 15th Floor, New York, NY 10036

At Roivant, we are passionate about discovering and developing new drugs to impact patients’ lives. Since its inception in 2014, Roivant has launched over 20 portfolio companies (Vants), overseen 5 successful IPOs, established a $3B partnership with a global pharma, built a pipeline of over 40 assets across various modalities and therapeutic areas, and delivered 8 successful phase 3 readouts.

Roivant is currently building new capabilities in drug discovery and expanding its existing development engine to become the world’s leading tech-enabled pharmaceutical company. Roivant’s drug discovery capabilities are driven by our computational discovery platform, which combines preeminent physics-based tools with deep expertise in machine learning to generate unprecedented predictive power that can tackle previously intractable discovery challenges. The tight integration of this computational platform with our experimental capabilities enables the rapid design and optimization of new drugs to address a wide range of targets for diseases with high unmet need.

We believe that the future of drug discovery lies in integrating predictive sciences, biology, and medicinal chemistry to accelerate the path to new medicines. This role is an opportunity to be an architect of this paradigm shift and generate transformative benefit for patients.

Position Summary: 

We are looking for an experienced Data Engineer to join our rapidly growing Discovery team.  Our platform combines our cutting-edge physics-based computational platform with predictive machine learning and experimental biology and medicinal chemistry to develop novel therapeutics.  We are looking for a talented data engineer to help build our integrated data platform to consolidate these highly diverse data sources and enable data-driven decision making across the Roivant Discovery arm.


  • Develop and manage data infrastructure and pipelines for a high-performance computational drug discovery platform.
  • Work collaboratively with software engineering, high-performance computing and data science teams to ingest and organize simulation, experimental and third-party data.
  • Design data models to jointly optimize for storage, retrieval, and drug discovery and business needs.
  • Contribute to end-to-end data processes including automation, ELT/ETL, integration, management and governance.
  • Integrate commercial, open-source and / or purpose-built components to build highly scalable hybrid cloud / on-premise data platform.
  • Contribute to shared tooling and standards with a focus on data quality, monitoring and logging best practices.

 Required Qualifications:  

  • Bachelor’s or Master’s degree in Computer Science, Engineering, or related field with 3-5 years’ experience
  • Proficiency in Python, SQL, and containerization (Docker / Singularity)
  • Experience with workflow frameworks (Airflow, Prefect, dbt)
  • Experience with high performance computing clusters
  • Experience with data lake and data warehouse architectures
  • Excellent communication skills
  • Experience in the design and development of APIs

 Additional Desirable Qualifications:  

  • Hands-on experience with hybrid cloud on-premises data ecosystems
  • Previous experience with small molecule chemical data types
  • Experience with commercial drug discovery data repositories and electronic lab notebooks

Roivant Sciences provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.