Role overview
Your Team Responsibilities:
We are developing cutting-edge tools to identify and analyze climate change risk exposure of companies and real estate investments. Our models simulate natural and physical risks into insightful metrics for investors. A cornerstone of our climate modeling capabilities is our GeoSpatial dataset of asset locations, which is integrated into multiple models such as physical risk and biodiversity risk.
The team is seeking a Data Scientist to develop advanced analytical tools and models to gain new insights from our existing geospatial datasets. You will play a central role in transforming disparate asset-level data into interconnected structures, such as knowledge graphs, that enable deeper understanding of physical infrastructure, natural capital, and climate risk relationships. This technical position requires a strong expertise in data science, graph theory, and scalable computation.
Your Key Responsibilities:
- Contribute and co-lead projects as part of the R&D effort of asset-level geospatial datasets, including physical infrastructure and natural capital
- Design and implement methodologies to extract, link, and enrich entities across datasets, producing knowledge graphs and network-based representations of asset-level relationships using Python or R, leveraging machine learning frameworks, graph-theory libraries (e.g., NetworkX, Neo4j, graph neural networks), and statistical modeling.
- Design graph schemas and data models that represent entities such as assets, companies, locations, and environmental features, and the relationships between them.
- Develop and maintain robust pipelines for entity resolution, data linkage, and relationship inference across large-scale tabular and geospatial datasets.
- Collaborate with internal stakeholders and work with engineers to deploy models into production.
Your skills and experience that will help you excel:
- 5+ years of experience in (geospatial) data science, exposure/cat modelling, or climate analytics
- Advanced coding skills in Python, with familiarity with data science libraries such as pandas, numpy, scipy, scikit-learn, tidyverse.
- Deep expertise in machine learning and graph modeling, including:
- Knowledge graph construction,
- Graph algorithms (clustering, centrality, community detection),
- Graph databases (Neo4j, Neptune) or graph ML (PyG, DGL)
- Strong grounding in relational modeling, data linkage, and entity-resolution techniques.
- Experience with parallel computing frameworks such as Dask or multiprocessing is desirable.
- Hands-on experience in cloud environments (GCP, AWS, or Azure) and cloud-native geospatial workflows are desirable.
- Strong written and verbal communication skills.
- MSc/PhD in Computer Science, Data Science, Geospatial Science, Applied Mathematics, Environmental Informatics
About MSCI:
We are aware of recruitment scams where fraudsters impersonating MSCI personnel may try and elicit personal information from job seekers. Read our full note on careers.msci.com