close-up image of interlocking gears turning with a rainbow gradient overlay

Researcher tools: code, datasets, & models

An index of datasets, SDKs, APIs and other open source code created by Microsoft researchers and shared with the broader academic community. We also maintain a collection highlighting some of the tools you’ll find here.

Showing 1 - 10 of 1082 results

Dataset Source Code

MarS 

MarS is a cutting-edge financial market simulation engine powered by the Large Market Model (LMM), a generative foundation model.

Dataset Source Code

MageBench 

MageBench is a benchmark for evaluating the reasoning and planning ability of large multimodal model agents. This benchmark currently includes three types of environments: WebUI, Sokoban, and Football, comprising a total of 483 different scenarios.…

Dataset Source Code

TamGen 

This is the implementation of the paper “TamGen: Target-aware Molecule Generation for Drug Design Using a Chemical Language Model”.

Download

RAD-DINO model 

RAD-DINO is a vision transformer model trained to encode chest X-rays using the self-supervised learning method DINOv2. RAD-DINO is described in detail in RAD-DINO: Exploring Scalable Medical Image Encoders Beyond Text Supervision (F. Pérez-García, H. Sharma, S.…

Download

MAIRA-2 model 

MAIRA-2 is a multimodal transformer designed for the generation of grounded or non-grounded radiology reports from chest X-rays. It is described in more detail in MAIRA-2: Grounded Radiology Report Generation (S. Bannur, K. Bouzid et al.,…