From Bench to Bedside: Generalizable AI Model for ADC Biomarker Evaluation in NSCLC

An overview of a key takeaway from our poster presented at AACR 2025.

Ryan Sargent
May 13, 2025

Background

Non-small cell lung cancer (NSCLC) is the one of the most lethal cancer types worldwide1, with treatment options that have historically been limited due to factors like late-stage diagnosis. Antibody-drug conjugates (ADCs) are a promising therapeutic approach that have recently transitioned from research laboratories into patient care. Unlike conventional chemotherapy, which affects cells throughout the body, ADCs combine chemotherapy drugs with monoclonal antibodies that target specific cells, improving efficacy and safety while providing new treatment options for cancer patients. 

Our study presented at this year's AACR meeting focused on two emerging ADC targets that are of particular interest for the treatment of NSCLC: TROP-2 and cMET. These biomarkers have shown early clinical success as therapeutic targets for NSCLC. To effectively implement ADCs targeting these markers in clinical practice, physicians will need standardized, efficient methods to identify which patients express sufficient levels of these markers to benefit from treatment. Patient selection is critical for therapeutic success, since ADCs are only effective when their targets are present. 

However, the challenge remains in developing reliable, reproducible methods to accurately quantify biomarker expression levels across diverse patient samples. To address this challenge, our team developed and evaluated a series of machine learning models to automatically quantify biomarker expression in NSCLC samples. These models provide an automated scoring system that can assist physicians in their evaluations of when to use targeted therapies. Our approach was designed to test model generalizability – we initially trained an expression scoring model on TROP-2 data, then evaluated performance on cMET samples as a proof-of-concept. 

A generalizable model for biomarker assessment could potentially eliminate the need to develop and validate separate models for each biomarker of interest, reducing development time and cost. 

Methods

In this study, we developed a series of AI models for ADC biomarker evaluation. Tissue microarrays (TMAs) were generated with tumor cores from 1,142 NSCLC patients at two institutions - Charité Berlin and University Hospital Cologne. The TMAs were first immunohistochemically (IHC) stained for TROP-2 and cMET, and then digitized for AI analysis. 

Our series of machine learning models performed three sequential analyses: 

  1. Nucleus detection to identify individual cells
  2. Cell classification to highlight tumor cells
  3. Cell expression scoring to quantify membrane expression in detected tumor cells

To validate cross-marker generalizability, the model was trained using TROP-2 stained slides and evaluated on both TROP-2 and cMET. Generalization should be enhanced by the pathology foundation model (RudolfV) backbone.

Results

Performance of the TROP-2 expression model on TROP-2 and cMET samples was evaluated by comparing model predictions against manual H-score evaluations performed by six pathologists on a hold-out dataset of 100 slides for each marker. H-scoring is a semi-quantitative method that is calculated by multiplying the percentage of positively stained cells with the intensity of staining: 

H-score = (1 × % of cells with weak intensity) + (2 × % of cells with moderate intensity) + (3 × % of cells with strong intensity)

This calculation produced H-scores ranging from 0-300, which were grouped into four expression ranges (negative=0-50, weak=50-100, moderate=100-200, strong=200-300). Higher scores indicate stronger biomarker expression, which is associated with higher likelihood of response to ADC therapy.

In evaluating the model predictions against pathologist evaluations, we found that the grouped H-scores predicted by the model demonstrated high agreement with the grouped pathologists' scores, with an average pair-wise Pearson correlation of 93% for both TROP-2 and cMET. This strong correlation indicates that our expression scoring model achieved excellent concordance with expert pathologist evaluation. 

Notably, the expression scoring model, which was trained exclusively on TROP-2, generalized effectively to cMET without requiring additional fine-tuning. These results support our earlier hypothesis that using RudolfV as the basis of the model enabled this cross-marker generalizability.

Conclusion

This study demonstrates the potential of AI models to address key challenges in ADC biomarker evaluation for NSCLC. The strong alignment between our model predictions and pathologist assessments demonstrates the value of our automated scoring approach.

More importantly, our model's ability to generalize between biomarkers highlights its adaptability. This establishes the potential for automated scoring approaches to facilitate rapid expansion to additional biomarkers, streamlining treatment decision pathways for NSCLC patients.

In the future, we plan to expand our approach to investigate additional ADC-relevant biomarkers and further study the generalization capabilities of our models.

___________________________________________________

  1. World Health Organization. "Lung Cancer." World Health Organization Fact Sheets, published  June 26, 2023. https://www.who.int/news-room/fact-sheets/detail/lung-cancer.

Related Articles

Meet Neelay Shah, Machine Learning Engineer on our Data Processing team!

29.4.2025

Neelay and the Data Processing team develop pipelines to process data at scale for use-cases such as training foundation models. They ensure that data and models can be processed in a distributed fashion, efficiently spreading computational work across multiple machines to allow for scaling when large data sets are involved. They also partner closely with the Data Science team who use the infrastructure developed by the Machine Learning Engineers to perform experiments for client projects.

Pathology's Evolution: How Aignostics Builds on Charité Berlin's 300-Year Legacy

27.3.2025

Aignostics started in 2018 within the walls of the Berlin Institute of Health at Charité – Universitätsmedizin Berlin, before ultimately spinning-out in 2020. But did you know that Charité Berlin has a rich history dating all the way back to 1710? Let’s explore some of that history, along with the history of pathology’s digital transformation, to understand the origins and drivers of Aignostics’ founding.

Meet Timo Haschler, Senior Scientific Project Manager on our Target and Biomarker Discovery team!

12.3.2025

Timo Haschler and his team work closely with clients to identify novel drug targets and biomarkers. This includes the multi-year collaboration we announced with Bayer in 2024, to identify novel cancer targets and co-create a target identification platform to enable better patient identification, stratification, and selection for clinical trials.