Patent SAR over curated datasets
Dense structure–activity relationships from real medicinal chemistry programs, not curated academic subsets.
Evaluating molecular binding prediction through patent-derived SAR
PKMYT1, CDK1
Canonical + synthetic
Pointwise, pairwise, listwise
Baselines + docking + cofolding
Most binding affinity benchmarks measure how well a model fits measurements in a dataset. This does not reflect real-world use. Drug discovery operates through local design decisions: selecting the next analogue, prioritising within a series, deciding whether a modification improves potency.
Dense structure–activity relationships from real medicinal chemistry programs, not curated academic subsets.
Emphasises ranking within analogue series — the decisions that actually drive lead optimisation.
Property-based baselines (molecular weight, lipophilicity) ensure models demonstrate real predictive power.
Evaluation mirrors real questions: 'Which analogue should we synthesise next?'
Ranking
Models perform best on ranking tasks (~50–63% pairwise accuracy), suggesting ranking is more tractable than absolute affinity prediction.
Regression
Continuous affinity prediction is very poor — negative R² values mean models perform worse than predicting the mean.
Baselines
Property-based heuristics (molecular weight, cLogP) are surprisingly competitive, revealing how much current models rely on simple molecular properties.
SAR
SAR winner identification (~12–21% accuracy) remains extremely challenging — far below what is useful for real medicinal chemistry.