PatentBind Benchmark

← Bigger is Better|Smina Docking →

Greasier is Better

Property Baseline

Ranks by lipophilicity (cLogP).

A property-based baseline that scores molecules by computed lipophilicity (cLogP via RDKit). Greasier compounds often bind more strongly through the hydrophobic effect, but this is undesirable in drug design.

Design Rationale

This baseline should specifically fail on LLE tasks, where lipophilicity is penalised. A model that can't beat this on LLE tasks is not learning specificity.

Evaluation Scores

Binary Classification

pointwise

AUPRC0.1779

Area under the precision-recall curve. More informative when class balance is skewed.

AUROC0.2687

Area under the ROC curve. Measures discrimination ability across all thresholds.

EF 1%0.0000

Enrichment factor at 1%. How many actives found in the top 1% vs random.

EF 5%0.0000

Enrichment factor at 5%. How many actives found in the top 5% vs random.

Pairwise Ordinal

pairwise

Pairwise Accuracy0.6301

Fraction of pairs correctly ordered by ordinal rank.

SAR Winner (LLE)

listwise

mrr0.2178

top1_accuracy0.0042

SAR Winner (Ordinal)

listwise

mrr0.5534

top1_accuracy0.2788

Compare with other models