PatentBind
Greasier is Better
Property Baseline
Ranks by lipophilicity (cLogP).

A property-based baseline that scores molecules by computed lipophilicity (cLogP via RDKit). Greasier compounds often bind more strongly through the hydrophobic effect, but this is undesirable in drug design.

Design Rationale

This baseline should specifically fail on LLE tasks, where lipophilicity is penalised. A model that can't beat this on LLE tasks is not learning specificity.

Evaluation Scores

AUPRC0.1779

Area under the precision-recall curve. More informative when class balance is skewed.

AUROC0.2687

Area under the ROC curve. Measures discrimination ability across all thresholds.

EF 1%0.0000

Enrichment factor at 1%. How many actives found in the top 1% vs random.

EF 5%0.0000

Enrichment factor at 5%. How many actives found in the top 5% vs random.

Adjacent Accuracy0.6095

Fraction within ±1 rank of the true rank. Reflects triage decisions.

Concordance Index0.6051

Harrell's C statistic — probability that a random pair is correctly ordered.

Kendall τ0.2203

Rank correlation averaged within assay groups.

Exact Accuracy0.2456

Fraction of predictions matching the exact ordinal rank.

Pairwise Accuracy0.6301

Fraction of pairs correctly ordered by ordinal rank.

mrr0.2178
top1_accuracy0.0042
mrr0.5534
top1_accuracy0.2788
Compare with other models