Licenses & provenance
Unless stated otherwise, data are available under CC BY 4.0 .
RetroRules integrates identifiers, chemical structures and annotations from the public databases MNXref, Rhea, USPTO, ChEBI and UniProt. Those databases may have different licensing terms. Check licenses of each referenced database before redistribution, especially for commercial use.
Current release — v3.0
Release v3.0 includes three datasets derived from MetaNetX, Rhea, and the USPTO. Each provides reaction templates at radii 0-10 (implicit hydrogens), matching the current web app. Files are TSV: one rule per line plus metadata columns.
Datasets
-
MetaNetX-derived datasetTSV archive — doi: 10.xxxx/coming.soon
-
Rhea-derived datasetTSV archive — doi: 10.xxxx/coming.soon
-
USPTO-derived datasetTSV archive — doi: 10.xxxx/coming.soon
Format
- TEMPLATE_ID: str
- Template identifier (i.e., RR:…).
- TEMPLATE: str
- Reaction SMARTS pattern (implicit H).
- REACTIONS: list[str]
- Source reaction identifiers (e.g., MNXR), semicolon-separated.
- REACTIONS_COUNT: int
- Number of linked source reactions.
- ECS: list[str]
- Associated EC numbers, comma-separated.
- ECS_COUNT: int
- Number of distinct EC numbers.
- RADIUS_MIN: int
- Minimum modeled radius level.
- RADIUS_MAX: int
- Maximum modeled radius level.
- RADII: list[int]
- Radius levels modeled (integers 0-10), comma-separated.
- SCORE: float
- Enzyme-likeness score in [0, 1].
- VALID: bool
- Template SMARTS validation flag (1/0).
- DATASETS: list[str]
- Provenance tags (metanetx, rhea, uspto), comma-separated.
Archives
Previous RetroRules releases.
rr02Pre-parsed rule archives for RetroPath tools.
rr01Initial RetroRules release (pre-parsed rules).
-
MNX-based Dataset For RetroPath 2.0 Radii 1–8 Explicit Hydrogenstar.gz on Zenodo