SynthBioData (Synthetic Biological Data)¶
A Python package for generating synthetic drug discovery data that mimics real-world scenarios using realistic molecular descriptors and target properties.
Important Notice
This package generates synthetic data for testing and educational purposes only.
The data produced does not represent real biological or chemical measurements and should not be used for clinical, regulatory, or production applications.
Quick Start¶
Get started with synthbiodata
in just a few lines of code:
from synthbiodata import generate_sample_data
# Generate molecular descriptor data
df = generate_sample_data(data_type="molecular-descriptors")
print(f"Generated {len(df)} samples with {len(df.columns)} features")
# Generate ADME data
df_adme = generate_sample_data(data_type="adme")
print(f"Generated {len(df_adme)} samples with {len(df_adme.columns)} features")
Key Features¶
-
Molecular Descriptors
Generate realistic molecular properties like MW, LogP, TPSA, HBD, HBA, and more
-
ADME Data
Simulate Absorption, Distribution, Metabolism, and Excretion properties
-
Target Families
Support for GPCR, Kinase, Protease, and other protein families
-
Chemical Fingerprints
Generate binary chemical fingerprints as features
-
Configurable
Customize data generation parameters and distributions
-
Efficient
Built on Polars for fast data manipulation and processing
Data Types¶
Molecular Descriptors¶
Generate synthetic molecular data with features like:
- Molecular weight, LogP, TPSA
- Hydrogen bond donors/acceptors
- Rotatable bonds, aromatic rings
- Chemical fingerprints
- Target protein families (GPCR, Kinase, Protease, etc.)
ADME Data¶
Generate ADME (Absorption, Distribution, Metabolism, Excretion) data with:
- Absorption percentages
- Plasma protein binding
- Clearance rates and half-life
- Bioavailability predictions
⬇ Installation¶
Install synthbiodata using your preferred package manager:
📖 Documentation¶
Explore the docs:
- Quick Start - Get up and running quickly.
- User Guide - The backstage of Synthbiodata explained.
- API Reference - Complete API documentation.
- **Examples - Detailed usage examples.