TITLE:
Machine Learning Based Virtual Screening for Biodegradable Polyesters
AUTHORS:
Navya Nori
KEYWORDS:
Biodegradability, Molecular Generation, Virtual Screening
JOURNAL NAME:
Journal of Materials Science and Chemical Engineering,
Vol.12 No.8,
August
22,
2024
ABSTRACT: Current biodegradation timelines show that polyesters take over 200 years to break down. A crucial component of several industries, polyesters are relied upon for materials development and thus require sustainable alternatives. Recent works in generative modeling have made it possible to produce large sets of chemical structures, but current molecular screening methods are expensive, not scalable, and are oversimplified. This work evaluates whether a molecule’s biodegradability potential can be accurately predicted by training a model on recent experimental data. Additionally, three chemical descriptors were evaluated on the final molecules for their effects on biodegradability: molecular structure, bond types, and solubility. A Gradient Boosted Machine was trained on a dataset of 600 molecules and their binary labels on biodegradability. The classification model effectively captured the biodegradability property, yielding an Area Under the Receiver Operating Characteristics, AUROC, of 84% and an Area Under the Precision Recall Curve, or AUPRC, of 87%. Additionally, an existing amortized synthetic tree generation model, SynNet, validated each molecule by showing chemical synthesizability and producing simple and interpretable synthesis pathways. This approach of filtering by prediction and chemical rule interpretation is inexpensive, highly scalable and can capture the necessary complexity. Using this method, novel polyester candidates can be polymerized and produced into sustainable fabrics, reducing environmental stress from textile-reliant industries.