About BioLens
BioLens transforms unstructured scientific literature into structured, analyzable experimental data using large language models (LLMs). We aim to accelerate progress in aging biology and drug discovery by making experimental data more accessible, comparable, and reusable.
Our Mission
Scientific knowledge is often trapped in unstructured text, making it difficult to analyze systematically. By converting the scientific literature into standardized, machine-readable datasets, BioLens enables researchers to explore, compare, and build upon experimental findings at scale.
LLM-Powered Metadata Extraction
BioLens automatically identifies and extracts experimental metadata from scientific publications, including study design, interventions, outcomes, models, and effect sizes. This enables large-scale synthesis and cross-study comparison across vast bodies of literature, significantly streamlining data-driven discovery.
Extracted data are normalized into consistent formats and ontologies, ensuring interoperability and comparability within the dataset.
Workflow
BioLens's workflow comprises multiple sequential stages designed to transform raw text into high-quality, analyzable data.
Data Gathering
Identifying and collecting relevant scientific papers from sources such as PubMed and other open-access repositories.
Data Classification
Applying LLM-driven filtering to categorize papers by topic, intervention type, and experimental model, ensuring that only relevant studies progress to extraction.
PDF Parsing
Converting publications into machine-readable formats to enable accurate downstream text analysis.
Experiment Extraction
Using advanced LLMs and customizable prompts to capture key metadata such as interventions, outcomes, and study parameters from each paper.
Data Standardization
Normalizing extracted information into unified formats, ontologies, and controlled vocabularies to ensure consistency and cross-study comparability. For lifespan studies, effect sizes are additionally normalized to the mice's age at the start of the survival experiment, enabling fair cross-study comparisons between cohorts initiated at different life stages.
Reporting
Generating structured datasets and interactive visualizations, allowing researchers to explore findings and download data for further analysis.
Every stage incorporates rigorous quality assurance protocols to ensure precision, reproducibility, and reliability of the results.
Application: The Lifespan Project
To demonstrate BioLens's capabilities, we implemented this workflow on mouse lifespan research. Our pipeline initially identified approximately 270,000 potentially relevant papers, resulting in a curated set of 600 lifespan-related publications recorded in our database.
This large-scale effort established one of the most comprehensive structured resources for lifespan experiments, capturing detailed information about interventions, outcomes, and survival effects in mouse models.
In BioLens, we display lifespan experiments involving genetic, pharmacological, or supplement-based interventions. The dataset includes studies where no timepoint data have been reported for control and/or intervention groups. These experiments are retained when they provide reliable lifespan outcomes, ensuring comprehensive coverage while allowing users to assess data completeness for their specific analytical requirements.
Wild Type panel
The Wild Type panel includes (1) healthy wild-type mice (littermates or strain-matched) that age naturally without baseline genetic or disease manipulation, and (2) WT controls that may receive vehicle or sham procedures only, ensuring survival effects reflect the tested intervention(s).
Progeroid panel
The Progeroid panel includes (1) validated progeroid rodent models with systemic premature-aging phenotypes, and (2) disease / accelerated-aging models shown to shorten whole-organism lifespan via aging-linked mechanisms.
Last updated: 2 September.
Collaborate With Us
We're actively expanding BioLens to include additional experimental domains beyond lifespan data. If you'd like to see other types of experimental data represented on the BioLens platform, or if you're interested in collaborating, we'd love to hear from you.
Contact us at: contact@biolens.io