🐟Cluefish: Transcriptomics Workflow 🕵️♂️, 👖denim: R Package for Compartmental Models, 🪖Napoleonic Soldiers DNA Reveals Lice-Borne Disease🪳
Stay Updated with the Latest in Bioinformatics!
Issue: 98 | Date: 08 August 2025
👋 Welcome to the Bioinformer Weekly Roundup!
In this newsletter, we curate and bring you the most captivating stories, developments, and breakthroughs from the world of bioinformatics. Whether you are a seasoned researcher, a student, or simply curious about the intersection of biology and data science, we have got you covered. Subscribe now to stay ahead in the exciting realm of Bioinformatics!
🔬 Featured Research
This study investigates liver-mediated mechanisms underlying growth retardation in piglets using transcriptomics, metabolomics, and ATAC-seq. Piglets with low birth and weaning weights exhibited hepatic vacuolation and structural lesions. Key pathways enriched include PPAR signalling, glutathione metabolism, and ferroptosis, with elevated GCLM and related metabolites linked to ROS scavenging and improved growth. Integrative analysis revealed transcription factor networks and differentially accessible regions associated with liver development and disease.
In silico pharmacological analysis of Tinospora cordifolia compounds targeting African swine fever virus B175L | bioRxiv
This research explores Tinospora cordifolia-derived compounds as potential inhibitors of ASFV B175L, a protein that suppresses STING-mediated IFN-I signalling. GC-MS identified 124 compounds, with 52 meeting drug-likeness criteria. A high-confidence AlphaFold3 model of B175L was used for virtual screening, identifying four top ligands. The compound Benzaldehyde, 5-bromo-2-hydroxy-, (5-trifluoromethyl-2pyridyl)hydrazone showed the strongest binding affinity (-8.2 kcal/mol), with molecular dynamics simulations confirming stable interactions at the active site.
This study presents a single-cell transcriptomic atlas to evaluate a novel transcription factor combination (GATA2, GFI1B, FOS, REL, STAT5A) for reprogramming human fibroblasts into HSC-like cells. The 5TF recipe tripled CD34+ cell conversion efficiency. Long-read scRNA-seq revealed heterogeneous gene expression and isoform diversity, indicating partial reprogramming. A benchmarking framework mapped transcriptomic states relative to native and initial cell types, highlighting alternative splicing's role in reprogramming dynamics.
This study introduces a literature citation network approach for drug repurposing, analyzing 19,553 drug pairs using the Jaccard coefficient. A validation set from repoDB was used to assess similarity metrics, with literature-based Jaccard similarity outperforming others. The method identified de novo repurposing candidates by ranking drug pairs and applying a quantile threshold. The study emphasizes leveraging literature overlap via drug-target gene associations to enhance repurposing predictions.
This study explores the use of defective distributed Bragg reflectors (DBRs) doped with three-level rubidium atoms to control optical transmission and group velocity. The authors investigate the role of spontaneously generated coherence and phase-dependent quantum interference in modulating the DBR's optical response. The work demonstrates how tuning laser and pumping field parameters can shift probe field behaviour from absorption to gain. Applications are discussed in the context of optical filtering and photonic device engineering.
This research analyses differentially expressed lncRNAs and mRNAs in cervical cancer using bioinformatics approaches. The study identifies key genes and pathways potentially involved in disease progression through enrichment analysis and network construction. The findings provide insights into molecular mechanisms and suggest targets for further investigation in cervical cancer biology.
The study investigates the antibacterial mechanism of luteolin using protein-protein interaction networks derived from databases and analysed via Cytoscape. Ten target proteins were identified, and network clustering revealed seven functional modules linked to bacterial resistance processes. Functional annotation suggests luteolin's involvement in inhibiting bacterial growth and inducing apoptosis, contributing to its bacteriostatic activity.
🛠️ Latest Tools
The study presents DNA-Storalator, a cross-platform simulator designed to emulate biological and computational processes in DNA-based data storage. It models synthesis, PCR, and sequencing stages, injecting errors such as insertions, deletions, and substitutions based on customizable rates. The tool supports clustering, reconstruction algorithms and integrates with external error-correcting codes. It also enables analysis of new datasets to build error models for future simulations.
Source code is available here.
The study introduces FIBOS, a software package available in R and Python, implementing the Occluded Surface algorithm for atomic-level protein packing analysis. It enables comparison between experimental protein structures and AlphaFold predictions, revealing variability patterns in predicted models. The tool supports enhanced structural analysis and is designed for integration into computational workflows.
The study presents Varan, a Python-based tool designed to streamline the preparation of cancer genomic studies for cBioPortal. It automates data formatting, variant filtering, metadata generation, and version control from raw VCF files. Varan supports SNV and CNV data, integrates annotation tools like VEP and OncoKB, and includes features for study updates, removals, and extractions. The tool aims to reduce manual intervention and improve reproducibility in cancer data management workflows.
Source code is available here.
The study introduces Cluefish, a semi-automated R workflow for interpreting transcriptomic data series using over-representation analysis on pre-clustered protein–protein interaction networks. It enhances biological function detection by merging clusters and recovering isolated genes through shared contexts. Applied to zebrafish and other toxicology datasets, Cluefish identified low-dose gene deregulation and functions missed by standard methods. The tool integrates with DRomics for dose–response modeling and supports broad organism applicability.
Source code is available here.
The study introduces scMomer, a pretraining framework designed to learn multi-modal representations from single-cell data, even when some modalities are missing. It employs a three-stage strategy: unimodal representation learning, joint multi-omics modeling, and knowledge distillation to generate multi-modal-like embeddings from unimodal inputs. The architecture is modality-specific and supports various downstream tasks such as gene function prediction and drug response modeling. Experimental results show scMomer’s ability to capture cellular heterogeneity and generalize across diverse biological contexts.
Source code is available here.
novoStoic2.0 is an integrated computational framework developed to enhance metabolic pathway design by combining pathway synthesis, thermodynamic feasibility evaluation, and enzyme selection. It builds upon the original novoStoic platform by incorporating updated reaction databases and improved algorithms for identifying biologically plausible and energetically favourable pathways. The tool also facilitates enzyme assignment for synthetic routes, enabling more accurate and implementable metabolic engineering strategies. This makes novoStoic2.0 particularly valuable for designing biosynthetic routes in industrial biotechnology and synthetic biology applications.
The source code is available here.
COEXIST is a computational framework developed to integrate serial multiplexed tissue images at the single-cell level. It addresses limitations in multiplexed tissue imaging (MTI), where different biomarker panels are applied to consecutive thin tissue sections due to panel size or assay incompatibilities. COEXIST combines shared molecular profiles with spatial information to overcome challenges posed by biological heterogeneity and misaligned cell populations across slices. The framework improves spatial single-cell profiling, corrects miscalled cell phenotypes, and enables cross-platform comparisons, enhancing the resolution and utility of MTI data in studying complex tissue environments like tumours.
The source code is available here.
SuperEdgeGO is a graph representation learning framework designed to enhance protein function prediction by incorporating edge supervision into protein graphs. Unlike traditional graph convolution methods that treat residue contacts passively or unsupervised, SuperEdgeGO introduces a supervised attention mechanism that explicitly encodes residue interactions into the graph representation. This approach significantly improves the model’s ability to capture structural features relevant to protein function. Comprehensive experiments show that SuperEdgeGO achieves state-of-the-art performance across three categories of protein functions, with ablation studies confirming the effectiveness of its edge supervision strategy.
The source code is available here.
MOH is a novel computational framework that constructs a multilayer multi-omics heterogeneous graph to improve single-cell clustering accuracy. It integrates diverse omics data—such as scRNA-seq and scATAC-seq—into a unified graph structure, capturing complex biological relationships across modalities. By leveraging graph-based learning and multi-layered data representation, MOH enhances the resolution of cell type identification and reveals subtle cellular heterogeneity. The framework is implemented in Python and R and is available on GitHub for reproducible analysis and further development.
The source code is available here.
denim is an R package designed for building and simulating deterministic discrete-time compartmental models with flexible dwell time distributions. It supports both parametric (e.g., exponential, gamma, Weibull, log-normal) and non-parametric specifications for the duration individuals spend in each compartment, allowing for non-Markovian dynamics. The package includes a domain-specific language (DSL) for defining transitions and supports various transition types such as fixed, probabilistic, and multinomial. This flexibility makes denim particularly useful for modeling infectious disease dynamics and other systems where memory and time-dependent transitions are critical.
Recommended by LinkedIn
The source code is available here.
📰 Community News
Researchers applied computational algorithms to de-identified clinical records from six University of California health centers to compare patients with and without endometriosis. They identified more than 600 statistically significant associations between endometriosis and other diagnoses—from known comorbidities like autoimmune diseases and infertility to unexpected links such as certain cancers, asthma, and eye disorders. The retrospective study underscores endometriosis as a multisystem condition and provides a detailed map of its clinical correlations to inform future diagnostic and management strategies.
A team at the Spanish National Cancer Research Centre developed a plasma proteome-based assay that leverages fluorogenic reactions and machine learning to quantify five amino acids in patient blood samples. In trials with 170 individuals, the test identified 78% of early-stage solid tumors with zero false positives and distinguished between cancer types. Preliminary data suggest unique immune-response protein signatures for different diseases and predictive ability for treatment response, with ongoing clinical trials in the UK, US, and China.
An international consortium led by Baylor College of Medicine created BigHorn, a machine-learning pipeline that predicts lncRNA–DNA interactions via “elastic” sequence patterns rather than strict motif matching. Analyzing over 27,000 samples across cancer types, they uncovered widespread dual regulation—lncRNAs that both promote transcription of target genes and stabilize their mRNAs (“coordinated regulation”). Focusing on the oncogenic lncRNA ZFAS1, they showed it transcriptionally activates and preserves DICER1 mRNA, tightly linking lncRNA expression to microRNA network control.
Prima Mente plans a Phase I/II trial evaluating its epigenetic-based therapy for Alzheimer’s, building on preclinical data showing modulation of DNA methylation patterns linked to amyloid and tau pathology. The study will enrol mild to moderate Alzheimer’s patients and assess safety, tolerability, biomarker changes in cerebrospinal fluid, and cognitive outcomes over 12 months. The approach targets histone-modifying enzymes and non-coding RNA regulators to restore neuronal gene expression profiles disrupted in the disease.
Researchers integrated genome-wide association analysis and plasma proteomics in over 200,000 older adults to dissect biological drivers of frailty. The GWAS identified multiple loci—some overlapping with known aging and immune pathways—and Mendelian randomization implicated inflammatory mediators. Proteomic profiling nominated protein biomarkers involved in extracellular matrix remodelling, mitochondrial function, and innate immunity. Combined omics highlighted novel targets for intervention and reinforced the multifactorial etiology of frailty.
Napoleon's doomed retreat: DNA from Vilnius mass grave reveals signs of foodborne and lice-borne fever | Phys.Org
Ancient DNA extracted from 13 teeth of Napoleonic soldiers buried in Vilnius was sequenced to screen for pathogens. Four individuals carried Salmonella enterica Paratyphi C DNA (30–970 unique reads), indicating paratyphoid fever, while two yielded Borrelia recurrentis fragments (320–4,060 reads), consistent with relapsing fever. No authenticated reads for Rickettsia prowazekii or Bartonella quintana were detected. Findings align with historical accounts of contaminated rations and lice infestation during the 1812 retreat.
A study applied spatial-point statistics to 21 Paleolithic artifacts (2,840 incisions) alongside ethnographic tallies and butchery marks to distinguish intentional “artificial memory systems” (AMS) from other engravings. While butchery traces clustered randomly and decorative motifs showed variable patterns, candidate AMS exhibited regular spacing and near-orthogonal orientations—mirroring ethnographic tally sticks used for calendrical or accounting purposes. Results support early quantification practices in Africa and Europe dating back 1.7 million years.
Investigators observed that breast cancer brain metastases capture mitochondria from adjacent neurons via tunnelling nanotubes. In murine and organotypic co-culture models, invading tumor cells acquired functional neuronal mitochondria, boosting oxidative phosphorylation and invasive capacity. Blocking nanotube formation reduced mitochondrial transfer and metastatic growth. The study suggests intercellular organelle exchange as a metabolic adaptation supporting tumor colonization of the brain.
📅 Upcoming Events
Digital pathology and AI are reshaping biopharma R&D by enabling faster discovery, smarter clinical trials, and broader patient access. The webinar highlights the transition from analog to computational pathology, emphasizing AI-driven diagnostics, enterprise-wide data workflows, and real-world case studies. It outlines strategies for integrating pathology AI aligned with therapeutic and translational science priorities.
📚 Educational Corner
The CLAV R package facilitates cluster validation by generating multiple random samples via simple splits or bootstrapping. It supports internal, relative, and external validation strategies and visualizes cluster profiles and mean distributions. A Shiny app complements the package, enabling interactive exploration of clustering outcomes across training and validation datasets.
This study applies H2O-based AutoML regression models to assess stock performance of Amazon, Google, and Meta in response to recent fiscal policies. Using time series data and predictive intervals, the analysis explores the effects of tariffs and subsidies on tech firms. The modeling pipeline includes feature engineering, calibration, and accuracy evaluation across multiple IDs.
Docker’s MCP Toolkit enables modular AI tool hosting via containerized MCP servers, integrated with VS Code’s Copilot Agent Mode. The toolkit supports secure, discoverable, and reusable development workflows, with standardized APIs for tasks like CI automation and GitHub data retrieval. It includes runtime isolation, resource limits, and image attestation for enhanced security.
The continue keyword in Python allows skipping the remainder of a loop iteration and proceeding to the next. It is typically used within conditional blocks to control flow in for and while loops. The tutorial outlines use cases, syntax, and common pitfalls associated with its application in iterative structures.
The article outlines strategies for testing R code that interacts with external systems like LLMs, APIs, and databases. It emphasizes isolating core logic by using interfaces , dependency injection and simulating external dependencies with mocks or fakes. This approach ensures tests remain fast, reliable, and cost-effective without relying on live services.
This tutorial demonstrates how to create animated maps in R using ggplot2, gganimate, and spatial data from the gapminder and sf packages. It visualizes life expectancy trends across the Americas from 1952 to 2007, integrating shapefiles for geographic boundaries and applying animation techniques to highlight temporal changes.
The post explores R vectors through the lens of functional programming, focusing on the flatmap operation. Using Nobel Prize data as an example, it illustrates how vectors can be manipulated to extract and transform nested information, highlighting their role in data structuring and transformation within R.
Connect with Us
Stay connected and engage with us on social media for daily updates, discussions, and more!
📬 Subscribe
Don't miss an issue! Subscribe to the Bioinformer Weekly Roundup and receive the latest insights directly in your inbox.
We hope you enjoyed this week's edition of the Bioinformer Weekly Roundup. Feel free to share it with your colleagues and friends who share your passion for bioinformatics!
Disclaimer: The information provided in this newsletter is for educational and informational purposes only and does not constitute professional advice.
Contact: bioinformatics@zifornd.com
Copyright © 2025, Bioinformer Weekly Roundup. All rights reserved.