-
AI Co-Scientist for Protein Function Prediction and Hypothesis Generation
Extending the Co-Scientist framework to build reasoning agents for protein function prediction using CEPI and BV-BRC datasets for priority pathogens
-
AI-Powered Analysis Pathfinder: From Research Questions to Data and Workflows
Enabling bench scientists to start with analysis goals and receive both relevant datasets and executable workflows—a complete path from question to execution.
-
Assigning Functions to Uncharacterized Genes
Leveraging machine learning and AI to predict functions of characterized and uncharacterized genes using K-mers, ensemble methods, and LLM embeddings
-
Expanding Rhea for Automated Workflow Generation
Extending the Rhea platform with MCP+RAG to enable automated workflow generation leveraging Galaxy and BV-BRC for infectious disease analysis
-
Extracting Influenza Passaging Cell Type from GenBank Records
Fine-tuning question answering LLMs to extract passaging cell type metadata from unstructured influenza GenBank records
-
Extracting Training Data for Automated Workflow Generation
Creating structured question-answer training datasets from BRC resources and publications to train LLMs for bioinformatics workflow generation
-
Generative AI-Driven Workflow Design and Execution via MCP
Automating bioinformatics workflow design, refinement, and execution using generative AI models guided through the Model Context Protocol (MCP)
-
HiPerRAG for Literature-based Data Extraction on Priority Pathogens
Leveraging high-performance retrieval-augmented generation to extract and curate structured biological data for CEPI priority pathogens from scientific literature
-
InterWeb Outbreak Surveillance and Progress Monitoring
AI-powered outbreak surveillance aggregating web data sources including social media, news, case reports, and public health databases
-
PubMed Miner: AI-Powered Sequence Feature Extraction from Literature
Automated extraction of viral sequence features, mutations, and epitopes from PubMed literature using LLMs to accelerate outbreak response and genomic analysis
-
RDF Knowledge Graph Construction
Transforming Pathogen Data Network Resources to RDF using Large Language Models
-
StorySeq: Automated Sequence Narrative Generation
Automating sequence identification and contextualization using BLAST, database queries, and LLM narrative synthesis to accelerate pathogen and AMR gene discovery
-
Viral Structural Phylogenetics
Optimizing protein language model to predict viral protein structures at scale
-
Your Project Here
Use this template to propose your own codeathon project