> nextflow run portfolio.nf -profile production
> [init] loading assets...
> [process] initializing graph...
> [complete] launching interface_
Profile picture of Pranava Upparlapalli - Click to explore interactive network

Pranava Upparlapalli

Email

About Me

I help teams turn complex biological data into clear, reproducible, and actionable insights. I design and run bioinformatics and genomics analyses that scale efficiently and deliver results that can be trusted across projects. By combining workflow automation with statistical and computational methods, I make it easier for research and industry teams to interpret data, validate findings, and make informed decisions. My MS in Bioinformatics and Computational Biology and my research in Dr. Xuan's lab strengthened my ability to build systems that handle large datasets with precision. I am a bioinformatics scientist who takes initiative, identifies gaps in analysis and process, and develops solutions that improve both accuracy and impact.

Building reliable analyses and workflows that turn data into results teams can trust.
Download Resume

Skills & Expertise

I specialize in turning raw sequencing data into results that research teams can trust. My skills span reproducible genomics analysis, machine learning for biological data, and production-ready pipelines that scale across HPC and cloud systems.

NGS

Next-Generation Sequencing (NGS)

Developed end-to-end RNA-seq and scRNA-seq pipelines that convert raw reads into expression matrices and QC reports used in cancer and disease studies.

NGS RNA-Seq scRNA-Seq Library Prep Variant Calling FastQC Trim Galore STAR HISAT2 FeatureCounts
Genomics

Genomic Data Analysis

Analyzed and interpreted genomic variation, integrating public databases and pathway tools to link variants to functional and regulatory impact.

GEO dbSNP NCBI UCSC Browser Ensembl KEGG IGV
Data Science

Bioinformatics & Statistical Analysis

Applied statistical modeling and data cleaning to large datasets, improving reproducibility and clarity in gene expression and variant studies.

Python R Linux/Bash SQL Pandas NumPy Matplotlib ggplot2 Seurat Scanpy DESeq2
Machine Learning

Machine Learning in Biology

Built classification and prediction models using scikit-learn and PyTorch for transcriptomic and histopathology data, improving interpretability and accuracy.

Scikit-learn ElasticNet PyTorch CNNs ResNet Deep Learning Model Evaluation
GWAS-TWAS

GWAS and TWAS Pipelines

Integrated genotype, expression, and chromatin data using TWAS and eQTL mapping to identify trait-associated genes beyond traditional GWAS.

GWAS TWAS eQTL Hi-C PLINK PrediXcan
Pipelines

Workflow Automation

Designed scalable workflows with Nextflow, Snakemake, Docker, and AWS that reduced runtime and ensured reproducibility across HPC and cloud systems.

Nextflow Snakemake Docker Conda Git Slurm HPC AWS MLflow

Work Experience

Graduate Researcher — Genomic Data Modeling

Dr. Xuan's Lab, UT Dallas (Jan 2024 — May 2025)

  • Lab needed reliable pipelines to connect genetic variation to expression and phenotype.
  • Built and validated machine learning models using GTEx and Hi-C data; engineered modular pipelines on HPC systems.
  • Improved the lab's ability to predict gene expression across tissues and interpret long-range regulatory effects, directly supporting translational genomics research.

Undergraduate Research Assistant — Antimicrobial Research

Sree Vidyanikethan Degree College (Aug 2020 — Mar 2021)

  • The project aimed to study Biancaea sappan for antioxidant and antibacterial potential, but initial assays lacked consistency and yield.
  • Refined assay protocols, optimized growth media, and tested extraction conditions to improve reproducibility.
  • Achieved more reliable yields and confirmed antibacterial activity, giving the team stronger evidence to pursue natural compound studies further.

Projects & Pipelines

Computational strategies for deciphering biological heterogeneity, gene regulation, and clinical pathogenicity.

CRC-TME Analysis
2024
scRNA-SeqScanpyscVI

CRC-TME: Tumor Heterogeneity Analysis

Uncovering immune infiltration patterns in the Colorectal Tumor Microenvironment. Integrated 63k+ single-cell transcriptomes across cohorts using scVI for probabilistic batch correction and Scanpy for clustering.

View Analysis
Yeast Stress Analysis
2024
NextflowNF-CoreDocker

Yeast-Stress: Automated RNA-Seq Pipeline

Reproducible Nextflow pipeline for quantifying oxidative stress response in S. cerevisiae. Automates QC, alignment, and differential expression analysis in a containerized environment.

View Pipeline
Pan-Cancer RNA-Seq
2024
RDESeq2TCGA

Pan-Cancer Expression Profiling

Standardized analysis workflow for five major cancer types (e.g., BRCA, KIRC). Automates normalization (DESeq2) and survival analysis to identify subtype-specific prognostic biomarkers.

View Workflow
Gleason Score Classifier
2023
Computer VisionPyTorchResNet

Gleason AI: Histology Classifier

Deep learning model for grading prostate cancer tissue. Trained a ResNet-50 CNN to identify Gleason patterns, using Grad-CAM to visualize the morphological features driving the diagnosis.

View Model
TinyVariant Model
Experimental
NLPTransformers

TinyVariant: Transformer Classifier

Experimental NLP model for classifying Variants of Uncertain Significance (VUS). Adapts attention mechanisms (BERT-like) to learn pathogenicity from genomic sequence context.

View Project
SeqMorph Tool
2024
Python CLISimulation

SeqMorph: Mutation Simulator

CLI tool for injecting synthetic mutations into sequencing data. Designed to stress-test the sensitivity of alignment algorithms against edge cases like indels and structural variants.

View Tool
CORE-seq Library
2023
AlgorithmsOptimization

CORE-seq: Sequence Compression

High-performance Python library for optimizing nucleotide storage. Reduces memory overhead during the pre-processing of large-scale genomic datasets for machine learning ingestion.

View Library
Antimicrobial Research
2021
Wet LabAssay Dev

Antimicrobial Assay Optimization

Refined extraction protocols for Biancaea sappan to improve yield of bioactive compounds. Validated antioxidant and antibacterial efficacy against pathogens via zone-of-inhibition assays.

Nociception Study
2025
In-SilicoDocking

Nociception Study

Investigating gut microbial metabolites in nociception pathways. Performed molecular docking simulations to map the binding affinity of secondary metabolites to host pain receptors.

Education & Certifications

MS: Bioinformatics and Computational Biology

University of Texas at Dallas (UTD) — May 2025

Relevant Coursework:
  • Applied Bioinformatics
  • Statistics in Bioinformatics
  • Molecular Biology
  • Algorithms & Data Structures
  • Medical Image Analysis

Advanced Diploma: Bioinformatics

Bharati Vidyapeeth University (BVDU) — 2022

Relevant Coursework:
  • Biological Informatics
  • Biostatistics
  • Data Mining & ML
  • Molecular Modeling
  • R & Data Analytics

BSc: Microbiology, Biochemistry, Chemistry

Sri Venkateswara University (SVU) — 2020

Relevant Coursework:
  • Microbial Physiology
  • Medical Microbiology
  • Immunology
  • Biomolecules
  • Biotechnology

Certifications

AWS Educate: Cloud Computing 101

Completed foundational training on cloud computing infrastructure, services, deployment models, and best practices.

Hello Nextflow Certificate

Passed the Hello Nextflow test at the conclusion of the Nextflow training week (September 2025).

Contact Me

Open to opportunities in bioinformatics, computational biology, and data science. I also welcome collaborations on pipeline optimization and machine learning for genomics.