From Wet Lab to Code: My Journey into Bioinformatics

How I transitioned from pipetting and PCR to Python and phylogenetics, and what I learned analyzing SARS-CoV-2 genomes

The Pipette and the Terminal

I still remember the first time I successfully amplified the ATP8B1 gene fragment. After hours of optimizing PCR conditions, adjusting annealing temperatures, and troubleshooting failed reactions, those beautiful DNA bands finally appeared under UV light. That moment of seeing my target amplicon at exactly 486 bp on the agarose gel felt like pure victory. I had coaxed genetic information out of blood samples using nothing but precise volumes, careful timing, and a thermal cycler.

That was my world as an undergraduate biosciences student at COMSATS University Islamabad—a world of micropipettes, gel electrophoresis, PCR tubes, and UV transilluminators. My final year project investigated the role of ATP8B1 gene variants in gallstone disease in Pakistani patients. I spent months in the lab extracting DNA, designing TETRA-ARMS PCR primers, running gels, and genotyping samples. It was meticulous, hands-on work that taught me patience, precision, and the fundamental principles of molecular biology.

But something was missing.

The Realization

As I analyzed my genotyping results, manually recording each band pattern and calculating allele frequencies in Excel, a question kept nagging at me: What happens to all this genomic data after we generate it?

My thesis concluded that the ATP8B1 rs766737165 (T>A) variant wasn't associated with gallstone disease in our population—everyone had the wild-type TT genotype. But I found myself wondering: What about other variants in this gene? What about the entire exome? How do researchers analyze thousands of variants simultaneously? How did scientists sequence the first SARS-CoV-2 genome in just weeks during the pandemic?

The answer, I realized, wasn't in more pipettes. It was in code.

Every major breakthrough I read about—from GWAS studies identifying disease genes to the rapid characterization of emerging pathogens—relied heavily on computational analysis. The wet-lab work I loved was essential, but it was only half the story. The other half was happening in terminals and programming languages I didn't know.

That realization was both exciting and terrifying. I had spent four years becoming proficient at PCR, gel electrophoresis, and DNA extraction. Now I was contemplating starting over in a completely different domain. Could a biology student with minimal coding experience really learn bioinformatics?

Taking the Leap

In late 2024, while writing up my thesis results, I made a decision that would change my research trajectory: I enrolled in the Applied Bioinformatics Specialization offered by UC San Diego on Coursera. The course was called "Hacking COVID-19," and it promised to teach genome assembly, annotation, and phylogenetics using real SARS-CoV-2 pandemic data.

I'll be honest—the first week was intimidating. Terms like "de Bruijn graphs," "k-mers," and "N50 statistics" felt as foreign as a new language. I was used to thinking about DNA as something tangible that I could see on a gel. Now I had to think about it as data—millions of short reads that needed to be computationally assembled into a genome.

But I had an advantage I didn't initially recognize: my wet-lab experience gave me biological intuition that pure programmers often lack. When the course talked about paired-end sequencing, I immediately understood it because I'd worked with PCR primers that flank a target region. When we discussed open reading frames, I knew exactly what start and stop codons were from my genetics courses. The biology wasn't the hard part—learning to think computationally was.

The Transformation: From Bench to Terminal

Let me give you a concrete example of how my thinking changed.

Wet-Lab Approach (ATP8B1 Project):

Design primers using Primer3 software (manual, ~2 hours)
Optimize PCR through trial and error (1-2 days of bench work)
Run gel electrophoresis and manually photograph bands (~30 minutes per gel)
Score genotypes by visual inspection under UV light
Calculate frequencies in Excel spreadsheet
Analyze 104 samples over 3-4 months

Result: Determined that one specific SNP (rs766737165) is monomorphic in our population.

Computational Approach (SARS-CoV-2 Project):

Download raw sequencing data from NCBI SRA (millions of reads, minutes)
Assemble genome using SPAdes algorithm (automated, ~1 hour compute time)
Assess quality with QUAST (generates comprehensive statistics automatically)
Annotate genes using Prokka (identifies all 11 ORFs in minutes)
Build phylogenetic tree with 100 sequences using FastTree (minutes to hours)
Date evolutionary emergence using molecular clock methods

Result: Complete genomic characterization, evolutionary timeline, drug target identification, and phylogenetic context—all from publicly available data.

The contrast was staggering. What took me months of painstaking lab work for a single gene variant could be accomplished computationally in hours for an entire genome. But here's the crucial insight: neither approach invalidates the other. Wet-lab work generates the data; computational work makes sense of it at scale. The future belongs to scientists who can bridge both worlds.

Five Things I Wish I'd Known Before Starting

1. You Don't Need to Be a Programmer (Yet)

My biggest fear was that I'd need years of computer science background. I didn't. The specialization started with web-based platforms like Galaxy that require zero coding. I learned to use tools like SPAdes and BLAST through graphical interfaces before ever touching command-line tools.

Lesson: Start with GUI-based tools. Understanding what tools do is more important than how they're coded.

2. Biology Knowledge Is Your Superpower

In one assignment, I had to design diagnostic PCR primers for SARS-CoV-2. My wet-lab experience with primer design (Tm calculations, avoiding hairpins, checking specificity) made this trivial. Classmates without lab background struggled with concepts I took for granted.

Lesson: Your wet-lab skills aren't obsolete—they're contextual knowledge that makes you a better bioinformatician.

3. Failed PCRs Taught Me Troubleshooting

Remember those failed PCR reactions I mentioned? Every failed assembly, alignment error, or crashed program in bioinformatics requires the same systematic troubleshooting: check your inputs, verify your parameters, read the error messages, try different conditions. The mindset is identical.

Lesson: Lab skills transfer. Patience, systematic thinking, and troubleshooting are universal.

4. The Data Tells Stories Wet Labs Can't

When I built my first phylogenetic tree of SARS-CoV-2 sequences and dated the pandemic's emergence to October-November 2019 using molecular clock analysis, I experienced something profound: I was looking at evolutionary history. No amount of PCR work could have revealed that timeline. Computational biology doesn't replace wet-lab work—it answers different questions.

Lesson: Each approach has unique strengths. Choose your methods based on your biological questions.

5. The Learning Curve Is Steep but Conquerable

Here's what I learned in 5 months: - Genome assembly (SPAdes, QUAST) - Functional annotation (Prokka, BLAST) - Sequence alignment (MUSCLE, MAFFT) - Phylogenetics (FastTree, LSD2, molecular clock dating) - Pathway analysis (BioCyc, metabolic networks) - Basic Python and R for data handling

Could I have learned this alongside my wet-lab thesis? Absolutely. Would I recommend doing both simultaneously? Only if you enjoy sleep deprivation.

Lesson: The skills build on each other. Master one tool before adding the next.

The Moment It Clicked

There was a specific moment when everything came together. I was analyzing the spike protein of SARS-CoV-2 using multiple sequence alignment to compare it with related coronaviruses. The alignment showed that SARS-CoV-2 had 6 key amino acid substitutions in the receptor-binding domain compared to SARS-CoV-1. These weren't random—they were precisely the residues that contact the human ACE2 receptor.

Suddenly, I understood why this virus was so transmissible. I could see evolution optimizing the spike protein for human cell entry. And I realized something profound: I was doing the same kind of detective work I did in the wet lab, but at a completely different scale. Instead of genotyping 104 samples for one variant, I was comparing entire viral genomes across evolutionary time.

That's when the wet-lab researcher and the computational biologist merged into one person.

Comparing Workflows: Then and Now

Let me show you how my thinking has evolved by contrasting my two projects:

ATP8B1 Gallstone Study (Wet-Lab)

Question: Is this specific SNP associated with disease?
Sample Size: 104 patients (75 cases, 29 controls)
Time Investment: 4-5 months of daily lab work
Consumables Cost: ~$500-1000 (DNA extraction kits, PCR reagents, gels)
Skills Required: DNA extraction, PCR, gel electrophoresis, primer design
Output: Genotypic frequencies, allele distribution
Conclusion: This variant is monomorphic (not informative)

Limitations I Encountered: - Could only test one variant at a time - Sample collection was time-consuming and required ethical approval - Failed PCR reactions meant wasted samples and reagents - Limited to variants I specifically designed primers for - No way to discover new variants

SARS-CoV-2 Genomic Analysis (Computational)

Question: How did this virus emerge, evolve, and what are its drug targets?
Sample Size: 100 complete genomes (publicly available)
Time Investment: 5-course specialization over 3 months
Consumables Cost: $0 (all data and tools are free/open-source)
Skills Required: Genome assembly, annotation, phylogenetics, pathway analysis
Output: Complete genome, 11 genes annotated, evolutionary tree, drug targets
Conclusions: Multiple biological insights simultaneously

Advantages: - Access to millions of sequences worldwide (GISAID, NCBI) - Can analyze entire genomes, not just targeted regions - Discover novel variants through assembly - Trace evolutionary history across time and geography - Identify functional impacts of mutations - Reproducible—anyone can re-run my analysis

The Best of Both Worlds

Here's what I've learned: the most powerful researchers are bilingual. They speak both the language of the bench and the language of code.

My ATP8B1 project taught me: - How to handle biological samples with precision - The importance of experimental controls - That negative results (monomorphic locus) are still results - How genetic variants actually work in practice

My SARS-CoV-2 project taught me: - How to leverage publicly available data at scale - That computational predictions need experimental validation - The power of integrating multiple analysis types - How bioinformatics accelerates discovery during public health emergencies

The future isn't wet-lab OR computational—it's wet-lab AND computational. The most impactful research I can imagine would be: 1. Identify drug target computationally (like guanylate kinase in SARS-CoV-2) 2. Validate experimentally through enzyme assays 3. Design inhibitors using structural modeling 4. Test in cell culture or animal models

That workflow requires both skillsets.

Lessons for Biology Students Considering Bioinformatics

If you're where I was a year ago—comfortable in the lab but curious about computational biology—here's my honest advice:

Start With Your Strengths

Don't abandon your wet-lab skills. Use them as your foundation. When I learned genome assembly, I thought about it in terms of PCR products and Sanger sequencing I already understood. Make the new knowledge connect to the old.

Choose Projects With Real Data

Learning abstract algorithms is tedious. Learning by assembling the actual SARS-CoV-2 genome that caused a pandemic? That's motivating. Find courses or projects that use real biological datasets you care about.

Don't Wait Until You're "Ready"

I delayed starting bioinformatics because I thought I needed to learn Python first. Wrong. The best way to learn programming is to need it for a biological question. Learn tools as you need them, not in advance.

Embrace Imposter Syndrome

You'll feel like you don't belong—wet-lab people will say you're not a real biologist, computer science people will say you're not a real programmer. Ignore both. Bioinformatics is its own discipline, and your unique perspective is valuable.

Build a Portfolio

I created a GitHub repository for my SARS-CoV-2 project. It showcases skills better than any certificate. When applying to graduate programs or jobs, I can show exactly what I've done, not just claim it.

The Challenges I'm Still Facing

Let me be honest about what's still difficult:

1. Command-Line Intimidation
I'm comfortable with Galaxy and web tools, but raw command-line bioinformatics still feels daunting. Running tools through a terminal requires a different mental model than clicking buttons.

2. Statistical Depth
Understanding p-values from a t-test is different from understanding false discovery rates in GWAS or bootstrap values in phylogenetics. The statistics got deeper fast.

3. Programming Proficiency
I can write basic Python scripts to parse FASTA files, but I'm not building machine learning models. That's okay—you don't need to be a software engineer to do valuable bioinformatics.

4. Keeping Up With Tools
New tools emerge constantly. AlphaFold2, ChatGPT for protein design, long-read sequencing platforms—the field moves fast. Wet-lab protocols are more stable.

5. Missing the Tangible
Sometimes I miss seeing DNA bands appear on a gel. There's something satisfying about physical results that digital output doesn't quite replicate. But then I build a phylogenetic tree of a million sequences and that feeling fades.

What's Next for Me

This journey isn't over—it's just beginning. Here's where I'm headed:

Short-term (Next 6 months): - Master command-line Linux/Unix - Learn R for statistical genomics - Complete projects combining wet-lab and computational approaches - Write blogs teaching others what I've learned

Medium-term (1-2 years): - Master's in Bioinformatics/Computational Biology (applying Fall 2026) - Specialize in viral genomics or cancer genomics - Publish research that integrates both methodologies - Contribute to open-source bioinformatics tools

Long-term vision: - Lead a research group that bridges experimental and computational biology - Mentor students making similar transitions - Apply AI/ML to biological problems I deeply understand - Maybe even develop new tools that make bioinformatics more accessible to wet-lab biologists

The Drug Target Discovery That Changed My Perspective

Let me tell you about the moment that crystallized why this transition matters.

During the pathway analysis module of my SARS-CoV-2 project, I used BioCyc databases to explore viral metabolic dependencies. The question was elegant: what does the virus need from the host cell that we could block without harming the host?

Through computational network analysis, I identified guanylate kinase as a high-priority drug target. Here's why that's fascinating:

The Biology (what I understood from wet-lab): - SARS-CoV-2 is an RNA virus (~30,000 nucleotides) - It needs massive amounts of GTP to synthesize new viral genomes - Each infected cell produces thousands of viral copies

The Computation (what I learned): - Metabolic pathway analysis shows humans recycle GTP from RNA degradation - Viruses remove GTP from the cell (non-renewable in infection context) - Blocking guanylate kinase starves virus, spares host cells - This is the "holy grail" of selective therapeutics

The Integration (why both matter): - Computational prediction: guanylate kinase is a target - Experimental validation needed: enzyme assays, knockout studies, inhibitor screening - Clinical application: drug development, trials, patient treatment

This finding (published in Cell Metabolism by Renz et al., 2020) was discovered computationally but requires wet-lab validation. Neither approach alone gets you to a therapy. Both together might save lives.

That's when I understood: the future of biology isn't choosing between lab bench and laptop. It's knowing when to use each.

My Advice to Supervisors and Educators

If you're mentoring biology students, please consider this:

1. Integrate bioinformatics early
Don't wait until graduate school. Even basic sequence alignment and database searching should be taught alongside PCR and gel electrophoresis.

2. Use real data
Abstract exercises bore students. Let them BLAST their PCR products, visualize their sequences, analyze real experimental results computationally.

3. Normalize the transition
Don't treat computational skills as optional "extra" skills. They're core competencies, like pipetting.

4. Provide resources
Free courses (Coursera, Rosalind, DataCamp), institutional licenses (MATLAB, GraphPad), and mentorship from computational researchers.

5. Reward interdisciplinary projects
The best undergraduate theses would combine wet-lab data generation with computational analysis. Why should students choose one or the other?

Final Thoughts: The Best Time to Start

If you're reading this and thinking "I should learn bioinformatics," you're right. But you're also wrong about one thing: you don't need to wait until you're "ready."

You're ready when you have a biological question that requires computational answers.

For me, that question was: How can we analyze genomic variation beyond single SNPs? Your question might be different. Maybe you want to: - Analyze RNA-seq data from your experiment - Build a phylogenetic tree of sequences you generated - Predict protein structures from gene sequences - Identify disease-associated variants in exome data - Classify microbiome communities from 16S data

Whatever your question, the tools exist. The data exists. The courses exist (many free). What's missing is just your decision to start.

Resources That Helped Me

Since people always ask, here are the specific resources I found most valuable:

Courses: - UC San Diego Applied Bioinformatics Specialization (Coursera) - This was my main course - Bioinformatics Algorithms by Compeau & Pevzner (book + Coursera) - Rosalind.info (problem-solving practice)

Tools I Learned: - SPAdes (genome assembly) - QUAST (quality assessment)
- Prokka (genome annotation) - BLAST (sequence similarity) - MUSCLE/MAFFT (multiple sequence alignment) - FastTree (phylogenetic inference) - LSD2 (molecular clock dating) - BioCyc (pathway analysis) - Galaxy (workflow platform)

Communities: - Biostars (Q&A forum) - r/bioinformatics (Reddit) - Twitter/X #bioinformatics (amazing community)

Mindset Shifts: - Biology first, code second - Tools are means, not ends - Negative results teach as much as positive ones - Open science makes everything faster - Your unique perspective (wet-lab + computational) is valuable

One Year From Now

Where will I be in one year? Hopefully: - Admitted to a top bioinformatics Master's program - Publishing my SARS-CoV-2 analysis as a methods paper - Fluent in Python and R - Working on a project that generates wet-lab data I analyze computationally - Mentoring other students making this transition - Contributing to open-source bioinformatics tools

But more importantly, I'll be a scientist who can ask bigger questions because I'm not limited by methodology. When I encounter a biological mystery, I won't think "Is this a wet-lab question or a computational question?" I'll think "Which tools best answer this question?" and have both at my disposal.

You're Not Leaving the Lab—You're Expanding It

The title of this blog says "From Wet Lab to Code," but that's not quite right. It should be "From Wet Lab to Code AND Lab."

I haven't abandoned my pipettes. They're sitting in my drawer next to my laptop. Sometimes the answer to a biological question is a PCR reaction. Sometimes it's a Python script. Often, it's both.

My ATP8B1 thesis taught me that wet-lab work generates data. My SARS-CoV-2 project taught me that computational work extracts meaning from data at scale. Together, they taught me that modern biology requires both languages.

I'm not a wet-lab biologist who learned coding.
I'm not a computational biologist who did some lab work.
I'm a biologist who uses whatever tools answer the question.

And so can you.

Update: Where I Am Now (March 2026)

Since starting this journey: - ✅ Completed full Applied Bioinformatics Specialization - ✅ Built complete SARS-CoV-2 analysis pipeline (29,903 bp genome assembled) - ✅ Identified 11 ORFs and designed diagnostic PCR primers
- ✅ Conducted phylogenetic analysis with molecular clock dating - ✅ Identified drug target through metabolic pathway analysis - ✅ Created GitHub portfolio showcasing all analyses - ✅ Currently applying to Bioinformatics Master's programs - 🔄 Learning advanced Python for genomics - 🔄 Exploring machine learning applications in biology - 🔄 Writing tutorials to help others make this transition

The journey from pipette to Python isn't a straight line—it's a bridge you build while walking across it. Every wet-lab skill you have is a foundation stone. Every computational skill you gain is a spanning cable. Together, they let you cross distances neither alone could reach.

Start building your bridge today.

Syed Muhammad Ali Shirazi is a recent BS Biosciences graduate from COMSATS University Islamabad, Pakistan, specializing in the integration of wet-lab molecular biology and computational genomics. His research spans gallstone disease genetics, viral genomics, and bioinformatics pipeline development. He's currently preparing applications for Master's programs in Bioinformatics and Computational Biology.

Connect: GitHub | LinkedIn | Email

Have questions about transitioning from wet-lab to bioinformatics? Drop a comment below or reach out—I'm always happy to help fellow biology students navigate this journey!