Zahera (Fathima) Khatoon
Austin, TX
MS Bioinformatics Candidate · Northeastern University
Summary
MS Bioinformatics candidate with hands-on experience in NGS analysis (FASTQ → VCF), ML for biological data, structural modeling, and reproducible pipelines on HPC (SLURM). Comfortable implementing core algorithms (Smith–Waterman, Needleman–Wunsch, UPGMA, ORF detection) from scratch in Python.
Education
Northeastern University — M.S. in Bioinformatics · Boston, MA · Remote
Sep 2025 – May 2027 (Expected)
Coursework: BINF 6200 (Bioinformatics Programming), BINF 6400 (Genomics & Computational Biology).
Osmania University — Postgraduate Diploma in Bioinformatics · Hyderabad, India
2021 – 2022
Sequence analysis, structural biology (homology modeling, docking), computational genomics.
Osmania University — B.Sc. in Microbiology, Biotechnology & Chemistry · Hyderabad, India
2005 – 2008
Experience
Medical Laboratory Technician — Sunflower Memory Care · Cedar Park, TX
Apr 2024 – May 2024
- Performed routine clinical lab procedures: specimen collection, processing, and analysis.
- Maintained HIPAA-compliant documentation; coordinated with staff for accurate test reporting.
Academic Projects
Cell Segmentation ML Pipeline — BioHack 2026 · Northeastern
Feb 2026
Deep-learning microscopy segmentation (Dice ≈ 0.87); reduced manual annotation time ~60%.
Stack: Python, PyTorch, OpenCV
UPGMA Phylogenetics + Classifier — BINF 6400 · Northeastern
Mar 2026
Implemented UPGMA from scratch; trained classifier on sequence features for taxonomic grouping.
Stack: Python, NumPy, Biopython, scikit-learn
NGS QC on Explorer HPC — BINF 6400 · Northeastern
Feb 2026
SLURM-scheduled QC workflow; retained >95% high-quality reads after adapter/base trimming.
Stack: Bash, SLURM, FastQC, MultiQC, Trimmomatic
Genome Assembly + ORF Detection — BINF 6400 · Northeastern
Feb 2026
Assembled short reads and wrote a custom ORF detector; validated against reference annotations.
Stack: Python, SPAdes, Biopython
CCDS Python Package — BINF 6200 · Northeastern
Sep – Dec 2025
Reusable CCDS analysis package with 96% test coverage and 10/10 pylint; CI via GitHub Actions.
Stack: Python, pytest, pylint, GitHub Actions
Pairwise Alignment + PSSM — BINF 6200 · Northeastern
Sep – Dec 2025
Implemented Smith–Waterman, Needleman–Wunsch, and PSSM scoring from scratch.
Stack: Python, NumPy
GLT6D1 Homology Modeling — Thesis · Osmania University
2024 – 2025
>92% residues in favored Ramachandran regions; top docking pose ≈ −8.2 kcal/mol.
Stack: MODELLER, PyMOL, AutoDock Vina
Technical Skills
Languages: Python · R · Bash · SQL · C · C++
Bioinformatics: Biopython · BLAST · Clustal Omega · MODELLER · PyMOL · AutoDock Vina · FastQC · MultiQC · Trimmomatic · BWA · SAMtools · GATK · SPAdes · IGV
NGS & Genomics: FASTQ → VCF · Read QC & trimming · Variant calling · Genome assembly · ORF detection · Phylogenetics (UPGMA) · PSSM · Pairwise / MSA
ML & Engineering: PyTorch · scikit-learn · NumPy · Pandas · OpenCV · Git / GitHub · GitHub Actions · pytest · pylint · SLURM / HPC · Linux
Platforms & Formats: Explorer HPC · Jupyter · VS Code · FASTA · FASTQ · SAM/BAM · VCF · PDB · GFF/GTF