Benjamin Harvey, DSc

  • Senior Research Associate

Departmental Affiliations

Contact Information

Dr. Harvey's Portfolio
Dr. Harvey's LinkedIn

View Current Courses


DSc, Bowie State University , 2015
MS, Bowie State University, 2011
BS, Mississippi Valley State University, 2008


I graduated from Mississippi Valley State University (MVSU) in 2008 with a B.S. in Pre-Medicine & Computer Science. I received a Master of Science in Computer Science from Bowie State University in 2011 and Doctor of Science in Computer Science from Bowie State University (BSU) in 2015. I’m currently a Sr. Research Associate at Johns Hopkins University supporting Nilanjan as a data scientist. I previously served as a lead Data Scientist and Solutions Architect with Databricks where I supported data science solutions engineering for federal customers and developed genomics pipelines for their Public Sector and Health and Life Sciences (HLS) services organization. I was also a Research Professor at George Washington University (GWU) within the Department of Engineering Management and Systems Engineering and Department of Computer Science’s joint Data Analytics graduate program where I taught Data Science and Big Data Analytics courses. I joined the National Security Agency (NSA) in 2009 and worked there for nearly a decade where my final position was the Chief of Operations Data Science. I was hired into the Cryptologic Computer Science Develop Program (CDP), graduated from the CDP in 2012 and was the first African American to be accepted and to finish the program. I was a research fellow at Harvard-Massachusetts Institute of Technology (MIT) Division of Health Sciences and Technology (HST) in the Bioinformatics and Integrative Genomics (BIG) program in 2008. I was a Post-Graduate Research-Fellow with i2b2, National Center for Biomedical Computing, Brigham and Women’s Hospital, and Children’s Hospital Boston Informatics Program (CHIP) with an academic appointment at Harvard Medical School.

Honors and Awards

2019 NSF National ICorps, Principal Investigator, AI Augmentation and Integration

2017 Intelligence Community and Counter Intelligence Security Professional Award

2016 Office of the Director National Intelligence (ODNI) Award for Human Capital

2015 Bowie State University Dissertation of the Year Award

2015 Bowie State University Computer Science Chair’s Award Dissertation of the Year

2011 NSA/CSS Cryptologic Computer Science Development Program

2010 National Institute of Health Research Fellowship, Clinical Center

2009 Brigham and Women’s Hospital, Harvard Medical School (i2b2)

2009 National Institute of Health Clinical Center Research Fellow

2009 Harvard Medical School-Brigham and Women’s Hospital Post-Baccalaureate

2008 Harvard-MIT Health & Science Technology (HST) Internship Program

2008 National Association of Mathematicians Summer Research Grant, ECSU

2007 Ronald McNair Post-Baccalaureate Program Grant, University of Tennessee

Dr. Benjamin Harvey is currently a Sr. Research Associate at the Johns Hopkins Bloomberg School of Public Health in Baltimore, Maryland. He has a B.S. in Computer Science from Mississippi Valley State University and a M.S. and D.Sc. in Computer Science from Bowie State University. As a data scientist, Dr. Harvey specializes in assisting researchers and universities in prepping, processing, and analyzing genomic data by implementing scalable systems including Apache Spark, and tools like Hail, GATK4, ADAM, SparkSeq, and VariantSpark. Dr. Harvey has utilized these tools to provide high level APIs that simplify implementing algorithms for analyzing large genomic datasets including GATK pipelines hosted in the cloud on AWS and Azure at scale. His work has enabled researchers and universities to process data 15x faster with workflows optimized to run in parallel and easily launch and scale pipelines with a few clicks. Dr. Harvey has also developed capabilities that enable researchers to interactively explore and classify data with prepackaged genomic analytics (e.g., Joint Variant Calling, GWAS, eQTL, etc.) and machine learning. He has developed capabilities to analyze hundreds of thousands of genomes while minimizing costs with autoscaling on AWS and Azure. This includes seamlessly connecting processed genomic data with downstream analytics for faster results. He has also developed solutions to enable researchers, computational biologists and bioinformaticians to iterate in real-time with collaborative workspaces in Databricks. This includes exploring data efficiently with familiar languages – SQL, R, Python, Java, and Scala and standardizing genomic workflows across teams to improve reproducibility.

  • B. Harvey; S. Y. Ji, "Cloud-Scale Genomic Signals Processing for Robust Large-Scale Cancer Genomic Microarray Data Analysis," in IEEE Journal of Biomedical and Health Informatics , vol.PP, no.99, pp.1-1, November 2015
  • Harvey, B., Ji, S., "Cloud-Scale Genomic Signal Processing Classification Analysis for Gene Expression Microarray Data," Engineering in Medicine and Biology Society (EMBC), 2014 36th Annual International Conference of the IEEE , vol., no., pp.7152,7155, 26-30 August 2014
  • Kato Mivule, Benjamin Harvey, Crystal Cobb, and Hoda El Sayed, "A Review of CUDA, MapReduce, and Pthreads Parallel Computing Models", IJISET - International Journal of Innovative Science, Engineering & Technology, Vol. 1 Issue 8, October 2014,Pages 208-217.