Skip Navigation

Course Directory

Advanced Data Science I

1st term
3 credits
Academic Year:
2021 - 2022
Instruction Method:
Asynchronous Online with Some Synchronous Online
Auditors Allowed:
Grading Restriction:
Letter Grade or Pass/Fail
Course Instructor:
Roger Peng

The course is designed for PhD students in the Johns Hopkins Biostatistics Masters and PhD programs and assumes significant background in statistics. Specifically it is assumed you know the basics of statistics through generalized linear models, you know how to fit and interpret models, you know the basics of R and Python, and you can use version control with Github.


Teaches how to organize the components of a data analysis – statistics, data manipulation, and visualization. Teaches how to produce a complete data analysis to answer a targeted scientific question. Focuses on synthesis, communication, ethics, and interpretation of data analytic products.

Learning Objectives:

Upon successfully completing this course, students will be able to:

  1. Critique a data analysis and separate good from bad analysis
  2. Produce a complete data analysis
  3. Produce the components of a data analytic paper
  4. Produce the components of a statistical methods paper
  5. Produce the components of a data analytic presentation for technical and non-technical audiences
  6. Identify key issues in data analytic relationships
Methods of Assessment:

This course is evaluated as follows:

  • 99% Homework

Enrollment Restriction:

Enrollment restricted to Biostatistics 2nd-year PhD and 2nd-year master's students only

Instructor Consent:

Consent required for all students

Consent Note:

Consent required for all students

For consent, contact:

Special Comments:

Please note: This is the virtual section of a course that is also offered onsite. Students will need to commit to the modality for which they register.