Analyzing Population Based HIV Impact Assessment (PHIA) Data

Monday, June 1, 2020 - Tuesday, June 30, 2020  

Course Description

This course will explore how researchers focusing on control of the HIV epidemic can use Population-Based HIV Impact Assessment (PHIA) Data to advance their research. The PHIA surveys have been conducted in 13 countries, and the data are now becoming available to the wider research community. The course will teach students how to access and analyse the complex PHIA datasets, using the widely available software, STATA. The first module will provide an introduction to the global HIV epidemic, the aims and protocol of the PHIA surveys, and then the PHIA datasets themselves. Each subsequent module will advance the student 's skills in using the PHIA datasets to interrogate increasingly complex research questions, moving from fairly standard cross-tabulations to complex multi-variable analyses. Students will be provided with a sample dataset from one of the African PHIA surveys. Students will learn how to merge the datasets and how to use the appropriate weighting structure to address the series of research questions embedded in the sequence of modules. Each module will contain video presentations of the techniques, sample do-files and the series of hands-on analytic exercises that will be the basis for completion of each module and the course. This course will provide a resource library and articles demonstrating a diverse array of research using the techniques taught in this course. At the end of the course, students will be able to produce logistic or Poisson regression models using the PHIA datasets and the appropriate weighting structures.

Course Objectives

The primary objective of this course is to provide individuals with the skills needed to conduct data analyses of the Population Based HIV Impact Assessment (PHIA) data using STATA.


By the end of the course, students will:


  • Learn how the Population-based HIV Impact Assessment surveys (PHIA) provide extensive data to support a broad array of HIV/AIDS research and programmatic question
  • Learn how to access and analyze the PHIA dataset
  • Develop/refine analysis plans for complex research questions using the PHIA data set
  • Become proficient in bi-variate and multi-variable analyses of PHIA data using STATA
  • Be able to conduct analyses using appropriate PHIA weight
  • Learn how to format analyses for manuscripts and research reports


Basic statistics or biostatistics is a pre-requisite, as is experience using STATA. Familiarity with HIV/AIDS and HIV control programs will be helpful. Applicants based in LMICs, and particularly in countries which have already conducted a PHIA survey will be given preference.

Technical Requirements

In order to follow-along in guided analyses and complete course exercises, individuals will need access to STATA and familiarity with the basic STATA command structure. STATA will need to be downloaded and installed. A STATA licenses is required, which is not included in the course. STATA may be included through your university or department. The cost of STATA varies depending on your country of residence and the type of license that you purchase. For PHIA, at the STATE/SE is recommended. STATA can be downloaded at There are a number of introductory STATA tutorials available for free online. Individuals lacking experience in STATA programming must familiarize themselves with STATA prior to beginning this course.

Course Reading List

Scientific articles are listed within each of the modules. These recommended readings will provide background for the module's topics and/or provide a clear example of analyzing population-based HIV impact assessment data for epidemiologic research. Background materials on STATA and sample do-files will also be made available to students.


Sally Findley, PhD

Professor Emerita in Population and Family Health

Mailman School of Public Health, Columbia University

In my thirty years as a professor at Mailman School of Public Health, I have taught multiple methodological courses for MPH students, including Demographic Methods and Applied Public Health Research Methods for LMIC settings, and I have supervised theses and dissertations applying complex data analysis methods to different African household survey data sets.   I have developed an effective teaching and training style that emphasizes participatory activities and adult learning methods.   From 2015-2019, I was the Senior Technical Advisor for Capacity Building for the ICAP Population-based HIV Impact Assessment (PHIA) Surveys in 13 countries. The three-workshop capacity building sequence led to significant gains in data analysis competencies among the 200 participants. These workshops have now been adapted as distance learning modules so they can be offered to any HIV/AIDS or sexual and reproductive health researchers wanting to conduct investigations with these complex but exceptionally rich PHIA datasets.  

Course Fee

Registration before April 1, 2020: $900.00
After April 1, 2020: $1,000.00


Online Course Format

This is a month-long digital course, equivalent to approximately 20 hours of classroom instruction. Lectures and course material will be presented online in roughly weekly segments. The flexible format will include video or audio recordings of lecture material, file sharing and topical discussion, self-assessment exercises, and access to the instructor for feedback during the course. The course utilizes the learning management software, Canvas (; participants will receive an e-mail inviting them to join on the first day of the course. Any additional information about technical requirements and access to the course will be shared in the weeks before the course begins.

