Biostatistic Resource Lists for CFAR Investigator

- There are no free programs
- According to school rules, unofficial auditing is NOT allowed
- The audit has to be built into the course grading options
- Cost per credit for the 2017-2018 academic year was $1,303.00. Harvard Affiliates do not receive a discount on tuition. (More about Non-degree programs for Harvard Affiliates)
- There is a tuition assistance program (TAP). TAP recipients have to pay 10% of tuition. (More about TAP)

- BST 201: Introduction to Statistical Methods
- BST 210: Applied Regression Analysis
- BST 212: Survey Research Methods in Community Health
- BST 213: Applied Regression for Clinical Research
- BST 222: Basics of Statistical Inference
- BST 223: Applied Survival Analysis
- BST 226: Applied Longitudinal Analysis
- BST 227: Introduction to Statistical Genetics
- BST 228: Applied Bayesian Analysis
- BST 232: Methods I
- BST 263: Applied Machine Learning
- BST 267: Introduction to Social and Biological Networks

- This is a Biostatistics Continuing Education program by Harvard Catalyst.
- The target audience is Harvard Catalyst statisticians and other quantitative researchers.
- This is open/free to the CFAR investigators as well.
- Although all seminars are not necessary related to the HIV/AIDS research, some talks might be relevant and useful for the CFAR investigators.
- The past seminars can be seen from here

- This is a monthly seminar series by Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute.
- The aim of this seminar series is to provide participants with useful examples and a framework on which to build their own projects.
- This is open to the CFAR investigators as well
- Seminar location is CLSB Building, 11FL, room 11081
- Although all topics are not necessary related to the HIV/AIDS research, some talks might be relevant and useful for the CFAR investigators.
- The past and upcoming talks can be seen from here
- Some previous seminars that might have been relevant to the CFAR investigators are listed below.
- How many mice? What on earth do I do with their data? (by Donna Neuberg)
- Decision Curve Analysis (by Giovanni Parmigiani)
- Next Generation Sequencing Pipelines (by Naim Rashid)
- Propensity Score (by Cory Zigler)

- This is an online book and free to access
- This is an introductory level and no prior statistical or mathematical knowledge would be required.
- Each section has an illustrating example and several exercises and the answers.
- No computer package would be necessary to do the examples and exercises.
- Topics to be covered (Chapter list)
- Data display and summary
- Mean and standard deviation
- Populations and samples
- Statements of probability and confidence intervals
- Differences between means: type I and type II errors and power
- Differences between percentages and paired alternatives
- The t tests
- The chi-squared tests
- Exact probability test
- Rank score tests
- Correlation and regression
- Survival analysis
- Study design and choosing a statistical test

- This is an online textbook and free to access
- This is an introductory level and no prior statistical or mathematical knowledge would be required.
- The text is written in R Markdown. So, the R code and the result are displayed together, which would help readers to learn how to do it with R.
- No prior knowledge about R would be necessary to read the contents.
- Examples are not necessary related to clinical/laboratory research.
- Topics to be covered (Chapter list)
- 2 group comparisons and resampling
- 2 group comparisons using t-tests
- a closer look at assumptions
- non-parametric 2 group comparisons
- more than 2 groups
- linear combinations and multiple comparisons
- linear regression
- even more linear regression
- multiple regression
- inference for multiple regression
- model checking and refinement
- variable selection
- two-way ANOVA

- StataCorp is continuously posting short video tutorials to illustrate how to implement various statistical analysis methods with Stata.
- This is free and does not require a license of Stata to view the tutorials.
- The length of each video is about a few minutes. Each video uses a data example, but the examples are not necessary related to medical research. The video content focuses on how to implement a particular statistical method by Stata; theories or details of the method is not covered in this tutorial.
- This source would be useful for those who are using Stata or plan to use Stata.

- This is free to access.
- The website contains pages that illustrate the application of statistical analysis techniques using several statistical packages (i.e., Stata, SAS, SPSS, Mplus, and R). Each page has a sample data and analysis. An explanation of the output is also provided. References in each page would be also useful.
- The pages are designed to introduce only the essence of the technique. No prior math/stat knowledge would be required. No equations in the pages.
- The examples are not necessary related to medical research but often hypothetical examples just for illustrating purpose.
- Topics to be covered on this page are:
- Robust Regression
- Logistic Regression
- Exact Logistic Regression
- Multinomial Logistic Regression
- Ordinal Logistic Regression
- Probit Regression
- Poisson Regression
- Negative Binomial Regression
- Zero-inflated Poisson Regression
- Zero-inflated Negative Binomial Regression
- Zero-truncated Poisson
- Zero-truncated Negative Binomial
- Tobit Regression
- Truncated Regression
- Interval Regression
- One-way MANOVA
- Discriminant Function Analysis
- Canonical Correlation Analysis
- Multivariate Multiple Regression
- Generalized Linear Mixed Models
- Mixed Effects Logistic Regression
- Latent Class Analysis
- Power analysis: Single-sample t-test
- Power analysis: Paired-sample t-test
- Power analysis: Independent-sample t-test
- Power analysis: Two Independent Proportions
- Power analysis: One-way ANOVA
- Power analysis: Multiple Regression
- Power analysis: Accuracy in Parameter Estimation

- BMJ has a specific section for articles that discuss research methods. This section includes research reporting guidelines as well.
- The target audience of those articles is the readers of BMJ (mostly doctors). No prior math/stat knowledge is required to fully understand the materials.
- The articles are accessible via Harvard library (or subscription)
- Various topics are covered. The whole list can be seen from here.

- The source is accessible via Harvard library. Most of the articles in this series are open access, so that they can be read without going through Harvard library.
- This series of articles can be identified by searching "Best (but oft-fogotten)" on the journal's website.
- The articles are written for non-statisticians to be able fully to understand. The details about this series can be seen here.
- As of July 2018, the following papers can be found.
- mediation analysis
- intention-to-treat, treatment adherence, and missing participant outcome data in the nutrition literature
- propensity score methods in clinical nutrition research
- the design, analysis, and interpretation of Mendelian randomization studies
- expressing and interpreting associations and effect sizes in clinical outcome assessments
- sensitivity analyses in randomized controlled trials
- testing for treatment effects in randomized trials by separate analyses of changes from baseline in each group is a misleading approach
- checking assumptions concerning regression residuals
- the multiple problems of multiplicityâ€”whether and how to correct for many statistical tests
- designing, analyzing, and reporting cluster randomized controlled trials

- The source is accessible via Harvard library.
- This series of articles can be identified by searching "Guide to Statistics and Methods" in the search box on the JAMA Network website.
- The articles are written for non-statisticians to be able to fully understand. The details about this series can be seen here.
- Some of the selected articles relevant to the CFAR investigators are as follows:
- Odds Ratio -- Current Best Practice and Use
- Logistic Regression Diagnostics : Understanding How Well a Model Predicts Outcomes
- Logistic Regression : Relating Patient Characteristics to Outcomes
- Time-to-Event Analysis
- Analyzing Repeated Measurements Using Mixed Models
- Missing Data: How to Best Account for What Is Not Known
- Multiple Imputation: A Flexible Tool for Handling Missing Data
- The Propensity Score
- Evaluating Discrimination of Risk Prediction Models: The C Statistic
- Multiple Comparison Procedures
- Sample Size Calculation for a Hypothesis Test
- Thoughtful Methods to Increase Evidence Levels and Analyze Nonparametric Data