STAT548 PhD Qualifying Course Papers

Published: 6 August 2024
Last Updated: 6 August 2024

If you are interested in a qualifying paper with me, please email me to schedule a one-on-one meeting. The subject line of your email should start with "[Qualifying Paper]" to ensure I don't accidentally miss it. Come to the meeting prepared to discuss:

your background,
your long-term research interests (it's okay if these are not yet well-defined),
the specific paper you are interested in and the reasons for your interest,
your planned submission date for your report (typically 4-6 weeks after we meet), and
initial ideas for a mini-project based on the paper.

To ensure a productive meeting, please spend some time reviewing the paper before we meet.

What to Expect From Me as a Supervisor

The qualifying papers I've listed below are representative of my current research interests: Bayesian optimization and neural network uncertainty quantification. More specifically, I'm interested in heuristic and approximate notions of uncertainty from machine learning models, and how they can inform reliable and optimal downstream decisions within the contexts of experimental design and scientific discovery.

Timeline and Deliverables

The final report for your qualifying paper will comprise two parts: (1) an extended review demonstrating your comprehension and critical evaluation of the paper, and (2) a mini-research project. Part 1 (the extended review) should take 2-3 weeks, and Part 2 (the project) should take 3-4 weeks.

Part 1: The Extended Review

The extended review should be divided into two sections:

Technical/methodological summary (roughly 3 pages)
This section should exhibit your grasp of the paper's technical content. Expect to review some related literature, as you should be able to contextualize the paper within the broader subfield. Address the following prompts in your summary:
- Theory papers: Provide a concise overview of primary technical outcomes and major proof techniques. Explain assumptions and their rationale. Relate to existing results and potential relaxations. Highlight innovative proof techniques.
- Methodological/applied papers: Explain the proposed methodology and its theoretical properties. Discuss computational complexity. Identify crucial aspects and potential bottlenecks. Mention alternative methods that could also be appplied to the given problem.
Mini-project proposal (roughly 1-2 pages)
This section should demonstrate your ability to think creatively about research. Brainstorm a generalization, extension, or novel application of the paper's content. Dreaming too big is better than dreaming too small: aim for a project with potential for publication or inclusion in your thesis. (I'll help you scope whatever you come up with into a 3-4 week mini-project.)
Your proposal should (1) describe the area of opportunity, (2) propose a method/approach, (3) identify expected technical challenges/bottlenecks, and (4) predict potential impact.

Part 2: The (Mini) Project

After completing a draft of your extended review, we'll meet one-on-one to define a 3-4 week project based on your proposal. You will turn in a 4+ page report along with associated code and data. The content of the project will depend on the style of the paper

Theory papers: If you choose a paper that is purely theoretical in nature, I will expect a predominantly mathematical project (extending the paper's theorems to a novel setting, apply the paper's proof techniques to a different problem, etc).
Methodological/applied papers: Expect a mix of theory and coding, as well as getting your hands dirty with some real-world data. (If you want to use a language other than Python, you will need a really convincing argument!)

Workflow expectations: My research approach tends to be highgly iterative, and I anticipate the same for our mini-projects. Initial project ideas will likely require modification to be fruitful. Be prepared to pivot or adjust your project, perhaps more than once.

In my opinion, a good researcher knows when to "fail fast." Most research ideas don't work, so figure out the fastest way to evaluate whether your ideas are likely to be dead ends. Design a minimal experiment/derivation for quick evaluation. If results seem promising in a week or two, continue pursuing the idea. Otherwise, adapt or pivot.

I expect you to check in with me at least once (ideally more) over the course of your project. Share (1) early results indicating your approach's viability and (2) your plan to pivot or adapt based on those results. Slack communication is preferred, but I'm always happy to meet in person if you want to bounce ideas off of each other.

Formatting: Submit the report as a GitHub repository, using the template at https://github.com/ben-br/qp-template/. The template includes a LATEX style file that should be used for the report. (Detailed instructions for usage can be found in the repository’s README file.) Ensure that the experimental results are reproducible. Write reusable/documented/well-commented Python code, and publish the code in a GitHub repo that I have access to. I should easily install and run your experiments.

What I am Looking For

Official Assessment

As outlined in the assessment form, your evaluation will be based on: (1) your overall comprehension, (2) your ability to "go beyond," and (3) your work habits/reporting/communication skills. The extended review should showcase your understanding, while the mini-research project should demonstrate your creative thinking and ability to "go beyond."

Unofficial Assessment

The qualifying paper also gives me the chance to gauge your research potential and our compatibility in a mentor-mentee dynamic. I will not judge you based on how good your project results are. Rather, I will evaluate you on the following:

shared research interests,
strong technical proficiency (or an ability to quickly acquire new skills),
capacity to "fail fast" (as described in the project details),
independence and proactivity,
effective communication of when assistance is needed,
receptiveness to feedback, and
awareness of the societal and ethical implications of machine learning research.

Qualifying Papers (updated 6 August 2024)

If a paper is crossed out, then it is no longer available.

Discovering Many Diverse Solutions with Bayesian Optimization
N. Maus, K. Wu, D. Eriksson, J.R. Gardner
AISTATS, 2023
Topics: Bayesian optimization, molecule generation
Style: Applied, methodological
Bayesian Optimization of Function Networks with Partial Evaluations
P. Buathong, J. Wan, R. Astudillo, S. Daulton, M. Balandat, P.I. Frazier
ICML, 2024
Topics: Bayesian optimization, molecule generation
Style: Methodological, applied
The Behavior and Convergence of Local Bayesian Optimization
K. Wu, K. Kim, R. Garnett, J.R. Gardner
NeurIPS, 2023
Topics: Bayesian optimization
Style: Theoretical
Active Statistical Inference
T. Zrnic, E.J. Candès
ICML, 2024
Topics: Uncertainty quantification, neural networks, conformal prediction
Style: Methodological, theoretical
Is In-Context Learning in Large Language Models Bayesian? A Martingale Perspective
F. Falck, Z. Wang, C. Holmes
ICML, 2024
Topics: Uncertainty quantification, LLMs, Bayesian inference
Style: Theoretical

Useful Resources/Advice

Many of these links have been shared by other faculty members as well:

Nancy Heckman's page on technical writing,
Harry Joe's page on mathematical writing and typesetting in LaTeX,
Trevor Campbell's talk on "How to Explain Things,"
Knuth, Larrabee, and Roberts on mathematical writing,
"Getting Started with Git": Chapters 1 and 2, and
my talk on "How to Be a Git Wizard" (if git still scares you after reading the above resource).

Many parts of this document were derived/adapted/copied from Ben, Trevor, Marie, and Daniel. Thanks all!