Staying on Top of the Literature¶

"Four to six weeks in the lab can save you an hour in the library."
— G.C. Quarderer

The Challenge¶

The number of publications grows exponentially, making it impossible to read everything. However, staying current with relevant research remains crucial for effective scientific work.

Discovery Methods¶

Google Scholar alerts on key topics, authors, and "related papers"
Journal browsing of major venues (browse the table of content for Nature, Science, PNAS)
Social media (especially Twitter/X for ML) as a quality filter
Journal clubs and literature seminars
Paper recommendation systems I use Scholar Inbox, which sends me a daily email with papers that are similar to the ones I read recently

Recommended Workflow¶

1. Collection System¶

Use Zotero as your central repository
Create an "Inbox" folder for quick captures
Install browser extensions for seamless paper collection
- Consider tools like PaperMemory for finding forgotten papers

2. Processing Routine¶

Read papers directly in Zotero (desktop or mobile)
Take highlights in the Zotero app
I use my note-taking app (Logseq) in a second window to take notes

Always share interesting papers with the team using the Zulip channel
Include a brief summary explaining relevance or key insights

Core Responsibilities¶

Your responsibility: Stay current with literature in your research area. While others may provide initial pointers, maintaining awareness is ultimately your job.

Best practice: Read broadly, including papers that simply spark your curiosity—unexpected connections often drive innovation.

Essential Reading List¶

From our group¶

Graph Neural Networks¶

JT-VAE - Junction tree variational autoencoders for molecules
Jumping Knowledge Networks - Advanced GNN architectures

ML Theory & Ethics¶

Understanding Deep Learning Requires Rethinking Generalization - Fundamental insights on generalization
The Lottery Ticket Hypothesis - Sparse neural network training
On the Dangers of Stochastic Parrots - Critical perspective on large language models
AI Ethics Course - Comprehensive ethics framework

Software Development¶

The Pragmatic Programmer - Essential software development principles

Basic Research Tools¶

Essential Skills:

The Turing Way - Comprehensive guide to reproducible research
The Missing Semester - Practical computing skills every researcher needs
What Every CS Major Should Know - Foundational computer science concepts

Books¶

General Reading¶

Greatness Cannot Be Planned - On open-endedness and serendipity in research
Chop Wood Carry Water - Philosophy and mindset for sustained work
Systems Science - Understanding complex systems

Software Development¶

The Pragmatic Programmer - Essential development principles

Media & Ongoing Learning¶

Podcasts¶

Dwarkesh Podcast - In-depth interviews with researchers and thought leaders

YouTube Channels¶

Yannic Kilcher - ML paper reviews and "ML News" updates

Staying on Top of the Literature¶

The Challenge¶

Discovery Methods¶

Recommended Workflow¶

1. Collection System¶

2. Processing Routine¶

3. Team Sharing¶

Core Responsibilities¶

Essential Reading List¶

From our group¶

Core ML/AI Foundations¶

Chemical/Materials Informatics¶

Transformer Deep Dive¶

Graph Neural Networks¶

ML Theory & Ethics¶

Software Development¶

Basic Research Tools¶

Books¶

General Reading¶

Software Development¶

Media & Ongoing Learning¶

Podcasts¶

YouTube Channels¶