Staying on Top of the Literature¶
"Four to six weeks in the lab can save you an hour in the library."
— G.C. Quarderer
The Challenge¶
The number of publications grows exponentially, making it impossible to read everything. However, staying current with relevant research remains crucial for effective scientific work.
Discovery Methods¶
- Google Scholar alerts on key topics, authors, and "related papers"
- Journal browsing of major venues (browse the table of content for Nature, Science, PNAS)
- Social media (especially Twitter/X for ML) as a quality filter
- Journal clubs and literature seminars
- Paper recommendation systems I use Scholar Inbox, which sends me a daily email with papers that are similar to the ones I read recently
Recommended Workflow¶
1. Collection System¶
- Use Zotero as your central repository
- Create an "Inbox" folder for quick captures
- Install browser extensions for seamless paper collection
-
- Consider tools like PaperMemory for finding forgotten papers
2. Processing Routine¶
- Read papers directly in Zotero (desktop or mobile)
- Take highlights in the Zotero app
- I use my note-taking app (Logseq) in a second window to take notes
3. Team Sharing¶
- Always share interesting papers with the team using the Zulip channel
- Include a brief summary explaining relevance or key insights
Core Responsibilities¶
Your responsibility: Stay current with literature in your research area. While others may provide initial pointers, maintaining awareness is ultimately your job.
Best practice: Read broadly, including papers that simply spark your curiosity—unexpected connections often drive innovation.
Essential Reading List¶
From our group¶
Core ML/AI Foundations¶
- Attention is All You Need - The transformer architecture (start with the Annotated Transformer)
- Neural Message Passing for Quantum Chemistry - Graph neural networks for molecules
- Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules - VAEs for chemical design
- Meta-Learning in Neural Networks: A Survey - Comprehensive meta-learning overview
- Interpretable Machine Learning - Essential reference on interpretability
Chemical/Materials Informatics¶
- BigSMILES: A Structurally-Based Line Notation for Describing Macromolecules - Polymer representation standard
- Polymer Genome: A Data-Powered Polymer Informatics Platform - Major polymer informatics effort
- AI-driven design of catalysts and materials for ring opening polymerization - Domain-specific language approach
Transformer Deep Dive¶
- The Illustrated Transformer - Visual explanation of transformers
- Leveraging LLMs for Predictive Chemistry - LLMs applied to chemistry
- LIFT: Language-Interfaced Fine-Tuning - Fine-tuning for non-language tasks
- Scaling Laws for Autoregressive Generative Modeling - Understanding model scaling
- The Power of Scale for Parameter-Efficient Prompt Tuning - Efficient fine-tuning methods
Graph Neural Networks¶
- JT-VAE - Junction tree variational autoencoders for molecules
- Jumping Knowledge Networks - Advanced GNN architectures
ML Theory & Ethics¶
- Understanding Deep Learning Requires Rethinking Generalization - Fundamental insights on generalization
- The Lottery Ticket Hypothesis - Sparse neural network training
- On the Dangers of Stochastic Parrots - Critical perspective on large language models
- AI Ethics Course - Comprehensive ethics framework
Software Development¶
- The Pragmatic Programmer - Essential software development principles
Basic Research Tools¶
Essential Skills:
-
The Turing Way - Comprehensive guide to reproducible research
-
The Missing Semester - Practical computing skills every researcher needs
-
What Every CS Major Should Know - Foundational computer science concepts
Books¶
General Reading¶
- Greatness Cannot Be Planned - On open-endedness and serendipity in research
- Chop Wood Carry Water - Philosophy and mindset for sustained work
- Systems Science - Understanding complex systems
Software Development¶
- The Pragmatic Programmer - Essential development principles
Media & Ongoing Learning¶
Podcasts¶
- Dwarkesh Podcast - In-depth interviews with researchers and thought leaders
YouTube Channels¶
- Yannic Kilcher - ML paper reviews and "ML News" updates