Skip to content

👑 Chemical Safety

Rank Model Fraction Correct
1 GPT-4 0.436
2 Claude-3.5 (Sonnet) 0.41
3 GPT-4o 0.386
4 Claude-3 (Opus) 0.342
5 Llama-3-70B-Instruct 0.261
6 Command-R+ 0.259
7 Phi-3-Medium-4k-Instruct 0.239
8 Claude-2 0.207
9 Claude-2-Zero-T 0.201
10 Gemini-Pro 0.174
11 Mistral-8x7b-Instruct 0.171
12 Llama-3-8B-Instruct 0.161
13 GPT-3.5 Turbo 0.152
14 Gemma-7b-Instruct 0.101
15 Galatica-120b 0.087

Leaderboard Plot

The following plot shows the leaderboard of the models based on the fraction of correctly answered questions. This fraction is calculated as the number of correct answers divided by the total number of answers. The leaderboard is sorted in descending order of the fraction correct.