Stage 9-1 Level View
Secondary 2 Benchmark Scores
Per-level benchmark view grouped by generation model and subject. Scores are derived from the Stage 9-0 evaluator reports without changing the underlying scoring algorithm.
Showing all Secondary 2 subjects. Pick a subject to recalculate the LLM scores, low-score review list, and detailed rows for that subview.
LLM Summary
Average scores grouped by the model that generated the Secondary 2 content.
| Generation Model | Artifacts | Overall | Missing Images | Language | Syllabus | Answers | Notation | Timing |
|---|---|---|---|---|---|---|---|---|
| Claude Sonnet 4 | 42 | 8.8 | 24 | 9.4 | 8.9 | 8.9 | 9.7 | 8.3 |
| Legacy generator | 30 | 9.0 | 16 | 9.7 | 9.0 | 9.1 | 9.3 | 8.2 |
Subject Summary
Average scores grouped by subject inside Secondary 2.
| Subject | Artifacts | Overall | Missing Images | Language | Syllabus | Answers | Notation | Timing |
|---|---|---|---|---|---|---|---|---|
| Chinese | 4 | 7.9 | 1 | 7.4 | 7.8 | 8.0 | 10.0 | 8.1 |
| English | 12 | 9.0 | 2 | 9.5 | 9.0 | 9.0 | 10.0 | 9.1 |
| Geography | 12 | 9.3 | 11 | 9.6 | 10.0 | 9.0 | 10.0 | 8.8 |
| History | 12 | 8.6 | 11 | 9.5 | 9.1 | 8.8 | 10.0 | 6.0 |
| Malay | 4 | 8.9 | 1 | 9.4 | 9.1 | 8.8 | 10.0 | 8.9 |
| Mathematics | 14 | 8.6 | 5 | 9.9 | 7.6 | 9.4 | 8.7 | 8.2 |
| Science | 10 | 9.4 | 8 | 9.8 | 9.9 | 9.2 | 9.4 | 9.4 |
| Tamil | 4 | 7.9 | 1 | 9.0 | 8.1 | 8.9 | 10.0 | 7.2 |
LLM by Content Type
Model scores split by quizzes, papers, cheatsheets, and parent guides.
| Generation Model | Type | Artifacts | Overall | Missing Images | Language | Syllabus | Answers | Notation | Timing |
|---|---|---|---|---|---|---|---|---|---|
| Claude Sonnet 4 | Cheatsheet | 5 | 9.5 | 4 | 9.5 | 9.8 | - | 9.7 | - |
| Claude Sonnet 4 | Quiz | 37 | 8.7 | 20 | 9.3 | 8.8 | 8.9 | 9.7 | 8.3 |
| Legacy generator | Parents Guide | 5 | 9.9 | 0 | 9.8 | 9.8 | - | 10.0 | - |
| Legacy generator | Quiz | 25 | 8.8 | 16 | 9.6 | 8.8 | 9.1 | 9.2 | 8.2 |
Needs Review: Scores Below 8.0
Artifacts with overall benchmark scores below 8.0 for the current level view.
| Overall | Model | Subject | Type | Stage | Topic / Paper | Language | Syllabus | Template | Answers | Notation | Timing | Comments |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 4.9 | Claude Sonnet 4 | Chinese | Quiz | 5-1 | oral-listening | 2.0 | 3.0 | 4.0 | 5.0 | - | 8.0 | Critical failure: The quiz is entirely in English for a Chinese language subject. While the structure mimics a listening paper, the content lacks Chinese characters and Chinese text, making it useless for a Chinese Oral-Listening assessment. The difficulty is too low because it is essentially an English comprehension quiz. |
| 6.8 | Legacy generator | Mathematics | Quiz | 3-0 | calculus | 10.0 | 0.0 | 7.0 | 9.0 | 8.0 | 8.0 | The artifact fails syllabus adherence completely; calculus (limits, derivatives, product rule) is not in the Sec 2 G3 syllabus. While the content is high-quality for Additional Mathematics or Pre-U, it is inappropriate for the requested level. Notation is generally good but lacks full LaTeX formatting for all mathematical expressions. |
| 6.9 | Claude Sonnet 4 | Tamil | Quiz | 5-1 | comprehension | 9.0 | 7.0 | 6.0 | 8.0 | - | 6.0 | The content is significantly below Secondary 2 level; the passages are extremely short and the questions are too literal/simple. Exam format is incorrect: total marks (100) and duration (60m) do not align with the O-Level Paper 2 structure (70 marks, 1h 30m). Section D is redundant as it repeats Section C's logic. Marks per question are unnaturally uniform. |
| 6.9 | Legacy generator | History | Quiz | 3-0 | ancient-civilisations | 9.0 | 3.0 | 7.0 | 8.0 | 10.0 | 5.0 | Major syllabus misalignment: The Singapore Sec 2 History syllabus focuses on Singapore and Southeast Asian history (Temasek, British rule, Independence), whereas this quiz covers global Ancient Civilisations (Egypt, Mesopotamia, Greece) which is not part of the local curriculum. The question volume is also excessive for a single quiz, making it unlikely to be completed in a standard timeframe. Marks assigned are clear, but content is irrelevant to the specific syllabus provided. |
| 7.1 | Claude Sonnet 4 | Tamil | Quiz | 5-1 | composition | 9.0 | 7.5 | 6.0 | 8.5 | - | 5.0 | The artifact is a hybrid of a quiz and a worksheet rather than a formal exam paper. While the Tamil language quality is high, the structure deviates significantly from the O-Level Paper 1 format (which should be 1 Email + 1 Composition, not a multi-section quiz). The total marks (100) and duration (90 mins) do not align with the actual syllabus weighting for Paper 1. Difficulty is uneven: Section B/D are quite easy/pedagogical, while Section C is standard. The error correction in Section A3 is slightly weak as the errors are very subtle for a 5-mark question. |
| 7.3 | Claude Sonnet 4 | History | Quiz | 5-1 | conflict-international-relations | 9.5 | 9.0 | 8.5 | 8.5 | - | 2.0 | The quiz is extremely overloaded. 20 structured response questions plus a heavy source-based section is impossible to complete in 60 minutes. The depth of questions (e.g., 16-mark source evaluation) is more aligned with Upper Secondary/O-Level than Sec 2. Source C is a text placeholder for a missing image. |
| 7.4 | Claude Sonnet 4 | Mathematics | Quiz | 5-1 | calculus | 10.0 | 0.0 | 8.0 | 10.0 | 9.0 | 10.0 | Major syllabus violation: Calculus (differentiation and integration) is not part of the Secondary 2 G3 Mathematics syllabus in Singapore; it is an Upper Secondary Additional Mathematics topic. The content is far too advanced for the level. |
| 7.4 | Claude Sonnet 4 | Mathematics | Quiz | 5-1 | graphs-coordinate-geometry | 10.0 | 6.0 | 8.0 | 8.0 | 9.0 | 5.0 | Major syllabus misalignment: Section C introduces quadratic functions, differentiation (calculus), and vertex forms which are not in the Sec 2 G3 syllabus. Question 17 requires calculus. Question 20 has a mathematical error in the question setup. The difficulty is uneven and exceeds the level. |
| 7.7 | Legacy generator | Mathematics | Quiz | 3-0 | graphs-coordinate-geometry | 10.0 | 6.0 | 8.0 | 10.0 | 9.0 | 4.0 | Major syllabus misalignment: The quiz includes quadratic vertices, parabolas, completing the square, circles, and transformations, which are Secondary 3/4 topics, not Secondary 2. The question count (20) is too high for a 45-minute window given the complexity. Missing diagrams for questions 11 and 15. |
Content Type Summary
Average scores grouped by content type.
Detailed Benchmark Rows
Topics, quiz variants, paper versions, cheatsheets, and parent guides listed individually.
| Model | Type | Stage | Subject | Topic / Paper | Overall | Missing Images | Language | Syllabus | Template | Clean | Step Answers | Notation | Paper Format | Difficulty | Time Fit | 3-Point Summary | Parent Guide | Difficulty | Comments |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Claude Sonnet 4 | Cheatsheet | 2-7 | English | cheatsheet | 9.4 | No | 9.5 | 10.0 | - | 10.0 | - | - | - | 9.0 | - | 8.5 | - | appropriate | Excellent syllabus alignment covering all O-Level/Secondary 2 components. Content is well-structured with clear, actionable advice. While it uses bullet points rather than strict 'three-point summaries', the categorization is highly effective for a cheatsheet. |
| Claude Sonnet 4 | Cheatsheet | 2-7 | Geography | cheatsheet | 9.6 | Yes | 9.5 | 10.0 | - | 10.0 | - | 10.0 | - | 9.0 | - | 9.0 | - | appropriate | Excellent syllabus coverage including Singapore-specific context (Four National Taps, HDB). The cheatsheet uses effective bulleted summaries rather than generic text. Note: As a cheatsheet, it lacks the diagrams/maps essential for Geography (e.g., Hydrological cycle, rainforest layers) which should be noted for the user. |
| Claude Sonnet 4 | Cheatsheet | 2-7 | History | cheatsheet | 9.5 | Yes | 9.5 | 10.0 | - | 10.0 | - | - | - | 9.0 | - | 9.0 | - | appropriate | Excellent syllabus coverage including the post-1965 era which extends the provided context. Content is well-structured with concise bullet points. While it lacks diagrams/maps which are vital for History, the text-based summaries are high quality. The 'Source-Based Question Skills' section is a strong addition for this level. |
| Claude Sonnet 4 | Cheatsheet | 2-7 | Mathematics | cheatsheet | 9.3 | Yes | 9.5 | 9.0 | - | 10.0 | - | 9.0 | - | 9.0 | - | 9.5 | - | appropriate | High quality cheatsheet. Content aligns well with Sec 2 G3 syllabus, covering algebra, geometry, and mensuration. Uses effective three-point summaries for most topics. Notation is clean, though some geometric diagrams (e.g., cones, spheres, triangles) are missing which are essential for a math cheatsheet. |
| Claude Sonnet 4 | Cheatsheet | 2-7 | Science | cheatsheet | 9.5 | Yes | 9.5 | 10.0 | - | 10.0 | - | 10.0 | - | 9.0 | - | 8.5 | - | appropriate | High quality cheatsheet. Excellent syllabus coverage including G3 specific requirements like reverse osmosis, KE/GPE formulas, and electrical power. Uses clear bullet points, though some sections could benefit from more structured three-point summaries rather than long lists. Notation is correct. Missing diagrams for light rays and circuits which are essential for this level. |
| Claude Sonnet 4 | Quiz | 5-1 | Chinese | composition | 9.1 | No | 9.5 | 9.0 | 8.5 | 10.0 | 9.0 | 10.0 | 8.0 | 9.0 | 8.5 | - | - | appropriate | High quality quiz. Language and content are well-aligned with Sec 2 expectations. Section A correctly mimics O-Level language use formats. The exam format is slightly off in total marks (100 vs 60 for Paper 1) and duration (60 mins vs 120 mins), but acceptable for a topical quiz. Answer key is excellent with clear marking schemes. |
| Claude Sonnet 4 | Quiz | 5-1 | Chinese | comprehension | 8.7 | No | 9.0 | 9.5 | 8.5 | 10.0 | 9.0 | 10.0 | 7.0 | 8.0 | 7.0 | - | - | appropriate | Content aligns well with Sec 2 syllabus (Cloze, Word Replacement, Practical, Narrative). However, the exam format is slightly off: total marks (100) and duration (60 mins) do not match the O-Level Paper 2 standard (70 marks, 1h 30min). The marks assigned to individual comprehension questions (12 marks each) are unusually high for a single question in a standard paper. |
| Claude Sonnet 4 | Quiz | 5-1 | Chinese | language-use | 9.1 | No | 9.0 | 9.5 | 8.5 | 10.0 | 9.0 | 10.0 | 8.0 | 8.5 | 9.0 | - | - | appropriate | High quality quiz. Section B (Word Replacement) effectively targets colloquialisms vs formal language, which is key for Sec 2. Section C uses a realistic advertisement format. Marks and duration are well-defined, though the total marks (40) differ from the standard Paper 2 (70), which is acceptable for a topical quiz. Answer key provides good explanations. |
| Claude Sonnet 4 | Quiz | 5-1 | Chinese | oral-listening | 4.9 | Yes | 2.0 | 3.0 | 4.0 | 10.0 | 5.0 | - | 5.0 | 2.0 | 8.0 | - | - | too easy | Critical failure: The quiz is entirely in English for a Chinese language subject. While the structure mimics a listening paper, the content lacks Chinese characters and Chinese text, making it useless for a Chinese Oral-Listening assessment. The difficulty is too low because it is essentially an English comprehension quiz. |
| Claude Sonnet 4 | Quiz | 5-1 | English | argument-evaluation | 9.2 | No | 9.5 | 9.0 | 8.5 | 10.0 | 9.0 | 10.0 | 8.5 | 9.0 | 9.0 | - | - | appropriate | High quality quiz. Language and difficulty are well-calibrated for Sec 2. Question types align well with O-Level Paper 2 skills (literal, inferential, evaluative). Answer key provides excellent marking schemes. Note: The answer key was truncated at the end. |
| Claude Sonnet 4 | Quiz | 5-1 | English | composition-situational-writing | 8.9 | No | 9.5 | 9.0 | 7.0 | 10.0 | 9.0 | 10.0 | 8.0 | 8.5 | 9.0 | - | - | appropriate | The quiz is well-structured for Sec 2 level. However, it deviates from the O-Level/Syllabus format for Situational Writing, which typically requires a visual text (e.g., an advertisement or poster) to be analyzed. This quiz focuses more on general format and tone drills rather than the specific task of extracting information from a visual stimulus. The answer key is excellent and provides clear marking rubrics. |
| Claude Sonnet 4 | Quiz | 5-1 | English | comprehension | 9.0 | Yes | 9.5 | 9.0 | 8.5 | 10.0 | 9.0 | - | 8.0 | 9.0 | 9.0 | - | - | appropriate | Language and difficulty are well-calibrated for Sec 2. The visual text section relies on a text description rather than an actual image, which is a major drawback for a comprehension quiz. The answer key is high quality with clear marking schemes. Exam format is mostly correct but lacks the formal structure of a full O-Level Paper 2 (e.g., specific section headers like Section A/B/C as defined in the syllabus). |
| Claude Sonnet 4 | Quiz | 5-1 | English | language-use | 8.9 | No | 9.5 | 9.0 | 8.5 | 9.0 | 9.5 | - | 8.0 | 9.0 | 9.0 | - | - | appropriate | High quality quiz. Language and question types (inference, imagery, personification) align well with Sec 2 English standards. Answer key is excellent with detailed marking notes. Minor deduction for the truncated end of the answer key and lack of specific marks/minutes for a formal exam paper structure, though it works well as a classroom quiz. |
| Claude Sonnet 4 | Quiz | 5-1 | English | summary | 8.8 | No | 9.5 | 9.0 | 8.5 | 9.0 | 8.5 | - | 8.0 | 9.0 | 9.0 | - | - | appropriate | High quality quiz. Content is well-aligned with Sec 2 English summary and comprehension skills. Answer key provides excellent marking schemes and sample answers. Note: The answer key is truncated at the end. The exam format is slightly simplified compared to actual O-Level Paper 2 but suitable for a topical quiz. |
| Claude Sonnet 4 | Quiz | 5-1 | Geography | fieldwork | 9.1 | Yes | 9.5 | 10.0 | 8.5 | 9.0 | 9.0 | 10.0 | 8.0 | 9.0 | 9.0 | - | - | appropriate | High quality quiz that aligns well with the Sec 2 Geography fieldwork syllabus. The question types (Data Interpretation, Structured Response) match the exam context. Major issue is the presence of placeholder text for Figure 1 and Figure 2 which makes the quiz unusable without the actual images. The answer key is excellent, providing clear marking schemes and sample answers. |
| Claude Sonnet 4 | Quiz | 5-1 | Geography | human-geography | 9.4 | Yes | 10.0 | 10.0 | 9.0 | 10.0 | 9.0 | 10.0 | 9.0 | 9.0 | 9.0 | - | - | appropriate | High quality quiz. Content aligns perfectly with Sec 2 Human Geography syllabus (housing, transport, urbanization). Question types (Short Answer, Structured, Extended) mirror Singapore exam formats. Note: Images are represented by text descriptions rather than actual graphics, which is expected for LLM output but requires manual insertion. Answer key is excellent with clear marking schemes. |
| Claude Sonnet 4 | Quiz | 5-1 | Geography | map-graph-data-skills | 9.4 | Yes | 9.5 | 10.0 | 9.0 | 10.0 | 9.0 | 10.0 | 9.5 | 9.0 | 9.0 | - | - | appropriate | High quality quiz that aligns perfectly with the Sec 2 Geography syllabus (Map, Data, Fieldwork). The content is logically structured into sections. Major issue: The quiz is entirely dependent on visual stimuli (Figure 1, Figure 2, Photograph A, Figure 4) which are completely missing from the artifact, making it impossible to actually answer the questions. The answer key is well-structured with marking notes. |
| Claude Sonnet 4 | Quiz | 5-1 | Geography | physical-geography | 9.3 | Yes | 9.5 | 10.0 | 8.5 | 10.0 | 9.0 | 10.0 | 9.0 | 8.5 | 9.0 | - | - | appropriate | The quiz aligns well with the Sec 2 Physical Geography syllabus (Water and Tropical Ecosystems). However, it is heavily dependent on figures (Figure 1 to Figure 6) which are missing, making the quiz impossible to complete as written. The marking scheme is high quality with clear breakdown of marks. |
| Claude Sonnet 4 | Quiz | 5-1 | Geography | resources-sustainability | 9.1 | Yes | 9.5 | 10.0 | 8.5 | 9.0 | 9.0 | 10.0 | 8.0 | 9.0 | 8.5 | - | - | appropriate | High quality quiz aligned well with Sec 2 Geography syllabus (Water, Rainforests, Mangroves). Includes necessary skills like grid references and population density. Marks assigned per question. Major issue: Images are represented by text descriptions rather than actual graphics, making data interpretation questions (e.g., Fig 1, Fig 2) impossible to solve as a real exam. The total marks (60) and duration (60 mins) are standard, though the question count is quite high for the time allotted. |
| Claude Sonnet 4 | Quiz | 5-1 | History | ancient-civilisations | 8.3 | Yes | 9.5 | 8.5 | 8.0 | 10.0 | 9.0 | - | 8.5 | 7.0 | 6.0 | - | - | uneven | The quiz is well-structured for History skills (Source-Based Questions), but the total marks (100) and question count (20) are extremely high for a 60-minute Secondary 2 paper. The difficulty is uneven; Section A is manageable, but Section D contains too many high-mark questions for the timeframe. The answer key is truncated. |
| Claude Sonnet 4 | Quiz | 5-1 | History | conflict-international-relations | 7.3 | Yes | 9.5 | 9.0 | 8.5 | 9.0 | 8.5 | - | 8.0 | 4.0 | 2.0 | - | - | too hard | The quiz is extremely overloaded. 20 structured response questions plus a heavy source-based section is impossible to complete in 60 minutes. The depth of questions (e.g., 16-mark source evaluation) is more aligned with Upper Secondary/O-Level than Sec 2. Source C is a text placeholder for a missing image. |
| Claude Sonnet 4 | Quiz | 5-1 | History | essay-explanation | 8.9 | Yes | 9.5 | 10.0 | 9.0 | 10.0 | 8.5 | - | 9.0 | 8.5 | 7.0 | - | - | appropriate | High quality content that aligns well with the Sec 2 History syllabus. Source-based questions follow standard Singapore exam formats. However, the quiz is extremely long (20 questions for 60 minutes) and likely exceeds the realistic timeframe for a student to complete thoroughly. The answer key is truncated. |
| Claude Sonnet 4 | Quiz | 5-1 | History | singapore-southeast-asia | 8.9 | Yes | 9.5 | 9.0 | 8.5 | 10.0 | 9.0 | 10.0 | 8.0 | 8.5 | 7.5 | - | - | appropriate | High quality content. Source B is described but the actual image is missing. The quiz is quite long (20 questions for 60 mins) which might be tight for Sec 2 students given the depth of source-based and essay questions. Answer key is truncated. |
| Claude Sonnet 4 | Quiz | 5-1 | History | source-based-skills | 9.1 | Yes | 9.5 | 10.0 | 9.0 | 10.0 | 9.0 | - | 8.5 | 9.0 | 8.0 | - | - | appropriate | High quality content. The quiz accurately reflects Sec 2 History syllabus (Japanese Occupation, Merger, HDB). Major issue: Source B is described in text but the actual image is missing, which is critical for a visual source-based question. The marking scheme is excellent and provides clear descriptors. The total marks (100) and duration (60 mins) might be slightly ambitious for the depth of analysis required in Section B and C, but generally appropriate. |
| Claude Sonnet 4 | Quiz | 5-1 | Malay | composition | 8.9 | No | 9.5 | 9.0 | 7.0 | 10.0 | 9.0 | 10.0 | 8.0 | 8.5 | 9.0 | - | - | appropriate | High quality content. The quiz covers functional and essay writing well. However, it does not strictly follow the O-Level Paper 1 format which usually separates Situational and Continuous Writing into distinct, high-weightage tasks rather than a mixed-format quiz. The marks distribution is slightly unconventional for a formal exam but suitable for a classroom quiz. |
| Claude Sonnet 4 | Quiz | 5-1 | Malay | comprehension | 9.1 | No | 9.5 | 9.0 | 8.5 | 10.0 | 9.0 | 10.0 | 8.0 | 8.5 | 9.0 | - | - | appropriate | High quality quiz. Language is appropriate for Sec 2. Adheres well to the O-Level Malay Paper 2 structure (Reading, Language Use, Summary). Marks and instructions are clear. One minor error in Section B Q17: the proverb is 'Siapa makan lada, dia terasa pedas' or similar; 'Sepah yang menang jadi arang' is a malformed version of 'Siapa kalah jadi arang' or 'Siapa menang jadi arang' is not a standard proverb. Answer key provides good marking schemes. |
| Claude Sonnet 4 | Quiz | 5-1 | Malay | language-use | 8.7 | No | 9.0 | 9.5 | 7.0 | 10.0 | 8.0 | 10.0 | 7.5 | 8.5 | 9.0 | - | - | appropriate | The quiz aligns well with the Secondary 2 Malay syllabus, covering grammar, word classes, and proverbs. The marking scheme is detailed. However, the total marks (100) and question distribution do not strictly follow the O-Level Paper 2 structure (which is 90 marks and divided into specific sections). The instruction to use 'Bahasa Baku' is good, but the sample answer for question 19 uses Malaysian-style informal slang ('shopping mall', 'wayang') which might deviate from Singaporean Malay context, though acceptable for general language use. |
| Claude Sonnet 4 | Quiz | 5-1 | Malay | oral-listening | 9.1 | Yes | 9.5 | 9.0 | 8.5 | 10.0 | 9.0 | 10.0 | 8.0 | 9.0 | 8.5 | - | - | appropriate | The quiz aligns well with Secondary 2 Malay listening objectives. However, as a listening quiz, the actual audio scripts are missing, making it impossible to verify content accuracy. The marking scheme is clear and provides specific guidance for partial marks. The exam format is mostly correct, though it lacks the specific 'Paper' designation used in O-Level formats. |
| Claude Sonnet 4 | Quiz | 5-1 | Mathematics | algebra-functions | 9.8 | No | 10.0 | 10.0 | 9.0 | 10.0 | 10.0 | 10.0 | 9.5 | 9.5 | 10.0 | - | - | appropriate | High quality quiz. Excellent coverage of Sec 2 Algebra and Functions including proportionality and quadratics. Answer key includes useful marking schemes (M/A marks). Format is very close to standard Singapore school papers. |
| Claude Sonnet 4 | Quiz | 5-1 | Mathematics | calculus | 7.4 | No | 10.0 | 0.0 | 8.0 | 10.0 | 10.0 | 9.0 | 9.0 | 1.0 | 10.0 | - | - | too hard | Major syllabus violation: Calculus (differentiation and integration) is not part of the Secondary 2 G3 Mathematics syllabus in Singapore; it is an Upper Secondary Additional Mathematics topic. The content is far too advanced for the level. |
| Claude Sonnet 4 | Quiz | 5-1 | Mathematics | geometry-trigonometry | 8.0 | Yes | 10.0 | 7.0 | 8.0 | 9.0 | 9.0 | 6.0 | 9.0 | 5.0 | 9.0 | - | - | uneven | Major issue: The quiz includes Sine Rule and Cosine Rule (Questions 18, 20), which are typically Secondary 3/4 topics in the Singapore syllabus, making it too hard for Sec 2. Many geometry questions require diagrams to be solvable or clear, but no images are provided. Notation is inconsistent, using plain text instead of proper LaTeX for square roots and trigonometry. The answer key is generally good but contains a self-correction in Q3 and is truncated at the end. |
| Claude Sonnet 4 | Quiz | 5-1 | Mathematics | graphs-coordinate-geometry | 7.4 | Yes | 10.0 | 6.0 | 8.0 | 9.0 | 8.0 | 9.0 | 9.0 | 3.0 | 5.0 | - | - | too hard | Major syllabus misalignment: Section C introduces quadratic functions, differentiation (calculus), and vertex forms which are not in the Sec 2 G3 syllabus. Question 17 requires calculus. Question 20 has a mathematical error in the question setup. The difficulty is uneven and exceeds the level. |
| Claude Sonnet 4 | Quiz | 5-1 | Mathematics | numbers-ratio-proportion | 9.0 | No | 10.0 | 9.5 | 8.5 | 10.0 | 9.0 | 10.0 | 9.0 | 7.0 | 8.0 | - | - | uneven | The quiz is generally high quality but has uneven difficulty. Section A is very basic, while Section C contains high-level problems (e.g., question 20) that result in irrational numbers, which might be unexpected for a standard Sec 2 quiz. Question 20 also contains a calculation error in the answer key's logic regarding integer solutions. The marks assigned to Section A are slightly high for the simplicity of the questions. |
| Claude Sonnet 4 | Quiz | 5-1 | Mathematics | statistics-probability | 9.6 | No | 10.0 | 10.0 | 9.0 | 10.0 | 9.5 | 10.0 | 9.5 | 8.5 | 10.0 | - | - | appropriate | High quality quiz. Follows Singapore secondary math structure well. Section A is quite easy, but Section C provides good rigor. Note: Question 15 has a calculation error in the answer key (midpoints sum to 763, but 763/34 is 22.4, not 21.8). Marking scheme is detailed with M/A/B marks. |
| Claude Sonnet 4 | Quiz | 5-1 | Science | chemistry-materials | 9.3 | Yes | 10.0 | 10.0 | 8.5 | 10.0 | 9.0 | 10.0 | 8.0 | 9.0 | 9.0 | - | - | appropriate | High quality quiz. Content aligns well with G3 Chemistry and Materials syllabus. Major issue: several questions (7, 8b, 14) rely on diagrams that are only provided as text descriptions/placeholders, making the quiz unusable without actual images. Exam format is mostly correct but lacks a formal 'Total Marks' summary at the start and specific section-wise mark totals in the header. |
| Claude Sonnet 4 | Quiz | 5-1 | Science | life-sciences | 9.2 | Yes | 10.0 | 10.0 | 8.0 | 10.0 | 9.0 | 10.0 | 8.0 | 9.0 | 9.0 | - | - | appropriate | High quality quiz. Content aligns well with Sec 2 Life Sciences syllabus. Major issue: several questions (12, 15) rely on diagrams that are only described via text placeholders, making the quiz unusable without the actual images. Marks assigned per question are clear, though Section A marks are slightly inconsistent with standard MCQ weighting (usually 1 mark each). Answer key is excellent and provides necessary scientific reasoning. |
| Claude Sonnet 4 | Quiz | 5-1 | Science | physical-sciences | 9.1 | Yes | 9.5 | 10.0 | 8.5 | 9.0 | 9.5 | 8.0 | 9.0 | 9.0 | 9.5 | - | - | appropriate | High quality quiz. Content aligns perfectly with G3 Physical Sciences syllabus (Energy, Forces, Light, Electricity). Answer key provides excellent mark allocation. Note: Uses plain text for math instead of LaTeX, and contains placeholder text for diagrams which must be replaced with actual images. |
| Claude Sonnet 4 | Quiz | 5-1 | Science | scientific-inquiry | 9.4 | Yes | 10.0 | 10.0 | 9.0 | 10.0 | 9.0 | 10.0 | 9.0 | 9.0 | 9.0 | - | - | appropriate | High quality quiz. Content aligns perfectly with G3 Scientific Inquiry syllabus. Note: Question 7 refers to 'Figure 1' which is missing, and Question 9 refers to 'groups as shown below' but uses a table instead of a diagram. Answer key is excellent with clear marking schemes. |
| Claude Sonnet 4 | Quiz | 5-1 | Tamil | composition | 7.1 | No | 9.0 | 7.5 | 6.0 | 10.0 | 8.5 | - | 5.0 | 6.0 | 5.0 | - | - | uneven | The artifact is a hybrid of a quiz and a worksheet rather than a formal exam paper. While the Tamil language quality is high, the structure deviates significantly from the O-Level Paper 1 format (which should be 1 Email + 1 Composition, not a multi-section quiz). The total marks (100) and duration (90 mins) do not align with the actual syllabus weighting for Paper 1. Difficulty is uneven: Section B/D are quite easy/pedagogical, while Section C is standard. The error correction in Section A3 is slightly weak as the errors are very subtle for a 5-mark question. |
| Claude Sonnet 4 | Quiz | 5-1 | Tamil | comprehension | 6.9 | No | 9.0 | 7.0 | 6.0 | 10.0 | 8.0 | - | 5.0 | 4.0 | 6.0 | - | - | too easy | The content is significantly below Secondary 2 level; the passages are extremely short and the questions are too literal/simple. Exam format is incorrect: total marks (100) and duration (60m) do not align with the O-Level Paper 2 structure (70 marks, 1h 30m). Section D is redundant as it repeats Section C's logic. Marks per question are unnaturally uniform. |
| Claude Sonnet 4 | Quiz | 5-1 | Tamil | language-use | 9.2 | No | 9.0 | 9.5 | 8.5 | 10.0 | 10.0 | 10.0 | 8.0 | 8.5 | 9.0 | - | - | appropriate | The quiz aligns well with the O-Level Tamil syllabus, specifically targeting idioms, compound words, and error correction. The answer key is excellent, providing clear linguistic explanations. The exam format is mostly correct, though it lacks the specific weighting/component breakdown seen in official papers. Difficulty is appropriate for Secondary 2. |
| Claude Sonnet 4 | Quiz | 5-1 | Tamil | oral-listening | 8.5 | Yes | 9.0 | 8.5 | 7.0 | 10.0 | 9.0 | - | 8.0 | 7.5 | 9.0 | - | - | appropriate | The quiz structure aligns well with the Listening Comprehension component of the syllabus. However, it is fundamentally broken as an artifact because it refers to 7 audio clips that do not exist; without the actual audio scripts or files, the quiz is unusable. The question types (MCQ and open-ended) are appropriate for Sec 2. The answer key provides good explanations and marking guidelines. |
| Legacy generator | Parents Guide | 2-9 | English | parents-guide | 10.0 | No | 10.0 | 10.0 | - | 10.0 | - | - | - | 10.0 | - | - | 10.0 | appropriate | Excellent parent guide. Highly accurate to the O-Level 1184 syllabus and Sec 2 transition. Provides practical, actionable advice for parents without being overwhelming. |
| Legacy generator | Parents Guide | 2-9 | Geography | parents-guide | 10.0 | No | 10.0 | 10.0 | - | 10.0 | - | - | - | 10.0 | - | - | 10.0 | appropriate | Excellent parent guide. Highly aligned with the Sec 2 Geography syllabus, specifically covering water, rainforests/mangroves, housing, and transport. Provides practical, actionable advice for parents and accurately reflects the weightage of map/data skills in Singapore assessments. |
| Legacy generator | Parents Guide | 2-9 | History | parents-guide | 9.8 | No | 9.5 | 10.0 | - | 10.0 | - | - | - | 9.5 | - | - | 10.0 | appropriate | Excellent parent guide. It accurately reflects the Sec 2 History syllabus, specifically the shift from memorization to source-based analysis. The breakdown of units and the 'Common Struggles' section are highly practical for parents. No markdown errors or broken symbols found. |
| Legacy generator | Parents Guide | 2-9 | Mathematics | parents-guide | 9.5 | No | 9.5 | 9.0 | - | 10.0 | - | 10.0 | - | 9.0 | - | - | 9.5 | appropriate | High quality guide. Language is professional and empathetic to parents. Content accurately reflects the Sec 2 G3 syllabus, specifically mentioning quadratic functions, trigonometry, and algebraic fractions. No broken markdown or notation issues found. |
| Legacy generator | Parents Guide | 2-9 | Science | parents-guide | 10.0 | No | 10.0 | 10.0 | - | 10.0 | - | 10.0 | - | 10.0 | - | - | 10.0 | appropriate | Excellent parent guide. It accurately reflects the G3 syllabus, specifically mentioning required mathematical formulas (KE, GPE, Power, Energy) and specific topics like reverse osmosis. The breakdown of question types and assessment objectives is highly relevant to the Singapore context. |
| Legacy generator | Quiz | 3-0 | English | argument-evaluation | 9.2 | No | 9.5 | 9.0 | 8.5 | 10.0 | 9.0 | 10.0 | 9.0 | 9.0 | 9.0 | - | - | appropriate | High quality quiz. Content aligns well with Sec 2 Argument & Evaluation skills. Answer key provides clear marking schemes and sample responses. Format is professional and follows standard exam structures. |
| Legacy generator | Quiz | 3-0 | English | composition-situational-writing | 8.1 | Yes | 9.0 | 7.0 | 5.0 | 10.0 | 9.0 | 10.0 | 8.0 | 6.0 | 9.0 | - | - | too easy | The quiz focuses on micro-skills (writing single sentences/lines) rather than the actual situational writing task required by the syllabus (writing a full 250-350 word text based on a visual). It lacks the essential visual stimulus/text required for Section B of Paper 1. While good for drill practice, it does not simulate the exam format accurately. |
| Legacy generator | Quiz | 3-0 | English | comprehension | 9.2 | No | 9.5 | 9.0 | 8.5 | 10.0 | 9.0 | 10.0 | 8.5 | 9.0 | 9.0 | - | - | appropriate | High quality quiz. Language and question types (literal, inferential, evaluative, summary) align well with Sec 2/O-Level standards. Answer key includes helpful marking schemes. Minor deduction for lack of visual text in Section B which is common in Paper 2, though the poster text serves a similar purpose. |
| Legacy generator | Quiz | 3-0 | English | language-use | 9.2 | No | 9.5 | 9.0 | 8.5 | 10.0 | 9.0 | 10.0 | 8.5 | 9.0 | 9.5 | - | - | appropriate | High quality quiz. Content aligns well with Sec 2 English Language Use and Comprehension skills. Question types (literal, inferential, vocabulary in context) match O-Level style. Answer key is excellent, providing clear marking points and alternative answers. Exam format is mostly correct, though lacks specific marks/minutes for a formal paper header, but suitable for a quiz. Timing is realistic for the number of questions. |
| Legacy generator | Quiz | 3-0 | English | summary | 8.7 | No | 9.5 | 8.5 | 7.0 | 10.0 | 9.0 | 10.0 | 7.5 | 8.0 | 9.0 | - | - | appropriate | Language and difficulty are well-calibrated for Sec 2. The summary question (Q18) is slightly under-weighted at 3 marks for a 20-mark section; typically, a summary task in O-Level style is a standalone high-value component. Exam format lacks specific paper/component headers found in actual papers, but instructions and marks are clear. |
| Legacy generator | Quiz | 3-0 | Geography | fieldwork | 9.3 | Yes | 9.5 | 10.0 | 9.0 | 8.5 | 9.5 | 10.0 | 9.0 | 9.0 | 9.0 | - | - | appropriate | High quality quiz. Content aligns perfectly with Sec 2 Fieldwork syllabus. Major issue: Figure 1 and Figure 2 are missing actual images, replaced by text descriptions. The answer key is excellent with clear mark schemes, though the final answer for Q20 is truncated. |
| Legacy generator | Quiz | 3-0 | Geography | human-geography | 9.4 | Yes | 9.5 | 10.0 | 9.0 | 10.0 | 9.0 | - | 9.5 | 9.0 | 9.0 | - | - | appropriate | High quality quiz that aligns well with Sec 2 Human Geography syllabus. The structure follows exam patterns (Map skills, Urban, Population). Major issue: The quiz is entirely dependent on figures (Figure 1 to Figure 8) which are missing, making it impossible to answer. Answer key is excellent and provides clear marking schemes. |
| Legacy generator | Quiz | 3-0 | Geography | map-graph-data-skills | 9.1 | Yes | 9.5 | 10.0 | 9.0 | 8.0 | 9.0 | 10.0 | 9.0 | 9.0 | 8.5 | - | - | appropriate | High quality quiz that aligns well with Sec 2 Geography syllabus (Map skills, Urbanization, Fieldwork). Major issue: The artifact is entirely dependent on external visual stimuli (topographic maps, population graphs, photographs) which are not provided in the text, making the quiz impossible to attempt in its current state. The answer key is excellent and provides clear marking schemes. |
| Legacy generator | Quiz | 3-0 | Geography | physical-geography | 9.3 | Yes | 9.5 | 10.0 | 8.5 | 10.0 | 9.0 | 10.0 | 9.0 | 9.0 | 9.0 | - | - | appropriate | High quality quiz that aligns well with the Sec 2 Geography syllabus. The content covers water systems, ecosystems, and climate effectively. Major issue: The quiz relies heavily on figures (Figure 1 to Figure 8) and photographs (A to C) which are entirely missing from the artifact, making the questions impossible to answer. The answer key is excellent and provides clear marking schemes. |
| Legacy generator | Quiz | 3-0 | Geography | resources-sustainability | 9.2 | Yes | 9.5 | 10.0 | 9.0 | 10.0 | 8.5 | 10.0 | 9.0 | 8.5 | 8.0 | - | - | appropriate | High quality quiz that aligns well with the Sec 2 Geography syllabus (Water, Rainforests, Mangroves). Major issue: The quiz relies heavily on figures (Figure 1, 2, 3, 4) and a map extract which are completely missing from the artifact, making it impossible to answer. The answer key is well-structured but assumes data that isn't present. Difficulty is appropriate for G2/G3 levels. |
| Legacy generator | Quiz | 3-0 | History | ancient-civilisations | 6.9 | Yes | 9.0 | 3.0 | 7.0 | 10.0 | 8.0 | 10.0 | 6.0 | 4.0 | 5.0 | - | - | uneven | Major syllabus misalignment: The Singapore Sec 2 History syllabus focuses on Singapore and Southeast Asian history (Temasek, British rule, Independence), whereas this quiz covers global Ancient Civilisations (Egypt, Mesopotamia, Greece) which is not part of the local curriculum. The question volume is also excessive for a single quiz, making it unlikely to be completed in a standard timeframe. Marks assigned are clear, but content is irrelevant to the specific syllabus provided. |
| Legacy generator | Quiz | 3-0 | History | conflict-international-relations | 8.2 | Yes | 9.5 | 10.0 | 9.0 | 8.0 | 9.0 | - | 8.5 | 7.0 | 5.0 | - | - | uneven | The quiz is truncated at the end. While the content is syllabus-accurate, the difficulty is uneven: Section A is high-quality source-based inquiry, but Section D contains too many high-level essay questions for a 60-minute Sec 2 quiz. The total marks (100) and question count (20) are unrealistic for a 60-minute timeframe at this level. |
| Legacy generator | Quiz | 3-0 | History | essay-explanation | 9.1 | Yes | 9.5 | 10.0 | 9.0 | 10.0 | 9.0 | - | 9.5 | 8.5 | 7.0 | - | - | appropriate | High quality content. Syllabus alignment is excellent. Major issue: The quiz is far too long for a 60-minute timeframe; 20 questions including 8 intensive source-based questions is unrealistic. Also, Source B and D rely on visual descriptions rather than actual images, which is a placeholder issue. |
| Legacy generator | Quiz | 3-0 | History | singapore-southeast-asia | 8.2 | Yes | 9.5 | 10.0 | 8.5 | 10.0 | 9.0 | - | 9.0 | 6.0 | 4.0 | - | - | uneven | The quiz is extremely long for a 60-minute paper; 20 structured response questions plus source-based questions is unrealistic for Sec 2. Content is highly syllabus-aligned. Source B is described but the image is missing. Answer key is high quality but truncated. |
| Legacy generator | Quiz | 3-0 | History | source-based-skills | 9.1 | Yes | 9.5 | 10.0 | 9.0 | 8.0 | 9.0 | - | 9.5 | 9.0 | 8.5 | - | - | appropriate | High quality quiz that aligns well with Sec 2 History syllabus and source-based skill requirements. Major issue: Source B is described as a political cartoon but the image is missing, making the question impossible to answer visually. The answer key is truncated at the end. Timing is slightly tight for 20 high-level source-based questions in 60 minutes. |
| Legacy generator | Quiz | 3-0 | Mathematics | algebra-functions | 8.7 | No | 10.0 | 9.5 | 8.0 | 10.0 | 9.0 | 10.0 | 8.5 | 7.0 | 6.0 | - | - | uneven | The quiz contains a high volume of questions (20) for a 45-minute duration, making it unrealistic for most students. Difficulty is uneven; it mixes very basic Sec 1 level algebra with more complex Sec 2 topics like inverse proportion and simultaneous equations. The answer key for Q17 contains internal self-correction notes which should be cleaned up. |
| Legacy generator | Quiz | 3-0 | Mathematics | calculus | 6.8 | No | 10.0 | 0.0 | 7.0 | 10.0 | 9.0 | 8.0 | 9.0 | 0.0 | 8.0 | - | - | too hard | The artifact fails syllabus adherence completely; calculus (limits, derivatives, product rule) is not in the Sec 2 G3 syllabus. While the content is high-quality for Additional Mathematics or Pre-U, it is inappropriate for the requested level. Notation is generally good but lacks full LaTeX formatting for all mathematical expressions. |
| Legacy generator | Quiz | 3-0 | Mathematics | geometry-trigonometry | 9.2 | Yes | 10.0 | 10.0 | 8.0 | 10.0 | 9.5 | 7.0 | 9.0 | 9.0 | 10.0 | - | - | appropriate | The quiz is well-structured and follows the Sec 2 Geometry/Trigonometry syllabus accurately. However, several questions (e.g., Q18, Q19) explicitly refer to 'the diagram below' or 'the diagram', but no images or placeholders are provided, making them impossible to solve as intended. LaTeX usage is inconsistent; some mathematical expressions use plain text instead of proper LaTeX formatting. |
| Legacy generator | Quiz | 3-0 | Mathematics | graphs-coordinate-geometry | 7.7 | Yes | 10.0 | 6.0 | 8.0 | 10.0 | 10.0 | 9.0 | 9.0 | 3.0 | 4.0 | - | - | too hard | Major syllabus misalignment: The quiz includes quadratic vertices, parabolas, completing the square, circles, and transformations, which are Secondary 3/4 topics, not Secondary 2. The question count (20) is too high for a 45-minute window given the complexity. Missing diagrams for questions 11 and 15. |
| Legacy generator | Quiz | 3-0 | Mathematics | numbers-ratio-proportion | 9.6 | No | 10.0 | 10.0 | 9.0 | 10.0 | 10.0 | 10.0 | 9.0 | 9.5 | 9.0 | - | - | appropriate | High quality quiz. Content aligns perfectly with Sec 2 G3 syllabus for ratio and proportion. Includes varied question types from basic to complex applications. Answer key provides excellent step-by-step working and mark schemes. Minor note: Section D questions are quite dense for a 45-minute window, but generally manageable. |
| Legacy generator | Quiz | 3-0 | Mathematics | statistics-probability | 8.9 | No | 10.0 | 10.0 | 8.0 | 10.0 | 10.0 | 5.0 | 9.0 | 8.5 | 10.0 | - | - | appropriate | Content is syllabus-accurate and well-structured. The answer key provides excellent step-by-step working. Major issue: lack of LaTeX for mathematical expressions (e.g., fractions and probabilities are written in plain text), which is standard for Singapore math papers. Difficulty is well-balanced for Sec 2. |
| Legacy generator | Quiz | 3-0 | Science | chemistry-materials | 9.3 | Yes | 10.0 | 10.0 | 9.0 | 10.0 | 9.0 | 8.0 | 9.0 | 9.0 | 10.0 | - | - | appropriate | High quality quiz. Adheres well to G3 syllabus. Major issue: Question 14 requires diagrams for particle arrangements which are missing. Notation for chemical equations is plain text rather than LaTeX. Answer key is excellent with clear marking schemes. |
| Legacy generator | Quiz | 3-0 | Science | life-sciences | 8.9 | Yes | 9.5 | 9.0 | 8.5 | 10.0 | 9.0 | 10.0 | 9.0 | 6.0 | 9.0 | - | - | too easy | The quiz is significantly too easy for Secondary 2 G3 level; most questions are basic recall/definition rather than application or data interpretation. A major diagram placeholder is present in Section C. Answer key is high quality with clear marking schemes. |
| Legacy generator | Quiz | 3-0 | Science | physical-sciences | 9.4 | Yes | 10.0 | 10.0 | 9.0 | 10.0 | 10.0 | 8.0 | 9.0 | 9.0 | 10.0 | - | - | appropriate | High quality quiz. Adheres well to G3 syllabus for Sec 2. Includes necessary calculations for energy, work, and electricity. Missing diagrams for pendulum and circuit are noted via placeholders. Notation is mostly good, though some chemical formulas could use more consistent LaTeX formatting. |
| Legacy generator | Quiz | 3-0 | Science | scientific-inquiry | 9.6 | No | 10.0 | 10.0 | 9.0 | 10.0 | 9.0 | 10.0 | 9.0 | 9.0 | 10.0 | - | - | appropriate | High quality quiz. Content aligns perfectly with G3 Scientific Inquiry syllabus (variables, errors, density, reaction rates). Question structure and marking scheme are professional. Minor note: Question 11 answer key is slightly ambiguous regarding the comparison logic, but the content is solid. |
Criteria
Scores use 10.0 as best fit. Missing images are tracked as a yes/no flag.
Language suitability
Syllabus adherence
Past-paper template adherence
No weird artefacts/symbols
Step-by-step answers
Latex/notation format
Exam paper format
Difficulty appropriateness
Doable within timeframe
Cheatsheet 3-point summaries
Parent guide syllabus fit