From Real Exams Exam Paper

Primary 5 Science Semestral Assessment 2 (End of Year) Paper 5

Free Exam-Derived NVIDIA Nemotron 3 Ultra 550B A55B Free Primary 5 Science Semestral Assessment 2 (End of Year) Paper 5 practice paper with questions and answers for Singapore students. This page is rendered as a direct URL so the questions and answers can be discovered without pressing in-page buttons.

These static practice materials are generated from the site's syllabus and paper-generation workflow, with source and model context shown so students and parents can evaluate the material before use.

Primary 5 Science From Real Exams Generated by NVIDIA Nemotron 3 Ultra 550B A55B Free Updated 2026-06-07

Questions

<!-- TuitionGoWhere generation metadata: stage=3-1; model=nvidia/nemotron-3-ultra-550b-a55b:free; model_label=NVIDIA Nemotron 3 Ultra 550B A55B Free; generated=2026-06-05; Sources: Stage 2-1 real exam-derived templates and Stage 2-2 exam-enriched syllabus. -->

Stage 3: Comprehensive Assessment

Instructions: Answer all questions. Each question is worth 10 points. Total: 100 points. Time limit: 90 minutes.


Section A: Multiple Choice (30 points)

Question 1 (10 points)

Which of the following best describes the primary purpose of a confusion matrix in machine learning evaluation?

A) To visualize the training loss over epochs
B) To summarize the performance of a classification model by showing true positives, false positives, true negatives, and false negatives
C) To calculate the mean squared error of a regression model
D) To determine the optimal learning rate for gradient descent

Question 2 (10 points)

In the context of cross-validation, what is the main advantage of k-fold cross-validation over a simple train-test split?

A) It reduces the computational cost of training
B) It provides a more robust estimate of model performance by using multiple train-test splits
C) It eliminates the need for a validation set
D) It guarantees the model will not overfit

Question 3 (10 points)

Which regularization technique adds a penalty equal to the absolute value of the magnitude of coefficients?

A) Ridge Regression (L2)
B) Lasso Regression (L1)
C) Elastic Net
D) Dropout


Section B: Short Answer (40 points)

Question 4 (10 points)

Explain the bias-variance tradeoff in your own words. Provide an example of a high-bias model and a high-variance model.

Question 5 (10 points)

Describe the difference between batch gradient descent, stochastic gradient descent (SGD), and mini-batch gradient descent. What are the pros and cons of each?

Question 6 (10 points)

What is feature scaling, and why is it important for algorithms like k-Nearest Neighbors (k-NN) and Support Vector Machines (SVM)? Name two common feature scaling techniques.

Question 7 (10 points)

Define precision, recall, and F1-score. When would you prioritize precision over recall? Give a real-world example.


Section C: Practical Application (30 points)

Question 8 (15 points)

You are building a model to detect fraudulent credit card transactions. The dataset is highly imbalanced (99.8% legitimate, 0.2% fraud).

a) Why is accuracy a misleading metric here?
b) Which evaluation metrics would you use instead? Justify your choices.
c) List three techniques to handle class imbalance during training.

Question 9 (15 points)

Given the following confusion matrix for a binary classification model:

Predicted PositivePredicted Negative
Actual Positive8515
Actual Negative1090

Calculate:

  • Accuracy
  • Precision
  • Recall (Sensitivity)
  • Specificity
  • F1-Score

Show all work.


End of Exam

Answers

<!-- TuitionGoWhere generation metadata: stage=3-1; model=nvidia/nemotron-3-ultra-550b-a55b:free; model_label=NVIDIA Nemotron 3 Ultra 550B A55B Free; generated=2026-06-05; Sources: Stage 2-1 real exam-derived templates and Stage 2-2 exam-enriched syllabus. -->

Stage 3: Comprehensive Assessment - Answer Key


Section A: Multiple Choice

Question 1

Answer: B
A confusion matrix summarizes classification model performance by displaying true positives, false positives, true negatives, and false negatives.

Question 2

Answer: B
k-fold cross-validation provides a more robust performance estimate by averaging results across k different train-test splits, reducing variance in the evaluation.

Question 3

Answer: B
Lasso Regression (L1) adds a penalty equal to the absolute value of coefficient magnitudes, which can drive some coefficients to exactly zero (feature selection).


Section B: Short Answer

Question 4: Bias-Variance Tradeoff

Bias is error from overly simplistic assumptions (underfitting). Variance is error from sensitivity to training data fluctuations (overfitting). The tradeoff: reducing one often increases the other.

  • High-bias example: Linear regression on highly non-linear data (e.g., fitting a line to a parabola).
  • High-variance example: Deep decision tree with no depth limit on noisy data (memorizes training set).

Question 5: Gradient Descent Variants

MethodDescriptionProsCons
Batch GDUses entire dataset per updateStable convergence; exact gradientSlow for large datasets; memory intensive
SGDUses one sample per updateFast updates; escapes local minima; online learningNoisy path; may not converge exactly; needs learning rate decay
Mini-batch GDUses small batch (e.g., 32–512)Balanced speed/stability; vectorized; GPU-friendlyRequires batch size tuning; extra hyperparameter

Question 6: Feature Scaling

Feature scaling standardizes feature ranges so no single feature dominates due to scale.

Importance for k-NN & SVM: Both rely on distance calculations (Euclidean distance for k-NN; margin optimization for SVM). Unscaled features with large ranges distort distances and margins.

Two techniques:

  1. Standardization (Z-score): ( x' = \frac{x - \mu}{\sigma} ) — zero mean, unit variance.
  2. Min-Max Normalization: ( x' = \frac{x - x_{min}}{x_{max} - x_{min}} ) — scales to [0, 1].

Question 7: Precision, Recall, F1-Score

  • Precision = TP / (TP + FP) — of predicted positives, how many are actually positive?
  • Recall (Sensitivity) = TP / (TP + FN) — of actual positives, how many were correctly predicted?
  • F1-Score = 2 × (Precision × Recall) / (Precision + Recall) — harmonic mean.

Prioritize precision over recall when false positives are costly.

Example: Spam detection — better to let some spam through (low recall) than to mark important emails as spam (high precision required).


Section C: Practical Application

Question 8: Fraud Detection (Imbalanced Data)

a) Why accuracy is misleading:
A model predicting "legitimate" for all transactions achieves 99.8% accuracy but detects zero fraud. Accuracy ignores class distribution.

b) Better metrics:

  • Precision — minimize false alarms (legitimate transactions flagged as fraud).
  • Recall — catch as many fraud cases as possible.
  • F1-Score — balance precision/recall.
  • AUC-ROC / AUC-PR — evaluate ranking performance across thresholds; PR-AUC preferred for severe imbalance.

c) Techniques for class imbalance:

  1. Resampling: Oversample minority (SMOTE) or undersample majority.
  2. Class weights: Assign higher loss weight to minority class (e.g., class_weight='balanced' in sklearn).
  3. Anomaly detection: Treat fraud as outlier detection (Isolation Forest, One-Class SVM).
  4. Threshold tuning: Lower decision threshold to increase recall.
  5. Ensemble methods: BalancedRandomForest, EasyEnsemble.

Question 9: Confusion Matrix Calculations

Given:

  • TP = 85
  • FN = 15
  • FP = 10
  • TN = 90
  • Total = 200

Accuracy = (TP + TN) / Total = (85 + 90) / 200 = 175 / 200 = 0.875 (87.5%)

Precision = TP / (TP + FP) = 85 / (85 + 10) = 85 / 95 ≈ 0.8947 (89.47%)

Recall (Sensitivity) = TP / (TP + FN) = 85 / (85 + 15) = 85 / 100 = 0.85 (85%)

Specificity = TN / (TN + FP) = 90 / (90 + 10) = 90 / 100 = 0.90 (90%)

F1-Score = 2 × (Precision × Recall) / (Precision + Recall)
= 2 × (0.8947 × 0.85) / (0.8947 + 0.85)
= 2 × 0.7605 / 1.7447
0.8718 (87.18%)


End of Answer Key