Here are all the actual test exam dumps for IT exams. Most people prepare for the actual exams with our test dumps to pass their exams. So it's critical to choose and actual test pdf to succeed.

Exam NCA-GENL Topic 2 Question 54 Discussion

Actual exam question for NVIDIA's NCA-GENL exam
Question #: 54
Topic #: 2
When designing an experiment to compare the performance of two LLMs on a question-answering task, which statistical test is most appropriate to determine if the difference in their accuracy is significant, assuming the data follows a normal distribution?

Suggested Answer: B Vote an answer

The paired t-test is the most appropriate statistical test to compare the performance (e.g., accuracy) of two large language models (LLMs) on the same question-answering dataset, assuming the data follows a normal distribution. This test evaluates whether the mean difference in paired observations (e.g., accuracy on each question) is statistically significant. NVIDIA's documentation on model evaluation in NeMo suggests using paired statistical tests for comparing model performance on identical datasets to account for correlated errors.
Option A (Chi-squared test) is for categorical data, not continuous metrics like accuracy. Option C (Mann- Whitney U test) is non-parametric and used for non-normal data. Option D (ANOVA) is for comparing more than two groups, not two models.
References:
NVIDIA NeMo Documentation: https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/nlp/model_finetuning.html

by Douglas at Mar 30, 2026, 02:57 AM

Comments

Chosen Answer:
This is a voting comment (?) , you can switch to a simple comment.
Switch to a voting comment New
Nick name: Submit Cancel
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.