Abstract:Assessment is a crucial part of education. Traditional marking is a source of inconsistencies and unconscious bias, placing a high cognitive load on the assessors. An approach to address these issues is comparative judgement (CJ). In CJ, the assessor is presented with a pair of items and is asked to select the better one. Following a series of comparisons, a rank is derived using a ranking model, for example, the BTM, based on the results. While CJ is considered a reliable method for marking, there are concerns around transparency, and the ideal number of pairwise comparisons to generate a reliable estimation of the rank order is not known. Additionally, there have been attempts to generate a method of selecting pairs that should be compared next in an informative manner, but some existing methods are known to have created their own bias within results inflating the reliability metric used. As a result, a random selection approach is usually deployed. We propose a novel Bayesian approach to CJ (BCJ) for determining the ranks of compared items alongside a new way to select the pairs to present to the marker(s) using active learning (AL), addressing the key shortcomings of traditional CJ. Furthermore, we demonstrate how the entire approach may provide transparency by providing the user insights into how it is making its decisions and, at the same time, being more efficient. Results from our experiments confirm that the proposed BCJ combined with entropy-driven AL pair-selection method is superior to other alternatives. We also find that the more comparisons done, the more accurate BCJ becomes, which solves the issue the current method has of the model deteriorating if too many comparisons are performed. As our approach can generate the complete predicted rank distribution for an item, we also show how this can be utilised in devising a predicted grade, guided by the assessor.

Active Bayesian Assessment for Black-Box Classifiers

Black-Box Batch Active Learning for Regression

Post-train Black-box Defense via Bayesian Boundary Correction

A Bayesian Active Learning Approach to Comparative Judgement

Approximate Bayesian Computation via Classification

Deep Bayesian Active Learning, A Brief Survey on Recent Advances

Uncertainty-aware Evaluation of Machine Learning Performance in binary Classification Tasks

Active Statistical Inference

A Systematic Comparison of Bayesian Deep Learning Robustness in Diabetic Retinopathy Tasks

Uncertainty Assessment-Based Active Learning for Reliable Fire Detection Systems

Re-Benchmarking Pool-Based Active Learning for Binary Classification

Online Performance Estimation with Unlabeled Data: A Bayesian Application of the Hui-Walter Paradigm

BSM loss: A superior way in modeling aleatory uncertainty of fine_grained classification

Active Learning of Bayesian Linear Models with High-Dimensional Binary Features by Parameter Confidence-Region Estimation

A Bayesian approach for the analysis of error rate studies in forensic science

Active Testing: Sample-Efficient Model Evaluation

BayesNetCNN: incorporating uncertainty in neural networks for image-based classification tasks

Label-Driven Learning Framework: Towards More Accurate Bayesian Network Classifiers Through Discrimination of High-Confidence Labels

Extracting Explanations, Justification, and Uncertainty from Black-Box Deep Neural Networks

Bayesian statistics guided label refurbishment mechanism: Mitigating label noise in medical image classification

Balancing Fairness and Accuracy in Data-Restricted Binary Classification