Sampling variability, reliability, validity and uncertainty for forensic text comparison: From likelihood ratio to Bayes factor
In the likelihood ratio (LR) framework, the task of the forensic scientist is to provide an estimate of the weight of evidence to the court. When the evidential weight is quantified via a LR, an astute lawyer may ask whether the LR value would be similar if another set of samples were used from the relevant population to estimate the LR. This is the issue of sampling variability and is germane to the reliability (synonymous to precision) in comparison to the validity (synonymous to accuracy) of the source-identification system. This further raises concerns about uncertainty in the expert’s opinion.
There are mainly two approaches to this issue. One is to measure the (un)reliability and report it. Some metrics have been devised for this purpose. The other, which is based on Bayesian logic, is to incorporate the degree of the unreliability (or uncertainty) into the calculation of the LR. In this case, the output is a prior a Bayes factor (BF), not LR. In this study, a random sample of authors from a large database is used to quantify the fluctuation in the system performance as the function of the number of authors included in the database. System performance based on BFs is then compared with the system performance based on LRs in the light of the degree of (un)reliability. The implication arising from the experimental results will be discussed. Word unigrams are used to model each document. The scores are calculated with the system built on a Dirichlet-Multinomial model, and they are subsequently converted to LRs via logistic regression and to BFs via a Bayesian model. The system performance is assessed with respect to log-LR cost. The derived LRs and BFs are graphically charted.
Speaker
Shunichi Ishihara
Associate Professor in the Department of Linguistics, ANU School of Culture, History & Language