Bayesian modelling and analysis of ranked data

Johnson, Stephen Richard

Please use this identifier to cite or link to this item: http://theses.ncl.ac.uk/jspui/handle/10443/4504

Title:	Bayesian modelling and analysis of ranked data
Authors:	Johnson, Stephen Richard
Issue Date:	2019
Publisher:	Newcastle University
Abstract:	Ranked data are central to many applications in science and social science and arise when rankers (individuals) use some criterion to order a set of entities. Such rankings are therefore equivalent to permutations of the elements of a set. The majority of models for ranked data rely on a strong assumption of homogeneity, such as all rankers sharing the same view on preferences of the entities. The aim of this thesis is to develop a richer class of models which can reveal any plausible subgroup structure within the data both for rankers and entities. We begin by looking at the Plackett–Luce model, an extension of the Bradley–Terry model for paired comparisons. First this model is extended to cater for when rankers do not report a full ranking of all entities. For example, they might only report their top five ranked entities after seeing some or all entities. Another issue is that most work in this area assumes that all rankers are equally informed about the entities they are ranking. Often this assumption will be questionable and so we develop a model which allows rankers to have differing reliability. This model, the Weighted Plackett–Luce model, allows for such heterogeneity through a novel two component mixture model defined by augmenting the Plackett–Luce model. The idea that rankers may be heterogeneous in their beliefs about entities is not new. However, there might be groups of rankers with each group sharing the same view about entities. Generally the number of such groups will not be known and so we investigate the possibility of such group structure by using a Dirichlet process mixture of Weighted Plackett–Luce models. It can also be useful to assess whether some entities are exchangeable, that is, whether there is also entity clustering within each ranker group, an issue that has received little attention in the literature. We extend the model further to explore both ranker and entity clustering by adapting the Nested Dirichlet process. The resulting model is a Weighted Adapted Nested Dirichlet (WAND) process mixture of Plackett–Luce models. Posterior inference is conducted via a simple and efficient Gibbs sampling scheme. The richness of information in the posterior distribution allows for inference about many aspects of the clustering structure both between ranker groups and between entity groups (within ranker groups), in contrast to many other (Bayesian) analyses. The methodology is illustrated using several simulation studies and real data examples. Finally, we relax the assumption of a known ranking process underpinning these models by looking at the recently developed Extended Plackett–Luce model. This model allows inference for the order in which a homogeneous set of rankers assign entities to ranks. Analysis of this model is challenging but we have found that using Metropolis coupled Markov chain Monte Carlo (MC3 ) methods can provide adequate mixing over the high dimensional space of all possible permutations when the number of entities is not small.
Description:	PhD Thesis
URI:	http://theses.ncl.ac.uk/jspui/handle/10443/4504
Appears in Collections:	School of Mathematics and Statistics

Files in This Item:

File	Description	Size	Format
Johnson SR 2019.pdf	Thesis	4.24 MB	Adobe PDF	View/Open
dspacelicence.pdf	Licence	43.82 kB	Adobe PDF	View/Open

Show full item record