The session Ranking will be held on tuesday, 2019-09-17, from 14:00 to 16:00, at room 1.011. The session chair is Eyke Hüllermeier.


15:00 - 15:20
A Reduction of Label Ranking to Multiclass Classification (385)
Klaus Brinker (Hamm-Lippstadt University of Applied Sciences), Eyke Hüllermeier (Paderborn University)

Label ranking considers the problem of learning a mapping from instances to strict total orders over a predefined set of labels. In this paper, we present a framework for label ranking using a decomposition into a set of multiclass problems. Conceptually, our approach can be seen as a generalization of pairwise preference learning. In contrast to the latter, it allows for controlling the granularity of the decomposition, varying between binary preferences and complete rankings as extreme cases. It is specifically motivated by limitations of pairwise learning with regard to the minimization of certain loss functions. We discuss theoretical properties of the proposed method in terms of accuracy, error correction, and computational complexity. Experimental results are promising and indicate that improvements upon the special case of pairwise preference decomposition are indeed possible.

14:20 - 14:40
Learning to Calibrate and Rerank Multi-label Predictions (391)
Cheng Li (Northeastern University), Virgil Pavlu (Northeastern University), Javed Aslam (Northeastern University), Bingyu Wang (Northeastern University), Kechen Qin (Northeastern University)

A multi-label classifier assigns a set of labels to each data object. A natural requirement in many end-use applications is that the classifier also provides a well-calibrated confidence (probability) to indicate the likelihood of the predicted set being correct; for example, an application may automate high-confidence predictions while manually verifying low-confidence predictions. The simplest multi-label classifier, called Binary Relevance (BR), applies one binary classifier to each label independently and takes the product of the individual label probabilities as the overall label-set probability (confidence). Despite its many known drawbacks, such as generating suboptimal predictions and poorly calibrated confidence scores, BR is widely used in practice due to its speed and simplicity. We seek in this work to improve both BR's confidence estimation and prediction through a post calibration and reranking procedure. We take the BR predicted set of labels and its product score as features, extract more features from the prediction itself to capture label constraints, and apply Gradient Boosted Trees (GB) as a calibrator to map these features into a calibrated confidence score. GB not only produces well-calibrated scores (aligned with accuracy and sharp), but also models label interactions, correcting a critical flaw in BR. We further show that reranking label sets by the new calibrated confidence makes accurate set predictions on par with state-of-the-art multi-label classifiers - yet calibrated, simpler, and faster.

Reproducible Research
14:00 - 14:20
Pairwise Learning to Rank by Neural Networks Revisited: Reconstruction, Theoretical Analysis and Practical Performance (400)
Marius Köppel (Johannes Gutenberg-Universität Mainz), Alexander Segner (Johannes Gutenberg-Universität Mainz), Martin Wagener (Johannes Gutenberg-Universität Mainz), Lukas Pensel (Johannes Gutenberg-Universität Mainz), Andreas Karwath (University of Birmingham), Stefan Kramer (Johannes Gutenberg-Universität Mainz)

We present a pairwise learning to rank approach based on a neural net, called DirectRanker, that generalizes the RankNet architecture. We show mathematically that our model is reflexive, antisymmetric, and transitive allowing for simplified training and improved performance. Experimental results on the LETOR MSLR-WEB10K, MQ2007 and MQ2008 datasets show that our model outperforms numerous state-of-the-art methods, while being inherently simpler in structure and using a pairwise approach only.

Reproducible Research
15:20 - 15:40
Sequential Learning over Implicit Feedback for Robust Large-Scale Recommender Systems (820)
Aleksandra Burashnikova (Skolkovo Institute of Science and Technology), Yury Maximov (Theoretical Division T-5 and CNLS, Los Alamos National Laboratory), Massih-Reza Amini (Université Grenoble Alpes)

In this paper, we propose a theoretically founded sequentialstrategy for training large-scale Recommender Systems (RS) over implicitfeedback mainly in the form of clicks. The proposed approach consistsin minimizing pairwise ranking loss over blocks of consecutive itemsconstituted by a sequence of non-clicked items followed by a clicked onefor each user. Parameter updates are discarded if for a given user thenumber of sequential blocks is below or above some given thresholdsestimated over the distribution of the number of blocks in the training set.This is to prevent from updating the parameters for an abnormally highnumber of clicks over some targeted items, mainly due to bots; or very fewuser interactions. Both scenarios affect the decision of RS and imply a shiftover the distribution of items that are shown to the users. We provide aproof of convergence of the algorithm to the minimizer of the ranking loss,in the case where the latter is convex. Furthermore, experimental resultson five large-scale collections demonstrate the efficiency of the proposedalgorithm concerning the state-of-the-art approaches, both regardingdifferent ranking measures and computation time.

Reproducible Research
14:40 - 15:00
A Ranking Model Motivated by Nonnegative Matrix Factorization with Applications to Tennis Tournaments (44)
Rui Xia (National University of Singapore), Vincent Y. F. Tan (National University of Singapore), Louis Filstroff (IRIT, Université de Toulouse), Cédric Févotte (IRIT, Université de Toulouse)

We propose a novel ranking model that combines the Bradley-Terry-Luce probability model with a nonnegative matrix factorization framework to model and uncover the presence of latent variables that influence the performance of top tennis players. We derive an efficient, provably convergent, and numerically stable majorization-minimization-based algorithm to maximize the likelihood of datasets under the proposed statistical model. The model is tested on datasets involving the outcomes of matches between 20 top male and female tennis players over 14 major tournaments for men (including the Grand Slams and the ATP Masters 1000) and 16 major tournaments for women over the past 10 years. Our model automatically infers that the surface of the court (e.g., clay or hard court) is a key determinant of the performances of male players, but less so for females. Top players on various surfaces over this longitudinal period are also identified in an objective manner.

Reproducible Research
15:40 - 16:00
Rankboost+: An Improvement to Rankboost (J04)
Harold Connamacher (Case Western Reserve University), Nikil Pancha (Case Western Reserve University), Rui Liu (Case Western Reserve University), Soumya Ray (Case Western Reserve University)

Parallel Sessions