Dimensionality Reduction & Feature Selection

The session Dimensionality Reduction & Feature Selection will be held on thursday, 2019-09-19, from 16:20 to 18:00, at room 1.011. The session chair is Jefrey Lijffijt.


17:00 - 17:20
Joint Multi-Source Reduction (102)
Lei Zhang (Institute of Information Engineering, Chinese Academy of Sciences), Shupeng Wang (Institute of Information Engineering, Chinese Academy of Sciences), Xin Jin (National Computer Network Emergency Response Technical Team/Coordination Center of China), Siyu Jia (Institute of Information Engineering, Chinese Academy of Sciences)

The redundant sources problem in multi-source learning always exists in various real-world applications such as multimedia analysis, information retrieval, and medical diagnosis, in which the heterogeneous representations from different sources always have three-way redundancies. More seriously, the redundancies will cost a lot of storage space, cause high computational time, and degrade the performance of learner. This paper is an attempt to jointly reduce redundant sources. Specifically, a novel Heterogeneous Manifold Smoothness Learning (HMSL) model is proposed to linearly map multi-source data to a low-dimensional feature-isomorphic space, in which the information-correlated representations are close along manifold while the semantic-complementary instances are close in Euclidean distance. Furthermore, to eliminate three-way redundancies, we present a new Correlation-based Multi-source Redundancy Reduction (CMRR) method with 2,1-norm equation and generalized elementary transformation constraints to reduce redundant sources in the learned feature-isomorphic space. Comprehensive empirical investigations are presented that confirm the promise of our proposed framework.

16:20 - 16:40
Interpretable Discriminative Dimensionality Reduction and Feature Selection on the Manifold (210)
Babak Hosseini (Bielefeld University), Barbara Hammer (Bielefeld University)

Dimensionality reduction (DR) on the manifold includes effectivemethods which project the data from an implicit relational space ontoa vectorial space. Regardless of the achievements in this area, these algorithmssuffer from the lack of interpretation of the projection dimensions.Therefore, it is often difficult to explain the physical meaning behindthe embedding dimensions. In this research, we propose the interpretablekernel DR algorithm (I-KDR) as a new algorithm which maps the datafrom the feature space to a lower dimensional space where the classes aremore condensed with less overlapping. Besides, the algorithm creates thedimensions upon local contributions of the data samples, which makes iteasier to interpret them by class labels. Additionally, we efficiently fusethe DR with feature selection task to select the most relevant features ofthe original space to the discriminative objective. Based on the empiricalevidence, I-KDR provides better interpretations for embedding dimensionsas well as higher discriminative performance in the embedded spacecompared to the state-of-the-art and popular DR algorithms.

Reproducible Research
16:40 - 17:00
On the Stability of Feature Selection in the Presence of Feature Correlations (211)
Konstantinos Sechidis (University of Manchester), Konstantinos Papangelou (University of Manchester), Sarah Nogueira (Criteo, Paris), James Weatherall (Advanced Analytics Centre, Global Medicines Development, AstraZeneca, Cambridge), Gavin Brown (University of Manchester)

Feature selection is central to modern data science. The `stability' of a feature selection algorithm refers to the sensitivity of its choices to small changes in training data. This is, in effect, the robustness of the chosen features. This paper considers the estimation of stability when we expect strong pairwise correlations, otherwise known as feature redundancy. We demonstrate that existing measures are inappropriate here, as they systematically underestimate the true stability, giving an overly pessimistic view of a feature set.We propose a new statistical measure which overcomes this issue, and generalises previous work.

Reproducible Research
17:20 - 17:40
Efficient Feature Selection Using Shrinkage Estimators (J31)
Konstantinos Sechidis, Laura Azzimonti, Adam Pocock, Giorgio Corani, James Weatherall, Gavin Brown

Parallel Sessions