I am a Laplace Junior Chair in Machine Learning at École Normale Supérieure, Paris in the group of Stéphane Mallat, Giulio Biroli and Garbiele Peyré. I was previously a postdoctoral fellow at ETH Zurich and finished my PhD in Machine Learning in 2021 at the University of Edinburgh, supervised by Profs Tim Hospedales and Iain Murray.
Research Interest: My core interest is to mathematically understand mechanisms behind successful machine learning methods, e.g. neural networks, similarly to how physics describes the laws of nature. Recent machine learning models produce incredible results, but what is actually learned is often a mystery with unexplained failures cases and unknown risks in downstream applications. Scientific curiosity aside, by unpicking these models and the properties of the data they leverage, my research aims towards better performing, more interpretable and more reliable algorithms; and potentially a more fundamental understanding of the natural phenomena behind the data itself.
Current topics: I consider how machine learning models exploit aspects of the data distribution from a probabilistic, often latent variable, perspective. Recent projects include: explaining how VAEs disentangle independent factors of the data, identifying the mathematical model behind recent self-supervised learning, and deriving a probabilistic interpretation of softmax classification.
My PhD (Towards a Theoretical Understanding of Word and Relation Representation) investigated neural representations of discrete objects and their relationships, focusing on word embeddings learned from large text corpora (e.g. word2vec or GloVe); and entity embeddings learned from large collections of facts (“subject-relation-object”) in a knowledge graph. My main result shows how word embeddings can seemingly be added and subtracted, e.g. queen ≈ king - man + woman, which received Best Paper (honourable mention) at ICML 2019.
During an internship at Samsung AI Centre, Cambridge I worked at the interestection of representation learning and logical reasoning.
Background: I moved to Artificial Intelligence/Machine Learning research after a number of years in Project Finance. I have a BSc in Mathematics & Chemistry (University of Southampton), an MSc Mathematics and the Foundations of Computer Science (MFoCS, University of Oxford) and MScs in Artificial Intelligence and Data Science (University of Edinburgh).
Awards & Invited Talks: Aside from the Best Paper (hon. mention) I have been awarded a research grant from the Hasler Foundation and have given several invited talks, including at Harvard Center of Mathematical Sciences & Applications and Astra-Zeneca.
Publications
Unpicking Data at the Seams: VAEs, Disentanglement and Independent Components
[paper]
C Allen;
under review, 2024
A Probabilistic Model behind Self-Supervised Learning
[paper]
A Bizeul, B Schölkopf, C Allen;
TMLR, 2024
Variational Classification: A Probabilistic Generalization of the Softmax Classifier
[paper]
S Dhuliawala, M Sachan, C Allen;
TMLR, 2023
Learning to Drop Out: An Adversarial Approach to Training Sequence VAEs
[paper]
Đ Miladinović, K Shridhar, K Jain, M Paulus, JM Buhmann, C Allen;
NeurIPS, 2022
Adapters for Enhanced Modelling of Multilingual Knowledge and Text
[paper]
Y Hou, W Jiao, M Liu, C Allen, Z Tu, M Sachan;
EMNLP, 2022
Interpreting Knowledge Graph Relation Representation from Word Embeddings
[paper]
C Allen*, I Balažević*, T Hospedales;
ICLR, 2021
Multi-scale Attributed Embedding of Networks
[paper] [github]
B Rozemberczki, C Allen, R Sarkar;
Journal of Complex Networks , 2021
A Probabilistic Model for Discriminative & Neuro-Symbolic Semi-Supervised Learning
[paper]
C Allen, I Balažević, T Hospedales;
2020
What the Vec? Towards Probabilistically Grounded Embeddings
[paper]
C Allen, I Balažević, T Hospedales;
NeurIPS, 2019
Multi-relational Poincaré Graph Embeddings
[paper] [github]
I Balažević, C Allen, T Hospedales;
NeurIPS, 2019
Analogies Explained: Towards Understanding Word Embeddings
[paper]
[blog post]
[slides]
C Allen, T Hospedales;
ICML, 2019 (Best Paper, honorable mention)
TuckER: Tensor Factorization for Knowledge Graph Completion
[paper] [github]
I Balažević, C Allen, T Hospedales;
EMNLP, 2019 (oral)
Hypernetwork Knowledge Graph Embeddings
[paper] [github]
I Balažević, C Allen, T Hospedales;
ICANN, 2019 (oral)
Posts
Disentangling Disentanglement: how VAEs learn Independent Components
'Analogies Explained' ... Explained
subscribe via RSS