My research focuses on the area of Topic Modeling and Natural Language Processing. My interests span multiple areas across NLP and machine learning, including Sentiment Analysis, Hate Speech, Word Representation, and Deep Learning.

Here a selected collections of some projects I've been working on.

Constrained Relational Topic Models

Topics models can automatically extract topics from collections of documents. If two documents are related to each other (e.g. citations) are more likely to talk about similar topics. I designed and developed a semi-supervised topic model that jointly models the relationships of a network of documents and some prior knowledge related to documents in the form of constraints. This work has been accepted on an international journal! My very first paper! Check it out: [Code] [Paper]

Contextualized Neural Topic Models

Recently, neural topic models have become available. Concurrently, BERT-based representations have advanced the state of the art of neural models. In collaboration with MilaNLP lab @ Bocconi University, we combined pre-trained BERT representations and neural topic models. BERT sentence embeddings indeed generate more meaningful and coherent topics than bag of word approaches in either standard LDA or existing neural topic models. This also allowed us to address the problem of zero-shot cross-lingual topic modeling.
Check it out the preprints: [Arxiv:2004.03974] [Arxiv:2004.07737]
And here's the python library that reached over 10k downloads! [contextualized-topic-models]

Hyperparameter Optimization for Topic Models

Hyperpameters that regulate the random variables of topic models can have a strong impact on the overall performance of a model. I am empirically studying the impact of the hyperparameters of a subset of topics models. I'm investigating this problem by applying Bayesian optimization techniques to identify the best configuration of hyperparameters for a topic model applied in a classification task.
Related to this problem, I aim to extend this work and study the impact of hyperparameters on a wider set of topic models. I am actually planning to implement a comparative evaluation framework in which we can compare optimized topic models and investigate multiple performance metrics.

Topic Models in Relational Environments

Multiple types of relational information can be incorporated into topic models. I am studying how document-level relationships and word-level relationships affect the results of a topic model. As for the word-level relationships, I am considering relationships between the named-entities identified in a document. The idea is that two named-entities are more likely to share the same topics, rather than two single tokens (that may be ambiguous considered as single words). [Code]