Pranav A

I am Pranav and I am currently working as a PhD researcher at the University of Hamburg.

My email is cs.pranav.a (at) gmail.com.

My full name is Pranav Agrawal, but I prefer to be called by 'Pranav A'. For academic publications, I use the name 'A Pranav'. I use either he/him or they/them pronouns.

Github Link  /  LinkedIn Link  /  CV (Link to PDF)

Research Interests and Publications

Broadly, my interests are in NLP (Natural Language Processing) and AI ethics.

Within NLP, my particular focus revolves around tokenization learning and multilinguality. Currently, my work centers on developing subword models in multilingual tasks, with a specific emphasis on addressing the following issues:

  • How can we develop tokenization methods that effectively tackle the challenges of sparsity and skewness?
  • How do different language families influence one another within multilingual datasets, and how can we leverage this knowledge to create improved tokenization methods tailored for multilingual models?

Regarding AI ethics, my main areas of interest include conference inclusivity, queerness in AI, and data sovereignty. Presently, my research efforts are concentrated on the following topics:

  • How can community-based approaches enhance inclusivity at conferences?
  • In cases where control over digital identity, as manifested in datasets, models, and algorithms, has been taken away from queer individuals, what tools can be devised to counteract this?
  • Why do deadnames occur in citations of publications, and what tools can be developed to prevent the circulation of deadnames and outdated information?
Below are some of the research papers that I have worked on. My work has been published at top-tier conferences and received a best paper award.
Paper Title: Comparing Static and Contextual Distributional Semantic Models on Intrinsic Tasks: An Evaluation on Mandarin Chinese Datasets
Authors: A Pranav, Yan Cong, Emmanuele Chersoni, Yu-Yin Hsu, Alessandro Lenci
Conference: LREC, 2024
Anthology

Emperical comparisions on character based models against word based models on common Chinese semantic benchmarks.

Paper Title: Queer In AI: A Case Study in Community-Led Participatory AI
Authors: Organizers of QueerInAI (several awesome authors including me!)
Conference: FAccT, 2023
Best Paper Award
arXiv

Community-led participatory design case study of Queer in AI contributed lessons on decentralization, building community aid, empowering marginalized groups, and critiquing poor participatory practices.

Paper title: How to Make Virtual Conferences Queer-Friendly: A Guide
Authors: Organizers of QueerInAI, A Pranav, MaryLena Bleile, Arjun Subramonian, Luca Soldaini, Danica J. Sutherland, Sabine Weber and Pan Xu
Conference: Widening NLP, 2021
Link to Queer in AI page

Queer in AI provides a tutorial for diversity & inclusion organizers on making virtual conferences more queer-friendly through inclusivity based on their community's experiences with marginalization.

Paper title: 2kenize: Tying Subword Sequences for Chinese Script Conversion
Authors: Pranav A, Isabelle Augenstein
Conference: ACL, 2020
Github Link /  SIGTYP Newsletter Link /  arXiv Link /  Video and Slides Link

The paper contributes a contextual subword segmentation method along with benchmark datasets that outperformed previous Chinese character conversion approaches by 6 points in accuracy, especially for code-mixing and names entities.

Paper Title: Alignment Analysis of Sequential Segmentation of Lexicons to Improve Automatic Cognate Detection
Authors: Pranav A
Conference: ACL Student Research Workshop, 2018
Github Link  /  ACL Anthology Link

The paper contributes information retrieval ranking functions with heuristics like positional tokenization and graphical error modelling to the problem of cognate detection.


Volunteer Service and D&I Advocacy

Over the past few years, I have devoted a significant amount of time to volunteering for queer and D&I advocacy, particularly within NLP conferences.

I have served as a senior D&I chair for NAACL 2021 and 2022. In this role, my responsibilities encompassed the establishment of grassroots organizations, the implementation of mentoring programs, the promotion of trans-inclusive publishing, and improving financial accessibility.

As an initiator of the Queer in AI initiative within the NLP community, I have organized workshops (NAACL 2021 and NAACL 2022) and social events at each NLP conference. At Queer in AI, my research focuses on enhancing queer inclusivity in conferences, and I also mentor several volunteers in the field of queer activism.