Deeparghya Dutta Barua
Hi, I'm Deeparghya Dutta Barua, a PhD student at the School of Computing and Augmented Intelligence (SCAI) at Arizona State University. My primary interests lie in multimodal comprehension, and natural language processing.
I completed my undergrad from the University of Dhaka with a degree in Computer Science and Engineering, and enjoy building tools that solve very niche but practical problems.
Email /
GitHub /
Google Scholar /
LinkedIn
|
|
|
ChitroJera: A Regionally Relevant Visual Question Answering Dataset for Bangla
Deeparghya Dutta Barua,
Md Sakib Ul Rahman Sourove,
Md Fahim,
Fabiha Haider,
Fariha Tanjim Shifat,
Md Tasmim Rahman Adib,
Anam Borhan Uddin,
Md Farhan Ishmam,
Md Farhad Alam
ECML PKDD 2025
paper /
scholar /
code /
Introduces ChitroJera, a regionally relevant Bangla VQA dataset with 15k samples, advancing vision-language tasks for low-resource languages.
|
|
Enhancing Heterogeneous Multi-Agent Cooperation in Decentralized MARL via GNN-driven Intrinsic Rewards
Jahir Sadik Monon,
Deeparghya Dutta Barua,
Md Mosaddek Khan
AAMAS 2025 (Extended Abstract)
arxiv /
scholar /
code /
Proposes CoHet, a GNN-based intrinsic motivation algorithm for decentralized heterogeneous multi-agent reinforcement learning, achieving state-of-the-art performance in cooperative scenarios.
|
|
BANTH: A Multi-label Hate Speech Detection Dataset for Transliterated Bangla
Fabiha Haider,
Fariha Tanjim Shifat,
Md Farhan Ishmam,
Deeparghya Dutta Barua,
Md Sakib Ul Rahman Sourove,
Md Fahim,
Md Farhad Alam
NAACL 2025 (Findings)
paper /
scholar /
code /
Presents BanTH, the first multi-label hate speech dataset for transliterated Bangla with 37.3k samples and state-of-the-art detection methods.
|
|
BanglaTLit: A Benchmark Dataset for Back-Transliteration of Romanized Bangla
Md Fahim,
Fariha Tanjim Shifat,
Fabiha Haider,
Deeparghya Dutta Barua,
Md Sakib Ul Rahman Sourove,
Md Farhan Ishmam,
Md Farhad Alam
EMNLP 2024 (Findings)
paper /
scholar /
code /
Introduces BanglaTLit, a large-scale dataset and pre-training corpus for Bangla transliteration, enabling automated back-transliteration of romanized text.
|
|
Penta ML at EXIST 2024: Tagging Sexism in Online Multimodal Content With Attention-enhanced Modal Context
Deeparghya Dutta Barua,
Md Sakib Ul Rahman Sourove,
Fabiha Haider,
Fariha Tanjim Shifat,
Md Farhan Ishmam,
Md Fahim,
Farhad Alam Bhuiyan
CLEF EXIST 2024
paper /
scholar /
code /
Presents an attention-based multimodal approach for sexism detection in online content, achieving state-of-the-art performance in the EXIST 2024 challenge.
|
|
Penta-nlp at EXIST 2024 Task 1–3: Sexism Identification, Source Intention, Sexism Categorization In Tweets
Fariha Tanjim Shifat,
Fabiha Haider,
Md Sakib Ul Rahman Sourove,
Deeparghya Dutta Barua,
Md Farhan Ishmam,
Md Fahim,
Farhad Alam Bhuiyan
CLEF EXIST 2024
paper /
scholar /
code /
Develops NLP-based automated sexism detection in tweets using transformer models and attention mechanisms for content moderation.
|
Projects
A collection of my personal and professional projects
|
|
Rongali
2024
website /
Bangla transliteration annotation tool. A Svelte web application to annotate transliterated Bangla text to Bangla script using the Google Transliteration API. Offers various shortcuts to expedite the process, contains improvements for acronyms, certain vowel sounds and numerals. Can also be used to write Bangla text from scratch.
|
|
cambia
2022
website /
code /
A compact disc ripper log checking utility. A command-line utility and a web UI to analyse log files generated by various compact disc ripping programs. It is designed to make media archival metadata more accessible.
|
|
Salvare
2022
code /
An Android app to manage online links. An application to manage online links in a single place with the descriptions and thumbnails. Uses Google OAuth for authentication.
|
|
Apellai
2021
code /
A Subsonic client for Android. A music streaming application for Android that interfaces with Subsonic servers. Follows the MVVM architecture.
|
|