publications | Dhruvesh Patel

2023

ICCV

Pretrained Language Models as Visual Planners for Human Assistance

Dhruvesh Patel, Hamid Eghbalzadeh, Nitin Kamra, Michael Louis Iuzzolino, Unnat Jain, and 1 more author

Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023

arXiv HTML Code

2022

ACL

Word2Box: Capturing Set-Theoretic Semantics of Words using BoxEmbeddings

Shib Sankar Dasgupta, Michael Boratko, Siddhartha Mishra, Shriya Atmakuri, Dhruvesh Patel, and 2 more authors

In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (To Appear), 2022

Abs

Learning representations of words in a continuous space is perhaps the most fundamental task in NLP, a prerequisite for nearly all modern machine-learning techniques. Often the objective is to capture distributional similarity via vector dot product, however this is just one relation between word meanings we may wish to capture. It is natural to consider words as (soft) equivalence classes based on similarity, it is natural to expect the ability to perform set-theoretic operations (intersection, union, difference) on these representations. This is particularly relevant for words which are homographs- for example, “tongue”∩“body” should be similar to “mouth”, while “tongue”∩“language” should be similar to “dialect”. Box embeddings are a novel region-based representation which provide the capability to perform these set-theoretic operations. In this work, we provide a fuzzy-set interpretation of box embeddings, and train box embeddings with a CBOW objective where contexts are represented using intersection. We demonstrate improved performance on various word similarity tasks, particularly on less common words, and perform a quantitative and qualitative analysis exploring the additional unique expressivity provided by Word2Box.
ICLR
Modeling Label Space Interactions in Multi-label Classification using Box Embeddings

Dhruvesh Patel, Pavitra Dangati, Jay-Yoon Lee, Michael Boratko, and Andrew McCallum

In International Conference on Learning Representations, 2022

Abs Bib

Multi-label classification is a challenging structured prediction task in which a set of output class labels are predicted for each input. Real-world datasets often have natural or latent taxonomic relationships between labels, making it desirable for models to employ label representations capable of capturing such taxonomies. Most existing multi-label classification methods do not do so, resulting in label predictions that are inconsistent with the taxonomic constraints, thus failing to accurately represent the fundamentals of problem setting. In this work, we introduce the multi-label box model (MBM), a multi-label classification method that combines the encoding power of neural networks with the inductive bias and probabilistic semantics of box embeddings (Vilnis, et al 2018). Box embeddings can be understood as trainable Venn-diagrams based on hyper-rectangles. Representing labels by boxes rather than vectors, MBM is able to capture taxonomic relations among labels. Furthermore, since box embeddings allow these relations to be learned by stochastic gradient descent from data, and to be read as calibrated conditional probabilities, our model is endowed with a high degree of interpretability. This interpretability also facilitates the injection of partial information about label-label relationships into model training, to further improve its consistency. We provide theoretical grounding for our method and show experimentally the model’s ability to learn the true latent taxonomic structure from data. Through extensive empirical evaluations on both small and large-scale multi-label classification datasets, we show that BBM can significantly improve taxonomic consistency while preserving or surpassing the state-of-the-art predictive performance.
@inproceedings{patel2022modeling, title = {Modeling Label Space Interactions in Multi-label Classification using Box Embeddings}, author = {Patel, Dhruvesh and Dangati, Pavitra and Lee, Jay-Yoon and Boratko, Michael and McCallum, Andrew}, booktitle = {International Conference on Learning Representations}, year = {2022}, url = {https://openreview.net/forum?id=tyTH9kOxcvh}, }
NeurIPS

Structured Energy Network As a Loss

Jay Yoon Lee, Dhruvesh Patel, Purujit Goyal, Wenlong Zhao, Zhiyang Xu, and 1 more author

In Advances in Neural Information Processing Systems, 2022

Abs

Belanger & McCallum (2016) and Gygli et al. (2017) have shown that an energy network can capture arbitrary dependencies amongst the output variables in structured prediction; however, their reliance on gradient-based inference (GBI) makes the inference slow and unstable. In this work, we propose Structured Energy As Loss (SEAL) to take advantage of the expressivity of energy networks without incurring the high inference cost. This is a novel learning framework that uses an energy network as a trainable loss function (loss-net) to train a separate neural network (task-net), which is then used to perform the inference through a forward pass. We establish SEAL as a general framework wherein various learning strategies like margin-based, regression, and noise-contrastive, could be employed to learn the parameters of loss-net. Through extensive evaluation on multi-label classification, semantic role labeling, and image segmentation, we demonstrate that SEAL provides various useful design choices, is faster at inference than GBI, and leads to significant performance gains over the baselines.
ACL

Event-Event Relation Extraction using Probabilistic Box Embedding

EunJeong Hwang, Jay-Yoon Lee, Tianyi Yang, Dhruvesh Patel, Dongxu Zhang, and 1 more author

In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (To Appear), 2022

Abs

To understand a story with multiple events, it is important to capture the proper relations across these events. However, existing event relation extraction (ERE) framework regards it as a multi-class classification task and do not guarantee any coherence between different relation types, such as anti-symmetry. If a phone line "died" after "storm", then it is obvious that the "storm" happened before the "died". Current framework of event relation extraction do not guarantee this coherence and thus enforces it via constraint loss function (Wang et al., 2020). In this work, we propose to modify the underlying ERE model to guarantee coherence by representing each event as a box representation (BERE) without applying explicit constraints. From our experiments, BERE also shows stronger conjunctive constraint satisfaction while performing on par or better in F1 compared to previous models with constraint injection.

2021

Preprint

Word2Box: Learning Word Representation Using Box Embeddings

Shib Sankar Dasgupta, Michael Boratko, Shriya Atmakuri, Xiang Li, Dhruvesh Patel, and 1 more author

In ArXiv, 2021
EMNLP

Box Embeddings: An open-source library for representation learning using geometric structures

*Tejas Cheda, *Purujit Goyal, *Trang Tran, Dhruvesh Patel, Michael Boratko, and 2 more authors

In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Nov 2021

Abs Code

A fundamental component to the success of modern representation learning is the ease of performing various vector operations. Recently, objects with more geometric structure (eg. distributions, complex or hyperbolic vectors, or regions such as cones, disks, or boxes) have been explored for their alternative inductive biases and additional representational capacity. In this work, we introduce Box Embeddings, a Python library that enables researchers to easily apply and extend probabilistic box embeddings. Fundamental geometric operations on boxes are implemented in a numerically stable way, as are modern approaches to training boxes which mitigate gradient sparsity. The library is fully open source, and compatible with both PyTorch and TensorFlow, which allows existing neural network layers to be replaced with or transformed into boxes easily. In this work, we present the implementation details of the fundamental components of the library, and the concepts required to use box representations alongside existing neural network architectures.
NAACL
Looking Beyond Sentence-Level Natural Language Inference for Question Answering and Text Summarization

Anshuman Mishra, Dhruvesh Patel, Aparna Vijayakumar, Xiang Lorraine Li, Pavan Kapanipathi, and 1 more author

In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Jun 2021

Abs Bib Code

Natural Language Inference (NLI) has garnered significant attention in recent years; however, the promise of applying NLI breakthroughs to other downstream NLP tasks has remained unfulfilled. In this work, we use the multiple-choice reading comprehension (MCRC) and checking factual correctness of textual summarization (CFCS) tasks to investigate potential reasons for this. Our findings show that: (1) the relatively shorter length of premises in traditional NLI datasets is the primary challenge prohibiting usage in downstream applications (which do better with longer contexts); (2) this challenge can be addressed by automatically converting resource-rich reading comprehension datasets into longer-premise NLI datasets; and (3) models trained on the converted, longer-premise datasets outperform those trained using short-premise traditional NLI datasets on downstream tasks primarily due to the difference in premise lengths.
@inproceedings{mishra-etal-2021-looking, title = {Looking Beyond Sentence-Level Natural Language Inference for Question Answering and Text Summarization}, author = {Mishra, Anshuman and Patel, Dhruvesh and Vijayakumar, Aparna and Li, Xiang Lorraine and Kapanipathi, Pavan and Talamadupula, Kartik}, booktitle = {Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies}, month = jun, year = {2021}, address = {Online}, publisher = {Association for Computational Linguistics}, url = {https://aclanthology.org/2021.naacl-main.104}, doi = {10.18653/v1/2021.naacl-main.104}, pages = {1322--1336}, }

2020

Looking Beyond Sentence-Level Natural Language Inference for Downstream Tasks

*Anshuman Mishra, *Dhruvesh Patel, *Aparna Vijayakumar, Xiang Li, Pavan Kapanipathi, and 1 more author

In ArXiv, Dec 2020

Abs Code

In recent years, the Natural Language Inference (NLI) task has garnered significant attention, with new datasets and models achieving near human-level performance on it. However, the full promise of NLI – particularly that it learns knowledge that should be generalizable to other downstream NLP tasks – has not been realized. In this paper, we study this unfulfilled promise from the lens of two downstream tasks: question answering (QA), and text summarization. We conjecture that a key difference between the NLI datasets and these downstream tasks concerns the length of the premise; and that creating new long premise NLI datasets out of existing QA datasets is a promising avenue for training a truly generalizable NLI model. We validate our conjecture by showing competitive results on the task of QA and obtaining the best reported results on the task of Checking Factual Correctness of Summaries.
COLING

Reading Comprehension as Natural Language Inference: A Semantic Analysis

*Anshuman Mishra, *Dhruvesh Patel, *Aparna Vijayakumar, Xiang Li, Pavan Kapanipathi, and 1 more author

In StarSem 2020 Workshop at COLING , Dec 2020

Abs Code

In the recent past, Natural language Inference (NLI) has gained significant attention, particularly given its promise for downstream NLP tasks. However, its true impact is limited and has not been well studied. Therefore, in this paper, we explore the utility of NLI for one of the most prominent downstream tasks, viz. Question Answering (QA). We transform the one of the largest available MRC dataset (RACE) to an NLI form, and compare the performances of a state-of-the-art model (RoBERTa) on both these forms. We propose new characterizations of questions, and evaluate the performance of QA and NLI models on these categories. We highlight clear categories for which the model is able to perform better when the data is presented in a coherent entailment form, and a structured question-answer concatenation form, respectively.
ACL
Weakly Supervised Medication Regimen Extraction from Medical Conversations

Dhruvesh Patel, Sandeep Konam, and Sai Prabhakar

In Proceedings of the 3rd Clinical Natural Language Processing Workshop, Nov 2020

Abs Bib

Automated Medication Regimen (MR) extraction from medical conversations can not only improve recall and help patients follow through with their care plan, but also reduce the documentation burden for doctors. In this paper, we focus on extracting spans for frequency, route and change, corresponding to medications discussed in the conversation. We first describe a unique dataset of annotated doctor-patient conversations and then present a weakly supervised model architecture that can perform span extraction using noisy classification data. The model utilizes an attention bottleneck inside a classification model to perform the extraction. We experiment with several variants of attention scoring and projection functions and propose a novel transformer-based attention scoring function (TAScore). The proposed combination of TAScore and Fusedmax projection achieves a 10 point increase in Longest Common Substring F1 compared to the baseline of additive scoring plus softmax projection.
@inproceedings{patel-etal-2020-weakly, title = {Weakly Supervised Medication Regimen Extraction from Medical Conversations}, author = {Patel, Dhruvesh and Konam, Sandeep and Prabhakar, Sai}, booktitle = {Proceedings of the 3rd Clinical Natural Language Processing Workshop}, month = nov, year = {2020}, address = {Online}, publisher = {Association for Computational Linguistics}, url = {https://aclanthology.org/2020.clinicalnlp-1.20}, doi = {10.18653/v1/2020.clinicalnlp-1.20}, pages = {178--193}, }
AKBC
Representing Joint Hierarchies with Box Embeddings

*Dhruvesh Patel, *Shib Sankar Dasgupta, Michael Boratko, Xiang Li, Luke Vilnis, and 1 more author

In Automated Knowledge Base Construction (AKBC), Nov 2020

Abs Bib Code Slides

Learning representations for hierarchical and multi-relational knowledge has emerged as an active area of research. Box Embeddings [Vilnis et al., 2018, Li et al., 2019] represent concepts with hyperrectangles in n-dimensional space and are shown to be capable of modeling tree-like structures efficiently by training on a large subset of the transitive closure of the WordNet hypernym graph. In this work, we evaluate the capability of box embeddings to learn the transitive closure of a tree-like hierarchical relation graph with far fewer edges from the transitive closure. Box embeddings are not restricted to tree-like structures, however, and we demonstrate this by modeling the WordNet meronym graph, where nodes may have multiple parents. We further propose a method for modeling multiple relations jointly in a single embedding space using box embeddings. In all cases, our proposed method outperforms or is at par with all other embedding methods.
@inproceedings{dhruveshbox2020, title = {Representing Joint Hierarchies with Box Embeddings}, author = {Patel, *Dhruvesh and Dasgupta, *Shib Sankar and Boratko, Michael and Li, Xiang and Vilnis, Luke and McCallum, Andrew}, booktitle = {Automated Knowledge Base Construction (AKBC)}, year = {2020}, url = {https://openreview.net/forum?id=J246NSqR_l}, video = {https://youtu.be/yqP8wjMocAs}, }

2017

Computing the Safe Working Zone of a 3-RRS Parallel Manipulator

Dhruvesh Patel, Rohit Kalla, Tetik Halil, Kiper Gökhan, and Sandipan Bandyopadhyay

In New Trends in Mechanism and Machine Science, Nov 2017

Abs

Determination of the safe working zone (SWZ) of a parallel manipulator is a one-time computational task with several permanent benefits. As this subspace of the workspace of the manipulator is free of both the loss- and gain-type singularities, link interference, as well as physical joint limits, the manipulator can move freely in this space. Moreover, if the natural choice of a convex-shaped SWZ is adhered to, then point-to-point path planning inside the SWZ always has a trivial solution, namely, a segment joining the two points, which is guaranteed to be inside the workspace. In this paper, the SWZ of the 3-RRS existing in the İzmir Institute of Technology has been computed. Starting with the geometry of the manipulator, the loop-closure constraint equations have been derived. The singularity conditions are obtained based on the singularity of certain Jacobian matrices associated with the constraint functions. The interference between the links are detected by first encapsulating the links in rectangular parallelepipeds, which are then discretized into triangles, and subjected to collision tests between the relevant pairs of triangles. Using these theoretical developments, the SWZ is computed. The numerical results are depicted graphically.