Academic Homepage

Multi-Perspective Narrative (MPN) Understanding / Braiding

2023-08-24T00:00:00-07:00

The objective of this research is to design and develop a novel human-AI collaborative framework that can braid the Overlapping, Unique, and Conflicting information from a pair of alternative narratives into a single coherent summary.

Overview of Project CAMPeN

Multi-Perspective Narratives (MPNs) are ubiquitous and very useful for verifying information from different alternative narratives, and thus, MPNs facilitate more informed decisions by providing a concise overall picture of the current situation. Despite great progress in the area of natural language processing (NLP), computers are still far from being able to analyze multi-perspective narratives accurately; addressing this limitation is the focus of this project.

In this ongoing project, we are developing a novel human-AI collaborative framework called CAMPeN (``Collaborative Analytics of Multi-Perspective Narratives’’), where the AI, given multiple alternative narratives as input, first extracts a set of candidate clauses w.r.t. the Overlap-Unique-Conflict criteria, separately, in a zero-shot fashion. Next, the human actively verifies clauses that were labeled with low confidence by the AI. Finally, the machine braids the high-confidence/verified clauses to construct the ultimate Overlap-Unique-Conflict style summary, which will be presented to the user. The major benefits of the proposed framework are two-fold: 1) it enables domain experts in fields other than machine learning/NLP (e.g., a military general) to quickly dig out/verify interesting hypotheses from multiple alternative narratives/descriptions without worrying about the underlying computational techniques and thus, democratizes AI, and 2) it can quickly verify facts and claims about real-world events by analyzing alternative narratives and braid them into a single narrative with a higher degree of Information Assurance.

This project adopts both zero-shot and reinforcement learning approaches for extracting overlapping, unique, and conflicting information from alternative narratives that can be trained in a self-supervised fashion without requiring a large collection of training data; therefore, the proposed framework needs minimal human supervision in comparison to the existing Multi-Document Summarization techniques. Additionally, the project borrows intuitions and insights from the classical set theory and applies the properties of set operators to develop novel reward/loss functions to enable effective training of reinforcement learning-based extraction networks.

Objectives of Project CAMPeN

The proposed work includes the following objectives.

Design and develop a novel human-AI collaborative framework that can braid the Overlapping, Unique, and Conflicting information from a pair of alternative narratives into a single coherent summary.
Conduct research on how to extract overlapping information from a pair of alternative narratives and paraphrase the overlap.
Given a pair of alternative narratives as inputs, conduct research on how to extract unique information present exclusively in each input narrative and identify interesting, unique information.
Conduct research on how to extract conflicting information from a pair of alternative narratives and how to resolve the conflict via effective human-AI collaboration.
Design and Develop novel metrics for quantifying Author Influence by applying the proposed Human-AI collaborative framework.
Conduct a thorough quantitative and qualitative evaluation of the proposed human-AI collaborative framework.
Determine whether the proposed Human-AI collaborative framework performs similarly or differently in a language other than English.

Knowledge Grounding via Meta-Conversation

2023-08-23T00:00:00-07:00

This is an ongoing project in my lab, where we are developing a “Meta-Conversation Framework” to create dialog-based interactive laboratory experiences for middle school science students and teachers in the context of simulation-based science experiments.

Overview of Project “Knowledge Grounding via Meta-Conversation”

In open-domain dialog systems, it is often uncertain how the end user would expect a new conversation to be grounded and structured. Therefore, the ideal solution must engage in a pre-conversation with the user about their expectations and preferred knowledge base for grounding purposes before the actual conversation happens. In other words, a “Conversation about Conversation”, i.e., a “Meta-Conversation” should happen with the user beforehand.

Based on this idea, we are currently developing an Artificial Intelligence-based Conversational Framework to create dialog-based interactive laboratory experiences for middle school science students and teachers in the context of simulation-based science experiments. A key component of the framework is an intelligent conversational agent (SimPal) that actively learns from teachers through a “Meta-Conversation” to solicit their instructional goals associated with simulation experiments and store them using a computational representation. In other words, the school teacher actively teaches the machine/agent what the instructional goals are for a particular scientific experiment in plain natural language. The agent then uses this representation to facilitate and customize an interactive knowledge-grounded conversation (powered by state-of-the-art Large Language Models) with students as they run experiments to enhance their learning experience. Unlike existing intelligent tutoring systems and pedagogical conversational agents, SimPal can work with any off-the-shelf third-party simulations, a unique feature of this project enabled by our proposed Meta-Conversation technique.

Knowledge Grounding via Taxonomy Based Prompting

2023-08-21T00:00:00-07:00

This is an ongoing project where we develop conversational AI techniques by prompting Large Language Models (LLMs) to build a natural language interface between humans and the AutoML tools (e.g., Scikit-Learn), which, in turn, can facilitate acquiring new predictive skills via self-directed learning. To achieve this, we recently introduced a new prompting taxonomy called TeLER to design diverse prompts that can unleash the full potential of LLMs with proper knowledge grounding.

Overview of Project “Knowledge Grounding via Taxonomy Based Prompting”

A big challenge in democratizing AI is that no single AI model can be pretrained to be skilled in all possible tasks a user may want to perform. Therefore, being able to learn new skills in an ad hoc fashion is essential. To address this challenge, we are developing a Conversational Data Science solution that is capable of acquiring new predictive skills on the fly through intuitive, natural conversations with the user. In our ACM Computing Surveys 2022 paper, we highlighted the core technical challenges that need to be addressed:

Accurately understand and define the goal skill(s) to acquire (as defined by a human)
Formulate a precise and relevant Machine Learning (ML) task.
Curate data sets and assign model hyper-parameters accordingly.
Train AutoML models to learn the skill.
Apply the skill effectively.

However, one big hurdle in materializing Conversational Data Science is to ensure a conversation that is grounded in a unique data set that the user may provide from an unseen domain on the fly. To address this challenge, we recently proposed a prompting taxonomy called TeLER to design effective prompts for Large Language Models (LLMs) in order to build a natural language interface between humans and the AutoML tools (e.g., Scikit-Learn), which, in turn, facilitates acquiring new predictive skills via self-directed learning. Our experimental results (available on Arxiv) demonstrate the effectiveness of the proposed TeLER-taxonomy-based prompting technique for knowledge grounding.

We designed the architecture of our “Conversational Data Science” framework with four dialogue states: Data Visualization, Task Formulation, Prediction Engineering, and Result Summary and Recommendation. Each state marks a unique conversation phase, impacting the overall user-system interaction. Multiple LLM instances, serving as “micro-agents’’, ensure a cohesive conversation flow, granting us granular control over the conversation’s progression. In summary, we designed and developed an end-to-end system that demonstrates the viability of “Conversational Data Science” by evaluating the potency of taxonomy-based prompting of LLMs in solving such an ill-defined complex task.

Trust and Influence

2023-08-20T00:00:00-07:00

This ongoing project aims to develop a novel metric to quantify an author’s influence in the narrative braiding process by comparing the final braided narrative against the individual author’s contribution in terms of the Overlap-Unique-Conflict clauses extracted by the CAMPeN framework.

Overview of Project “Trust and Influence”

This project focused on developing a novel metric for quantifying author influence in the context of narrative braiding, which is largely an unexplored research area till now. By definition, a braided narrative is created from the contributions of multiple authors. For this thrust, we also assume that there is a separate entity called the editorial board that polishes and edits the raw contributions of individual authors and is in charge of creating the final braided narrative. Under these assumptions, we consider three scenarios for quantifying author influence: 1) Quantify Author Influence for Single Author Contribution - Single Braided Narrative Scenario, 2) Quantify Author Influence for Multiple Authors - Single Braided Narrative Scenario, and 3) Quantify Author Influence for Multiple Authors - Multiple Braided Narratives Scenario. Another basic idea here is that influential authors become more trustworthy over time and serve as reliable sources in the future.

Utility Centric Evaluation

2023-08-18T00:00:00-07:00

In this project, our main goal is to investigate how to make NLP/IR evaluation metrics more utility-centric and robust.

Overview of Project “Utility Centric Evaluation”

Previous studies have shown that popular Natural Language Generation (NLG) and Information Retrieval (IR) and evaluation metrics, e.g., nDCG, ROUGE, MAP, are not robust and often do not correlate with the utility perceived by humans. In this project, our main goal is to investigate how to make NLP/IR evaluation metrics more utility-centric.

Utility-Centric Metrics for Evaluating Text Generation Systems: In ACL 2022, we proposed a gain/utility-based automated metric called Sem-nCG, which is both rank and semantic aware, for evaluating extractive summarization tasks and showed that Sem-nCG exhibits a higher correlation with human judgments than the popular ROUGE metric. In the same year (EMNLP 2022), we proposed a new sentence-level utility-based evaluation metric, called SEM-${F_1}$ (Semantic $F_1$), for evaluating the performance of the Overlap summary generation task. Experimental results show that the proposed SEM-$F_1$ metric yields a higher correlation with human judgment and inter-rater agreement than the traditional ROUGE metric. Very recently, we proposed TELeR, a general taxonomy for benchmarking complex generation tasks using Large Language Models (LLMs). TELeR enables meaningful comparisons across studies, establishing a common standard for prompt design and LLMs' Utility Evaluation; the taxonomy has received great attention from the industry and practitioners.

Utility-Centric Metrics for Evaluating Ranking Systems: We recently proposed a novel framework for ranking evaluation with expected-utility normalization, where the expected utility is estimated from a randomized ranking of the corresponding documents present in the evaluation set. Our proposed metric demonstrates an average of 21% increase in Discriminatory Power and a 28% increase in consistency. In a similar line of research, I previously performed a detailed evaluation of learning-to-rank methods for both Web search and E-Commerce search by exploiting multiple utility-based signals in addition to click rates, i.e., add-to-cart ratios, order rates, and revenue, results of which were published in CIKM 2022 (Best Poster Nomination) and SIGIR 2017 conference proceedings. The E-Commerce search study yielded multiple interesting insights and results that received significant attention from the search industry, including Walmart, Flipkart, @Unbxd, etc.