Home About us

Prompt contradicts itself, can the large model be found? Shanghai Jiao Tong University's latest research decryption

Qubit 2024/08/14 13:24
Dequan Wang's research group at Shanghai Jiao Tong University submitted
qubits | Official account QbitAI

Professor Wang Dequan's research group at Shanghai Jiao Tong University raised such a question in the latest study.

Imagine a scene where a kindergarten child holds a picture of a tiger and asks you, "This kitten is cute, is it a female cat?" How Would You Answer?

You may not answer "yes" or "no" directly, but start by pointing out the "contradiction" in the question – the picture is a tiger, not a cat.

Qubits, Prompt contradict itself, can large models find it? Shanghai Jiao Tong University's latest research decryption | ECCV 2024

However, there has been little systematic research on how large models will respond.

Keep in mind that AI models that can't detect "instruction conflicts" will generate results for "questions that shouldn't have answers", and regardless of which side of the conflicting side of the conflict, the resulting results can lead to potential disasters that affect AI security and superalignment.

In this latest study, the team proposed a multimodal benchmark, the self-contradictory instruction set, and designed an innovative automatic dataset creation framework called AutoCreate.

The team found that the multimodal large model was very lacking in the detection of contradictory user instructions, so the cognitive arousal prompt method (CAP) was proposed to inject cognitive ability from the external world to improve the ability of contradiction detection.

The paper will be presented at the 18th European Computer Vision Congress (ECCV) in October this year.

Qubits, Prompt contradict itself, can large models find it? Shanghai Jiao Tong University's latest research decryption | ECCV 2024

Can large models detect conflicting instructions?

At present, multimodal large models have made great progress in scientific research and application. They are capable of processing a wide range of data types, including text, images, and show capabilities similar to those of human cognition.

The team believes that the success of these models is due to a lot of research and development work, which allows them to closely follow human instructions and even be somewhat "submissive".

In addition, these models are particularly good at long contexts. Multimodal large models, such as the Claude 3 and Gemini 1.5 Pro, have demonstrated their capabilities. The Claude 3 series models offer a context window of 200K tokens, with the Gemini 1.5 Pro having a standard context window size of 128K, and even reaching 1M tokens during the private preview phase.

These advances have made multimodal large models excellent at handling complex tasks and meeting the needs of human interaction for long periods of time.

However, with the deepening of multimodal interaction and the increase of context length, the problem of self-contradiction of user instructions has become more and more prominent.

As shown in the figure below, when users (e.g., children or language beginners) use these models, they are often unaware of potential multimodal conflicts.

Qubits, Prompt contradict itself, can large models find it? Shanghai Jiao Tong University's latest research decryption | ECCV 2024

At the same time, as the number of conversation rounds increases and the context window expands, it is difficult for users to remember all the details, leading to contradictions between instructions.

In addition, as the number of modalities increases, intermodal conflicts may also occur. Once these models lack self-awareness and the ability to discern contradictions, their performance suffers.

To address these challenges, the research team proposes a multimodal benchmark, Self-Contradictory Instructions (SCI), to evaluate the ability of multimodal large models to detect conflicting instructions.

SCI contains 20,000 conflicting instructions and 8 tasks, evenly distributed in two paradigms: language-language and visual-language.

In the upper part of the diagram, the language-language paradigm deals with conflicts between contexts and instructions, such as design rule conflicts, object property conflicts, exclusive directives, and forbidden words.

Qubits, Prompt contradict itself, can large models find it? Shanghai Jiao Tong University's latest research decryption | ECCV 2024

In the lower part of the diagram: the visual-linguistic paradigm covers multimodal conflicts, such as OCR text recognition conflicts, graph conflicts, geometric conflicts, and semantic conflicts. Of the eight tasks, only the semantic collision involved the other dataset (ImageNet).

As a specific example, when constructing semantic conflicts, researchers first generate corresponding text based on images, and then replace the key semantic information in the text with new semantics that are similar but different.

In the image below, an ostrich is included in the image, and the author adds the question "Does the picture depict the ostrich's size?" to the semantic "ostrich" in the image.

Subsequently, the key semantic semantics of the text in question, "ostrich", was replaced with "kiwi". In this way, a pair of contradictory multimodal instructions is constructed.

Qubits, Prompt contradict itself, can large models find it? Shanghai Jiao Tong University's latest research decryption | ECCV 2024

Throughout the construction process of SCI, the authors designed an innovative automatic dataset creation framework, AutoCreate.

It constructs a multimodal loop through programs and large language models. The framework leverages programs and large language models to automate dataset creation.

AutoCreate starts with a few task-related seed data and maintains a seed pool. Within each cycle, AutoCreate includes two branches: the language branch (left) and the vision branch (right). Each branch consists of generators and decorators.

Qubits, Prompt contradict itself, can large models find it? Shanghai Jiao Tong University's latest research decryption | ECCV 2024

Finally, the cleaner will exclude data that does not meet the criteria. After passing the quality check of human experts, this data will be fed back into the seed pool for the next round.

AutoCreate greatly improves the speed and breadth of SCI datasets.

How to improve the ability to detect contradictions?

Through the SCI dataset, the researchers comprehensively evaluated the performance of the large model when dealing with contradictory instructions.

Experimental results show that the current large models often show certain shortcomings in the face of contradictory instructions.

They are able to process information and knowledge, but lack the ability to assess the reasonableness of instructions, what the research team calls "cognitive".

This deficiency stems from a lack of self-awareness and an inability to recognize inconsistencies in instructions.

Therefore, researchers have proposed a simple method of interpolated prompting, called "Cognitive Awakening Prompting" (CAP).

CAP can inject cognitive ability from the external world by adding a simple prompt to the input, thereby improving the contradiction detection ability of the large model, and basically has no negative impact.

This finding suggests that the current multimodal large model needs more self-awareness and cognitive ability in order to better handle complex instruction conflicts.

Qubits, Prompt contradict itself, can large models find it? Shanghai Jiao Tong University's latest research decryption | ECCV 2024

For more details, interested children's shoes can be found in the original paper.

About the Author

The first author of the paper is Gao Jin, a doctoral student at Shanghai Jiao Tong University.

His research interests include computer vision, multimodal large models, and AI-enabled life sciences.

Qubits, Prompt contradict itself, can large models find it? Shanghai Jiao Tong University's latest research decryption | ECCV 2024

The corresponding author of the paper is Dequan Wang, a tenured assistant professor and doctoral supervisor at Shanghai Jiao Tong University, who received his bachelor's degree from Fudan University and his Ph.D. from the University of California, Berkeley, under the supervision of Professor Trevor Darrell.

His research work has been published in top international conferences such as CVPR, ICCV, ECCV, ICLR, ICML, ICRA, IROS, etc., and his papers have been cited more than 10,000 times in the past five years, with an H-index of 20.

This article is from Xinzhi self-media and does not represent the views and positions of Business Xinzhi.If there is any suspicion of infringement, please contact the administrator of the Business News Platform.Contact: system@shangyexinzhi.com