Question and Answer System

Question-answering system ( QA-system ; from the English QA - English. Question-answering system ) - an information system that can accept questions and answer them in a natural language , in other words, it is a system with a natural-language interface.

Content

1 Classification
2 Architecture
3 Scheme of work
4 Problems
5 Directions for the development of question-answer systems
6 Quality assessment of question-answer systems
7 See also
8 Notes
9 Literature
10 Links

Classification

Question-answer systems can be divided into:

Highly specialized QA systems work in specific areas (e.g. medicine or car maintenance).
General QA-systems work with information on all fields of knowledge, thus it becomes possible to conduct a search in related fields.

Architecture

The first QA-systems ^[1] were developed in the 1960s and were natural language shells for expert systems focused on specific areas. Modern systems are designed to search for answers to questions in the documents provided using natural language processing (NLP) technologies.

Modern QA-systems usually include a special module - a classifier of questions , which determines the type of question and, accordingly, the expected answer. After this analysis, the system gradually applies the increasingly complex and subtle NLP methods to the submitted documents, discarding unnecessary information. The roughest method - search in documents - involves the use of an information search system to select parts of the text that potentially contain an answer. Then the filter selects phrases similar to the expected answer (for example, to the question "Who ..." the filter will return pieces of text containing the names of people). And finally, the module for selecting answers will find the correct answer among these phrases.

Workflow

The performance of the question-answer system depends on the effectiveness of the used methods of text analysis and on the quality of the text base - if it does not have answers to questions, the QA system can find little. The larger the database - the better, but only if it contains the necessary information. Large repositories (such as the Internet) contain a lot of redundant information ^[2] . This leads to the following points:

Since the information is presented in different forms, the completeness of the information is higher. The QA system is more likely to find the answer.
The correct information is often repeated, therefore, errors in the search for answers can be minimized.
The accuracy of information retrieval substantially depends on the accuracy of the information in the repositories, as well as the effectiveness of the methods of analyzing information and generating responses.

Problems

In 2002, a group of researchers wrote a research plan in the field of question-answer systems ^[3] . It was proposed to consider the following issues:

Types of Questions: Different questions require different methods for finding answers. Therefore, it is necessary to compile or improve methodological lists of types of possible questions.
Question Handling: The same information can be requested in different ways. It is required to create effective methods for understanding and processing the semantics (meaning) of a sentence. It is important that the program recognizes questions that are equivalent in meaning, regardless of the style , words, syntactic relationships and idioms used . I would like the QA-system to divide complex questions into several simple ones, and correctly interpret context-sensitive phrases, possibly clarifying them with the user in the process of dialogue.
Contextual Issues: Questions are asked in a specific context . A context can refine a query, eliminate ambiguity, or follow a user's thoughts on a series of questions.
Sources of knowledge for the QA system: Before answering the question, it would be nice to ask about the available text databases. Whatever methods of word processing are used, we will not find the right answer if it is not in the databases.
Highlighting Answers: The correct implementation of this procedure depends on the complexity of the question, its type, context, quality of available texts, search method, etc. - a huge number of factors. Therefore, it is necessary to approach the study of text processing methods with all caution, and this problem deserves special attention.
Answer wording: The answer should be as natural as possible. In some cases, just selecting it from the text is sufficient. For example, if you need a name (person’s name, name of the device, illness), size (money rate, length, size) or date (“When was Ivan the Terrible born?”), A direct answer is enough. But sometimes you have to deal with complex queries, and here we need special algorithms for merging answers from different documents.
Answers to questions in real time: We need to create a system that can find answers in repositories in a few seconds, regardless of the complexity and ambiguity of the question, the size and vastness of the document base.
Multilingual queries: Development of systems for working and searching in other languages (including automatic translation ).
Interactivity: Often the information offered by the QA system as an answer is incomplete. Perhaps the system incorrectly determined the type of question or incorrectly “understood” it. In this case, the user may want to not only reformulate his request, but also “communicate” with the program through a dialogue.
Reasoning (inference) mechanism: Some users would like an answer that goes beyond the available texts. To do this, you need to add to the QA system knowledge that is common to most areas (see General ontologies in computer science ), as well as means for automatically deriving new knowledge.
QA user profiles: Information about the user, such as his area of interest, manner of speaking and reasoning, default facts, could significantly increase system performance.

Directions for the development of question-answer systems

Since the advent of the first prototypes of question-answer systems, their scope has expanded significantly ^[4] . For example, they are used in answers to questions related to time, geolocation questions, questions of definition of concepts, bibliographic, multilingual questions, questions related to multimedia (visual, audio and video information). Related areas are being studied, such as the construction of interactive QA-systems (clarifying the questions required to clarify the initial one), the reuse of answers and the presentation of knowledge, the use of logical inference from available information to obtain answers to questions, etc., forecasting what questions may be asked, mood analysis.

Quality Assessment of Question and Answer Systems

Question-answer systems are constantly discussed in the framework of the projects: TREC ^[5] , CLEF ^[6] , NTCIR ^[7] , ROMIP ^[8] .

Notes

↑ Hirschman, L. & Gaizauskas, R. (2001) Natural Language Question Answering. The View from Here . Natural Language Engineering (2001), 7: 4: 275-300 Cambridge University Press.
↑ Lin, J. (2002). The Web as a Resource for Question Answering: Perspectives and Challenges. In Proceedings of the Third International Conference on Language Resources and Evaluation (LREC 2002).
↑ Burger, J., Cardie, C., Chaudhri, V., Gaizauskas, R., Harabagiu, S., Israel, D., Jacquemin, C., Lin, CY., Maiorano, S., Miller, G. , Moldovan, D., Ogden, B., Prager, J., Riloff, E., Singhal, A., Shrihari, R., Strzalkowski, T., Voorhees, E., Weishedel, R. Issues, Tasks and Program Structures to Roadmap Research in Question Answering (QA) .
↑ Maybury, MT editor. 2004. New Directions in Question Answering. AAAI / MIT Press.
↑ TREC competition
↑ CLEF evaluation campaign
↑ NTCIR project (English)
↑ ROMIP

Literature

Dragomir R. Radev, John Prager, and Valerie Samn. Ranking suspected answers to natural language questions using predictive annotation . In Proceedings of the 6th Conference on Applied Natural Language Processing, Seattle, WA, May 2000.
Hovy, E., Gerber, L., Hermjakob, U., Junk, M. & Lin, C. (2000) Question Answering in Webclopedia. In: 9th Text Retrieval Conference.
Huettner, A. (2000) Question Answering. In: 5th Search Engine Meeting.
John Prager, Eric Brown, Anni Coden, and Dragomir Radev. Question-answering by predictive annotation . In Proceedings, 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Athens, Greece, July 2000.
Katz, B., Felshin, S. & Lin, J. (2002) The START Multimedia Information System: Current Technology and Future Directions. In: International Workshop on Multimedia Information Systems.
Wong, W. (2005) Practical Approach to Knowledge-based Question Answering with Natural Language Understanding and Advanced Reasoning . In: Master; National Technical University College of Malaysia.

Links

QA systems and demos

One of the first Internet-based question and answer system START on the MIT website.
AskNet Search question-answer system on asknet.ru (originally Stocona Search).
The BrainBoost question and answer system on Answers.com (originally BrainBoost.com).
A QA system built into the Ask.com search engine.
OpenEphyra open source question and answer system.
AnswerBus .
AskEd! M multilingual QA system ( English , Japanese (unavailable link from 13-05-2013 [2307 days] - history ) , Chinese (unavailable link from 13-05-2013 [2307 days] - history ) , Russian (unavailable link from 13-05-2013 [2307 days] - history ) and Swedish (unavailable link from 13-05-2013 [2307 days] - history ) ).
True Knowledge Evi Project .
Ephyra
Russian-speaking question-answer system Robochat .

Specialized QA Systems

EAGLi: MEDLINE question answering engine .

[1] Hirschman, L. & Gaizauskas, R. (2001) Natural Language Question Answering. The View from Here . Natural Language Engineering (2001), 7: 4: 275-300 Cambridge University Press.

[2] Lin, J. (2002). The Web as a Resource for Question Answering: Perspectives and Challenges. In Proceedings of the Third International Conference on Language Resources and Evaluation (LREC 2002).

[3] Burger, J., Cardie, C., Chaudhri, V., Gaizauskas, R., Harabagiu, S., Israel, D., Jacquemin, C., Lin, CY., Maiorano, S., Miller, G. , Moldovan, D., Ogden, B., Prager, J., Riloff, E., Singhal, A., Shrihari, R., Strzalkowski, T., Voorhees, E., Weishedel, R. Issues, Tasks and Program Structures to Roadmap Research in Question Answering (QA) .

[4] Maybury, MT editor. 2004. New Directions in Question Answering. AAAI / MIT Press.

[5] TREC competition

[6] CLEF evaluation campaign

[7] NTCIR project (English)

[8] ROMIP