The questions that reveal how machines think

Adi Gaskell24 Sep 2019

269 2 minutes read

Originally posted on The Horizons Tracker.

The Turing Test famously aims to test the abilities of artificial intelligence by tasking humans with uncovering when they’re talking to a person and when they’re talking to a machine. It tests the ability of AI to understand human language sufficiently to be able to hold natural seeming conversations.

As anyone who has tried to have a conversation with a AI-powered chatbot or virtual assistant can attest, there remains some way to go before technology can master this most human of abilities. New research¹ from the University of Maryland aims to help AI progress by identifying some 1,200 questions that while pretty easy for humans to answer, have traditionally stumped the best technology available today.

“Most question-answering computer systems don’t explain why they answer the way they do, but our work helps us see what computers actually understand,” the researchers explain. “In addition, we have produced a dataset to test on computers that will reveal if a computer language system is actually reading and doing the same sorts of processing that humans are able to do.”

Smarter machines

The researchers explain that many of the Q&A systems in operation today rely on either humans or computers to generate the questions that are designed to train the systems. The problem with this approach is that it’s not easy to understand why the computers are struggling to answer the questions correctly. The researchers believe that by better understanding what stumps the machines, we can better design datasets to train them.

The team developed a system that was capable of showing its thought processes as it attempted to answer each question, which they believe would not only give insight into the processes the computer was going through, but if deployed in a live environment, allow the human questioner to modify their line of enquiry.

The partnership between man and machine enabled 1,213 questions that had defeated the computer on its own to be successfully answered.

“For three or four years, people have been aware that computer question-answering systems are very brittle and can be fooled very easily,” the authors explain. “But this is the first paper we are aware of that actually uses a machine to help humans break the model itself.”

The team believe that the questions will serve as a valuable dataset to better inform work in natural language processing, while also acting as a training dataset, especially as the questions uncovered six distinct phenomena that baffled AI-based systems.

These failures emerged in either linguistic areas such as paraphrasing or unexpected context, or a failure in reasoning skills, such as the triangulation of the various elements in a question or the requirement to use multiple steps when forging a conclusion.

“Humans are able to generalize more and to see deeper connections,” Boyd-the researchers explain. “They don’t have the limitless memory of computers, but they still have an advantage in being able to see the forest for the trees. Cataloguing the problems computers have helps us understand the issues we need to address, so that we can actually get computers to begin to see the forest through the trees and answer questions in the way humans do.”

Suffice to say, there is a little way to go before this kind of scenario emerges, but the research is an interesting indication of the progress being made in enabling machines to better navigate the nuances of human language.

Article source: The Questions That Reveal How Machines Think.

Header image source: Gerd Altmann on Pixabay, Public Domain.

Reference:

Wallace, E., Rodriguez, P., Feng, S., Yamada, I., & Boyd-Graber, J. (2018). Trick Me If You Can: Human-in-the-loop Generation of Adversarial Examples for Question Answering. arXiv preprint arXiv:1809.02701. ↩

Rate this post

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.

Smarter machines

Adi Gaskell

Related Articles

Do recommender systems help us make better decisions?

Research reveals the public is largely unaware of the use of AI In journalism

New research highlights how error-ridden data used to train AI is

AI-based KM features for knowledge co-development and exchange [Generative AI & KM series part 3]