The writers of science are known as scientific journalists. They report scientific news to the media, sometimes assuming a more investigative and critical role. Before writing the content, they basically need to figure out a way to explain the content so that a user with non-scientific background can understand.
Now, scientists at MIT and elsewhere have developed a neural network to help the science writer, at least to some extent. The AI can read scientific articles and present a simple English abstract in a sentence or two.
Not only in language processing, but the approach can also be used in machine translation and speech recognition. Scientists were actually demonstrating a way to use artificial intelligence to deal with certain prickly problems in physics. But they realized that the same approach could be used to solve other difficult computational problems, including natural language processing, in order to overcome existing neural network systems.
Soljačić said: "We have been doing various types of work in AI for some years now. We used AI to help with our research, basically to improve physics. And as we become more familiar with AI, we would realize that there is an opportunity sometimes to add to the AI field something we know about physics – a certain mathematical construct or a certain law of physics. . We realize that if we use this, it can really help with this or that particular AI algorithm. "
"This approach can be useful in several specific types of tasks, but not in all. We can not say that this is useful for all of AI, but there are cases where we can use insights from physics to improve a particular artificial intelligence algorithm. "
Generally, neural networks have difficulty correlating information from a long data chain, as is necessary in interpreting a research paper. Several tricks have been used to improve this capability, including techniques, are known as short-term memory (LSTM) and recurrent gated units (GRU), but these are still far short of what is required for the actual processing of natural language , the researchers say.
The team came up with an alternative system, which, instead of relying on matrix multiplication, like most conventional neural networks, is based on spinning vectors in a multidimensional space. The key concept is something they call rotational memory unit (RUM).
And as expected, the system generates an output representing each word in the text by a vector in multidimensional space – a line of a certain length pointing in a particular direction. Each subsequent word displays this vector in some direction, represented in a theoretical space that can have thousands of dimensions. At the end of the process, the final vector or set of vectors is translated back into its corresponding word sequence.
Nakov said: "RUM helps neural networks do two things very well. This helps them remember better and allows them to retrieve information more accurately. "
Marin Soljačić, MIT professor of physics, said: "After developing the UOM system to help with certain difficult physical problems, such as the behavior of light in complex engineering materials, we realized that one of the places where we thought this approach might be useful would be natural language processing. "
He further noted, "such a tool would be useful for your work as an editor trying to decide which documents to write. Tatalović was at that time exploring AI in scientific journalism as his Knight fellowship project. "
"And so, we tried some natural language processing tasks. One we tried was to summarize articles, and this seems to be working very well. "
the RUM-based system has been expanded so you can "read" through entire research jobs, not just summaries, to produce a summary of your content. The researchers even tried to use the system in their own research paper describing these findings – the document this report attempts to summarize.
Here is the summary of the new neural network: researchers have developed a new process of representation in the rotational unit of the RUM, a recurring memory that can be used to solve a broad spectrum of the neural revolution in natural language processing.
It may not be elegant prose, but at least it reaches the key points of the information.
Çağlar Gülçehre, a research scientist at IA Deepmind Technologies, a UK company that was not involved in this work, says that this research addresses a major problem in neural networks, related to information that is widely separated in time or space.
He said, "This problem has been a very fundamental issue in AI because of the need to reason about long time delays in sequence prediction tasks. Although I do not think this article completely solves this problem, it shows promising results in long-term dependency tasks such as questions, answers, text summarization, and associative retrieval. "
Gülçehre adds, "Since the experiments performed and the model proposed in this article are released as open source in Github, as a result, many researchers will be interested in testing it on their own tasks. To be more specific, the approach proposed in this article can have a very large impact on the fields of natural language processing and reinforcement learning, where long-term dependencies are very important. "
The research was supported by the Army Research Office, the National Science Foundation, the MIT-SenseTime Alliance on Artificial Intelligence, and the Semiconductor Research Corporation. The team also had the help of Science Daily, whose articles were used to train some of the AI models in this survey.
The paper is described in the journal Transactions of the Association for Computational Linguistics.