This tool defends AI models against hostile attacks

The potential number of machine learning applications has grown tremendously in the past several years, with the growth of artificial intelligence models more powerful. Machine learning is already being used in many areas of daily life, be it indoors Recommendation AlgorithmsAnd the self-driving carsor being used in new ways in fields such as Research or Finance. The most promising is how machine learning models can one day revolutionize Health CareIt may even help us deal with impossibly complex issues such as mitigation Climate change.

But despite the huge potential of machine learning models, they are Not guaranteed And it can be mistaken – sometimes with Serious consequences. These unintended effects are even more worrisome when image recognition algorithms are increasingly used to assess people’s biometric data. However, at the same time, it is also becoming clear that these same machine learning models They can be easily fooled when editing photos. Unfortunately, theblack boxThe nature of AI makes it difficult to determine why models make the decisions — or mistakes — they make, highlighting the importance of making models more robust.

A team of researchers from Kyushu University in Japan does just that, by developing a new way to assess how neural networks handle unfamiliar elements during image recognition tasks. Dubbed Raw Zero-Shot, the technology may be one tool to help researchers identify key features that lead AI models to make these mistakes, and ultimately discover how to create more Flexible AI models.

“There are a range of real-world applications of neural networks for image recognition, including self-driving cars and diagnostic tools in healthcare,” he explained. study Lead author Danilo Vasconcellos Vargas in A. statement. “However, no matter how well an AI is trained, it can fail even with a slight change in the image.”

AI models used for image recognition are usually initially trained on a large number of images. While some of these bigger models It can be very powerful given its size, and recent work has shown that Edit the input image Even by a single pixel, it can cause the system to shutdown. These intentionally altered images are called hostile images, and can be used as part of a coordinated attack on AI-powered systems.

The team explained that “adversarial samples are noise-disturbed samples in which neural networks can fail in tasks such as image classification.” “Since their discovery a few years ago, the quality and variety of competing samples has grown. These hostile samples can be generated by a particular class of algorithms known as hostile attacks. “

To investigate the root cause of these failures, the team focused on twelve of the most popular image recognition systems, testing them to see how they would react when encountering samples of images that were not part of their initial training data set. The team assumed that there would be correlations in their subsequent predictions—that the AI ​​models would be wrong, but they would make mistakes in the same way.

The team’s test results eventually showed that when faced with these modified images, these AI models were, in fact, consistently wrong in the same way. The team hypothesizes that the linear architecture of some artificial neural networks is one of the main factors for them to fail in a similar way, with other work suggesting that such models learn “false architectures” that are easier to learn than to learn. is expected.

“If we understand what the AI ​​was doing and what it learned when processing unknown images, we can use the same understanding to analyze why the AI ​​crashes when confronted with images with single-pixel changes or subtle adjustments,” Vargas explained. “Using the knowledge we have gained in trying to solve one problem by applying it to a different but related problem is known as portability.”

In the course of their work, the researchers found that a model of artificial intelligence called Capsule Networks (CapsNet) It exhibited the greatest portability of all the neural networks tested, while another model called LeNet came in second.

Ultimately, the team said that the development of AI should focus not only on accuracy, but also on increasing the robustness and flexibility of the models. According to the team, tools like Raw Zero-Shot can help experts determine why problems occur in their models, so that future systems can be designed to be less vulnerable to hostile attacks.

The team noted that, “Most of these adversarial attacks can also be converted into real-world attacks, which presents a significant problem, as well as a security risk, for current neural network applications. Although there are many variants of defenses against these adversarial attacks, no one can Algorithm learning or procedure known to defend [against these attacks] continuously. This shows that a more in-depth understanding of the opponent’s algorithms is needed to craft consistent and robust defenses.”

Collection Created with Sketch.

Leave a Comment