The Physics arXiv Blog icon

Conflict images are pictures that contain carefully crafted patterns designed to fool computer vision systems. Patterns cause otherwise powerful face or object recognition systems to misidentify things or faces that they would normally recognize.

This type of deliberate deception has important implications, as malicious users can use it to bypass security systems.

It also raises interesting questions about other types of computational intelligence, such as text-to-image systems. Users enter a word or phrase and a specially trained neural network uses it to create a photorealistic image. But are these systems also vulnerable to adversary attack, and if so, how?

Today we have an answer thanks to the work of Raphael Millier, an artificial intelligence researcher at Columbia University in New York. Millière has discovered a way to trick text-to-image generators by using made-up words designed to elicit specific responses.

Adverse effects

The work again raises security concerns. “Adversary attacks can be deliberately and maliciously deployed to trick neural networks into misclassifying inputs or generating problematic outputs, which can have adverse real-life consequences,” says Millier.

In recent months, text-to-image systems have advanced to the point where users can enter a phrase, such as an astronaut riding a horse, and receive a surprisingly realistic image in response. These systems aren’t perfect, but they’re impressive nonetheless.

Nonsense words can trick people into imagining certain scenes. A famous example is Lewis Carroll’s poem Jabberwocky: “‘Twas brillig, and the lithy toves, Did gyre and gimble in the wabe…” For most people, reading it conjures up fantastical images.

Millier wondered if text-to-image systems could be similarly vulnerable. He uses a technique called “macaroni prompting” to create nonsense words by combining parts of real words from different languages. So the word “rock” is Clip in German, Scogliera in Italian, mistake in French and acantilado in Spanish. Millier took parts of these words to create the nonsensical term “falaiscoglieklippantilado“.

To his surprise, putting that word into the DALL-E 2’s text-to-image generator produced a set of rock images. He created other words in the same way with comparable results: insectafetti for bugs, farpapmaripterling for a butterfly coniglapkaninc for a rabbit and so on. In any case, the generator creates realistic images of the English word.

Millière even created sentences out of these made-up words. For example, the sentence “Eidelucertlagarzard eats maripofarterling” created images of a lizard swallowing a butterfly. “Preliminary experiments suggest that hybridized redundant strings can be methodically crafted to generate images of virtually any object as needed, and even combined together to generate more complex scenes,” he says.

Farpapmaripterling landing on a feuerpompbomber as imagined by the DALL-E 2 text-to-image generator (Source; https://arxiv.org/abs/2208.04135)

Millière thinks it’s possible because the text-to-image generators are trained on a wide variety of photos, some of which must have been labeled in foreign languages. This allows made-up words to encode information that a machine can understand.

The ability to fool text-to-image generators raises a number of concerns. Millière points out that technology companies take great care to prevent illegal use of their technology.

“An obvious concern with this method is bypassing content filters based on blacklisting prompts,” Millier says. “In principle, macarooning can provide an easy and seemingly reliable way to bypass such filters to generate harmful, offensive, illegal or otherwise sensitive content, including violent, hateful, racist, sexist or pornographic images and perhaps images , infringing intellectual property rights or depicting real persons.”

Unwanted images?

He suggests that one way to prevent the creation of unwanted images is to remove all examples of them from the data sets used to train the AI ​​system. Another option is to check all the images it creates by feeding them to an image-to-text system before making them public, and filter out any that create unwanted text descriptions.

At the moment, the ability to interact with text-to-image generators is limited. Of the three most advanced, Google has developed two, Parti and Imagen, and is not making them available to the public because of various biases it has found in their inputs and outputs.

The third system, DALL-E 2, was developed by the Open AI Initiative and is available to a limited number of researchers, journalists and others. This is the one Millier uses.

One way or another, these systems, or others like them, are bound to become more widely used, so understanding their limitations and weaknesses is important to inform public debate. A key question for technology companies, and more broadly for society, is how these systems should be used and regulated. Such a debate is urgently needed.


Reference: Adversary Attacks on Image Generation with Made-up Words: arxiv.org/abs/2208.04135

Source link

Leave a Reply

Your email address will not be published. Required fields are marked *