Adversarial images are pictures which contain carefully crafted patterns made to fool computer vision systems. The patterns cause otherwise powerful face or object recognition systems to misidentify things or faces they might normally recognize.
This sort of deliberate trickery has important implications since malicious users might use it to bypass security systems.
In addition, it raises interesting questions about other forms of computational intelligence, such as for example text-to-image systems. Users enter a word or phrase and a specially trained neural network uses it to conjure up a photorealistic image. But are these systems also vunerable to adversarial attack and when so, how?
Today we get a remedy thanks to the task of Raphal Millire, an artificial intelligence researcher at Columbia University in New york. Millire has discovered a method to trick text-to-image generators using composed words made to trigger specific responses.
The task again raises security issues. Adversarial attacks could be intentionally and maliciously deployed to trick neural networks into misclassifying inputs or generating problematic outputs, which might have real-life adverse consequences, says Millire.
Lately, text-to-image systems have advanced to the stage that users can enter a phrase, such as for example an astronaut riding a horse, and get a surprisingly realistic image in response. These systems aren’t perfect but still impressive.
Nonsense words can trick humans into imagining certain scenes. A famous example may be the Lewis Carroll poem Jabberwocky: ‘Twas brillig, and the slithy toves, Did gyre and gimble in the wabe For many people, reading it conjures up fantastical images.
Millire wondered whether text-to-image systems could possibly be similarly vulnerable. He used a method called macaroni prompting to generate nonsense words by combining elements of real words from different languages. Therefore the word cliff is Klippe in German, scogliera in Italian, falaise in French and acantilado in Spanish. Millire took elements of these words to generate the nonsense term falaiscoglieklippantilado.
To his surprise, putting this word in to the DALL-E 2 text-to-image generator produced a couple of images of cliffs. He created other words just as with comparable results: insekafetti for bugs, farpapmaripterling for butterfly, coniglapkaninc for rabbit and so forth. In each case, the generator produced realistic images of the English word.
Millire even produced sentences of the made-up words. For instance, the sentence An eidelucertlagarzard eating a maripofarterling produced images of a lizard devouring a butterfly. The preliminary experiments claim that hybridized nonce strings could be methodically crafted to create images of just about any subject as needed, and also combined together to create more technical scenes, he says.
A farpapmaripterling lands on a feuerpompbomber, as imagined by the text-to-image generator DALL-E 2 (Source; https://arxiv.org/abs/2208.04135)
Millire thinks can be done because text-to-image generators are trained on a wide selection of pictures, a few of which will need to have been labelled in foreign languages. This enables the made-up words to encode information that the device can understand.
The opportunity to fool text-to-image generators raises several concerns. Millire highlights that technology companies put great care into preventing illicit usage of their technologies.
A clear concern with this technique may be the circumvention of content filters predicated on blacklisted prompts, says Millire. In principle, macaronic prompting could offer an easy and seemingly reliable solution to bypass such filters to be able to generate harmful, offensive, illegal, or elsewhere sensitive content, including violent, hateful, racist, sexist, or pornographic images, as well as perhaps images infringing on intellectual property or depicting real individuals.
He shows that a proven way of avoiding the creation of unwanted imagery is always to remove any types of it from the info sets used to teach the AI system. Another option would be to check all of the images it generates by feeding them into an image-to-text system prior to making them public and filter any that produce unwanted text descriptions.
For as soon as, opportunities to connect to text-to-image generators is bound. Of the three innovative, Google is rolling out two, Parti and Imagen, and isn’t making them open to the public due to various biases it has discovered within their inputs and outputs.
The 3rd system, DALL-E 2, originated by the Open AI Initiative and can be acquired to limited amounts of researchers, journalists among others. This is actually the one Millire used.
Some way, these systems or other similar ones, are bound to are more trusted, so understanding their limitations and weaknesses is essential for informing public debate. An integral question for technology companies, and much more broadly for society, is how these systems ought to be used and regulated. Such debate is urgently needed.
Ref: Adversarial Attacks on Image Generation With Made-Up Words : arxiv.org/abs/2208.04135