Microsoft Creates New Bot That Can Draw Based on Text Instructions
Microsoft researchers have created a bot that draws images from text instructions.
While text-to-image generation isn’t a new technology, Microsoft’s efforts have produced a “nearly three-fold boost in image quality compared to the previous state-of-the-art technique”.
The AI technology is “programmed to pay close attention to individual words when generating images from caption-like text descriptions”.
Researchers have nicknamed this technology the “drawing bot”, and share that it can generate images of everything from commonplace objects to the absurd. Every image contains elements that are absent from the user’s description, denoting that the bot makes use of artificial imagination.
“If you go to Bing and you search for a bird, you get a bird picture. But here, the pictures are created by the computer, pixel by pixel, from scratch,” said Xiaodong He, principal researcher and research manager in one of Microsoft’s research labs.
“These birds may not exist in the real world — they are just an aspect of our computer’s imagination of birds.”
The new technology combines two elements of artificial intelligence: natural-language processing and computer vision. The project’s inception was a bot that could generate text captions from photos, followed by one that could answer human-generated questions about images.
The drawing bot now generates components of an image based on text descriptions, but has to fill in the missing parts of the drawing. These missing parts are pulled from data it has acquired from previous drawings and photos, showcasing its capabilities for machine-based learning.
Microsoft made use of a technology known as Generative Adversarial Network (GAN) to establish the drawing bot.
“The network consists of two machine learning models, one that generates images from text descriptions and another, known as discriminator, that uses text descriptions to judge the authenticity or generated images,” Microsoft state on their blog.
“The generator attempts to get fake pictures past the discriminator; the discriminator never wants to be fooled. Working together, the discriminator pushes the generator toward perfection.”
In the future, Microsoft’s drawing bot could be used as a tool by painters, interior designers, and perhaps even filmmakers, who could use the technology to generate animated films based on screenplays.