tech2 News StaffNov 11, 2016 15:47:36 IST
Facebook is investing in long term research of various AI applications. One of the research areas is to allow computers to understand images, and automatically generate captions for images. Facebook is training AI to classify, understand and explain the elements in a photo. In the last few years, computers have gone from automatically drawing blocks around objects in a scene, to an implementation that works out who or what the objects in an image are, and what they are doing.
Facebook can predict future movements, describe the position and describe the image in almost human terms. The wireframe overlays on the image allows the AI to figure out the poses of the people. A comparison of the poses of the people in the image allows the machine to better describe the action. The best thing is, Facebook can do all of this in real time. Here is a video of a Facebook testing the pose identification. The system tracks the body through various situations and poses, and does not get thrown off no matter how hard it is pushed.
The system is not fool proof though. There are cases when Facebook AI does not come up with an appropriate caption. Machines "see" images very differently from humans, and something that is trivially easy for humans may be very difficult for machines. Facebook engineers are understanding how humans learn, and are integrating these lessons into artificial intelligences, to make them more human like. This is an example of a caption being accurate, next to a caption that is just wrong.
The wrong caption is because machines lack a contextual understanding of the world. To help machines get better contextual understanding, Facebook engineers feed in models that mimic conceptual understanding by humans. These are datasets where facts are correlated to concepts. However, all the data in the world is not available as models that can be fed into artificial intelligences, which is why Facebook is teaching the machines to farm the data from unstructured sources, such as Wikipedia articles. The improvements in the field has seen an acceleration in recent times, and more rapid progress can be expected going forward.
Image captioning is an important exercise for tech companies to train their artificial intelligences. Microsoft launched a captionbot that provides captions to images. The service is available on the web, and as a bot on Facebook Messenger. It is a hit and miss affair using the bot, but the captions attempt to describe the photo rather than just identify and state what is contained within. At the Made By Google event in October, CEO Sundar Pichai demonstrated TensorFlow captioning images. The captions by machines are approaching the captions provided by humans.
The image captioning technology used by Google is open source, and available on GitHub as a model for Tensorflow. Google initiated experiments to automatically caption images way back in 2014. One of the applications of image captioning is making photo and video content available to visually challenged audiences. While surfing the web, a digital assistant can talk to the user and describe the images.
Welcome to Tech2 Innovate, India’s most definitive youth festival celebrating innovation is being held at GMR Grounds, Aerocity Phase 2, on 14th and 15th February 2020. Come and experience an amalgamation of tech, gadgets, automobiles, music, technology, and pop culture along with the who’s who of the online world. Book your tickets now.