Skip to main content

AI software can identify objects in photos and videos at near-human levels

A new AI software program developed by researchers at Google and Stanford University can recognise objects in photos and videos at near-human levels of understanding.

ai software program google stanford university object recognition technology images videos human level understanding

It was only recently that computer systems became smart enough to identify unknown objects in photographs. Even then, it has generally been limited to individual objects. Now, two separate teams of researchers at Google and Stanford University have created software able to describe entire scenes. This could lead to much better and more intelligent algorithms in the future.
Stanford's work, entitled "Deep Visual-Semantic Alignments for Generating Image Descriptions", explains how specific details found in photographs and videos can be translated into written text. Google's version of the technology, in a study titled "Show and Tell: A Neural Image Caption Generator", produced similar results.
While each team used a slightly different approach, they both combined deep convolutional neural networks with recurrent neural networks that excel at text analysis and natural language processing. The programs were able to "learn" from each new interaction, with algorithms enabling the system to improve its accuracy by scanning scene after scene, looking for patterns, and then using the accumulation of previously described scenes to extrapolate what is being depicted in the next unknown image.

ai image recognition

"The system can analyse an unknown image and explain it in words and phrases that make sense," says Fei-Fei Li, a professor of computer science and director of the Stanford Artificial Intelligence Lab. "This is an important milestone. It's the first time we've had a computer vision system that could tell a basic story about an unknown image by identifying discrete objects and also putting them into some context."
These latest algorithms are being trained on a visual dictionary – the ImageNet project – with a database of more than 14 million objects. Each object is described by a mathematical term, or vector, that enables the machine to recognise the shape the next time it is encountered. Those mathematical definitions are linked to the words humans would use to describe the objects.
“I was amazed that even with the small amount of training data that we were able to do so well,” said Oriol Vinyals, a Google computer scientist who worked with members of the Google Brain project. “The field is just starting, and we will see a lot of increases.”
In the near term, computer vision systems that can discern the story in a picture will enable people to search photo or video archives and find highly specific images. Eventually, these advances will lead to robotic systems able to navigate unknown situations. Driverless cars would also be made safer. However, it also raises the prospect of even greater levels of government surveillance.

 frisbee 
"A group of young people playing a game of Frisbee."
 

 frisbee 
"A person riding a motorcycle on a dirt road."
 

 frisbee 
"A pizza sitting on top of a pan on top of a stove."
 

Comments

Popular posts from this blog

How ad-free subscriptions could solve Facebook

At the core of Facebook’s “well-being” problem is that its business is directly coupled with total time spent on its apps. The more hours you pass on the social network, the more ads you see and click, the more money it earns. That puts its plan to make using Facebook healthier at odds with its finances, restricting how far it’s willing to go to protect us from the harms of over use. The advertising-supported model comes with some big benefits, though. Facebook CEO Mark Zuckerberg has repeatedly said that “We will always keep Facebook a free service for everyone.” Ads lets Facebook remain free for those who don’t want to pay, and more importantly, for those around the world who couldn’t afford to. Ads pay for Facebook to keep the lights on, research and develop new technologies, and profit handsomely in a way that attracts top talent and further investment. More affluent users with more buying power in markets like the US, UK, and Canada command higher ad prices, effectively...

The EHang 184 Is A Human-Sized Drone Taking Off At CES

We’ve seen some pretty cool stuff on day 1 of CES 2016, but probably nothing more eye-catching than the EHang 184, a human-sized drone built by the Chinese UAV company  EHang . Yes you heard right — a giant autonomous drone that fits a human. It’s basically what you would expect to see if someone shrunk you down to the size of a LEGO and stuck you next to a DJI Inspire. Except no one was shrunk, and the giant flying machine was sitting smack in the middle of the CES drone section. EHang, which was founded in 2014 and has raised about $50M in venture fundingto date, was pretty gung-ho about telling everyone at CES that the 184 was the future of personal transport. And for the most part, people were too in awe to question them. But the reality is that the company probably was using the 184 as more of a marketing tool for their standard-sized drones like the  Ghost . Not that we’re saying that the 184 will never be a real thing, just that it probably isn’t co...

Facebook ‘Class Action’ Privacy Lawsuit Moves To Austrian Supreme Court

A privacy lawsuit filed against Facebook last year by Viennese lawyer and data privacy activist Max Schrems has moved up to Austria’s Supreme Court which will rule on whether the suit can be treated as a class action. When Schrems kicked off the suit, back in July 2014, he invited adult non-commercial Facebook users located anywhere outside the U.S. and Canada to join the suit for free — and tens of thousands of people quickly took up the invitation. The legal action focuses on multiple areas where the plaintiffs argue Facebook has been violating EU data protection laws, such as the absence of effective consent to many types of data use; the tracking of Internet users through external websites; and the monitoring and analysis of users via big data systems. Facebook’s participation in the NSA’s PRISM surveillance program is also part of the complaint. In July the case suffered a setback when an Austrian regional co...

Best Web Design Company in Pondicherry

#Technology    has two faces. We all feel it, but sometimes can’t find words to describe it.  #Ebooks    are the best example to show the 0-1 nature of emotions the  #technology  evokes. #itwhere    provide a  #Best     #solutions    to  #Growyourbusiness    feel free to drop a  #Mail    info@itwheretech.co.in www.itwheretech.co.in