Skip to main content

AI software can identify objects in photos and videos at near-human levels

A new AI software program developed by researchers at Google and Stanford University can recognise objects in photos and videos at near-human levels of understanding.

ai software program google stanford university object recognition technology images videos human level understanding

It was only recently that computer systems became smart enough to identify unknown objects in photographs. Even then, it has generally been limited to individual objects. Now, two separate teams of researchers at Google and Stanford University have created software able to describe entire scenes. This could lead to much better and more intelligent algorithms in the future.
Stanford's work, entitled "Deep Visual-Semantic Alignments for Generating Image Descriptions", explains how specific details found in photographs and videos can be translated into written text. Google's version of the technology, in a study titled "Show and Tell: A Neural Image Caption Generator", produced similar results.
While each team used a slightly different approach, they both combined deep convolutional neural networks with recurrent neural networks that excel at text analysis and natural language processing. The programs were able to "learn" from each new interaction, with algorithms enabling the system to improve its accuracy by scanning scene after scene, looking for patterns, and then using the accumulation of previously described scenes to extrapolate what is being depicted in the next unknown image.

ai image recognition

"The system can analyse an unknown image and explain it in words and phrases that make sense," says Fei-Fei Li, a professor of computer science and director of the Stanford Artificial Intelligence Lab. "This is an important milestone. It's the first time we've had a computer vision system that could tell a basic story about an unknown image by identifying discrete objects and also putting them into some context."
These latest algorithms are being trained on a visual dictionary – the ImageNet project – with a database of more than 14 million objects. Each object is described by a mathematical term, or vector, that enables the machine to recognise the shape the next time it is encountered. Those mathematical definitions are linked to the words humans would use to describe the objects.
“I was amazed that even with the small amount of training data that we were able to do so well,” said Oriol Vinyals, a Google computer scientist who worked with members of the Google Brain project. “The field is just starting, and we will see a lot of increases.”
In the near term, computer vision systems that can discern the story in a picture will enable people to search photo or video archives and find highly specific images. Eventually, these advances will lead to robotic systems able to navigate unknown situations. Driverless cars would also be made safer. However, it also raises the prospect of even greater levels of government surveillance.

 frisbee 
"A group of young people playing a game of Frisbee."
 

 frisbee 
"A person riding a motorcycle on a dirt road."
 

 frisbee 
"A pizza sitting on top of a pan on top of a stove."
 

Comments

Popular posts from this blog

The EHang 184 Is A Human-Sized Drone Taking Off At CES

We’ve seen some pretty cool stuff on day 1 of CES 2016, but probably nothing more eye-catching than the EHang 184, a human-sized drone built by the Chinese UAV company  EHang . Yes you heard right — a giant autonomous drone that fits a human. It’s basically what you would expect to see if someone shrunk you down to the size of a LEGO and stuck you next to a DJI Inspire. Except no one was shrunk, and the giant flying machine was sitting smack in the middle of the CES drone section. EHang, which was founded in 2014 and has raised about $50M in venture fundingto date, was pretty gung-ho about telling everyone at CES that the 184 was the future of personal transport. And for the most part, people were too in awe to question them. But the reality is that the company probably was using the 184 as more of a marketing tool for their standard-sized drones like the  Ghost . Not that we’re saying that the 184 will never be a real thing, just that it probably isn’t co...

Western Union Brings Money Transfer And Its Tricky Fees To Chat Apps

Remittance has always been a shady business. Migrant workers need to send money they earn home to their families, but get hit with fine print fees so less cash comes out the other side than they might assume. Remittance companies earn extra by keeping the margin between their own made up exchange rate and the real one. Western Union is the best known remittance company, with 500,000 brick-and-mortar locations around the world. But tech startups like TransferWise, Azimo, and WorldRemit are gunning for the business. They hope to increase convenience and reduce fees to lure customers away from Western Union, Moneygram, and other old-school remittance providers. So  Western Union  is going digital thanks to partnerships with big messaging apps. It launched its Western Union Connect system in October last year, followed by a partnership with WeChat for sending up to $100. Now it’s getting into bed with  Viber , which has over 664 million “unique” users, thou...

Following Patent Deal, Every Time Apple Sells An iPhone, Ericsson Gets A Bit Of Money

Telecommunications infrastructure company Ericsson just  announced  that it has reached an agreement with Apple over an ongoing patent dispute. For the next seven years, Apple will pay a fraction of its iPhone and iPad profit to Ericsson in royalties. Back in February, Ericsson filed suits in many different jurisdictions for patent infringement (the International Trade Commission, the U.S. District Court for the Eastern District of Texas, the U.S. District Court for the Northern District of California, as well as courts in the U.K., Germany and the Netherlands). According to the Swedish company, Apple has been violating 41 patents over the past few years with its iPhone and iPad, in particular patents related to GSM, UMTS and LTE technologies. As expected, the two companies have reached an agreement and Ericsson is dropping all of its lawsuits. Today’s news isn’t particularly surprising as Ericsson holds more than 35,000 patents. Many of them are related to wireles...

Android Oreo vs iOS 11: What’s different and what’s the same?

Google just announced Android Oreo and it packs a handful of new features. Some are at the system level and speed up the system and extend the battery life, while others are features that will change the way users interact with their phone. A lot of these features should be familiar to iPhone and iPad owners. Normally Apple is the one accused of copying Android, but for Android Oreo, Google lifted a handful of features straight from iOS, while a couple of new functions are hitting Android before iOS.                                                                                                                     Notifications Google cribbed iOS for Android’s new notification scheme. In An...

Google Calls Out EFF Over Bogus Claims That It Snoops On Students With Its Chromebooks

The Electronic Frontier Foundation (EFF) caused quite a stir this week when it alleged that Google is using its Chromebook platform, which has made a significant impact in the education sector, to snoop on students. The charges were damning, with the EFF claiming that Google was violating its own corporate policies and using students’ personally identifiable browsing data/habits to refine its services, in addition to sharing that data with partners. "EFF bases this petition on evidence that Google is engaged in collecting, maintaining, using, and sharing student personal information in violation of the 'K-12 School Service Provider Pledge to Safeguard Student Privacy' (Student Privacy Pledge), of which it is a signatory,” alleged the EFF in its initial FTC complaint. Google takes such allegations very seriously, and has thus responded to every claim brought forth by the EFF. “While we appreciate the EFF’s focus on student data privacy, we are confid...