Skip to main content

AI software can identify objects in photos and videos at near-human levels

A new AI software program developed by researchers at Google and Stanford University can recognise objects in photos and videos at near-human levels of understanding.

ai software program google stanford university object recognition technology images videos human level understanding

It was only recently that computer systems became smart enough to identify unknown objects in photographs. Even then, it has generally been limited to individual objects. Now, two separate teams of researchers at Google and Stanford University have created software able to describe entire scenes. This could lead to much better and more intelligent algorithms in the future.
Stanford's work, entitled "Deep Visual-Semantic Alignments for Generating Image Descriptions", explains how specific details found in photographs and videos can be translated into written text. Google's version of the technology, in a study titled "Show and Tell: A Neural Image Caption Generator", produced similar results.
While each team used a slightly different approach, they both combined deep convolutional neural networks with recurrent neural networks that excel at text analysis and natural language processing. The programs were able to "learn" from each new interaction, with algorithms enabling the system to improve its accuracy by scanning scene after scene, looking for patterns, and then using the accumulation of previously described scenes to extrapolate what is being depicted in the next unknown image.

ai image recognition

"The system can analyse an unknown image and explain it in words and phrases that make sense," says Fei-Fei Li, a professor of computer science and director of the Stanford Artificial Intelligence Lab. "This is an important milestone. It's the first time we've had a computer vision system that could tell a basic story about an unknown image by identifying discrete objects and also putting them into some context."
These latest algorithms are being trained on a visual dictionary – the ImageNet project – with a database of more than 14 million objects. Each object is described by a mathematical term, or vector, that enables the machine to recognise the shape the next time it is encountered. Those mathematical definitions are linked to the words humans would use to describe the objects.
“I was amazed that even with the small amount of training data that we were able to do so well,” said Oriol Vinyals, a Google computer scientist who worked with members of the Google Brain project. “The field is just starting, and we will see a lot of increases.”
In the near term, computer vision systems that can discern the story in a picture will enable people to search photo or video archives and find highly specific images. Eventually, these advances will lead to robotic systems able to navigate unknown situations. Driverless cars would also be made safer. However, it also raises the prospect of even greater levels of government surveillance.

 frisbee 
"A group of young people playing a game of Frisbee."
 

 frisbee 
"A person riding a motorcycle on a dirt road."
 

 frisbee 
"A pizza sitting on top of a pan on top of a stove."
 

Comments

Popular posts from this blog

Best Web Design Company in Pondicherry

#Technology    has two faces. We all feel it, but sometimes can’t find words to describe it.  #Ebooks    are the best example to show the 0-1 nature of emotions the  #technology  evokes. #itwhere    provide a  #Best     #solutions    to  #Growyourbusiness    feel free to drop a  #Mail    info@itwheretech.co.in www.itwheretech.co.in 

Phoenix OS is (another) Android-as-a-desktop

Google Android may have been developed as a smartphone operating system (and later ported to tablets, TVs, watches, and other platforms), but over the past few years we’ve seen a number of attempts to turn it into a desktop operating system. One of the most successful has been  Remix OS , which gives Android a taskbar, start menu, and an excellent window management system. The Remix OS team has also generated a lot of buzz over the past year, and this week the operating system gained a lot of new alpha testers thanks to a  downloadable version of Remix OS  that you can run on many recent desktop or notebook computers. But Remix OS isn’t the only game in town.  Phoenix OS  is another Android-as-desktop operating system, and while it’s still pretty rough around the edges, there are a few features that could make it a better option for some testers. Some background I first discovered Phoenix OS from  a post in the Remix OS Google Group , although I’ve also found mentions of th

So, when will your device actually get Android Oreo?

Google officially just took the wraps off of Android Oreo, but there are still some questions left to be answered — most notably, precisely when each device will be getting the latest version of the mobile operating system. Due to Android’s openness and a variety of different factors on the manufacturing side, it’s not an easy question to answer, but we’ll break it down best we can. First the good news: If your device was enrolled in the Android Beta Program, you’ll be getting your hands on the final version of the software “soon,” according to Google. Exactly what that means remains to be seen, but rest assured that you’ll be one of of the first people outside of Google to take advantage of picture-in-picture, notification dots and the like. No big surprise, Google handsets will be the first non-beta phones to get the update. The Pixel, Nexus 5X and 6P are at the top of the list, alongside Pixel C tablet and ASUS’s Nexus Player set-top box, which will be receiving the upgrade i

How ad-free subscriptions could solve Facebook

At the core of Facebook’s “well-being” problem is that its business is directly coupled with total time spent on its apps. The more hours you pass on the social network, the more ads you see and click, the more money it earns. That puts its plan to make using Facebook healthier at odds with its finances, restricting how far it’s willing to go to protect us from the harms of over use. The advertising-supported model comes with some big benefits, though. Facebook CEO Mark Zuckerberg has repeatedly said that “We will always keep Facebook a free service for everyone.” Ads lets Facebook remain free for those who don’t want to pay, and more importantly, for those around the world who couldn’t afford to. Ads pay for Facebook to keep the lights on, research and develop new technologies, and profit handsomely in a way that attracts top talent and further investment. More affluent users with more buying power in markets like the US, UK, and Canada command higher ad prices, effectively

Engineering against all odds, or how NYC’s subway will get wireless in the tunnels

Never ask a wireless engineer working on the NYC subway system “What can go wrong?” Flooding, ice, brake dust, and power outages relentlessly attack the network components. Rats — many, many rats — can eat power and fiber optic cables and bring down the whole system. Humans are no different, as their curiosity or malice strikes a blow against wireless hardware (literally and metaphorically). Serverless software deployment to the cloud, this is not. New York City officially got wireless service in every underground subway station a little more than a year ago, and I was curious what work went into the buildout of this system as well as how it will expand in the future. That curiosity is part of a series of articles I’ve written on an observed pattern known as cost disease, the massively inflating costs of basic human services like health care, housing, infrastructure, and education. The United States spends trillions of dollars on each of these fields, massively outspending sim