Microsoft has announced plans to release public beta versions of new tools in its Project Oxford that will help developers take advantage of recent advances in the fields of machine learning and artificial intelligence, including one that can recognize emotion.
Project Oxford, which was introduced in May, allows developers to create smarter apps, which can do things like recognize faces and interpret natural language even if the app developers are not experts in those fields. Project Oxford includes four main components -- face recognition, speech processing, visual tools, and language understanding intelligent service (LUIS).
The new tools are designed for developers who don’t necessarily have machine learning or artificial intelligence expertise but want to include capabilities like speech, vision and language understanding in their apps.
The emotion tool can be used to create systems that recognize eight core emotional states – anger, contempt, fear, disgust, happiness, neutral, sadness or surprise – based on universal facial expressions that reflect those feelings. The tool is available to developers as a public beta.
"Developers might want to use these tools to create systems that marketers can use to gauge people’s reaction to a store display, movie or food. Or, they might find them valuable for creating a consumer tool, such as a messaging app, that offers up different options based on what emotion it recognizes in a photo," Ryan Galgon, a senior program manager within Microsoft’s Technology and Research group said.
The facial recognition technology that is part of Microsoft Project Oxford also can be used in plenty of other ways, such as grouping collections of photos based on the faces of people that appear in them.
The spell check tool, which developers can add to their mobile- or cloud-based apps and other products, recognizes slang words as well as brand names, common name errors and difficult-to-spot errors such as “four” and “for.” It’s available as a public beta.
The video tool lets customers analyze and automatically edit videos by doing things like tracking faces, detecting motion and stabilizing shaky video. It’s based on some of the same technology found in Microsoft Hyperlapse. It will be available in beta by the end of the year.
The speaker recognition tool can be used to recognize who is speaking based on learning the particulars of an individual’s voice. A developer could use it as a security measure since a person’s voice, like a fingerprint, is unique. It will be available as a public beta by the end of the year.
Custom Recognition Intelligent Services (CRIS) tool makes it easier for people to customize speech recognition for challenging environments, such as a noisy public space. It also could be used to help an app better understand people who have traditionally had trouble with voice recognition, such as non-native speakers or those with disabilities. It will be available as an invite-only beta by the end of the year.
In addition to the new tools, Microsoft Project Oxford’s existing face detection tool will be updated to include facial hair and smile prediction tools, and the tool also has improved visual age estimation and gender identification.
Updated Date: Nov 12, 2015 12:44:37 IST