Of course, computers can’t think, reason, or rationalize in quite the same way as humans, but researchers at Carnegie Mellon University are using Computer Vision and Machine Learning as ways of optimizing the capabilities of computers.
NEIL’s task isn’t so much to deal with hard data, like numbers, which is what computers have been doing since they first were created. Instead, NEIL goes a step further, translating the visual world into useful information by way of identifying colors and lighting, classifying materials, recognizing distinct objects, and more. This information then is used to make general observations, associations, and connections, much like the human mind does at an early age.
While computers aren’t capable of processing this information with an emotional response–a critical component that separates them from humans–there are countless tasks that NEIL can accomplish today or in the near future that will help transform the way we live. Think about it: how might Computer Vision and Machine Learning change the way you live, work, and interact with your environment?
Most intriguing is the idea that NFLers have not embraced these technologies as readily as their pro sports peers because the complexity of the game defies decoupling teamwork into discrete actions readily attributed to individual players.
Think of offensive linemen working together to protect quarterbacks and running backs. Or defensive linebackers who don’t get many tackles, but make it difficult for the offense to execute.
For this reason, many NFL coaches prefer to assess players based on how they look on film.
Over time, however, this resistance to analytics is likely to fade as machine learning based applications that use computer vision prove their value and become easier to use.
In fact, it could simply be a matter of identifying more subtle metrics to extract and analyze that previously evaded human detection.
For example, during the 2012 playoffs, the Wall Street Journal’s John Letzig reported on how MLB used motion-analysis software from Sportvision Inc. to quantify an outfielder’s ability to get a jump on fielding fly balls.
Given the rich data complexity of football, it’s hard to imagine coaches not eating up algorithm-powered, in-situ forecasts that take player stats, weather and game scenarios into account and identify those variables most likely to influence what happens next.
Or team management not angling for competitive advantage at the lowest possible cost by pinpointing those overlooked, game-deciding metrics that don’t correlate with salary levels like fourth down conversions (just like on-base percentage was the key focus in Moneyball).
In other instances, humans can request recalibration of the algorithms so video tracking models ignore what they consider to be noise and add additional factors they view as pivotal.
Part of football’s very appeal is its complexity and the many inter-dependencies that make it tick. And so it’s a natural for the video scrutiny and data mining that computer vision and machine learning make possible.
Are you involved in an activity where many individuals come together to form a whole greater than the sum of its parts?
How could analysis of its finer points of interaction unlock hidden value in your business?
Writing for the New York Times Science Desk, John Markoff reports on how computer vision and machine learning will create the next generation Internet where search engines find images and videos with the same degree of relevance as they do now with text.
And the need is crushing… in the next 60 seconds, YouTube will have uploaded 72 hours of video.
Today, unless images and videos are labeled, search engines have no way to match them against your query. Even then, labels can be unreliable (e.g., “junk” versus the objects that comprise it).
To give search engines something akin to human sight, Stanford’s Dr. Fei Fei Li has teamed up with fellow computer scientists at Princeton to develop ImageNet, the world’s biggest image database.
Given the enormity of the task and limited budget, Dr. Li connected with Mechanical Turk, the Amazon.com crowdsourcing system where, for a small payment per task, humans label photos. The database now has over 14 million images in over 21,000 categories thanks to the efforts of nearly 30,000 participants a year.
As the database of labeled images grows, machine learning algorithms enable software to recognize similar, unlabeled images. Over time, accuracy rates improve dramatically.
Surprisingly, when tested on a large collection of labeled images by Google computer scientists Andrew Ng and Jeff Dean, the system nearly doubled the accuracy of previous neural network algorithms designed to model human thought processes.
To further improve speed and accuracy, images are classified against WordNet, a hierarchical database of English words. With skillful programming to make educated choices about how to search the hierarchy, the database continues to rise to this growing challenge.