Introduction to Computer Vision

What is it and what's it good for?
Computer vision, in a nutshell, is trying to make the computer see and interpret the contents of an image. Normally, a computer only recognizes a picture as a collection of bits. It associates no meaning to a picture of an apple, and cannot distinguish it from a picture of an airplane, except to note that the bits are different. Given a picture, a computer has no idea what it is. So the goal of computer vision is to provide the computer with some way of at least interpreting what is in a picture.

Computer vision is a relatively new and important area of study. Its recent study has been brought about by the availability of cheap, high-speed computer processors, which allow some complex analysis algorithms to run in just seconds. Additionally, the recent success with neural networks, along with the availability of special neural network CPUs, is important to the recent success in the computer vision field. Vision and image analysis is very complex, but neural networks are well suited to this type of analysis and actually perform very well.

The development of the computer vision field is very important. There are many applications for this. One application of computer vision is in the development of easy-to-use hardware and software. With the wide spread use of computers, suddenly there are many people who are forced to use computers for the first time in their life. Some of these people type very slowly, yet they don't use a computer often enough to become a proficient typist. For these people, being able to use a pen and writing tablet may make occasional computer use much simpler, as well as more friendly. Another application is in the widespread growth of Personal Digital Assistants. Many of these already use a form of computer vision to allow you to take notes with a pen, and then convert your writing to text. This ability is a direct result of the research in computer vision. A third example application for computer vision is in the management of huge image databases (possibly containing thousands or millions of images). Computer vision research offers some basic techniques for organizing and searching these images. Right now, you can search for "all images that contain a lot of red". Current research may soon offer you the ability to search for "all images that contain a horse".

What does it require?
The development and research of computer vision requires a combination of many disciplines. This isn't meant to be an all-inclusive list, but here are some of the required disciplines and techniques.

First, computer vision requires standard programming techniques. This is pretty self explanatory. Computer vision is, after all, provided by specialized hardware and software.

Another requirement is Artificial Intelligence techniques. Although computer vision is an AI field of its own, it could not exist without a variety of other areas of AI research. Probably the most important AI technique is the use of neural networks. Neural networks allow software to more closely model the processes that go on inside the human brain. Since vision perception is a cognitive function, neural networks are an idea implementation tool for making computers think like humans. Another AI technique that is often used is fuzzy logic. Is that color green or blue? Is maroon considered to be a shade of red? These are often unclear areas, but important to the development of computer vision.

Computer vision also requires image processing techniques. In order to be successful, computers need ways of eliminating the unnecessary visual information (sometimes called noise) that is often contained in an image. Further more, they need ways of detecting things such as shape outlines or color gradients. Without these techniques, computer vision will be, at best, nearly impossible.

Finally, computer vision requires an understanding of human biology and physiology. We need to understand how the eye sees details, and how the brain interprets it. What draws a person's eyes to a certain feature in an image? If the predominant shape in an image is an apple, then the computer should have some way to determine that the apple is the most important aspect, the aspect that most humans will focus on first.

 

NEXT TOPIC - IMAGE PROCESSING BASICS

 

return to the beginning