Beginning CV Augmented Reality iOS Book Details
Title: CV Augmented Reality iOS
Author: Ahmed Fathi Bekhit
No. of pages: 169
Format: PDF, EPUB
Introduction to Computer Vision
This chapter will focus on what computer vision is, why we need it, the evolution of the technology, its different applications, and how it is used in Augmented Reality.
What Is Computer Vision?
Vision is the ability to analyze and interpret scenes and objects of interest. Human vision has been studied for hundreds of years to understand how the visual process works. The human visual process is one of the most complex processes to understand. In fact, to this day, vision scientists have not yet found a complete answer to how the visual process works.
However, vision scientists’ discoveries on how the human visual process begins and a little beyond that inspired computer scientists to develop what we know today as Computer Vision. Vision researchers and scientists describe that the visual process begins with the eyes processing signals of light and converting them into scenes and images for the brain’s visual cortex to analyze and interpret. A breakthrough in vision research in the 1950s discovered that the visual process begins by detecting the simple structures and edges of an image to help build up a more detailed interpretation as the visual information becomes more complex.
The breakthrough vision research inspired computer scientists to develop the preprocessing Computer Vision algorithms we use today to initiate every computer vision task. Beginning CV Augmented Reality iOS Compared to a typical computer today, the human brain computing speed is significantly slower than a computer’s computing speed, yet the human brain performs vision tasks much faster and significantly better than any computer. Hence, researchers’ inspiration to develop Computer Vision algorithms has always been the evolution of vision in nature.
Computer Vision is the field of studying and developing technology that enables computers to process, analyze, and interpret digital images. Today, Computer Vision applications can be found in several industries, such as industrial robots, medical imaging, surveillance, and many more. All these applications have one principal mission, and it is processing, analyzing, and interpreting the contents of digital images to perform a task relevant to an industry’s needs, which will be referred to as a vision task in the rest of this Beginning CV Augmented Reality iOS book. A vision task is any kind of task that requires processing, analyzing, or interpreting the contents of digital images and videos. For reference, a video is a sequence of digital images, typically consisting of 30–60 digital images per second, also referred to as frames.
Computers display digital images very often. When a digital image is given to a computer as an input, the computer reads it as a two- dimensional array of pixels; it can also be defined as a two-dimensional matrix. An image matrix consists of M columns and N rows. The size of an image in pixels can be determined by finding the product of M columns and N rows (M × N), where M is the width and N is the height of the image. A pixel position is identified by its x and y coordinates (x, y) in the matrix.
The coordinate system in computer graphics and digital images is slightly different from a typical Cartesian coordinate system; the point of origin (0, 0) in a digital image begins from the top-left corner of the image. Therefore, x is increasing from left to right, and y is increasing from top to bottom (see Figure 1-1).