This page is a part of CVprimer.com, a wiki devoted to computer vision. It focuses on low level computer vision, digital image analysis, and applications. It is designed as an online textbook but the exposition is informal. It geared towards software developers, especially beginners, and CS students. The wiki contains mathematics, algorithms, code examples, source code, compiled software, and some discussion. If you have any questions or suggestions, please contact me directly.

Computer Vision Primer:About

From Computer Vision Primer

Jump to: navigation, search

This wiki was started in August 2007. The initial material came from a book draft of mine. This text grew from my efforts to implement some algorithms of homology theory in order to apply them in image analysis. One title I tried was Context Independent Image Analysis but it as too broad. Since this is an introduction I ended up with Computer Vision Primer. Mathematically, it is about Computational Topology and Geometry.

The wiki provides methods related to the following image analysis tasks: partition and simplification of images, image enhancement, motion tracking, scientific image analysis, image recognition, matching, and search. The approach is simple, robust, versatile, easy to customize, and context independent. Unlike most of the current methods, it is designed to be consistent with the mathematical theory of image analysis, i.e., algebraic topology.

Algebraic topology is rarely taught. The reason is that such a course would have to rely on point-set topology and modern algebra as prerequisites. It would also include challenging proofs. Our goal here is to bring algebraic topology to the audience that needs it in the appropriate form.

The wiki is developed with a programmer in mind. Prerequisites are minimal.

First, some calculus. Very little is required because integer (or binary) arithmetic is applied almost exclusively. I do refer to sums as integrals and I do mention Green’s Theorem.
Second, a fair amount of linear algebra. At later stages, we will freely use vectors and matrices of arbitrary dimensions. Finitely dimensional vector spaces, subspaces, linear operators and their matrix representations also appear. Without a good familiarity with quotient spaces it will be hard to understand homology. [Update: there is a way to do homology without linear algebra.] Elementary probability will be mentioned in some application sections. Beyond basic calculus and elementary linear algebra the wiki is self-contained.
Third, a basic knowledge of C++ or another computer language is desirable. However, the (open source) code is written as a mere illustration of the algorithms. As a result simplicity and clarity are chosen over efficiency. The code is so elementary that it can be easily followed by a person who understands the mathematics involved. On the other hand, the simplicity of the code allows an experienced reader to implement the algorithms in any other language.

The main parts of the method described here have been implemented as a computer program called Pixcavator. The program, as well as its SDK, is available for free download.

We develop only very basic algorithms. You won’t find here any advanced image analysis techniques. On the other hand, there is hardly anything in this wiki that you can find elsewhere in the current image analysis literature.

There have been a few attempts to address topological issues in imaging.

Digital Geometry: Geometric Methods for Digital Image Analysis by Klette and Rosenfeld, an encyclopedia of mathematical methods in imaging, devotes a whole page to homology theory!
Volumetric Image Analysis by Lohmann has some basics, mostly Betti numbers.
Topology for Computing by Zamorodian is a research monograph that provides useful algorithms for computation of homology but does not address digital image analysis.
Computational Homology by Kaczynski, Mischaikow, and Mrozek has a very well written introduction to homology in the beginning of the book as well as many algorithms for cubical homology. However, only a seasoned mathematician can work his way through the notation and the proofs in the rest of the book. Essentially, the book is half way between a graduate textbook and a research monograph. My goal here is something more user friendly, like an online textbook...

In part, the wiki is organized according to Donald Knuth's “literate programming” idea [1]:

Instead of writing code containing documentation, the literate programmer writes documentation containing code.

Best of luck!

Peter Saveliev

P.S. The development of the wiki will be discussed in our blog: Computer Vision for Dummies.

P.P.S. Because of spam, the editing by the users was disabled.