2 \section{Image Processing}
5 Note: The chapter describes functions for image processing and
6 analysis. Most of the functions work with 2d arrays of pixels, which are referred
7 to as "images". However, they do not have to be of type
8 \cross{IplImage}, they can be of type \cross{CvMat} or type \cross{CvMatND} as well.
10 \subsection{Gradients, Edges and Corners}
12 \cvfunc{Sobel}\label{Sobel}
14 Calculates the first, second, third or mixed image derivatives using an extended Sobel operator.
27 int aperture\_size=3 );
29 }{CPP}{Sobel(src,dst,xorder,yorder,aperture\_size = 3)-> None}
31 \cvarg{src}{Source image of type CvArr*}
32 \cvarg{dst}{Destination image}
33 \cvarg{xorder}{Order of the derivative x}
34 \cvarg{yorder}{Order of the derivative y}
35 \cvarg{aperture\_size}{Size of the extended Sobel kernel, must be 1, 3, 5 or 7}
38 In all cases except 1, an $\texttt{aperture\_size} \times
39 \texttt{aperture\_size}$ separable kernel will be used to calculate the
40 derivative. For $\texttt{aperture\_size} = 1$ $ 3 \times 1$ or $ 1 \times 3$
41 a kernel is used (Gaussian smoothing is not done). There is also the special
42 value \texttt{CV\_SCHARR} (-1) that corresponds to a $3\times3$ Scharr
43 filter that may give more accurate results than a $3\times3$ Sobel. Scharr
52 for the x-derivative or transposed for the y-derivative.
54 The function \texttt{cvSobel} calculates the image derivative by convolving the image with the appropriate kernel:
57 \texttt{dst}(x,y) = \frac{d^{xorder+yorder} \texttt{src}}{dx^{xorder} \cdot dy^{yorder}}
60 The Sobel operators combine Gaussian smoothing and differentiation
61 so the result is more or less resistant to the noise. Most often,
62 the function is called with (\texttt{xorder} = 1, \texttt{yorder} = 0,
63 \texttt{aperture\_size} = 3) or (\texttt{xorder} = 0, \texttt{yorder} = 1,
64 \texttt{aperture\_size} = 3) to calculate the first x- or y- image
65 derivative. The first case corresponds to a kernel of:
73 and the second one corresponds to a kernel of:
86 depending on the image origin (\texttt{origin} field of
87 \texttt{IplImage} structure). No scaling is done, so the destination image
88 usually has larger numbers (in absolute values) than the source image does. To
89 avoid overflow, the function requires a 16-bit destination image if the
90 source image is 8-bit. The result can be converted back to 8-bit using the
91 \cross{ConvertScale} or the \cross{ConvertScaleAbs} function. Besides 8-bit images
92 the function can process 32-bit floating-point images. Both the source and the
93 destination must be single-channel images of equal size or equal ROI size.
95 \cvfunc{Laplace}\label{Laplace}
97 Calculates the Laplacian of an image.
106 int aperture\_size=3 );
108 }{CPP}{Laplace(src,dst,aperture\_size=3)-> None}
110 \cvarg{src}{Source image}
111 \cvarg{dst}{Destination image}
112 \cvarg{aperture\_size}{Aperture size (it has the same meaning as \cross{Sobel})}
115 The function \texttt{cvLaplace} calculates the Laplacian of the source image by adding up the second x and y derivatives calculated using the Sobel operator:
118 \texttt{dst}(x,y) = \frac{d^2 \texttt{src}}{dx^2} + \frac{d^2 \texttt{src}}{dy^2}
121 Setting \texttt{aperture\_size} = 1 gives the fastest variant that is equal to convolving the image with the following kernel:
123 \[ \vecthreethree {0}{1}{0}{1}{-4}{1}{0}{1}{0} \]
125 Similar to the \cross{Sobel} function, no scaling is done and the same combinations of input and output formats are supported.
127 \cvfunc{Canny}\label{Canny}
128 Implements the Canny algorithm for edge detection.
131 void cvCanny( const CvArr* image,
139 int aperture\_size=3 );
141 }{CPP}{Canny(image,edges,threshold1,threshold2,aperture\_size=3)-> None}
143 \cvarg{image}{Single-channel input image}
144 \cvarg{edges}{Single-channel image to store the edges found by the function}
145 \cvarg{threshold1}{The first threshold}
146 \cvarg{threshold2}{The second threshold}
147 \cvarg{aperture\_size}{Aperture parameter for the Sobel operator (see \cross{Sobel})}
150 The function \texttt{cvCanny} finds the edges on the input image \texttt{image} and marks them in the output image \texttt{edges} using the Canny algorithm. The smallest value between \texttt{threshold1} and \texttt{threshold2} is used for edge linking, the largest value is used to find the initial segments of strong edges.
152 \cvfunc{PreCornerDetect}\label{PreCornerDetect}
153 Calculates the feature map for corner detection.
156 void cvPreCornerDetect(
162 int aperture\_size=3 );
164 }{CPP}{PreCornerDetect(image,corners,aperture\_size=3)-> None}
166 \cvarg{image}{Input image}
167 \cvarg{corners}{Image to store the corner candidates}
168 \cvarg{aperture\_size}{Aperture parameter for the Sobel operator (see \cross{Sobel})}
171 The function \texttt{cvPreCornerDetect} calculates the function
174 D_x^2 D_{yy} + D_y^2 D_{xx} - 2 D_x D_y D_{xy}
177 where $D_?$ denotes one of the first image derivatives and $D_{??}$ denotes a second image derivative.
179 The corners can be found as local maximums of the function below:
182 // assume that the image is floating-point
183 IplImage* corners = cvCloneImage(image);
184 IplImage* dilated_corners = cvCloneImage(image);
185 IplImage* corner_mask = cvCreateImage( cvGetSize(image), 8, 1 );
186 cvPreCornerDetect( image, corners, 3 );
187 cvDilate( corners, dilated_corners, 0, 1 );
188 cvSubS( corners, dilated_corners, corners );
189 cvCmpS( corners, 0, corner_mask, CV_CMP_GE );
190 cvReleaseImage( &corners );
191 cvReleaseImage( &dilated_corners );
194 \cvfunc{CornerEigenValsAndVecs}\label{CornerEigenValsAndVecs}
195 Calculates eigenvalues and eigenvectors of image blocks for corner detection.
198 void cvCornerEigenValsAndVecs( \par const CvArr* image,\par CvArr* eigenvv,\par int block\_size,\par int aperture\_size=3 );
200 }{CPP}{CornerEigenValsAndVecs(image,eigenvv,block\_size,aperture\_size=3)-> None}
203 \cvarg{image}{Input image}
204 \cvarg{eigenvv}{Image to store the results. It must be 6 times wider than the input image}
205 \cvarg{block\_size}{Neighborhood size (see discussion)}
206 \cvarg{aperture\_size}{Aperture parameter for the Sobel operator (see \cross{Sobel})}
209 For every pixel, the function \texttt{cvCornerEigenValsAndVecs} considers a \texttt{block\_size} $\times$ \texttt{block\_size} neigborhood S(p). It calcualtes the covariation matrix of derivatives over the neigborhood as:
213 \sum_{S(p)}(dI/dx)^2 & \sum_{S(p)}(dI/dx \cdot dI/dy)^2 \\
214 \sum_{S(p)}(dI/dx \cdot dI/dy)^2 & \sum_{S(p)}(dI/dy)^2
218 After that it finds eigenvectors and eigenvalues of the matrix and stores them into destination image in form
219 $(\lambda_1, \lambda_2, x_1, y_1, x_2, y_2)$ where
221 \item[$\lambda_1, \lambda_2$]are the eigenvalues of $M$; not sorted
222 \item[$x_1, y_1$]are the eigenvectors corresponding to $\lambda_1$
223 \item[$x_2, y_2$]are the eigenvectors corresponding to $\lambda_2$
226 \cvfunc{CornerMinEigenVal}\label{CornerMinEigenVal}
227 Calculates the minimal eigenvalue of gradient matrices for corner detection.
230 void cvCornerMinEigenVal(
238 int aperture\_size=3 );
240 }{CPP}{CornerMinEigenVal(image,eigenval,block\_size,aperture\_size=3)-> None}
242 \cvarg{image}{Input image}
243 \cvarg{eigenval}{Image to store the minimal eigenvalues. Should have the same size as \texttt{image}}
244 \cvarg{block\_size}{Neighborhood size (see the discussion of \cross{CornerEigenValsAndVecs})}
245 \cvarg{aperture\_size}{Aperture parameter for the Sobel operator (see \cross{Sobel}).}
246 % format. In the case of floating-point input format this parameter is the number of the fixed float filter used for differencing
249 The function \texttt{cvCornerMinEigenVal} is similar to \cross{CornerEigenValsAndVecs} but it calculates and stores only the minimal eigen value of derivative covariation matrix for every pixel, i.e. $min(\lambda_1, \lambda_2)$ in terms of the previous function.
251 \cvfunc{CornerHarris}\label{CornerHarris}
252 Harris edge detector.
259 CvArr* harris\_responce,
263 int aperture\_size=3,
267 }{CPP}{CornerHarris(image,harris\_dst,block\_size,aperture\_size=3,k=0.04)-> None}
270 \cvarg{image}{Input image}
271 \cvarg{harris\_responce}{Image to store the Harris detector responses. Should have the same size as \texttt{image}}
272 \cvarg{block\_size}{Neighborhood size (see the discussion of \cross{CornerEigenValsAndVecs})}
273 \cvarg{aperture\_size}{Aperture parameter for the Sobel operator (see \cross{Sobel}).}
274 % format. In the case of floating-point input format this parameter is the number of the fixed float filter used for differencing
275 \cvarg{k}{Harris detector free parameter. See the formula below}
278 The function \texttt{cvCornerHarris} runs the Harris edge detector on the image. Similarly to \cross{CornerMinEigenVal} and \cross{CornerEigenValsAndVecs}, for each pixel it calculates a $2\times2$ gradient covariation matrix $M$ over a $\texttt{block\_size} \times \texttt{block\_size}$ neighborhood. Then, it stores
281 det(M) - k \, trace(M)^2
284 to the destination image. Corners in the image can be found as the local maxima of the destination image.
286 \cvfunc{FindCornerSubPix}\label{FindCornerSubPix}
287 Refines the corner locations.
290 void cvFindCornerSubPix(
294 CvPoint2D32f* corners,
302 CvTermCriteria criteria );
303 }{CPP}{FindCornerSubPix(image,win,zero\_zone,criteria)-> corners}
306 \cvarg{image}{Input image}
307 \cvarg{corners}{Initial coordinates of the input corners; refined coordinates on output}
308 \cvarg{count}{Number of corners}
309 \cvarg{win}{Half of the side length of the search window. For example, if \texttt{win} =(5,5), then a 5*2+1 $\times$ 5*2+1 = 11 $\times$ 11 search window would be used}
310 \cvarg{zero\_zone}{Half of the size of the dead region in the middle of the search zone over which the summation in the formula below is not done. It is used sometimes to avoid possible singularities of the autocorrelation matrix. The value of (-1,-1) indicates that there is no such size}
311 \cvarg{criteria}{Criteria for termination of the iterative process of corner refinement. That is, the process of corner position refinement stops either after a certain number of iterations or when a required accuracy is achieved. The \texttt{criteria} may specify either of or both the maximum number of iteration and the required accuracy}
314 The function \texttt{cvFindCornerSubPix} iterates to find the sub-pixel accurate location of corners, or radial saddle points, as shown in on the picture below.
316 \includegraphics[width=1.0\textwidth]{pics/cornersubpix.png}
318 Sub-pixel accurate corner locator is based on the observation that every vector from the center $q$ to a point $p$ located within a neighborhood of $q$ is orthogonal to the image gradient at $p$ subject to image and measurement noise. Consider the expression:
321 \epsilon_i = {DI_{p_i}}^T \cdot (q - p_i)
324 where ${DI_{p_i}}$ is the image gradient at the one of the points $p_i$ in a neighborhood of $q$. The value of $q$ is to be found such that $\epsilon_i$ is minimized. A system of equations may be set up with $\epsilon_i$ set to zero:
327 \sum_i(DI_{p_i} \cdot {DI_{p_i}}^T) - \sum_i(DI_{p_i} \cdot {DI_{p_i}}^T \cdot p_i)
330 where the gradients are summed within a neighborhood ("search window") of $q$. Calling the first gradient term $G$ and the second gradient term $b$ gives:
336 The algorithm sets the center of the neighborhood window at this new center $q$ and then iterates until the center keeps within a set threshold.
338 \cvfunc{GoodFeaturesToTrack}\label{GoodFeaturesToTrack}
339 Determines strong corners on an image.
342 void cvGoodFeaturesToTrack(
346 CvArr* eig\_image, CvArr* temp\_image
348 CvPoint2D32f* corners
352 double quality\_level
356 const CvArr* mask=NULL
364 }{CPP}{GoodFeaturesToTrack(image,eig\_image,temp\_image,quality\_level,min\_distance,mask=NULL,block\_size=3,use\_harris=0,k=0.04)-> corners}
367 \cvarg{image}{The source 8-bit or floating-point 32-bit, single-channel image}
368 \cvarg{eig\_image}{Temporary floating-point 32-bit image, the same size as \texttt{image}}
369 \cvarg{temp\_image}{Another temporary image, the same size and format as \texttt{eig\_image}}
370 \cvarg{corners}{Output parameter; detected corners}
371 \cvarg{corner\_count}{Output parameter; number of detected corners}
372 \cvarg{quality\_level}{Multiplier for the max/min eigenvalue; specifies the minimal accepted quality of image corners}
373 \cvarg{min\_distance}{Limit, specifying the minimum possible distance between the returned corners; Euclidian distance is used}
374 \cvarg{mask}{Region of interest. The function selects points either in the specified region or in the whole image if the mask is NULL}
375 \cvarg{block\_size}{Size of the averaging block, passed to the underlying \cross{CornerMinEigenVal} or \cross{CornerHarris} used by the function}
376 \cvarg{use\_harris}{If nonzero, Harris operator (\cross{CornerHarris}) is used instead of default \cross{CornerMinEigenVal}}
377 \cvarg{k}{Free parameter of Harris detector; used only if ($\texttt{use\_harris} != 0$)}
380 The function \texttt{cvGoodFeaturesToTrack} finds corners with big
381 eigenvalues in the image. The function first calculates the minimal
382 eigenvalue for every source image pixel using the \cross{CornerMinEigenVal}
383 function and stores them in \texttt{eig\_image}. Then it performs
384 non-maxima suppression (only local maxima in $3\times 3$ neighborhood
385 remain). The next step is rejecting the corners with the minimal
387 $\texttt{quality\_level} \cdot max(\texttt{eig\_image}(x,y))$
389 Finally, the function ensures that all the corners found are distanced
390 enough from one another by considering the corners (the strongest
391 corners are considered first) and checking that the distance between
392 the newly considered feature and the features considered earlier
393 is larger than \texttt{min\_distance}. So, the function removes the
394 features than are too close to the stronger features.
396 \cvfunc{ExtractSURF}\label{ExtractSURF}
398 Extracts Speeded Up Robust Features from an image.
401 void cvExtractSURF( \par const CvArr* image,\par const CvArr* mask,\par CvSeq** keypoints,\par CvSeq** descriptors,\par CvMemStorage* storage,\par CvSURFParams params );
402 }{CPP}{ExtractSURF(image,mask,storage,params)-> keypoints,descriptors}
405 \cvarg{image}{The input 8-bit grayscale image}
406 \cvarg{mask}{The optional input 8-bit mask. The features are only found in the areas that contain more than 50\% of non-zero mask pixels}
407 \cvarg{keypoints}{The output parameter; double pointer to the sequence of keypoints. The sequence of CvSURFPoint structures is as follows:}
409 typedef struct CvSURFPoint
411 CvPoint2D32f pt; // position of the feature within the image
412 int laplacian; // -1, 0 or +1. sign of the laplacian at the point.
413 // can be used to speedup feature comparison
414 // (normally features with laplacians of different
415 // signs can not match)
416 int size; // size of the feature
417 float dir; // orientation of the feature: 0..360 degrees
418 float hessian; // value of the hessian (can be used to
419 // approximately estimate the feature strengths;
420 // see also params.hessianThreshold)
424 \cvarg{descriptors}{The optional output parameter; double pointer to the sequence of descriptors. Depending on the params.extended value, each element of the sequence will be either a 64-element or a 128-element floating-point (\texttt{CV\_32F}) vector. If the parameter is NULL, the descriptors are not computed}
425 \cvarg{storage}{Memory storage where keypoints and descriptors will be stored}
426 \cvarg{params}{Various algorithm parameters put to the structure CvSURFParams:}
428 typedef struct CvSURFParams
430 int extended; // 0 means basic descriptors (64 elements each),
431 // 1 means extended descriptors (128 elements each)
432 double hessianThreshold; // only features with keypoint.hessian
433 // larger than that are extracted.
434 // good default value is ~300-500 (can depend on the
435 // average local contrast and sharpness of the image).
436 // user can further filter out some features based on
437 // their hessian values and other characteristics.
438 int nOctaves; // the number of octaves to be used for extraction.
439 // With each next octave the feature size is doubled
441 int nOctaveLayers; // The number of layers within each octave
446 CvSURFParams cvSURFParams(double hessianThreshold, int extended=0);
447 // returns default parameters
451 The function cvExtractSURF finds robust features in the image, as
454 . For each feature it returns its location, size,
455 orientation and optionally the descriptor, basic or extended. The function
456 can be used for object tracking and localization, image stitching etc. See the
457 \texttt{find\_obj.cpp} demo in OpenCV samples directory.
459 \cvfunc{GetStarKeypoints}\label{GetStarKeypoints}
461 Retrieves keypoints using the StarDetector algorithm.
464 CvSeq* cvGetStarKeypoints( \par const CvArr* image,\par CvMemStorage* storage,\par CvStarDetectorParams params=cvStarDetectorParams() );
465 }{CPP}{GetStarKeypoints(image,storage,params)-> keypoints}
468 \cvarg{image}{The input 8-bit grayscale image}
469 \cvarg{storage}{Memory storage where the keypoints will be stored}
470 \cvarg{params}{Various algorithm parameters given to the structure CvStarDetectorParams:}
472 typedef struct CvStarDetectorParams
474 int maxSize; // maximal size of the features detected. The following
475 // values of the parameter are supported:
476 // 4, 6, 8, 11, 12, 16, 22, 23, 32, 45, 46, 64, 90, 128
477 int responseThreshold; // threshold for the approximatd laplacian,
478 // used to eliminate weak features
479 int lineThresholdProjected; // another threshold for laplacian to
481 int lineThresholdBinarized; // another threshold for the feature
482 // scale to eliminate edges
483 int suppressNonmaxSize; // linear size of a pixel neighborhood
484 // for non-maxima suppression
486 CvStarDetectorParams;
490 The function GetStarKeypoints extracts keypoints that are local
491 scale-space extremas. The scale-space is constructed by computing
492 approximate values of laplacians with different sigma's at each
493 pixel. Instead of using pyramids, a popular approach to save computing
494 time, all of the laplacians are computed at each pixel of the original
495 high-resolution image. But each approximate laplacian value is computed
496 in O(1) time regardless of the sigma, thanks to the use of integral
497 images. The algorithm is based on the paper
500 of a square, hexagon or octagon it uses an 8-end star shape, hence the name,
501 consisting of overlapping upright and tilted squares.
503 Each computed feature is represented by the following structure:
506 typedef struct CvStarKeypoint
508 CvPoint pt; // coordinates of the feature
509 int size; // feature size, see CvStarDetectorParams::maxSize
510 float response; // the approximated laplacian value at that point.
514 inline CvStarKeypoint cvStarKeypoint(CvPoint pt, int size, float response);
517 Below is the small usage sample:
523 int main(int argc, char** argv)
525 const char* filename = argc > 1 ? argv[1] : "lena.jpg";
526 IplImage* img = cvLoadImage( filename, 0 ), *cimg;
527 CvMemStorage* storage = cvCreateMemStorage(0);
528 CvSeq* keypoints = 0;
533 cvNamedWindow( "image", 1 );
534 cvShowImage( "image", img );
535 cvNamedWindow( "features", 1 );
536 cimg = cvCreateImage( cvGetSize(img), 8, 3 );
537 cvCvtColor( img, cimg, CV_GRAY2BGR );
539 keypoints = cvGetStarKeypoints( img, storage, cvStarDetectorParams(45) );
541 for( i = 0; i < (keypoints ? keypoints->total : 0); i++ )
543 CvStarKeypoint kpt = *(CvStarKeypoint*)cvGetSeqElem(keypoints, i);
545 cvCircle( cimg, kpt.pt, r, CV_RGB(0,255,0));
546 cvLine( cimg, cvPoint(kpt.pt.x + r, kpt.pt.y + r),
547 cvPoint(kpt.pt.x - r, kpt.pt.y - r), CV_RGB(0,255,0));
548 cvLine( cimg, cvPoint(kpt.pt.x - r, kpt.pt.y + r),
549 cvPoint(kpt.pt.x + r, kpt.pt.y - r), CV_RGB(0,255,0));
551 cvShowImage( "features", cimg );
556 \subsection{Sampling, Interpolation and Geometrical Transforms}
559 \cvfunc{SampleLine}\label{SampleLine}
560 Reads the raster line to the buffer.
573 int connectivity=8 );
578 \cvarg{image}{Image to sample the line from}
579 \cvarg{pt1}{Starting line point}
580 \cvarg{pt2}{Ending line point}
581 \cvarg{buffer}{Buffer to store the line points; must have enough size to store
582 $max( |\texttt{pt2.x} - \texttt{pt1.x}|+1, |\texttt{pt2.y} - \texttt{pt1.y}|+1 )$
583 points in the case of an 8-connected line and
584 $ (|\texttt{pt2.x}-\texttt{pt1.x}|+|\texttt{pt2.y}-\texttt{pt1.y}|+1) $
585 in the case of a 4-connected line}
586 \cvarg{connectivity}{The line connectivity, 4 or 8}
589 The function \texttt{cvSampleLine} implements a particular application of line iterators. The function reads all of the image points lying on the line between \texttt{pt1} and \texttt{pt2}, including the end points, and stores them into the buffer.
592 \cvfunc{GetRectSubPix}\label{GetRectSubPix}
594 Retrieves the pixel rectangle from an image with sub-pixel accuracy.
597 void cvGetRectSubPix(
603 CvPoint2D32f center );
604 }{CPP}{GetRectSubPix(src,dst,center)-> None}
607 \cvarg{src}{Source image}
608 \cvarg{dst}{Extracted rectangle}
609 \cvarg{center}{Floating point coordinates of the extracted rectangle center within the source image. The center must be inside the image}
612 The function \texttt{cvGetRectSubPix} extracts pixels from \texttt{src}:
615 dst(x, y) = src(x + \texttt{center.x} - (width(\texttt{dst})-1)*0.5, y + \texttt{center.y} - (height(\texttt{dst} )-1)*0.5)
618 where the values of the pixels at non-integer coordinates are retrieved
619 using bilinear interpolation. Every channel of multiple-channel
620 images is processed independently. While the rectangle center
621 must be inside the image, parts of the rectangle may be
622 outside. In this case, the replication border mode is used to get
623 pixel values beyond the image boundaries.
625 \cvfunc{GetQuadrangleSubPix}\label{GetQuadrangleSubPix}
627 Retrieves the pixel quadrangle from an image with sub-pixel accuracy.
630 void cvGetQuadrangleSubPix(
636 const CvMat* map\_matrix );
638 }{CPP}{GetQuadrangleSubPix(src,dst,map\_matrix)-> None}
641 \cvarg{src}{Source image}
642 \cvarg{dst}{Extracted quadrangle}
643 \cvarg{map\_matrix}{The transformation $2 \times 3$ matrix $[A|b]$ (see the discussion)}
646 The function \texttt{cvGetQuadrangleSubPix} extracts pixels from \texttt{src} at sub-pixel accuracy and stores them to \texttt{dst} as follows:
649 dst(x, y)= src( A_{11} x' + A_{12} y' + b_1, A_{21} x' + A_{22} y' + b_2)
655 x'=x-\frac{(width(dst)-1)}{2},
656 y'=y-\frac{(height(dst)-1)}{2}
662 \texttt{map\_matrix} = \begin{bmatrix}
663 A_{11} & A_{12} & b_1\\
664 A_{21} & A_{22} & b_2
668 The values of pixels at non-integer coordinates are retrieved using bilinear interpolation. When the function needs pixels outside of the image, it uses replication border mode to reconstruct the values. Every channel of multiple-channel images is processed independently.
671 \cvfunc{Resize}\label{Resize}
681 int interpolation=CV\_INTER\_LINEAR );
683 }{CPP}{Resize(src,dst,interpolation=CV\_INTER\_LINEAR)-> None}
686 \cvarg{src}{Source image}
687 \cvarg{dst}{Destination image}
688 \cvarg{interpolation}{Interpolation method:
690 \cvarg{CV\_INTER\_NN}{nearest-neigbor interpolation}
691 \cvarg{CV\_INTER\_LINEAR}{bilinear interpolation (used by default)}
692 \cvarg{CV\_INTER\_AREA}{resampling using pixel area relation. It is the preferred method for image decimation that gives moire-free results. In terms of zooming it is similar to the \texttt{CV\_INTER\_NN} method}
693 \cvarg{CV\_INTER\_CUBIC}{bicubic interpolation}
697 The function \texttt{cvResize} resizes an image \texttt{src} so that it fits exactly into \texttt{dst}. If ROI is set, the function considers the ROI as supported.
699 \cvfunc{WarpAffine}\label{WarpAffine}
701 Applies an affine transformation to an image.
710 const CvMat* map\_matrix,
712 int flags=CV\_INTER\_LINEAR+CV\_WARP\_FILL\_OUTLIERS,
714 CvScalar fillval=cvScalarAll(0) );
716 }{CPP}{WarpAffline(src,dst,map\_matrix,flags=CV\_INTER\_LINEAR+CV\_WARP\_FILL\_OUTLIERS,fillval=cvScalarAll(0))-> None}
719 \cvarg{src}{Source image}
720 \cvarg{dst}{Destination image}
721 \cvarg{map\_matrix}{$2\times 3$ transformation matrix}
722 \cvarg{flags}{A combination of interpolation methods and the following optional flags:
724 \cvarg{CV\_WARP\_FILL\_OUTLIERS}{fills all of the destination image pixels; if some of them correspond to outliers in the source image, they are set to \texttt{fillval}}
725 \cvarg{CV\_WARP\_INVERSE\_MAP}{indicates that \texttt{matrix} is inversely
726 transformed from the destination image to the source and, thus, can be used
727 directly for pixel interpolation. Otherwise, the function finds
728 the inverse transform from \texttt{map\_matrix}}}
730 \cvarg{fillval}{A value used to fill outliers}
733 The function \texttt{cvWarpAffine} transforms the source image using the specified matrix:
736 dst(x',y') = src(x,y)
746 \end{bmatrix} = \texttt{map\_matrix} \cdot \begin{bmatrix}
750 \end{bmatrix} & \mbox{if CV\_WARP\_INVERSE\_MAP is not set}\\
754 \end{bmatrix} = \texttt{map\_matrix} \cdot \begin{bmatrix}
758 \end{bmatrix}& \mbox{otherwise}
762 The function is similar to \cross{GetQuadrangleSubPix} but they are not exactly the same. \cross{WarpAffine} requires input and output image have the same data type, has larger overhead (so it is not quite suitable for small images) and can leave part of destination image unchanged. While \cross{GetQuadrangleSubPix} may extract quadrangles from 8-bit images into floating-point buffer, has smaller overhead and always changes the whole destination image content.
764 To transform a sparse set of points, use the \cross{Transform} function from cxcore.
766 \cvfunc{GetAffineTransform}\label{GetAffineTransform}
768 Calculates the affine transform from 3 corresponding points.
771 CvMat* cvGetAffineTransform(
773 const CvPoint2D32f* src,
775 const CvPoint2D32f* dst,
777 CvMat* map\_matrix );
778 }{CPP}{GetAffineTransform(src,dst,map\_matrix)-> None}
781 \cvarg{src}{ Coordinates of 3 triangle vertices in the source image}
782 \cvarg{dst}{ Coordinates of the 3 corresponding triangle vertices in the destination image}
783 \cvarg{map\_matrix}{ Pointer to the destination $2 \times 3$ matrix}
786 The function cvGetAffineTransform calculates the matrix of an affine transform such that:
811 \cvfunc{2DRotationMatrix}\label{2DRotationMatrix}
813 Calculates the affine matrix of 2d rotation.
816 CvMat* cv2DRotationMatrix(
824 CvMat* map\_matrix );
825 }{CPP}{2DRotationMatrix(center,angle,scale,map\_matrix)-> None}
828 \cvarg{center}{Center of the rotation in the source image}
829 \cvarg{angle}{The rotation angle in degrees. Positive values mean counter-clockwise rotation (the coordinate origin is assumed to be the top-left corner)}
830 \cvarg{scale}{Isotropic scale factor}
831 \cvarg{map\_matrix}{Pointer to the destination $2\times 3$ matrix}
834 The function \texttt{cv2DRotationMatrix} calculates the following matrix:
838 a & \beta & (1-a) \cdot \texttt{center.x} - \beta \cdot \texttt{center.y} \\
839 \beta - 1 & a & \beta \cdot \texttt{center.x} - (1-a) \cdot \texttt{center.y}
846 a = \texttt{scale} \cdot cos(\texttt{angle}), \beta = \texttt{scale} \cdot sin(\texttt{angle})
849 The transformation maps the rotation center to itself. If this is not the purpose, the shift should be adjusted.
851 \cvfunc{WarpPerspective}\label{WarpPerspective}
853 Applies a perspective transformation to an image.
856 void cvWarpPerspective(
862 const CvMat* map\_matrix,
864 int flags=CV\_INTER\_LINEAR+CV\_WARP\_FILL\_OUTLIERS,
866 CvScalar fillval=cvScalarAll(0) );
868 }{CPP}{WarpPerspective(src,dst,map\_matrix,flags=CV\_I
869 NNER\_LINEAR+CV\_WARP\_FILL\_OUTLIERS,fillval=cvScalarAll(0
873 \cvarg{src}{Source image}
874 \cvarg{dst}{Destination image}
875 \cvarg{map\_matrix}{$3\times 3$ transformation matrix}
876 \cvarg{flags}{A combination of interpolation methods and the following optional flags:
878 \cvarg{CV\_WARP\_FILL\_OUTLIERS}{fills all of the destination image pixels; if some of them correspond to outliers in the source image, they are set to \texttt{fillval}}
879 \cvarg{CV\_WARP\_INVERSE\_MAP}{indicates that \texttt{matrix} is inversely transformed from the destination image to the source and, thus, can be used directly for pixel interpolation. Otherwise, the function finds the inverse transform from \texttt{map\_matrix}}
881 \cvarg{fillval}{A value used to fill outliers}
884 The function \texttt{cvWarpPerspective} transforms the source image using the specified matrix:
891 \end{bmatrix} = \texttt{map\_matrix} \cdot \begin{bmatrix}
895 \end{bmatrix} & \mbox{if CV\_WARP\_INVERSE\_MAP is not set}\\
899 \end{bmatrix} = \texttt{map\_matrix} \cdot \begin{bmatrix}
903 \end{bmatrix}& \mbox{otherwise}
907 For a sparse set of points use the \cross{PerspectiveTransform} function from CxCore.
910 \cvfunc{GetPerspectiveTransform}\label{GetPerspectiveTransform}
912 Calculates the perspective transform from 4 corresponding points.
915 CvMat* cvGetPerspectiveTransform(
917 const CvPoint2D32f* src,
919 const CvPoint2D32f* dst,
921 CvMat* map\_matrix );
922 }{CPP}{GetPerspectiveTransform(src,dst,map\_matrix)-> None}
925 \cvarg{src}{Coordinates of 4 quadrangle vertices in the source image}
926 \cvarg{dst}{Coordinates of the 4 corresponding quadrangle vertices in the destination image}
927 \cvarg{map\_matrix}{Pointer to the destination $3\times 3$ matrix}
930 The function \texttt{cvGetPerspectiveTransform} calculates a matrix of perspective transforms such that:
955 \cvfunc{Remap}\label{Remap}
957 Applies a generic geometrical transformation to the image.
970 int flags=CV\_INTER\_LINEAR+CV\_WARP\_FILL\_OUTLIERS,
972 CvScalar fillval=cvScalarAll(0) );
974 }{CPP}{Remap(src,dst,mapx,mapy,flags=CV\_INNER\_LINEAR+CV\_WARP\_FILL\_OUTLIERS,fillval=cvScalarAll0))-> None}
977 \cvarg{src}{Source image}
978 \cvarg{dst}{Destination image}
979 \cvarg{mapx}{The map of x-coordinates (32fC1 image)}
980 \cvarg{mapy}{The map of y-coordinates (32fC1 image)}
981 \cvarg{flags}{A combination of interpolation method and the following optional flag(s):
983 \cvarg{CV\_WARP\_FILL\_OUTLIERS}{fills all of the destination image pixels. If some of them correspond to outliers in the source image, they are set to \texttt{fillval}}
985 \cvarg{fillval}{A value used to fill outliers}
988 The function \texttt{cvRemap} transforms the source image using the specified map:
991 \texttt{dst}(x,y) = \texttt{src}(\texttt{mapx}(x,y),\texttt{mapy}(x,y))
994 Similar to other geometrical transformations, some interpolation method (specified by user) is used to extract pixels with non-integer coordinates.
996 \cvfunc{LogPolar}\label{LogPolar}
998 Remaps an image to log-polar space.
1007 CvPoint2D32f center,
1011 int flags=CV\_INTER\_LINEAR+CV\_WARP\_FILL\_OUTLIERS );
1013 }{CPP}{LogPolar(src,dst,center,M,flags=CV\_INNER\_LINEAR+CV\_WARP\_FILL\_OUTLIERS)-> None}
1016 \cvarg{src}{Source image}
1017 \cvarg{dst}{Destination image}
1018 \cvarg{center}{The transformation center; where the output precision is maximal}
1019 \cvarg{M}{Magnitude scale parameter. See below}
1020 \cvarg{flags}{A combination of interpolation methods and the following optional flags:
1022 \cvarg{CV\_WARP\_FILL\_OUTLIERS}{fills all of the destination image pixels. If some of them correspond to outliers in the source image, they are set to zero}
1023 \cvarg{CV\_WARP\_INVERSE\_MAP}{See below}
1027 The function \texttt{cvLogPolar} transforms the source image using the following transformation:
1029 Forward transformation (\texttt{CV\_WARP\_INVERSE\_MAP} is not set):
1032 dst(\phi,\rho) = src(x,y)
1035 Inverse transformation (\texttt{CV\_WARP\_INVERSE\_MAP} is set):
1038 dst(x,y) = src(\phi,\rho)
1044 \rho = M \cdot \log{\sqrt{x^2 + y^2}},
1048 The function emulates the human "foveal" vision and can be used for fast scale and rotation-invariant template matching, for object tracking and so forth.
1051 \cvfunc{Example: Log-polar transformation}
1054 #include <highgui.h>
1056 int main(int argc, char** argv)
1060 if( argc == 2 && (src=cvLoadImage(argv[1],1) != 0 )
1062 IplImage* dst = cvCreateImage( cvSize(256,256), 8, 3 );
1063 IplImage* src2 = cvCreateImage( cvGetSize(src), 8, 3 );
1064 cvLogPolar( src, dst, cvPoint2D32f(src->width/2,src->height/2), 40,
1065 CV_INTER_LINEAR+CV_WARP_FILL_OUTLIERS );
1066 cvLogPolar( dst, src2, cvPoint2D32f(src->width/2,src->height/2), 40,
1067 CV_INTER_LINEAR+CV_WARP_FILL_OUTLIERS+CV_WARP_INVERSE_MAP );
1068 cvNamedWindow( "log-polar", 1 );
1069 cvShowImage( "log-polar", dst );
1070 cvNamedWindow( "inverse log-polar", 1 );
1071 cvShowImage( "inverse log-polar", src2 );
1078 And this is what the program displays when \texttt{opencv/samples/c/fruits.jpg} is passed to it
1080 \includegraphics[width=0.4\textwidth]{pics/logpolar.jpg}
1081 \includegraphics[width=0.4\textwidth]{pics/inv_logpolar.jpg}
1083 \subsection{Morphological Operations}
1085 \cvfunc{CreateStructuringElementEx}\label{CreateStructuringElementEx}
1087 Creates a structuring element.
1090 IplConvKernel* cvCreateStructuringElementEx(
1103 }{CPP}{CreateStructuringElementEx(cols,rows,anchor\_x,anchor\_y,shape,values={NULL,0})-> kernel}
1106 \cvarg{cols}{Number of columns in the structuring element}
1107 \cvarg{rows}{Number of rows in the structuring element}
1108 \cvarg{anchor\_x}{Relative horizontal offset of the anchor point}
1109 \cvarg{anchor\_y}{Relative vertical offset of the anchor point}
1110 \cvarg{shape}{Shape of the structuring element; may have the following values:
1112 \cvarg{CV\_SHAPE\_RECT}{a rectangular element}
1113 \cvarg{CV\_SHAPE\_CROSS}{a cross-shaped element}
1114 \cvarg{CV\_SHAPE\_ELLIPSE}{an elliptic element}
1115 \cvarg{CV\_SHAPE\_CUSTOM}{a user-defined element. In this case the parameter \texttt{values} specifies the mask, that is, which neighbors of the pixel must be considered}
1117 \cvarg{values}{Pointer to the structuring element data, a plane array, representing row-by-row scanning of the element matrix. Non-zero values indicate points that belong to the element. If the pointer is \texttt{NULL}, then all values are considered non-zero, that is, the element is of a rectangular shape. This parameter is considered only if the shape is \texttt{CV\_SHAPE\_CUSTOM} }
1120 The function CreateStructuringElementEx allocates and fills the structure \texttt{IplConvKernel}, which can be used as a structuring element in the morphological operations.
1123 \cvfunc{ReleaseStructuringElement}\label{ReleaseStructuringElement}
1125 Deletes a structuring element.
1128 void cvReleaseStructuringElement( IplConvKernel** element );
1132 \cvarg{element}{Pointer to the deleted structuring element}
1135 The function \texttt{cvReleaseStructuringElement} releases the structure \texttt{IplConvKernel} that is no longer needed. If \texttt{*element} is \texttt{NULL}, the function has no effect.
1139 \cvfunc{Erode}\label{Erode}
1141 Erodes an image by using a specific structuring element.
1150 IplConvKernel* element=NULL,
1153 }{CPP}{Erode(src,dst,element=NULL,itertions=1)-> None}
1156 \cvarg{src}{Source image}
1157 \cvarg{dst}{Destination image}
1158 \cvarg{element}{Structuring element used for erosion. If it is \texttt{NULL}, a $3\times 3$ rectangular structuring element is used}
1159 \cvarg{iterations}{Number of times erosion is applied}
1162 The function \texttt{cvErode} erodes the source image using the specified structuring element that determines the shape of a pixel neighborhood over which the minimum is taken:
1165 \min_{(x',y') \, in \, \texttt{element}}src(x+x',y+y')
1168 The function supports the in-place mode. Erosion can be applied several (\texttt{iterations}) times. For color images, each channel is processed independently.
1170 \cvfunc{Dilate}\label{Dilate}
1172 Dilates an image by using a specific structuring element.
1181 IplConvKernel* element=NULL,
1184 }{CPP}{Dilate(src,dst,element=NULL,iterations=1)-> None}
1187 \cvarg{src}{Source image}
1188 \cvarg{dst}{Destination image}
1189 \cvarg{element}{Structuring element used for dilation. If it is \texttt{NULL}, a $3\times 3$ rectangular structuring element is used}
1190 \cvarg{iterations}{Number of times dilation is applied}
1193 The function \texttt{cvDilate} dilates the source image using the specified structuring element that determines the shape of a pixel neighborhood over which the maximum is taken:
1196 \max_{(x',y') \, in \, \texttt{element}}src(x+x',y+y')
1199 The function supports the in-place mode. Dilation can be applied several (\texttt{iterations}) times. For color images, each channel is processed independently.
1201 \cvfunc{MorphologyEx}\label{MorphologyEx}
1203 Performs advanced morphological transformations.
1206 void cvMorphologyEx(
1214 IplConvKernel* element,
1219 }{CPP}{MorphologyEx(src,dst,temp,element,operation,iterations=1)-> None}
1222 \cvarg{src}{Source image}
1223 \cvarg{dst}{Destination image}
1224 \cvarg{temp}{Temporary image, required in some cases}
1225 \cvarg{element}{Structuring element}
1226 \cvarg{operation}{Type of morphological operation, one of the following:}
1228 \cvarg{CV\_MOP\_OPEN}{opening}
1229 \cvarg{CV\_MOP\_CLOSE}{closing}
1230 \cvarg{CV\_MOP\_GRADIENT}{morphological gradient}
1231 \cvarg{CV\_MOP\_TOPHAT}{"top hat"}
1232 \cvarg{CV\_MOP\_BLACKHAT}{"black hat"}
1234 \cvarg{iterations}{Number of times erosion and dilation are applied}
1237 The function \texttt{cvMorphologyEx} can perform advanced morphological transformations using erosion and dilation as basic operations.
1242 dst=open(src,element)=dilate(erode(src,element),element)
1248 dst=close(src,element)=erode(dilate(src,element),element)
1251 Morphological gradient:
1254 dst=morph\_grad(src,element)=dilate(src,element)-erode(src,element)
1260 dst=tophat(src,element)=src-open(src,element)
1266 dst=blackhat(src,element)=close(src,element)-src
1269 The temporary image \texttt{temp} is required for a morphological gradient and, in the case of in-place operation, for "top hat" and "black hat".
1271 \subsection{Filters and Color Conversion}
1273 \cvfunc{Smooth}\label{Smooth}
1275 Smooths the image in one of several ways.
1284 int smoothtype=CV\_GAUSSIAN,
1290 double param3=0, double param4=0 );
1291 }{CPP}{Smooth(src,dst,smoothtype=CV\_GAUSSIAN,param1=3,param2=0,param3=0,param4=0)-> None}
1294 \cvarg{src}{The source image}
1295 \cvarg{dst}{The destination image}
1296 \cvarg{smoothtype}{Type of the smoothing:
1298 \cvarg{CV\_BLUR\_NO\_SCALE (simple blur with no scaling)}{summation over a pixel $\texttt{param1}\times \texttt{param2}$ neighborhood. If the neighborhood size varies, one can precompute the integral image with the \cross{Integral} function}
1299 \cvarg{CV\_BLUR (simple blur)}{summation over a pixel $\texttt{param1}\times \texttt{param2}$ neighborhood with subsequent scaling by $1/(\texttt{param1} \cdot \texttt{param2})$}
1300 \cvarg{CV\_GAUSSIAN (gaussian blur)}{convolving image with a $\texttt{param1}\times \texttt{param2}$ Gaussian kernel}
1301 \cvarg{CV\_MEDIAN (median blur)}{finds the median of a $\texttt{param1}\times \texttt{param1}$ neighborhood (i.e. the neighborhood is square)}
1302 \cvarg{CV\_BILATERAL (bilateral filter)}{applying bilateral $\texttt{param1}\times \texttt{param2}$ filtering with color sigma=\texttt{param3} and space sigma=\texttt{param4}. \texttt{param1} and \texttt{param2} must be equal (square). Information about bilateral filtering can be found at
1303 \url{http://www.dai.ed.ac.uk/CVonline/LOCAL\_COPIES/MANDUCHI1/Bilateral\_Filtering.html}
1306 \cvarg{param1}{The first parameter of the smoothing operation}
1307 \cvarg{param2}{The second parameter of the smoothing operation. In the case of simple scaled/non-scaled and Gaussian blur if \texttt{param2} is zero, it is set to \texttt{param1}}
1308 \cvarg{param3}{In the case of a Gaussian parameter this parameter may specify Gaussian $\sigma$ (standard deviation). If it is zero, it is calculated from the kernel size:
1310 \sigma = 0.3 (n/2 - 1) + 0.8 \quad \text{where} \quad n=
1312 \mbox{\texttt{param1} for horizontal kernel}\\
1313 \mbox{\texttt{param2} for vertical kernel}
1317 Using standard sigma for small kernels ($3\times 3$ to $7\times 7$) gives better speed. If \texttt{param3} is not zero, while \texttt{param1} and \texttt{param2} are zeros, the kernel size is calculated from the sigma (to provide accurate enough operation).}
1320 The function \texttt{cvSmooth} smooths an image using one of several methods. Every of the methods has some features and restrictions listed below
1322 Blur with no scaling works with single-channel images only and supports accumulation of 8-bit to 16-bit format (similar to \cross{Sobel} and \cross{Laplace}) and 32-bit floating point to 32-bit floating-point format.
1324 Simple blur and Gaussian blur support 1- or 3-channel, 8-bit and 32-bit floating point images. These two methods can process images in-place.
1326 Median and bilateral filters work with 1- or 3-channel 8-bit images and can not process images in-place.
1328 \cvfunc{Filter2D}\label{Filter2D}
1330 Convolves an image with the kernel.
1339 const CvMat* kernel,
1341 CvPoint anchor=cvPoint(-1,-1));
1342 }{CPP}{Filter2D(src,dst,kernel,anchor=(-1,-1))-> None}
1345 \cvarg{src}{The source image}
1346 \cvarg{dst}{The destination image}
1347 \cvarg{kernel}{Convolution kernel, a single-channel floating point matrix. If you want to apply different kernels to different channels, split the image into separate color planes using \cross{Split} and process them individually}
1348 \cvarg{anchor}{The anchor of the kernel that indicates the relative position of a filtered point within the kernel. The anchor shoud lie within the kernel. The special default value (-1,-1) means that it is at the kernel center}
1351 The function \texttt{cvFilter2D} applies an arbitrary linear filter to the image. In-place operation is supported. When the aperture is partially outside the image, the function interpolates outlier pixel values from the nearest pixels that are inside the image.
1353 \cvfunc{CopyMakeBorder}\label{CopyMakeBorder}
1355 Copies an image and makes a border around it.
1358 void cvCopyMakeBorder(
1368 CvScalar value=cvScalarAll(0) );
1369 }{CPP}{CopyMakeBorder(src,dst,offset,bordrtype,value=cvScalarAll(0))-> None}
1372 \cvarg{src}{The source image}
1373 \cvarg{dst}{The destination image}
1374 \cvarg{offset}{Coordinates of the top-left corner (or bottom-left in the case of images with bottom-left origin) of the destination image rectangle where the source image (or its ROI) is copied. Size of the rectanlge matches the source image size/ROI size}
1375 \cvarg{bordertype}{Type of the border to create around the copied source image rectangle; types inlude:
1377 \cvarg{IPL\_BORDER\_CONSTANT}{border is filled with the fixed value, passed as last parameter of the function.}
1378 \cvarg{IPL\_BORDER\_REPLICATE}{the pixels from the top and bottom rows, the left-most and right-most columns are replicated to fill the border.}
1380 (The other two border types from IPL, \texttt{IPL\_BORDER\_REFLECT} and \texttt{IPL\_BORDER\_WRAP}, are currently unsupported)}
1381 \cvarg{value}{Value of the border pixels if \texttt{bordertype} is \texttt{IPL\_BORDER\_CONSTANT}}
1384 The function \texttt{cvCopyMakeBorder} copies the source 2D array into the interior of the destination array and makes a border of the specified type around the copied area. The function is useful when one needs to emulate border type that is different from the one embedded into a specific algorithm implementation. For example, morphological functions, as well as most of other filtering functions in OpenCV, internally use replication border type, while the user may need a zero border or a border, filled with 1's or 255's.
1386 \cvfunc{Integral}\label{Integral}
1388 Calculates the integral of an image.
1399 CvArr* tilted\_sum=NULL );
1400 }{CPP}{Integral(image,sum,sqsum=NULL,tilted\_sum=NULL)-> None}
1403 \cvarg{image}{The source image, $W\times H$, 8-bit or floating-point (32f or 64f)}
1404 \cvarg{sum}{The integral image, $(W+1)\times (H+1)$, 32-bit integer or double precision floating-point (64f)}
1405 \cvarg{sqsum}{The integral image for squared pixel values, $(W+1)\times (H+1)$, double precision floating-point (64f)}
1406 \cvarg{tilted\_sum}{The integral for the image rotated by 45 degrees, $(W+1)\times (H+1)$, the same data type as \texttt{sum}}
1409 The function \texttt{cvIntegral} calculates one or more integral images for the source image as following:
1412 \texttt{sum}(X,Y) = \sum_{x<X,y<Y} \texttt{image}(x,y)
1416 \texttt{sqsum}(X,Y) = \sum_{x<X,y<Y} \texttt{image}(x,y)^2
1420 \texttt{tilted\_sum}(X,Y) = \sum_{y<Y,abs(x-X)<y} \texttt{image}(x,y)
1423 Using these integral images, one may calculate sum, mean and standard deviation over a specific up-right or rotated rectangular region of the image in a constant time, for example:
1426 \sum_{x_1<=x<x_2, \, y_1<=y<y_2} = \texttt{sum}(x_2,y_2)-\texttt{sum}(x_1,y_2)-\texttt{sum}(x_2,y_1)+\texttt{sum}(x_1,x_1)
1429 It makes possible to do a fast blurring or fast block correlation with variable window size, for example. In the case of multi-channel images, sums for each channel are accumulated independently.
1431 \cvfunc{CvtColor}\label{CvtColor}
1433 Converts an image from one color space to another.
1443 }{CPP}{CvtColor(src,dst,code)-> None}
1446 \cvarg{src}{The source 8-bit (8u), 16-bit (16u) or single-precision floating-point (32f) image}
1447 \cvarg{dst}{The destination image of the same data type as the source. The number of channels may be different}
1448 \cvarg{code}{Color conversion operation that can be specifed using \texttt{CV\_ \textit{src\_color\_space} 2 \textit{dst\_color\_space}} constants (see below)}
1451 The function \texttt{cvCvtColor} converts the input image from one color
1452 space to another. The function ignores the \texttt{colorModel} and
1453 \texttt{channelSeq} fields of the \texttt{IplImage} header, so the
1454 source image color space should be specified correctly (including
1455 order of the channels in the case of RGB space. For example, BGR means 24-bit
1456 format with $B_0, G_0, R_0, B_1, G_1, R_1, ...$ layout
1457 whereas RGB means 24-format with $R_0, G_0, B_0, R_1, G_1, B_1, ...$
1460 The conventional range for R,G,B channel values is:
1463 \item 0 to 255 for 8-bit images
1464 \item 0 to 65535 for 16-bit images and
1465 \item 0 to 1 for floating-point images.
1468 Of course, in the case of linear transformations the range can be
1469 specific, but in order to get correct results in the case of non-linear
1470 transformations, the input image should be scaled.
1472 The function can do the following transformations:
1475 \item Transformations within RGB space like adding/removing the alpha channel, reversing the channel order, conversion to/from 16-bit RGB color (R5:G6:B5 or R5:G5:B5), as well as conversion to/from grayscale using:
1477 \text{RGB[A] to Gray:} Y \leftarrow 0.299 \cdot R + 0.587 \cdot G + 0.114 \cdot B
1481 \text{Gray to RGB[A]:} R \leftarrow Y, G \leftarrow Y, B \leftarrow Y, A \leftarrow 0
1484 The conversion from a RGB image to gray is done with:
1486 cvCvtColor(src ,bwsrc, CV_RGB2GRAY)
1489 \item RGB $\leftrightarrow$ CIE XYZ.Rec 709 with D65 white point (\texttt{CV\_BGR2XYZ, CV\_RGB2XYZ, CV\_XYZ2BGR, CV\_XYZ2RGB}):
1498 0.412453 & 0.357580 & 0.180423\\
1499 0.212671 & 0.715160 & 0.072169\\
1500 0.019334 & 0.119193 & 0.950227
1517 3.240479 & -1.53715 & -0.498535\\
1518 -0.969256 & 1.875991 & 0.041556\\
1519 0.055648 & -0.204043 & 1.057311
1528 $X$, $Y$ and $Z$ cover the whole value range (in the case of floating-point images $Z$ may exceed 1).
1530 \item RGB $\leftrightarrow$ YCrCb JPEG (a.k.a. YCC) (\texttt{CV\_BGR2YCrCb, CV\_RGB2YCrCb, CV\_YCrCb2BGR, CV\_YCrCb2RGB})
1531 \[ Y \leftarrow 0.299 \cdot R + 0.587 \cdot G + 0.114 \cdot B \]
1532 \[ Cr \leftarrow (R-Y) \cdot 0.713 + delta \]
1533 \[ Cb \leftarrow (B-Y) \cdot 0.564 + delta \]
1534 \[ R \leftarrow Y + 1.403 \cdot (Cr - delta) \]
1535 \[ G \leftarrow Y - 0.344 \cdot (Cr - delta) - 0.714 \cdot (Cb - delta) \]
1536 \[ B \leftarrow Y + 1.773 \cdot (Cb - delta) \]
1541 128 & \mbox{for 8-bit images}\\
1542 32768 & \mbox{for 16-bit images}\\
1543 0.5 & \mbox{for floating-point images}
1546 Y, Cr and Cb cover the whole value range.
1548 \item RGB $\leftrightarrow$ HSV (\texttt{CV\_BGR2HSV, CV\_RGB2HSV, CV\_HSV2BGR, CV\_HSV2RGB})
1549 in the case of 8-bit and 16-bit images
1550 R, G and B are converted to floating-point format and scaled to fit the 0 to 1 range
1551 \[ V \leftarrow max(R,G,B) \]
1553 \[ S \leftarrow \fork{\frac{V-min(R,G,B)}{V}}{if $V \neq 0$}{0}{otherwise} \]
1554 \[ H \leftarrow \forkthree
1555 {{60(G - B)}/{S}}{if $V=R$}
1556 {{120+60(B - R)}/{S}}{if $V=G$}
1557 {{240+60(R - G)}/{S}}{if $V=B$} \]
1558 if $H<0$ then $H \leftarrow H+360$
1560 On output $0 \leq V \leq 1$, $0 \leq S \leq 1$, $0 \leq H \leq 360$.
1562 The values are then converted to the destination data type:
1565 \[ V \leftarrow 255 V, S \leftarrow 255 S, H \leftarrow H/2 \text{(to fit to 0 to 255)} \]
1566 \item[16-bit images (currently not supported)]
1567 \[ V <- 65535 V, S <- 65535 S, H <- H \]
1568 \item[32-bit images]
1569 H, S, V are left as is
1572 \item RGB $\leftrightarrow$ HLS (\texttt{CV\_BGR2HLS, CV\_RGB2HLS, CV\_HLS2BGR, CV\_HLS2RGB}).
1573 in the case of 8-bit and 16-bit images
1574 R, G and B are converted to floating-point format and scaled to fit the 0 to 1 range.
1575 \[ V_{max} \leftarrow {max}(R,G,B) \]
1576 \[ V_{min} \leftarrow {min}(R,G,B) \]
1577 \[ L \leftarrow \frac{V_{max} - V_{min}}{2} \]
1578 \[ S \leftarrow \fork
1579 {\frac{V_{max} - V_{min}}{V_{max} + V_{min}}}{if $L < 0.5$}
1580 {\frac{V_{max} - V_{min}}{2 - (V_{max} + V_{min})}}{if $L \ge 0.5$} \]
1581 \[ H \leftarrow \forkthree
1582 {{60(G - B)}/{S}}{if $V_{max}=R$}
1583 {{120+60(B - R)}/{S}}{if $V_{max}=G$}
1584 {{240+60(R - G)}/{S}}{if $V_{max}=B$} \]
1585 if $H<0$ then $H \leftarrow H+360$
1586 On output $0 \leq V \leq 1$, $0 \leq S \leq 1$, $0 \leq H \leq 360$.
1588 The values are then converted to the destination data type:
1591 \[ V \leftarrow 255 V, S \leftarrow 255 S, H \leftarrow H/2 \text{(to fit to 0 to 255)} \]
1592 \item[16-bit images (currently not supported)]
1593 \[ V <- 65535 V, S <- 65535 S, H <- H \]
1594 \item[32-bit images]
1595 H, S, V are left as is
1598 \item RGB $\leftrightarrow$ CIE L*a*b* (\texttt{CV\_BGR2Lab, CV\_RGB2Lab, CV\_Lab2BGR, CV\_Lab2RGB})
1599 in the case of 8-bit and 16-bit images
1600 R, G and B are converted to floating-point format and scaled to fit the 0 to 1 range
1601 \[ \vecthree{X}{Y}{Z} \leftarrow \vecthreethree
1602 {0.412453}{0.357580}{0.180423}
1603 {0.212671}{0.715160}{0.072169}
1604 {0.019334}{0.119193}{0.950227}
1606 \vecthree{R}{G}{B} \]
1607 \[ X \leftarrow X/X_n, \text{where} X_n = 0.950456 \]
1608 \[ Z \leftarrow Z/Z_n, \text{where} Z_n = 1.088754 \]
1609 \[ L \leftarrow \fork
1610 {116*Y^{1/3}-16}{for $Y>0.008856$}
1611 {903.3*Y}{for $Y \le 0.008856$} \]
1612 \[ a \leftarrow 500 (f(X)-f(Y)) + delta \]
1613 \[ b \leftarrow 200 (f(Y)-f(Z)) + delta \]
1616 {t^{1/3}}{for $t>0.008856$}
1617 {7.787 t+16/116}{for $t<=0.008856$} \]
1619 \[ delta = \fork{128}{for 8-bit images}{0}{for floating-point images} \]
1620 On output $0 \leq L \leq 100$, $-127 \leq a \leq 127$, $-127 \leq b \leq 127$
1622 The values are then converted to the destination data type:
1625 \[L \leftarrow L*255/100, a \leftarrow a + 128, b \leftarrow b + 128\]
1626 \item[16-bit images] currently not supported
1627 \item[32-bit images]
1628 L, a, b are left as is
1631 \item RGB $\leftrightarrow$ CIE L*u*v* (\texttt{CV\_BGR2Luv, CV\_RGB2Luv, CV\_Luv2BGR, CV\_Luv2RGB})
1632 in the case of 8-bit and 16-bit images
1633 R, G and B are converted to floating-point format and scaled to fit 0 to 1 range
1634 \[ \vecthree{X}{Y}{Z} \leftarrow \vecthreethree
1635 {0.412453}{0.357580}{0.180423}
1636 {0.212671}{0.715160}{0.072169}
1637 {0.019334}{0.119193}{0.950227}
1639 \vecthree{R}{G}{B} \]
1640 \[ L \leftarrow \fork
1641 {116 Y^{1/3}}{for $Y>0.008856$}
1642 {903.3 Y}{for $Y<=0.008856$} \]
1643 \[ u' \leftarrow 4*X/(X + 15*Y + 3 Z) \]
1644 \[ v' \leftarrow 9*Y/(X + 15*Y + 3 Z) \]
1645 \[ u \leftarrow 13*L*(u' - u_n) \quad \text{where} \quad u_n=0.19793943 \]
1646 \[ v \leftarrow 13*L*(v' - v_n) \quad \text{where} \quad v_n=0.46831096 \]
1647 On output $0 \leq L \leq 100$, $-134 \leq u \leq 220$, $-140 \leq v \leq 122$.
1649 The values are then converted to the destination data type:
1652 \[L \leftarrow 255/100 L, u \leftarrow 255/354 (u + 134), v \leftarrow 255/256 (v + 140) \]
1653 \item[16-bit images] currently not supported
1654 \item[32-bit images] L, u, v are left as is
1657 The above formulas for converting RGB to/from various color spaces have been taken from multiple sources on Web, primarily from
1659 at the Charles Poynton site.
1661 \item Bayer $\rightarrow$ RGB (\texttt{CV\_BayerBG2BGR, CV\_BayerGB2BGR, CV\_BayerRG2BGR, CV\_BayerGR2BGR, CV\_BayerBG2RGB, CV\_BayerGB2RGB, CV\_BayerRG2RGB, CV\_BayerGR2RGB}) The Bayer pattern is widely used in CCD and CMOS cameras. It allows one to get color pictures from a single plane where R,G and B pixels (sensors of a particular component) are interleaved like this:
1663 \newcommand{\Rcell}{\color{red}R}
1664 \newcommand{\Gcell}{\color{green}G}
1665 \newcommand{\Bcell}{\color{blue}B}
1669 \definecolor{BackGray}{rgb}{0.8,0.8,0.8}
1670 \begin{array}{ c c c c c }
1671 \Rcell&\Gcell&\Rcell&\Gcell&\Rcell\\
1672 \Gcell&\colorbox{BackGray}{\Bcell}&\colorbox{BackGray}{\Gcell}&\Bcell&\Gcell\\
1673 \Rcell&\Gcell&\Rcell&\Gcell&\Rcell\\
1674 \Gcell&\Bcell&\Gcell&\Bcell&\Gcell\\
1675 \Rcell&\Gcell&\Rcell&\Gcell&\Rcell
1679 The output RGB components of a pixel are interpolated from 1, 2 or
1680 4 neighbors of the pixel having the same color. There are several
1681 modifications of the above pattern that can be achieved by shifting
1682 the pattern one pixel left and/or one pixel up. The two letters
1684 in the conversion constants
1685 \texttt{CV\_Bayer} $ C_1 C_2 $ \texttt{2BGR}
1687 \texttt{CV\_Bayer} $ C_1 C_2 $ \texttt{2RGB}
1688 indicate the particular pattern
1689 type - these are components from the second row, second and third
1690 columns, respectively. For example, the above pattern has very
1694 \cvfunc{Threshold}\label{Threshold}
1696 Applies a fixed-level threshold to array elements.
1709 int threshold\_type );
1710 }{CPP}{Threshold(src,dst,threshld,max\_value,threshold\_type)-> None}
1713 \cvarg{src}{Source array (single-channel, 8-bit of 32-bit floating point)}
1714 \cvarg{dst}{Destination array; must be either the same type as \texttt{src} or 8-bit}
1715 \cvarg{threshold}{Threshold value}
1716 \cvarg{max\_value}{Maximum value to use with \texttt{CV\_THRESH\_BINARY} and \texttt{CV\_THRESH\_BINARY\_INV} thresholding types}
1717 \cvarg{threshold\_type}{Thresholding type (see the discussion)}
1720 The function \texttt{cvThreshold} applies fixed-level thresholding
1721 to a single-channel array. The function is typically used to get a
1722 bi-level (binary) image out of a grayscale image (\cross{CmpS} could
1723 be also used for this purpose) or for removing a noise, i.e. filtering
1724 out pixels with too small or too large values. There are several
1725 types of thresholding that the function supports that are determined by
1726 \texttt{threshold\_type}:
1729 \cvarg{CV\_THRESH\_BINARY}{\[ \texttt{dst}(x,y) = \fork{\texttt{max\_value}}{if $\texttt{src}(x,y) > \texttt{threshold}$}{0}{otherwise} \]}
1730 \cvarg{CV\_THRESH\_BINARY\_INV}{\[ \texttt{dst}(x,y) = \fork{0}{if $\texttt{src}(x,y) > \texttt{threshold}$}{\texttt{max\_value}}{otherwise} \]}
1731 \cvarg{CV\_THRESH\_TRUNC}{\[ \texttt{dst}(x,y) = \fork{\texttt{threshold}}{if $\texttt{src}(x,y) > \texttt{threshold}$}{\texttt{src}(x,y)}{otherwise} \]}
1732 \cvarg{CV\_THRESH\_TOZERO}{\[ \texttt{dst}(x,y) = \fork{\texttt{src}(x,y)}{if $\texttt{src}(x,y) > \texttt{threshold}$}{0}{otherwise} \]}
1733 \cvarg{CV\_THRESH\_TOZERO\_INV}{\[ \texttt{dst}(x,y) = \fork{0}{if $\texttt{src}(x,y) > \texttt{threshold}$}{\texttt{src}(x,y)}{otherwise} \]}
1736 Also, the special value \texttt{CV\_THRESH\_OTSU} may be combined with
1737 one of the above values. In this case the function determines the optimal threshold
1738 value using Otsu's algorithm and uses it instead of the specified \texttt{thresh}.
1739 The function returns the computed threshold value.
1740 Currently, Otsu's method is implemented only for 8-bit images.
1742 \includegraphics[width=0.5\textwidth]{pics/threshold.png}
1744 \cvfunc{AdaptiveThreshold}\label{AdaptiveThreshold}
1746 Applies an adaptive threshold to an array.
1749 void cvAdaptiveThreshold(
1751 const CvArr* src,\par CvArr* dst,\par double max\_value,\par
1752 int adaptive\_method=CV\_ADAPTIVE\_THRESH\_MEAN\_C,\par
1753 int threshold\_type=CV\_THRESH\_BINARY,\par
1754 int block\_size=3,\par double param1=5 );
1756 }{CPP}{AdaptiveTheshold(src,dst,max\_value, adaptive\_method=CV\_ADAPTIVE\_THRESH\_MEAN\_C, threshold\_type=CV\_THRESH\_BINARY,block\_size=3,param1=5)-> None}
1759 \cvarg{src}{Source image}
1760 \cvarg{dst}{Destination image}
1761 \cvarg{max\_value}{Maximum value that is used with \texttt{CV\_THRESH\_BINARY} and \texttt{CV\_THRESH\_BINARY\_INV}}
1762 \cvarg{adaptive\_method}{Adaptive thresholding algorithm to use: \texttt{CV\_ADAPTIVE\_THRESH\_MEAN\_C} or \texttt{CV\_ADAPTIVE\_THRESH\_GAUSSIAN\_C} (see the discussion)}
1763 \cvarg{threshold\_type}{Thresholding type; must be one of
1765 \cvarg{CV\_THRESH\_BINARY}{xxx}
1766 \cvarg{CV\_THRESH\_BINARY\_INV}{xxx}
1768 \cvarg{block\_size}{The size of a pixel neighborhood that is used to calculate a threshold value for the pixel: 3, 5, 7, and so on}
1769 \cvarg{param1}{The method-dependent parameter. For the methods \texttt{CV\_ADAPTIVE\_THRESH\_MEAN\_C} and \texttt{CV\_ADAPTIVE\_THRESH\_GAUSSIAN\_C} it is a constant subtracted from the mean or weighted mean (see the discussion), though it may be negative}
1772 The function \texttt{cvAdaptiveThreshold} transforms a grayscale image to a binary image according to the formulas:
1775 \cvarg{CV\_THRESH\_BINARY}{\[ dst(x,y) = \fork{\texttt{max\_value}}{if $src(x,y) > T(x,y)$}{0}{otherwise} \]}
1776 \cvarg{CV\_THRESH\_BINARY\_INV}{\[ dst(x,y) = \fork{0}{if $src(x,y) > T(x,y)$}{\texttt{max\_value}}{otherwise} \]}
1779 where $T(x,y)$ is a threshold calculated individually for each pixel.
1781 For the method \texttt{CV\_ADAPTIVE\_THRESH\_MEAN\_C} it is the mean of a $\texttt{block\_size} \times \texttt{block\_size}$ pixel neighborhood, minus \texttt{param1}.
1783 For the method \texttt{CV\_ADAPTIVE\_THRESH\_GAUSSIAN\_C} it is the weighted sum (gaussian) of a $\texttt{block\_size} \times \texttt{block\_size}$ pixel neighborhood, minus \texttt{param1}.
1785 \subsection{Pyramids and the Applications}
1787 \cvfunc{PyrDown}\label{PyrDown}
1789 Downsamples an image.
1792 void cvPyrDown(\par const CvArr* src,\par CvArr* dst,\par int filter=CV\_GAUSSIAN\_5x5 );
1793 }{CPP}{PyrDown(src,dst,filter=CV\_GAUSSIAN\_5X5)-> None}
1796 \cvarg{src}{The source image}
1797 \cvarg{dst}{The destination image, should have a half as large width and height than the source}
1798 \cvarg{filter}{Type of the filter used for convolution; only \texttt{CV\_GAUSSIAN\_5x5} is currently supported}
1801 The function \texttt{cvPyrDown} performs the downsampling step of the Gaussian pyramid decomposition. First it convolves the source image with the specified filter and then downsamples the image by rejecting even rows and columns.
1803 \cvfunc{PyrUp}\label{PyrUp}
1808 void cvPyrUp(\par const CvArr* src,\par CvArr* dst,\par int filter=CV\_GAUSSIAN\_5x5 );
1809 }{CPP}{PyrUp(src,dst,filter=CV\_GAUSSIAN\_5X5)-> None}
1812 \cvarg{src}{The source image}
1813 \cvarg{dst}{The destination image, should have twice as large width and height than the source}
1814 \cvarg{filter}{Type of the filter used for convolution; only \texttt{CV\_GAUSSIAN\_5x5} is currently supported}
1817 The function \texttt{cvPyrUp} performs the up-sampling step of the Gaussian pyramid decomposition. First it upsamples the source image by injecting even zero rows and columns and then convolves the result with the specified filter multiplied by 4 for interpolation. So the destination image is four times larger than the source image.
1819 \cvfunc{PyrSegmentation}\label{PyrSegmentation}
1821 Implements image segmentation by pyramids.
1824 void cvPyrSegmentation(\par IplImage* src,\par IplImage* dst,\par
1825 CvMemStorage* storage,\par CvSeq** comp,\par
1826 int level,\par double threshold1,\par double threshold2 );
1827 }{CPP}{PyrSegmentation(src,dst,storage,level,threshold1,threshold2)-> comp}
1830 \cvarg{src}{The source image}
1831 \cvarg{dst}{The destination image}
1832 \cvarg{storage}{Storage; stores the resulting sequence of connected components}
1833 \cvarg{comp}{Pointer to the output sequence of the segmented components}
1834 \cvarg{level}{Maximum level of the pyramid for the segmentation}
1835 \cvarg{threshold1}{Error threshold for establishing the links}
1836 \cvarg{threshold2}{Error threshold for the segments clustering}
1839 The function \texttt{cvPyrSegmentation} implements image segmentation by pyramids. The pyramid builds up to the level \texttt{level}. The links between any pixel \texttt{a} on level \texttt{i} and its candidate father pixel \texttt{b} on the adjacent level are established if
1840 $p(c(a),c(b))<threshold1$.
1841 After the connected components are defined, they are joined into several clusters.
1842 Any two segments A and B belong to the same cluster, if $p(c(A),c(B))<threshold2$.
1843 If the input image has only one channel, then $p(c^1,c^2)=|c^1-c^2|$.
1844 If the input image has three channels (red, green and blue), then
1846 p(c^1,c^2) = 0.30 (c^1_r - c^2_r) +
1847 0.59 (c^1_g - c^2_g) +
1848 0.11 (c^1_b - c^2_b).
1851 There may be more than one connected component per a cluster. The images \texttt{src} and \texttt{dst} should be 8-bit single-channel or 3-channel images or equal size.
1853 \subsection{Connected Components and Contour Retrieval}
1855 \cvstruct{CvConnectedComp}\label{CvConnectedComp}
1857 Connected component.
1860 typedef struct CvConnectedComp
1862 double area; /* area of the segmented component */
1863 CvScalar value; /* average color of the connected component */
1864 CvRect rect; /* ROI of the segmented component */
1865 CvSeq* contour; /* optional component boundary
1866 (the contour might have child contours corresponding to the holes) */
1871 \cvfunc{FloodFill}\label{FloodFill}
1873 Fills a connected component with the given color.
1876 void cvFloodFill(\par CvArr* image,\par CvPoint seed\_point,\par CvScalar new\_val,\par
1877 CvScalar lo\_diff=cvScalarAll(0),\par CvScalar up\_diff=cvScalarAll(0),\par
1878 CvConnectedComp* comp=NULL,\par int flags=4,\par CvArr* mask=NULL );
1880 }{CPP}{FloodFill(image,seed\_point,new\_val,lo\_diff=cvScalarAll(0),up\_diff=cvScalarAll(0),flags=4,mask=NULL)-> comp}
1883 #define CV_FLOODFILL_FIXED_RANGE (1 << 16)
1884 #define CV_FLOODFILL_MASK_ONLY (1 << 17)
1888 \cvarg{image}{Input 1- or 3-channel, 8-bit or floating-point image. It is modified by the function unless the \texttt{CV\_FLOODFILL\_MASK\_ONLY} flag is set (see below)}
1889 \cvarg{seed\_point}{The starting point}
1890 \cvarg{new\_val}{New value of the repainted domain pixels}
1891 \cvarg{lo\_diff}{Maximal lower brightness/color difference between the currently observed pixel and one of its neighbors belonging to the component, or a seed pixel being added to the component. In the case of 8-bit color images it is a packed value}
1892 \cvarg{up\_diff}{Maximal upper brightness/color difference between the currently observed pixel and one of its neighbors belonging to the component, or a seed pixel being added to the component. In the case of 8-bit color images it is a packed value}
1893 \cvarg{comp}{Pointer to the structure that the function fills with the information about the repainted domain}
1894 \cvarg{flags}{The operation flags. Lower bits contain connectivity value, 4 (by default) or 8, used within the function. Connectivity determines which neighbors of a pixel are considered. Upper bits can be 0 or a combination of the following flags:
1896 \cvarg{CV\_FLOODFILL\_FIXED\_RANGE}{if set, the difference between the current pixel and seed pixel is considered, otherwise the difference between neighbor pixels is considered (the range is floating)}
1897 \cvarg{CV\_FLOODFILL\_MASK\_ONLY}{if set, the function does not fill the image (\texttt{new\_val} is ignored), but fills the mask (that must be non-NULL in this case)}
1899 \cvarg{mask}{Operation mask, should be a single-channel 8-bit image, 2 pixels wider and 2 pixels taller than \texttt{image}. If not NULL, the function uses and updates the mask, so the user takes responsibility of initializing the \texttt{mask} content. Floodfilling can't go across non-zero pixels in the mask, for example, an edge detector output can be used as a mask to stop filling at edges. It is possible to use the same mask in multiple calls to the function to make sure the filled area do not overlap. \textbf{Note}: because the mask is larger than the filled image, a pixel in \texttt{mask} that corresponds to $(x,y)$ pixel in \texttt{image} will have coordinates $(x+1,y+1)$ }
1902 The function \texttt{cvFloodFill} fills a connected component starting from the seed point with the specified color. The connectivity is determined by the closeness of pixel values. The pixel at $(x,y)$ is considered to belong to the repainted domain if:
1906 \item[grayscale image, floating range] \[
1907 src(x',y')-\texttt{lo\_diff} <= src(x,y) <= src(x',y')+\texttt{up\_diff} \]
1909 \item[grayscale image, fixed range] \[
1910 src(seed.x,seed.y)-\texttt{lo\_diff}<=src(x,y)<=src(seed.x,seed.y)+\texttt{up\_diff} \]
1912 \item[color image, floating range]
1913 \[ src(x',y')_r-\texttt{lo\_diff}_r<=src(x,y)_r<=src(x',y')_r+\texttt{up\_diff}_r \]
1914 \[ src(x',y')_g-\texttt{lo\_diff}_g<=src(x,y)_g<=src(x',y')_g+\texttt{up\_diff}_g \]
1915 \[ src(x',y')_b-\texttt{lo\_diff}_b<=src(x,y)_b<=src(x',y')_b+\texttt{up\_diff}_b \]
1917 \item[color image, fixed range]
1918 \[ src(seed.x,seed.y)_r-\texttt{lo\_diff}_r<=src(x,y)_r<=src(seed.x,seed.y)_r+\texttt{up\_diff}_r \]
1919 \[ src(seed.x,seed.y)_g-\texttt{lo\_diff}_g<=src(x,y)_g<=src(seed.x,seed.y)_g+\texttt{up\_diff}_g \]
1920 \[ src(seed.x,seed.y)_b-\texttt{lo\_diff}_b<=src(x,y)_b<=src(seed.x,seed.y)_b+\texttt{up\_diff}_b \]
1923 where $src(x',y')$ is the value of one of pixel neighbors. That is, to be added to the connected component, a pixel's color/brightness should be close enough to the:
1925 \item color/brightness of one of its neighbors that are already referred to the connected component in the case of floating range
1926 \item color/brightness of the seed point in the case of fixed range.
1929 \cvfunc{FindContours}\label{FindContours}
1931 Finds the contours in a binary image.
1934 int cvFindContours(\par CvArr* image,\par CvMemStorage* storage,\par CvSeq** first\_contour,\par
1935 int header\_size=sizeof(CvContour),\par int mode=CV\_RETR\_LIST,\par
1936 int method=CV\_CHAIN\_APPROX\_SIMPLE,\par CvPoint offset=cvPoint(0,0) );
1937 }{CPP}{FindContours(image, storage, mode=CV\_RETR\_LIST, method=CV\_CHAIN\_APPROX\_SIMPLE, offset=(0,0)) -> cvseq}
1940 \cvarg{image}{The source, an 8-bit single channel image. Non-zero pixels are treated as 1's, zero pixels remain 0's - the image is treated as \texttt{binary}. To get such a binary image from grayscale, one may use \cross{Threshold}, \cross{AdaptiveThreshold} or \cross{Canny}. The function modifies the source image's content}
1941 \cvarg{storage}{Container of the retrieved contours}
1942 \cvarg{first\_contour}{Output parameter, will contain the pointer to the first outer contour}
1943 \cvarg{header\_size}{Size of the sequence header, $\ge \texttt{sizeof(CvChain)}$ if $\texttt{method} =\texttt{CV\_CHAIN\_CODE}$,
1944 and $\ge \texttt{sizeof(CvContour)}$ otherwise}
1945 \cvarg{mode}{Retrieval mode
1947 \cvarg{CV\_RETR\_EXTERNAL}{retrives only the extreme outer contours}
1948 \cvarg{CV\_RETR\_LIST}{retrieves all of the contours and puts them in the list}
1949 \cvarg{CV\_RETR\_CCOMP}{retrieves all of the contours and organizes them into a two-level hierarchy: on the top level are the external boundaries of the components, on the second level are the boundaries of the holes}
1950 \cvarg{CV\_RETR\_TREE}{retrieves all of the contours and reconstructs the full hierarchy of nested contours}
1952 \cvarg{method}{Approximation method (for all the modes, except \texttt{CV\_LINK\_RUNS}, which uses built-in approximation)
1954 \cvarg{CV\_CHAIN\_CODE}{outputs contours in the Freeman chain code. All other methods output polygons (sequences of vertices)}
1955 \cvarg{CV\_CHAIN\_APPROX\_NONE}{translates all of the points from the chain code into points}
1956 \cvarg{CV\_CHAIN\_APPROX\_SIMPLE}{compresses horizontal, vertical, and diagonal segments and leaves only their end points}
1957 \cvarg{CV\_CHAIN\_APPROX\_TC89\_L1,CV\_CHAIN\_APPROX\_TC89\_KCOS}{applies one of the flavors of the Teh-Chin chain approximation algorithm.}
1958 \cvarg{CV\_LINK\_RUNS}{uses a completely different contour retrieval algorithm by linking horizontal segments of 1's. Only the \texttt{CV\_RETR\_LIST} retrieval mode can be used with this method.}
1960 \cvarg{offset}{Offset, by which every contour point is shifted. This is useful if the contours are extracted from the image ROI and then they should be analyzed in the whole image context}
1963 The function \texttt{cvFindContours} retrieves contours from the
1964 binary image and returns the number of retrieved contours. The
1965 pointer \texttt{first\_contour} is filled by the function. It will
1966 contain a pointer to the first outermost contour or \texttt{NULL} if no
1967 contours are detected (if the image is completely black). Other
1968 contours may be reached from \texttt{first\_contour} using the
1969 \texttt{h\_next} and \texttt{v\_next} links. The sample in the
1970 \cross{DrawContours} discussion shows how to use contours for
1971 connected component detection. Contours can be also used for shape
1972 analysis and object recognition - see \texttt{squares.c} in the OpenCV
1976 \cvfunc{StartFindContours}\label{StartFindContours}
1978 Initializes the contour scanning process.
1981 CvContourScanner cvStartFindContours(\par CvArr* image,\par CvMemStorage* storage,\par
1982 int header\_size=sizeof(CvContour),\par
1983 int mode=CV\_RETR\_LIST,\par
1984 int method=CV\_CHAIN\_APPROX\_SIMPLE,\par
1985 CvPoint offset=cvPoint(0,\par0) );
1989 \cvarg{image}{The 8-bit, single channel, binary source image}
1990 \cvarg{storage}{Container of the retrieved contours}
1991 \cvarg{header\_size}{Size of the sequence header, $>=sizeof(CvChain)$ if \texttt{method} =CV\_CHAIN\_CODE, and $>=sizeof(CvContour)$ otherwise}
1992 \cvarg{mode}{Retrieval mode; see \cross{FindContours}}
1993 \cvarg{method}{Approximation method. It has the same meaning in \cross{FindContours}, but \texttt{CV\_LINK\_RUNS} can not be used here}
1994 \cvarg{offset}{ROI offset; see \cross{FindContours}}
1997 The function \texttt{cvStartFindContours} initializes and returns a pointer to the contour scanner. The scanner is used in \cross{FindNextContour} to retrieve the rest of the contours.
1999 \cvfunc{FindNextContour}\label{FindNextContour}
2001 Finds the next contour in the image.
2004 CvSeq* cvFindNextContour( \par CvContourScanner scanner );
2008 \cvarg{scanner}{Contour scanner initialized by \cross{StartFindContours} }
2011 The function \texttt{cvFindNextContour} locates and retrieves the next contour in the image and returns a pointer to it. The function returns NULL if there are no more contours.
2013 \cvfunc{SubstituteContour}\label{SubstituteContour}
2015 Replaces a retrieved contour.
2018 void cvSubstituteContour( \par CvContourScanner scanner, \par CvSeq* new\_contour );
2022 \cvarg{scanner}{Contour scanner initialized by \cross{StartFindContours} }
2023 \cvarg{new\_contour}{Substituting contour}
2026 The function \texttt{cvSubstituteContour} replaces the retrieved
2027 contour, that was returned from the preceding call of
2028 \cross{FindNextContour} and stored inside the contour scanner
2029 state, with the user-specified contour. The contour is inserted
2030 into the resulting structure, list, two-level hierarchy, or tree,
2031 depending on the retrieval mode. If the parameter \texttt{new\_contour}
2032 is \texttt{NULL}, the retrieved contour is not included in the
2033 resulting structure, nor are any of its children that might be added
2034 to this structure later.
2036 \cvfunc{EndFindContours}\label{EndFindContours}
2038 Finishes the scanning process.
2041 CvSeq* cvEndFindContours( \par CvContourScanner* scanner );
2045 \cvarg{scanner}{Pointer to the contour scanner}
2048 The function \texttt{cvEndFindContours} finishes the scanning process and returns a pointer to the first contour on the highest level.
2051 \cvfunc{PyrMeanShiftFiltering}
2053 Does meanshift image segmentation
2057 void cvPyrMeanShiftFiltering( \par const CvArr* src, \par CvArr* dst,
2058 \par double sp, \par double sr, \par int max\_level=1,
2059 \par CvTermCriteria termcrit=\par cvTermCriteria(CV\_TERMCRIT\_ITER+CV\_TERMCRIT\_EPS,5,1));
2061 }{CPP}{PyrMeanShiftFiltering(src,dst,sp,sr,max\_level=1,
2062 termcrit=\par (CV\_TERMCRIT\_ITER+CV\_TERMCRIT\_EPS,5,1))-> None}
2065 \cvarg{src}{The source 8-bit, 3-channel image.}
2066 \cvarg{dst}{The destination image of the same format and the same size as the source.}
2067 \cvarg{sp}{The spatial window radius.}
2068 \cvarg{sr}{The color window radius.}
2069 \cvarg{max\_level}{Maximum level of the pyramid for the segmentation.}
2070 \cvarg{termcrit}{Termination criteria: when to stop meanshift iterations.}
2073 The function \texttt{cvPyrMeanShiftFiltering} implements the filtering
2074 stage of meanshift segmentation, that is, the output of the function is
2075 the filtered "posterized" image with color gradients and fine-grain
2076 texture flattened. At every pixel $(X,Y)$ of the input image (or
2077 down-sized input image, see below) the function executes meanshift
2078 iterations, that is, the pixel $(X,Y)$ neighborhood in the joint
2079 space-color hyperspace is considered:
2082 (x,y): X-\texttt{sp} \le x \le X+\texttt{sp} , Y-\texttt{sp} \le y \le Y+\texttt{sp} , ||(R,G,B)-(r,g,b)|| \le \texttt{sr}
2085 where \texttt{(R,G,B)} and \texttt{(r,g,b)} are the vectors of color components at \texttt{(X,Y)} and \texttt{(x,y)}, respectively (though, the algorithm does not depend on the color space used, so any 3-component color space can be used instead). Over the neighborhood the average spatial value \texttt{(X',Y')} and average color vector \texttt{(R',G',B')} are found and they act as the neighborhood center on the next iteration:
2087 $(X,Y)~(X',Y'), (R,G,B)~(R',G',B').$
2089 After the iterations over, the color components of the initial pixel (that is, the pixel from where the iterations started) are set to the final value (average color at the last iteration):
2091 $I(X,Y) <- (R*,G*,B*)$
2093 Then $\texttt{max\_level}>0$ , the gaussian pyramid of
2094 $\texttt{max\_level}+1$ levels is built, and the above procedure is run
2095 on the smallest layer. After that, the results are propagated to the
2096 larger layer and the iterations are run again only on those pixels where
2097 the layer colors differ much ( $>\texttt{sr}$ ) from the lower-resolution
2098 layer, that is, the boundaries of the color regions are clarified. Note,
2099 that the results will be actually different from the ones obtained by
2100 running the meanshift procedure on the whole original image (i.e. when
2101 $\texttt{max\_level}==0$ ).
2105 Does watershed segmentation.
2109 void cvWatershed( const CvArr* image, CvArr* markers );
2112 }{CPP}{Watershed(image,markers)-> None}
2115 \cvarg{image}{The input 8-bit 3-channel image.}
2116 \cvarg{markers}{The input/output 32-bit single-channel image (map) of markers.}
2119 The function \texttt{cvWatershed} implements one of the variants
2120 of watershed, non-parametric marker-based segmentation algorithm,
2121 described in \href{Meyer92}{[Meyer92]} Before passing the image to the
2122 function, user has to outline roughly the desired regions in the image
2123 \texttt{markers} with positive ($>0$) indices, i.e. every region is
2124 represented as one or more connected components with the pixel values
2125 1, 2, 3 etc. Those components will be "seeds" of the future image
2126 regions. All the other pixels in \texttt{markers}, which relation to the
2127 outlined regions is not known and should be defined by the algorithm,
2128 should be set to 0's. On the output of the function, each pixel in
2129 markers is set to one of values of the "seed" components, or to -1 at
2130 boundaries between the regions.
2132 Note, that it is not necessary that every two neighbor connected
2133 components are separated by a watershed boundary (-1's pixels), for
2134 example, in case when such tangent components exist in the initial
2135 marker image. Visual demonstration and usage example of the function
2136 can be found in OpenCV samples directory; see \texttt{watershed.cpp} demo.
2138 \subsection{Image and Contour moments}
2140 \cvfunc{Moments}\label{Moments}
2142 Calculates all of the moments up to the third order of a polygon or rasterized shape.
2145 void cvMoments( \par const CvArr* arr,\par CvMoments* moments,\par int binary=0 );
2146 }{CPP}{Moments(arr) -> cvmoments}
2149 \cvarg{arr}{Image (1-channel or 3-channel with COI set) or polygon (CvSeq of points or a vector of points)}
2150 \cvarg{moments}{Pointer to returned moment's state structure}
2151 \cvarg{binary}{(For images only) If the flag is non-zero, all of the zero pixel values are treated as zeroes, and all of the others are treated as 1's}
2154 The function \texttt{cvMoments} calculates spatial and central moments up to the third order and writes them to \texttt{moments}. The moments may then be used then to calculate the gravity center of the shape, its area, main axises and various shape characeteristics including 7 Hu invariants.
2156 \cvfunc{GetSpatialMoment}\label{GetSpatialMoment}
2158 Retrieves the spatial moment from the moment state structure.
2161 double cvGetSpatialMoment( \par CvMoments* moments, \par int x\_order, \par int y\_order );
2162 }{CPP}{GetSpatialMoment(cvmoments, x\_order, y\_order) -> double}
2165 \cvarg{moments}{The moment state, calculated by \cross{Moments}}
2166 \cvarg{x\_order}{x order of the retrieved moment, $\texttt{x\_order} >= 0$}
2167 \cvarg{y\_order}{y order of the retrieved moment, $\texttt{y\_order} >= 0$ and $\texttt{x\_order} + \texttt{y\_order} <= 3$}
2170 The function \texttt{cvGetSpatialMoment} retrieves the spatial moment, which in the case of image moments is defined as:
2173 M_{x\_order, \, y\_order} = \sum_{x,y} (I(x,y) \cdot x^{x\_order} \cdot y^{y\_order})
2176 where $I(x,y)$ is the intensity of the pixel $(x, y)$.
2178 \cvfunc{GetCentralMoment}\label{GetCentralMoment}
2180 Retrieves the central moment from the moment state structure.
2183 double cvGetCentralMoment( \par CvMoments* moments,\par int x\_order,\par int y\_order );
2184 }{CPP}{GetCentralMoment(cvmoments, x\_order, y\_order) -> double}
2187 \cvarg{moments}{Pointer to the moment state structure}
2188 \cvarg{x\_order}{x order of the retrieved moment, $\texttt{x\_order} >= 0$}
2189 \cvarg{y\_order}{y order of the retrieved moment, $\texttt{y\_order} >= 0$ and $\texttt{x\_order} + \texttt{y\_order} <= 3$}
2192 The function \texttt{cvGetCentralMoment} retrieves the central moment, which in the case of image moments is defined as:
2195 \mu_{x\_order, \, y\_order} = \sum_{x,y} (I(x,y) \cdot (x-x_c)^{x\_order} \cdot (y-y_c)^{y\_order})
2198 where $x_c,y_c$ are the coordinates of the gravity center:
2201 x_c=\frac{M_{10}}{M_{00}}, y_c=\frac{M_{01}}{M_{00}}
2204 \cvfunc{GetNormalizedCentralMoment}\label{GetNormalizedCentralMoment}
2206 Retrieves the normalized central moment from the moment state structure.
2209 double cvGetNormalizedCentralMoment( \par CvMoments* moments,\par int x\_order,\par int y\_order );
2210 }{CPP}{GetNormalizedCentralMoment(cvmoments, x\_order, y\_order) -> double}
2213 \cvarg{moments}{Pointer to the moment state structure}
2214 \cvarg{x\_order}{x order of the retrieved moment, $\texttt{x\_order} >= 0$}
2215 \cvarg{y\_order}{y order of the retrieved moment, $\texttt{y\_order} >= 0$ and $\texttt{x\_order} + \texttt{y\_order} <= 3$}
2218 The function \texttt{cvGetNormalizedCentralMoment} retrieves the normalized central moment:
2221 \eta_{x\_order, \, y\_order} = \frac{\mu_{x\_order, \, y\_order}}{M_{00}^{(y\_order+x\_order)/2+1}}
2224 \cvfunc{GetHuMoments}\label{GetHuMoments}
2226 Calculates the seven Hu invariants.
2229 void cvGetHuMoments( CvMoments* moments, CvHuMoments* hu\_moments );
2230 }{CPP}{GetHuMoments(cvmoments) -> (h1, h2, h3, h4, h5, h5, h7)}
2233 \cvarg{moments}{Pointer to the moment state structure}
2234 \cvC{\cvarg{hu\_moments}{Pointer to the Hu moments structure}}
2237 The function \texttt{cvGetHuMoments} calculates the seven Hu invariants that are defined as:
2240 h1=\eta_{20}+\eta_{02}\\
2241 h2=(\eta_{20}-\eta_{02})^{2}+4\eta_{11}^{2}\\
2242 h3=(\eta_{30}-3\eta_{12})^{2}+ (3\eta_{21}-\eta_{03})^{2}\\
2243 h4=(\eta_{30}+\eta_{12})^{2}+ (\eta_{21}+\eta_{03})^{2}\\
2244 h5=(\eta_{30}-3\eta_{12})(\eta_{30}+\eta_{12})[(\eta_{30}+\eta_{12})^{2}-3(\eta_{21}+\eta_{03})^{2}]+(3\eta_{21}-\eta_{03})(\eta_{21}+\eta_{03})[3(\eta_{30}+\eta_{12})^{2}-(\eta_{21}+\eta_{03})^{2}]\\
2245 h6=(\eta_{20}-\eta_{02})[(\eta_{30}+\eta_{12})^{2}- (\eta_{21}+\eta_{03})^{2}]+4\eta_{11}(\eta_{30}+\eta_{12})(\eta_{21}+\eta_{03})\\
2246 h7=(3\eta_{21}-\eta_{03})(\eta_{21}+\eta_{03})[3(\eta_{30}+\eta_{12})^{2}-(\eta_{21}+\eta_{03})^{2}]-(\eta_{30}-3\eta_{12})(\eta_{21}+\eta_{03})[3(\eta_{30}+\eta_{12})^{2}-(\eta_{21}+\eta_{03})^{2}]\\
2250 where $\eta_{i,j}$ are the normalized central moments of the $2^{nd}$ and $3^{rd}$ orders.
2251 These values are proved to be invariants to the image scale, rotation, and reflection except the seventh one, whose sign is changed by reflection.
2253 \subsection{Special Image Transforms}
2255 \cvfunc{HoughLines2}\label{HoughLines2}
2257 Finds lines in a binary image using a Hough transform.
2260 CvSeq* cvHoughLines2( \par CvArr* image,\par void* line\_storage,\par int method,\par double rho,\par double theta,\par int threshold,\par double param1=0,\par double param2=0 );
2261 }{CPP}{HoughLines2(image,storage,method,rho,theta,threshold,param1=0,parma2=0)-> lines}
2264 \cvarg{image}{The 8-bit, single-channel, binary source image. In the case of a probabilistic method, the image is modified by the function}
2265 \cvarg{line\_storage}{The storage for the lines that are detected. It can
2266 be a memory storage (in this case a sequence of lines is created in
2267 the storage and returned by the function) or single row/single column
2268 matrix (CvMat*) of a particular type (see below) to which the lines'
2269 parameters are written. The matrix header is modified by the function
2270 so its \texttt{cols} or \texttt{rows} will contain the number of lines
2271 detected. If \texttt{line\_storage} is a matrix and the actual number
2272 of lines exceeds the matrix size, the maximum possible number of lines
2273 is returned (in the case of standard hough transform the lines are sorted
2274 by the accumulator value)}
2275 \cvarg{method}{The Hough transform variant, one of the following:
2277 \cvarg{CV\_HOUGH\_STANDARD}{classical or standard Hough transform. Every line is represented by two floating-point numbers $(\rho, \theta)$, where $\rho$ is a distance between (0,0) point and the line, and $\theta$ is the angle between x-axis and the normal to the line. Thus, the matrix must be (the created sequence will be) of \texttt{CV\_32FC2} type}
2278 \cvarg{CV\_HOUGH\_PROBABILISTIC}{probabilistic Hough transform (more efficient in case if picture contains a few long linear segments). It returns line segments rather than the whole line. Each segment is represented by starting and ending points, and the matrix must be (the created sequence will be) of \texttt{CV\_32SC4} type}
2279 \cvarg{CV\_HOUGH\_MULTI\_SCALE}{multi-scale variant of the classical Hough transform. The lines are encoded the same way as \texttt{CV\_HOUGH\_STANDARD}}
2281 \cvarg{rho}{Distance resolution in pixel-related units}
2282 \cvarg{theta}{Angle resolution measured in radians}
2283 \cvarg{threshold}{Threshold parameter. A line is returned by the function if the corresponding accumulator value is greater than \texttt{threshold}}
2284 \cvarg{param1}{The first method-dependent parameter:
2286 \item For the classical Hough transform it is not used (0).
2287 \item For the probabilistic Hough transform it is the minimum line length.
2288 \item For the multi-scale Hough transform it is the divisor for the distance resolution $\rho$. (The coarse distance resolution will be $\rho$ and the accurate resolution will be $(\rho / \texttt{param1})$).
2290 \cvarg{param2}{The second method-dependent parameter:
2292 \item For the classical Hough transform it is not used (0).
2293 \item For the probabilistic Hough transform it is the maximum gap between line segments lying on the same line to treat them as a single line segment (i.e. to join them).
2294 \item For the multi-scale Hough transform it is the divisor for the angle resolution $\theta$. (The coarse angle resolution will be $\theta$ and the accurate resolution will be $(\theta / \texttt{param2})$).
2298 The function \texttt{cvHoughLines2} implements a few variants of the Hough transform for line detection.
2300 % ===== Example. Detecting lines with Hough transform. =====
2302 /* This is a standalone program. Pass an image name as a first parameter
2303 of the program. Switch between standard and probabilistic Hough transform
2304 by changing "#if 1" to "#if 0" and back */
2306 #include <highgui.h>
2309 int main(int argc, char** argv)
2312 if( argc == 2 && (src=cvLoadImage(argv[1], 0))!= 0)
2314 IplImage* dst = cvCreateImage( cvGetSize(src), 8, 1 );
2315 IplImage* color_dst = cvCreateImage( cvGetSize(src), 8, 3 );
2316 CvMemStorage* storage = cvCreateMemStorage(0);
2319 cvCanny( src, dst, 50, 200, 3 );
2320 cvCvtColor( dst, color_dst, CV_GRAY2BGR );
2322 lines = cvHoughLines2( dst,
2331 for( i = 0; i < MIN(lines->total,100); i++ )
2333 float* line = (float*)cvGetSeqElem(lines,i);
2334 float rho = line[0];
2335 float theta = line[1];
2337 double a = cos(theta), b = sin(theta);
2338 double x0 = a*rho, y0 = b*rho;
2339 pt1.x = cvRound(x0 + 1000*(-b));
2340 pt1.y = cvRound(y0 + 1000*(a));
2341 pt2.x = cvRound(x0 - 1000*(-b));
2342 pt2.y = cvRound(y0 - 1000*(a));
2343 cvLine( color_dst, pt1, pt2, CV_RGB(255,0,0), 3, 8 );
2346 lines = cvHoughLines2( dst,
2348 CV_HOUGH_PROBABILISTIC,
2354 for( i = 0; i < lines->total; i++ )
2356 CvPoint* line = (CvPoint*)cvGetSeqElem(lines,i);
2357 cvLine( color_dst, line[0], line[1], CV_RGB(255,0,0), 3, 8 );
2360 cvNamedWindow( "Source", 1 );
2361 cvShowImage( "Source", src );
2363 cvNamedWindow( "Hough", 1 );
2364 cvShowImage( "Hough", color_dst );
2371 This is the sample picture the function parameters have been tuned for:
2373 \includegraphics[width=0.5\textwidth]{pics/building.jpg}
2375 And this is the output of the above program in the case of probabilistic Hough transform (\texttt{\#if 0} case):
2377 \includegraphics[width=0.5\textwidth]{pics/houghp.png}
2379 \cvfunc{HoughCircles}\label{HoughCircles}
2381 Finds circles in a grayscale image using a Hough transform.
2384 CvSeq* cvHoughCircles( \par CvArr* image,\par void* circle\_storage,\par int method,\par double dp,\par double min\_dist,\par double param1=100,\par double param2=100,\par int min\_radius=0,\par int max\_radius=0 );
2385 }{CPP}{HoughCircles(image,circle\_storage,method,dp,min\_dist,param1=100,param2=100,min\_radius=0,max\_radius=0)-> None}
2388 \cvarg{image}{The 8-bit, single-channel, grayscale source image}
2389 \cvarg{circle\_storage}{The storage for the circles detected. It can be a memory storage (in this case a sequence of circles is created in the storage and returned by the function) or single row/single column matrix (CvMat*) of type \texttt{CV\_32FC3}, to which the circles' parameters are written. The matrix header is modified by the function so its \texttt{cols} or \texttt{rows} will contain the number of lines detected. If \texttt{circle\_storage} is a matrix and the actual number of lines exceeds the matrix size, the maximum possible number of circles is returned. Every circle is encoded as 3 floating-point numbers: center coordinates (x,y) and the radius}
2390 \cvarg{method}{Currently, the only implemented method is \texttt{CV\_HOUGH\_GRADIENT}, which is basically 21HT, described in Yuen03.}
2391 \cvarg{dp}{Resolution of the accumulator used to detect the centers of the circles. For example, if it is 1, the accumulator will have the same resolution as the input image, if it is 2 - accumulator will have half as big width and height, etc}
2392 \cvarg{min\_dist}{Minimum distance between the centers of the detected circles. If the parameter is too small, multiple neighbor circles may be falsely detected in addition to a true one. If it is too large, some circles may be missed}
2393 \cvarg{param1}{The first method-specific parameter. in the case of \texttt{CV\_HOUGH\_GRADIENT} it is the higher threshold of the two passed to \cross{Canny} edge detector (the lower one will be twice smaller)}
2394 \cvarg{param2}{The second method-specific parameter. in the case of \texttt{CV\_HOUGH\_GRADIENT} it is the accumulator threshold at the center detection stage. theThe smaller it is, the more false circles may be detected. Circles, corresponding to the larger accumulator values, will be returned first}
2395 \cvarg{min\_radius}{Minimum circle radius}
2396 \cvarg{max\_radius}{Maximum circle radius}
2399 The function \texttt{cvHoughCircles} finds circles in a grayscale image using some modification of Hough transform.
2401 % ===== Example. Detecting circles with Hough transform. =====
2404 #include <highgui.h>
2407 int main(int argc, char** argv)
2410 if( argc == 2 && (img=cvLoadImage(argv[1], 1))!= 0)
2412 IplImage* gray = cvCreateImage( cvGetSize(img), 8, 1 );
2413 CvMemStorage* storage = cvCreateMemStorage(0);
2414 cvCvtColor( img, gray, CV_BGR2GRAY );
2415 // smooth it, otherwise a lot of false circles may be detected
2416 cvSmooth( gray, gray, CV_GAUSSIAN, 9, 9 );
2417 CvSeq* circles = cvHoughCircles( gray,
2425 for( i = 0; i < circles->total; i++ )
2427 float* p = (float*)cvGetSeqElem( circles, i );
2429 cvPoint(cvRound(p[0]),cvRound(p[1])),
2434 cvPoint(cvRound(p[0]),cvRound(p[1])),
2439 cvNamedWindow( "circles", 1 );
2440 cvShowImage( "circles", img );
2446 \cvfunc{DistTransform}\label{DistTransform}
2448 Calculates the distance to the closest zero pixel for all non-zero pixels of the source image.
2451 void cvDistTransform( \par const CvArr* src,\par CvArr* dst,\par int distance\_type=CV\_DIST\_L2,\par int mask\_size=3,\par const float* mask=NULL,\par CvArr* labels=NULL );
2452 }{CPP}{DistTransform(src,dst,distance\_type=CV\_DIST\_L2,mask\_size=3,mask={NULL,0},labels=NULL)-> None}
2455 \cvarg{src}{8-bit, single-channel (binary) source image}
2456 \cvarg{dst}{Output image with calculated distances (32-bit floating-point, single-channel)}
2457 \cvarg{distance\_type}{Type of distance; can be \texttt{CV\_DIST\_L1, CV\_DIST\_L2, CV\_DIST\_C} or \texttt{CV\_DIST\_USER}}
2458 \cvarg{mask\_size}{Size of the distance transform mask; can be 3 or 5. in the case of \texttt{CV\_DIST\_L1} or \texttt{CV\_DIST\_C} the parameter is forced to 3, because a $3\times 3$ mask gives the same result as a $5\times 5 $ yet it is faster}
2459 \cvarg{mask}{User-defined mask in the case of a user-defined distance, it consists of 2 numbers (horizontal/vertical shift cost, diagonal shift cost) in the case ofa $3\times 3$ mask and 3 numbers (horizontal/vertical shift cost, diagonal shift cost, knight's move cost) in the case of a $5\times 5$ mask}
2460 \cvarg{labels}{The optional output 2d array of integer type labels, the same size as \texttt{src} and \texttt{dst}}
2463 The function \texttt{cvDistTransform} calculates the approximated
2464 distance from every binary image pixel to the nearest zero pixel.
2465 For zero pixels the function sets the zero distance, for others it
2466 finds the shortest path consisting of basic shifts: horizontal,
2467 vertical, diagonal or knight's move (the latest is available for a
2468 $5\times 5$ mask). The overall distance is calculated as a sum of these
2469 basic distances. Because the distance function should be symmetric,
2470 all of the horizontal and vertical shifts must have the same cost (that
2471 is denoted as \texttt{a}), all the diagonal shifts must have the
2472 same cost (denoted \texttt{b}), and all knight's moves must have
2473 the same cost (denoted \texttt{c}). For \texttt{CV\_DIST\_C} and
2474 \texttt{CV\_DIST\_L1} types the distance is calculated precisely,
2475 whereas for \texttt{CV\_DIST\_L2} (Euclidian distance) the distance
2476 can be calculated only with some relative error (a $5\times 5$ mask
2477 gives more accurate results), OpenCV uses the values suggested in
2481 \begin{tabular}{| c | c | c |}
2483 \texttt{CV\_DIST\_C} & $(3\times 3)$ & a = 1, b = 1\\ \hline
2484 \texttt{CV\_DIST\_L1} & $(3\times 3)$ & a = 1, b = 2\\ \hline
2485 \texttt{CV\_DIST\_L2} & $(3\times 3)$ & a=0.955, b=1.3693\\ \hline
2486 \texttt{CV\_DIST\_L2} & $(5\times 5)$ & a=1, b=1.4, c=2.1969\\ \hline
2489 And below are samples of the distance field (black (0) pixel is in the middle of white square) in the case of a user-defined distance:
2491 User-defined $3 \times 3$ mask (a=1, b=1.5)
2493 \begin{tabular}{| c | c | c | c | c | c | c |}
2495 4.5 & 4 & 3.5 & 3 & 3.5 & 4 & 4.5\\ \hline
2496 4 & 3 & 2.5 & 2 & 2.5 & 3 & 4\\ \hline
2497 3.5 & 2.5 & 1.5 & 1 & 1.5 & 2.5 & 3.5\\ \hline
2498 3 & 2 & 1 & & 1 & 2 & 3\\ \hline
2499 3.5 & 2.5 & 1.5 & 1 & 1.5 & 2.5 & 3.5\\ \hline
2500 4 & 3 & 2.5 & 2 & 2.5 & 3 & 4\\ \hline
2501 4.5 & 4 & 3.5 & 3 & 3.5 & 4 & 4.5\\ \hline
2504 User-defined $5 \times 5$ mask (a=1, b=1.5, c=2)
2506 \begin{tabular}{| c | c | c | c | c | c | c |}
2508 4.5 & 3.5 & 3 & 3 & 3 & 3.5 & 4.5\\ \hline
2509 3.5 & 3 & 2 & 2 & 2 & 3 & 3.5\\ \hline
2510 3 & 2 & 1.5 & 1 & 1.5 & 2 & 3\\ \hline
2511 3 & 2 & 1 & & 1 & 2 & 3\\ \hline
2512 3 & 2 & 1.5 & 1 & 1.5 & 2 & 3\\ \hline
2513 3.5 & 3 & 2 & 2 & 2 & 3 & 3.5\\ \hline
2514 4 & 3.5 & 3 & 3 & 3 & 3.5 & 4\\ \hline
2518 Typically, for a fast, coarse distance estimation \texttt{CV\_DIST\_L2},
2519 a $3\times 3$ mask is used, and for a more accurate distance estimation
2520 \texttt{CV\_DIST\_L2}, a $5\times 5$ mask is used.
2522 When the output parameter \texttt{labels} is not \texttt{NULL}, for
2523 every non-zero pixel the function also finds the nearest connected
2524 component consisting of zero pixels. The connected components
2525 themselves are found as contours in the beginning of the function.
2527 In this mode the processing time is still O(N), where N is the number of
2528 pixels. Thus, the function provides a very fast way to compute approximate
2529 Voronoi diagram for the binary image.
2533 Inpaints the selected region in the image.
2537 void cvInpaint( \par const CvArr* src, \par const CvArr* mask, \par CvArr* dst,
2538 \par double inpaintRadius, \par int flags);
2540 }{CPP}{Inpaint(src,mask,dst,inpaintRadius,flags) -> None}
2543 \cvarg{src}{The input 8-bit 1-channel or 3-channel image.}
2544 \cvarg{mask}{The inpainting mask, 8-bit 1-channel image. Non-zero pixels indicate the area that needs to be inpainted.}
2545 \cvarg{dst}{The output image of the same format and the same size as input.}
2546 \cvarg{inpaintRadius}{The radius of circlular neighborhood of each point inpainted that is considered by the algorithm.}
2547 \cvarg{flags}{The inpainting method, one of the following:
2549 \cvarg{CV\_INPAINT\_NS}{Navier-Stokes based method.}
2550 \cvarg{CV\_INPAINT\_TELEA}{The method by Alexandru Telea \href{\#Telea04}{[Telea04]}}
2554 The function \texttt{cvInpaint} reconstructs the selected image area from the pixel near the area boundary. The function may be used to remove dust and scratches from a scanned photo, or to remove undesirable objects from still images or video.
2557 \subsection{Histograms}
2559 \cvstruct{CvHistogram}\label{CvHistogram}
2561 Multi-dimensional histogram.
2564 typedef struct CvHistogram
2568 float thresh[CV_MAX_DIM][2]; /* for uniform histograms */
2569 float** thresh2; /* for non-uniform histograms */
2570 CvMatND mat; /* embedded matrix header for array histograms */
2575 \cvfunc{CreateHist}\label{CreateHist}
2577 Creates a histogram.
2580 CvHistogram* cvCreateHist(\par int dims,\par int* sizes,\par int type,\par float** ranges=NULL,\par int uniform=1 );
2581 }{CPP}{CreateHist(dims, type, ranges, uniform = 1) -> hist}
2584 \cvC{\cvarg{dims}{Number of histogram dimensions}
2585 \cvarg{sizes}{Array of the histogram dimension sizes}}
2586 \cvPy{\cvarg{dims}{for an N-dimensional histogram, list of length N giving the size of each dimension}}
2587 \cvarg{type}{Histogram representation format: \texttt{CV\_HIST\_ARRAY} means that the histogram data is represented as a multi-dimensional dense array CvMatND; \texttt{CV\_HIST\_SPARSE} means that histogram data is represented as a multi-dimensional sparse array CvSparseMat}
2588 \cvarg{ranges}{Array of ranges for the histogram bins. Its meaning depends on the \texttt{uniform} parameter value. The ranges are used for when the histogram is calculated or backprojected to determine which histogram bin corresponds to which value/tuple of values from the input image(s)}
2589 \cvarg{uniform}{Uniformity flag; if not 0, the histogram has evenly
2590 spaced bins and for every $0<=i<cDims$ \texttt{ranges[i]}
2591 is an array of two numbers: lower and upper boundaries for the i-th
2592 histogram dimension.
2593 The whole range [lower,upper] is then split
2594 into \texttt{dims[i]} equal parts to determine the \texttt{i-th} input
2595 tuple value ranges for every histogram bin. And if \texttt{uniform=0},
2596 then \texttt{i-th} element of \texttt{ranges} array contains
2597 \texttt{dims[i]+1} elements:
2598 $\texttt{lower}_0, \texttt{upper}_0,
2599 \texttt{lower}_1, \texttt{upper}_1 = \texttt{lower}_2,
2601 \texttt{upper}_{dims[i]-1} $
2603 $\texttt{lower}_j$ and $\texttt{upper}_j$
2605 boundaries of \texttt{i-th} input tuple value for \texttt{j-th}
2606 bin, respectively. In either case, the input values that are beyond
2607 the specified range for a histogram bin are not counted by
2608 \cross{CalcHist} and filled with 0 by \cross{CalcBackProject}}
2611 The function \texttt{CreateHist} creates a histogram of the specified
2612 size and returns a pointer to the created histogram. If the array
2613 \texttt{ranges} is 0, the histogram bin ranges must be specified later
2614 via the function \cross{SetHistBinRanges}. Though \cross{CalcHist}
2615 and \cross{CalcBackProject} may process 8-bit images without setting
2616 bin ranges, they assume thy are equally spaced in 0 to 255 bins.
2619 \cvfunc{SetHistBinRanges}\label{SetHistBinRanges}
2621 Sets the bounds of the histogram bins.
2624 void cvSetHistBinRanges( \par CvHistogram* hist,\par float** ranges,\par int uniform=1 );
2628 \cvarg{hist}{Histogram}
2629 \cvarg{ranges}{Array of bin ranges arrays, see \cross{CreateHist}}
2630 \cvarg{uniform}{Uniformity flag, see \cross{CreateHist}}
2633 The function \texttt{cvSetHistBinRanges} is a stand-alone function for setting bin ranges in the histogram. For a more detailed description of the parameters \texttt{ranges} and \texttt{uniform} see the \cross{CalcHist} function, that can initialize the ranges as well. Ranges for the histogram bins must be set before the histogram is calculated or the backproject of the histogram is calculated.
2637 \cvfunc{ReleaseHist}\label{ReleaseHist}
2639 Releases the histogram.
2642 void cvReleaseHist( CvHistogram** hist );
2646 \cvarg{hist}{Double pointer to the released histogram}
2649 The function \texttt{cvReleaseHist} releases the histogram (header and the data). The pointer to the histogram is cleared by the function. If \texttt{*hist} pointer is already \texttt{NULL}, the function does nothing.
2652 \cvfunc{ClearHist}\label{ClearHist}
2654 Clears the histogram.
2657 void cvClearHist( CvHistogram* hist );
2658 }{CPP}{ClearHist(hist)-> None}
2661 \cvarg{hist}{Histogram}
2664 The function \texttt{ClearHist} sets all of the histogram bins to 0 in the case of a dense histogram and removes all histogram bins in the case of a sparse array.
2667 \cvfunc{MakeHistHeaderForArray}\label{MakeHistHeaderForArray}
2669 Makes a histogram out of an array.
2672 CvHistogram* cvMakeHistHeaderForArray( \par int dims,\par int* sizes,\par CvHistogram* hist,\par float* data,\par float** ranges=NULL,\par int uniform=1 );
2676 \cvarg{dims}{Number of histogram dimensions}
2677 \cvarg{sizes}{Array of the histogram dimension sizes}
2678 \cvarg{hist}{The histogram header initialized by the function}
2679 \cvarg{data}{Array that will be used to store histogram bins}
2680 \cvarg{ranges}{Histogram bin ranges, see \cross{CreateHist}}
2681 \cvarg{uniform}{Uniformity flag, see \cross{CreateHist}}
2684 The function \texttt{cvMakeHistHeaderForArray} initializes the histogram, whose header and bins are allocated by th user. \cross{ReleaseHist} does not need to be called afterwards. Only dense histograms can be initialized this way. The function returns \texttt{hist}.
2688 \cvfunc{QueryHistValue\_1D} \cvexp{float cvQueryHistValue\_1D( CvHistogram *hist, int idx0)}{}{}
2689 \cvfunc{QueryHistValue\_2D} \cvexp{float cvQueryHistValue\_2D( CvHistogram *hist, int idx0, int idx1)}{}{}
2690 \cvfunc{QueryHistValue\_3D} \cvexp{float cvQueryHistValue\_3D( CvHistogram *hist, int idx0, int idx1, int idx2)}{}{}
2691 \cvfunc{QueryHistValue\_nD} Queries the value of the histogram bin. \cvexp{float cvQueryHistValue\_nD( CvHistogram *hist, int idx[])}{}{}
2693 \cvfunc{QueryHistValue\_nD}\label{QueryHistValue_nD}
2695 Queries the value of the histogram bin.
2698 \#define cvQueryHistValue\_1D( hist, idx0 ) \
2699 cvGetReal1D( (hist)->bins, (idx0) )
2700 \#define cvQueryHistValue\_2D( hist, idx0, idx1 ) \
2701 cvGetReal2D( (hist)->bins, (idx0), (idx1) )
2702 \#define cvQueryHistValue\_3D( hist, idx0, idx1, idx2 ) \
2703 cvGetReal3D( (hist)->bins, (idx0), (idx1), (idx2) )
2704 \#define cvQueryHistValue\_nD( hist, idx ) \
2705 cvGetRealND( (hist)->bins, (idx) )
2710 \cvarg{hist}{Histogram}
2711 \cvarg{idx0, idx1, idx2, idx3}{Indices of the bin}
2712 \cvarg{idx}{Array of indices}
2715 The macros \texttt{QueryHistValue\_nD} return the value of the specified bin of the 1D, 2D, 3D or N-D histogram. In the case of a sparse histogram the function returns 0, if the bin is not present in the histogram no new bin is created.
2718 \cvfunc{GetHistValue\_1D} \cvexp{float* cvGetHistValue\_1D( CvHistogram *hist, int idx0);}{}{}
2719 \cvfunc{GetHistValue\_2D} \cvexp{float* cvGetHistValue\_2D( CvHistogram *hist, int idx0, int idx1);}{}{}
2720 \cvfunc{GetHistValue\_3D} \cvexp{float* cvGetHistValue\_3D( CvHistogram *hist, int idx0, int idx1, int idx2);}{}{}
2721 \cvfunc{GetHistValue\_nD} Returns a pointer to the histogram bin. \cvexp{float* cvGetHistValue\_nD( CvHistogram *hist, int idx[]);}{}{}
2723 \cvfunc{GetHistValue\_nD}\label{GetHistValue_nD}
2725 Returns a pointer to the histogram bin.
2728 \#define cvGetHistValue\_1D( hist, idx0 ) \
2729 ((float*)(cvPtr1D( (hist)->bins, (idx0), 0 ))
2730 \#define cvGetHistValue\_2D( hist, idx0, idx1 ) \
2731 ((float*)(cvPtr2D( (hist)->bins, (idx0), (idx1), 0 )))
2732 \#define cvGetHistValue\_3D( hist, idx0, idx1, idx2 ) \
2733 ((float*)(cvPtr3D( (hist)->bins, (idx0), (idx1), (idx2), 0 )))
2734 \#define cvGetHistValue\_nD( hist, idx ) \
2735 ((float*)(cvPtrND( (hist)->bins, (idx), 0 )))
2740 \cvarg{hist}{Histogram}
2741 \cvarg{idx0, idx1, idx2, idx3}{Indices of the bin}
2742 \cvarg{idx}{Array of indices}
2745 The macros \texttt{GetHistValue} return a pointer to the specified bin of the 1D, 2D, 3D or N-D histogram. In the case of a sparse histogram the function creates a new bin and sets it to 0, unless it exists already.
2748 \cvfunc{GetMinMaxHistValue}\label{GetMinMaxHistValue}
2750 Finds the minimum and maximum histogram bins.
2753 void cvGetMinMaxHistValue( \par const CvHistogram* hist,\par float* min\_value,\par float* max\_value,\par int* min\_idx=NULL,\par int* max\_idx=NULL );
2755 }{CPP}{GetMinMaxHistValue(hist)-> min\_val,max\_val,min\_loc,max\_loc}
2758 \cvarg{hist}{Histogram}
2759 \cvarg{min\_value}{Pointer to the minimum value of the histogram}
2760 \cvarg{max\_value}{Pointer to the maximum value of the histogram}
2761 \cvarg{min\_idx}{Pointer to the array of coordinates for the minimum}
2762 \cvarg{max\_idx}{Pointer to the array of coordinates for the maximum}
2765 The function \texttt{cvGetMinMaxHistValue} finds the minimum and
2766 maximum histogram bins and their positions. All of output arguments are
2767 optional. Among several extremas with the same value the ones with the
2768 minimum index (in lexicographical order) are returned. In the case of several maximums
2769 or minimums, the earliest in lexicographical order (extrema locations)
2772 \cvfunc{NormalizeHist}\label{NormalizeHist}
2774 Normalizes the histogram.
2777 void cvNormalizeHist( CvHistogram* hist, double factor );
2778 }{CPP}{NormalizeHist(hist,factor)-> None}
2781 \cvarg{hist}{Pointer to the histogram}
2782 \cvarg{factor}{Normalization factor}
2785 The function \texttt{cvNormalizeHist} normalizes the histogram bins by scaling them, such that the sum of the bins becomes equal to \texttt{factor}.
2787 \cvfunc{ThreshHist}\label{ThreshHist}
2789 Thresholds the histogram.
2792 void cvThreshHist( CvHistogram* hist, double threshold );
2793 }{CPP}{ThreshHist(hist,threshold)-> None}
2796 \cvarg{hist}{Pointer to the histogram}
2797 \cvarg{threshold}{Threshold level}
2800 The function \texttt{cvThreshHist} clears histogram bins that are below the specified threshold.
2802 \cvfunc{CompareHist}\label{CompareHist}
2804 Compares two dense histograms.
2807 double cvCompareHist( \par const CvHistogram* hist1,\par const CvHistogram* hist2,\par int method );
2808 }{CPP}{CompareHist(hist1,hist2,method)-> None}
2811 \cvarg{hist1}{The first dense histogram}
2812 \cvarg{hist2}{The second dense histogram}
2813 \cvarg{method}{Comparison method, one of the following:
2815 \cvarg{CV\_COMP\_CORREL}{Correlation}
2816 \cvarg{CV\_COMP\_CHISQR}{Chi-Square}
2817 \cvarg{CV\_COMP\_INTERSECT}{Intersection}
2818 \cvarg{CV\_COMP\_BHATTACHARYYA}{Bhattacharyya distance}
2822 The function \texttt{CompareHist} compares two dense histograms using the specified method ($H_1$ denotes the first histogram, $H_2$ the second):
2825 \item[Correlation (method=CV\_COMP\_CORREL)]
2828 {\sum_I (H'_1(I) \cdot H'_2(I))}
2829 {\sqrt{\sum_I(H'_1(I)^2) \cdot \sum_I(H'_2(I)^2)}}
2833 H'_k(I) = \frac{H_k(I) - 1}{N \cdot \sum_J H_k(J)}
2835 where N is the number of histogram bins.
2837 \item[Chi-Square (method=CV\_COMP\_CHISQR)]
2838 \[ d(H_1,H_2) = \sum_I \frac{(H_1(I)-H_2(I))^2}{H_1(I)+H_2(I)} \]
2840 \item[Intersection (method=CV\_COMP\_INTERSECT)]
2841 \[ d(H_1,H_2) = \sum_I \min (H_1(I), H_2(I)) \]
2843 \item[Bhattacharyya distance (method=CV\_COMP\_BHATTACHARYYA)]
2844 \[ d(H_1,H_2) = \sqrt{1 - \sum_I \frac{\sqrt{H_1(I) \cdot H_2(I)}}{ \sqrt { \sum_I H_1(I) \cdot \sum_I H_2(I) }}} \]
2848 The function returns $d(H_1, H_2)$.
2850 Note: the method \texttt{CV\_COMP\_BHATTACHARYYA} only works with normalized histograms.
2852 To compare a sparse histogram or more general sparse configurations of weighted points, consider using the \cross{CalcEMD2} function.
2855 \cvfunc{CopyHist}\label{CopyHist}
2860 void cvCopyHist( const CvHistogram* src, CvHistogram** dst );
2864 \cvarg{src}{Source histogram}
2865 \cvarg{dst}{Pointer to destination histogram}
2868 The function \texttt{CopyHist} makes a copy of the histogram. If the
2869 second histogram pointer \texttt{*dst} is NULL, a new histogram of the
2870 same size as \texttt{src} is created. Otherwise, both histograms must
2871 have equal types and sizes. Then the function copies the source histogram's
2872 bin values to the destination histogram and sets the same bin value ranges
2877 \cvfunc{CalcHist}\label{CalcHist}
2879 Calculates the histogram of image(s).
2882 void cvCalcHist( \par IplImage** image,\par CvHistogram* hist,\par int accumulate=0,\par const CvArr* mask=NULL );
2883 }{CPP}{CalcHist(image,hist,ccumulate=0,mask=NULL)-> None}
2886 \cvarg{image}{Source images (though you may pass CvMat** as well)}
2887 \cvarg{hist}{Pointer to the histogram}
2888 \cvarg{accumulate}{Accumulation flag. If it is set, the histogram is not cleared in the beginning. This feature allows user to compute a single histogram from several images, or to update the histogram online}
2889 \cvarg{mask}{The operation mask, determines what pixels of the source images are counted}
2892 The function \texttt{CalcHist} calculates the histogram of one or more
2893 single-channel images. The elements of a tuple that is used to increment
2894 a histogram bin are taken at the same location from the corresponding
2897 % ===== Sample. Calculating and displaying 2D Hue-Saturation histogram of a color image =====
2900 #include <highgui.h>
2902 int main( int argc, char** argv )
2905 if( argc == 2 && (src=cvLoadImage(argv[1], 1))!= 0)
2907 IplImage* h_plane = cvCreateImage( cvGetSize(src), 8, 1 );
2908 IplImage* s_plane = cvCreateImage( cvGetSize(src), 8, 1 );
2909 IplImage* v_plane = cvCreateImage( cvGetSize(src), 8, 1 );
2910 IplImage* planes[] = { h_plane, s_plane };
2911 IplImage* hsv = cvCreateImage( cvGetSize(src), 8, 3 );
2912 int h_bins = 30, s_bins = 32;
2913 int hist_size[] = {h_bins, s_bins};
2914 /* hue varies from 0 (~0 deg red) to 180 (~360 deg red again) */
2915 float h_ranges[] = { 0, 180 };
2916 /* saturation varies from 0 (black-gray-white) to
2917 255 (pure spectrum color) */
2918 float s_ranges[] = { 0, 255 };
2919 float* ranges[] = { h_ranges, s_ranges };
2921 IplImage* hist_img =
2922 cvCreateImage( cvSize(h_bins*scale,s_bins*scale), 8, 3 );
2924 float max_value = 0;
2927 cvCvtColor( src, hsv, CV_BGR2HSV );
2928 cvCvtPixToPlane( hsv, h_plane, s_plane, v_plane, 0 );
2929 hist = cvCreateHist( 2, hist_size, CV_HIST_ARRAY, ranges, 1 );
2930 cvCalcHist( planes, hist, 0, 0 );
2931 cvGetMinMaxHistValue( hist, 0, &max_value, 0, 0 );
2934 for( h = 0; h < h_bins; h++ )
2936 for( s = 0; s < s_bins; s++ )
2938 float bin_val = cvQueryHistValue_2D( hist, h, s );
2939 int intensity = cvRound(bin_val*255/max_value);
2940 cvRectangle( hist_img, cvPoint( h*scale, s*scale ),
2941 cvPoint( (h+1)*scale - 1, (s+1)*scale - 1),
2942 CV_RGB(intensity,intensity,intensity),
2947 cvNamedWindow( "Source", 1 );
2948 cvShowImage( "Source", src );
2950 cvNamedWindow( "H-S Histogram", 1 );
2951 cvShowImage( "H-S Histogram", hist_img );
2958 \cvfunc{CalcBackProject}\label{CalcBackProject}
2960 Calculates the back projection.
2963 void cvCalcBackProject( \par IplImage** image,\par CvArr* back\_project,\par const CvHistogram* hist );
2964 }{CPP}{CalcBackProject(image,back\_project,hist)-> None}
2967 \cvarg{image}{Source images (though you may pass CvMat** as well)}
2968 \cvarg{back\_project}{Destination back projection image of the same type as the source images}
2969 \cvarg{hist}{Histogram}
2972 The function \texttt{cvCalcBackProject} calculates the back project of the histogram. For each tuple of pixels at the same position of all input single-channel images the function puts the value of the histogram bin, corresponding to the tuple in the destination image. In terms of statistics, the value of each output image pixel is the probability of the observed tuple given the distribution (histogram). For example, to find a red object in the picture, one may do the following:
2975 \item Calculate a hue histogram for the red object assuming the image contains only this object. The histogram is likely to have a strong maximum, corresponding to red color.
2976 \item Calculate back projection of a hue plane of input image where the object is searched, using the histogram. Threshold the image.
2977 \item Find connected components in the resulting picture and choose the right component using some additional criteria, for example, the largest connected component.
2980 That is the approximate algorithm of Camshift color object tracker, except for the 3rd step, instead of which CAMSHIFT algorithm is used to locate the object on the back projection given the previous object position.
2982 \cvfunc{CalcBackProjectPatch}\label{CalcBackProjectPatch}
2984 Locates a template within an image by using a histogram comparison.
2987 void cvCalcBackProjectPatch( \par IplImage** image,\par CvArr* dst,\par CvSize patch\_size,\par CvHistogram* hist,\par int method,\par float factor );
2988 }{CPP}{CalcBackProjectPatch(images,dst,patch\_size,hist,method,factor)-> None}
2991 \cvarg{image}{Source images (though, you may pass CvMat** as well)}
2992 \cvarg{dst}{Destination image}
2993 \cvarg{patch\_size}{Size of the patch slid though the source image}
2994 \cvarg{hist}{Histogram}
2995 \cvarg{method}{Compasion method, passed to \cross{CompareHist} (see description of that function)}
2996 \cvarg{factor}{Normalization factor for histograms, will affect the normalization scale of the destination image, pass 1 if unsure}
2999 The function \texttt{cvCalcBackProjectPatch} calculates the back projection by comparing histograms of the source image patches with the given histogram. Taking measurement results from some image at each location over ROI creates an array \texttt{image}. These results might be one or more of hue, \texttt{x} derivative, \texttt{y} derivative, Laplacian filter, oriented Gabor filter, etc. Each measurement output is collected into its own separate image. The \texttt{image} image array is a collection of these measurement images. A multi-dimensional histogram \texttt{hist} is constructed by sampling from the \texttt{image} image array. The final histogram is normalized. The \texttt{hist} histogram has as many dimensions as the number of elements in \texttt{image} array.
3001 Each new image is measured and then converted into an \texttt{image} image array over a chosen ROI. Histograms are taken from this \texttt{image} image in an area covered by a "patch" with an anchor at center as shown in the picture below. The histogram is normalized using the parameter \texttt{norm\_factor} so that it may be compared with \texttt{hist}. The calculated histogram is compared to the model histogram; \texttt{hist} uses The function \texttt{cvCompareHist} with the comparison method=\texttt{method}). The resulting output is placed at the location corresponding to the patch anchor in the probability image \texttt{dst}. This process is repeated as the patch is slid over the ROI. Iterative histogram update by subtracting trailing pixels covered by the patch and adding newly covered pixels to the histogram can save a lot of operations, though it is not implemented yet.
3003 \cvfunc{Back Project Calculation by Patches}
3005 \includegraphics[width=0.5\textwidth]{pics/backprojectpatch.png}
3007 \cvfunc{CalcProbDensity}\label{CalcProbDensity}
3009 Divides one histogram by another.
3012 void cvCalcProbDensity( \par const CvHistogram* hist1,\par const CvHistogram* hist2,\par CvHistogram* dst\_hist,\par double scale=255 );
3013 }{CPP}{CalcProbDensity(hist1,hist2,dst\_hst,scale=255)-> None}
3016 \cvarg{hist1}{first histogram (the divisor)}
3017 \cvarg{hist2}{second histogram}
3018 \cvarg{dst\_hist}{destination histogram}
3019 \cvarg{scale}{scale factor for the destination histogram}
3022 The function \texttt{CalcProbDensity} calculates the object probability density from the two histograms as:
3025 \texttt{dist\_hist}(I)=
3027 {0}{if $\texttt{hist1}(I)=0$}
3028 {\texttt{scale}}{if $\texttt{hist1}(I) \ne 0$ and $\texttt{hist2}(I) > \texttt{hist1}(I)$}
3029 {\frac{\texttt{hist2}(I) \cdot \texttt{scale}}{\texttt{hist1}(I)}}{if $\texttt{hist1}(I) \ne 0$ and $\texttt{hist2}(I) \le \texttt{hist1}(I)$}
3032 So the destination histogram bins are within less than \texttt{scale}.
3034 \cvfunc{EqualizeHist}\label{EqualizeHist}
3036 Equalizes the histogram of a grayscale image.
3039 void cvEqualizeHist( const CvArr* src, CvArr* dst );
3040 }{CPP}{EqualizeHist(src,dst)-> None}
3043 \cvarg{src}{The 8-bit, single-channel, source image}
3044 \cvarg{dst}{The output image of the same size and the same data type as \texttt{src}}
3047 The function \texttt{cvEqualizeHist} equalizes the histogram of the input image using the following algorithm:
3050 \item calculate the histogram $H$ for src.
3051 \item normalize the histogram so that the sum of histogram bins is 255.
3052 \item compute the integral of the histogram:
3054 H'_i = \sum_{0 \le j \le i} H(j)
3056 \item transform the image using $H'$ as a look-up table: $\texttt{dst}(x,y) = H'(\texttt{src}(x,y))$
3059 The algorithm normalizes the brightness and increases the contrast of the image.
3061 \subsection{Matching}
3063 \cvfunc{MatchTemplate}\label{MatchTemplate}
3065 Compares a template against overlapped image regions.
3068 void cvMatchTemplate( \par const CvArr* image,\par const CvArr* templ,\par CvArr* result,\par int method );
3069 }{CPP}{MatchTemplate(image,templ,result,method)-> None}
3072 \cvarg{image}{Image where the search is running; should be 8-bit or 32-bit floating-point}
3073 \cvarg{templ}{Searched template; must be not greater than the source image and the same data type as the image}
3074 \cvarg{result}{A map of comparison results; single-channel 32-bit floating-point.
3075 If \texttt{image} is $W \times H$ and
3076 \texttt{templ} is $w \times h$ then \texttt{result} must be $(W-w+1) \times (H-h+1)$}
3077 \cvarg{method}{Specifies the way the template must be compared with the image regions (see below)}
3080 The function \texttt{cvMatchTemplate} is similar to
3081 \cross{CalcBackProjectPatch}. It slides through \texttt{image}, compares the
3082 overlapped patches of size $w \times h$ against \texttt{templ}
3083 using the specified method and stores the comparison results to
3084 \texttt{result}. Here are the formulas for the different comparison
3085 methods one may use ($I$ denotes \texttt{image}, $T$ \texttt{template},
3086 $R$ \texttt{result}). The summation is done over template and/or the
3087 image patch: $x' = 0...w-1, y' = 0...h-1$
3089 % \texttt{x'=0..w-1, y'=0..h-1}):
3092 \item[method=CV\_TM\_SQDIFF]
3093 \[ R(x,y)=\sum_{x',y'} (T(x',y')-I(x+x',y+y'))^2 \]
3095 \item[method=CV\_TM\_SQDIFF\_NORMED]
3097 {\sum_{x',y'} (T(x',y')-I(x+x',y+y'))^2}
3098 {\sqrt{\sum_{x',y'}T(x',y')^2 \cdot \sum_{x',y'} I(x+x',y+y')^2}}
3101 \item[method=CV\_TM\_CCORR]
3102 \[ R(x,y)=\sum_{x',y'} (T(x',y') \cdot I(x+x',y+y')) \]
3104 \item[method=CV\_TM\_CCORR\_NORMED]
3106 {\sum_{x',y'} (T(x',y') \cdot I'(x+x',y+y'))}
3107 {\sqrt{\sum_{x',y'}T(x',y')^2 \cdot \sum_{x',y'} I(x+x',y+y')^2}}
3110 \item[method=CV\_TM\_CCOEFF]
3111 \[ R(x,y)=\sum_{x',y'} (T'(x',y') \cdot I(x+x',y+y')) \]
3116 T'(x',y')=T(x',y') - 1/(w \cdot h) \cdot \sum_{x'',y''} T(x'',y'')\\
3117 I'(x+x',y+y')=I(x+x',y+y') - 1/(w \cdot h) \cdot \sum_{x'',y''} I(x+x'',y+y'')
3121 \item[method=CV\_TM\_CCOEFF\_NORMED]
3123 { \sum_{x',y'} (T'(x',y') \cdot I'(x+x',y+y')) }
3124 { \sqrt{\sum_{x',y'}T'(x',y')^2 \cdot \sum_{x',y'} I'(x+x',y+y')^2} }
3128 After the function finishes the comparison, the best matches can be found as global minimums (\texttt{CV\_TM\_SQDIFF}) or maximums (\texttt{CV\_TM\_CCORR} and \texttt{CV\_TM\_CCOEFF}) using the \cross{MinMaxLoc} function. In the case of a color image, template summation in the numerator and each sum in the denominator is done over all of the channels (and separate mean values are used for each channel).
3130 \cvfunc{MatchShapes}\label{MatchShapes}
3132 Compares two shapes.
3135 double cvMatchShapes( \par const void* object1,\par const void* object2,\par int method,\par double parameter=0 );
3136 }{CPP}{MatchShapes(object1,object2,method,parameter=0)-> None}
3139 \cvarg{object1}{First contour or grayscale image}
3140 \cvarg{object2}{Second contour or grayscale image}
3141 \cvarg{method}{Comparison method;
3142 \texttt{CV\_CONTOUR\_MATCH\_I1},
3143 \texttt{CV\_CONTOURS\_MATCH\_I2}
3145 \texttt{CV\_CONTOURS\_MATCH\_I3}}
3146 \cvarg{parameter}{Method-specific parameter (is not used now)}
3149 The function \texttt{cvMatchShapes} compares two shapes. The 3 implemented methods all use Hu moments (see \cross{GetHuMoments}) ($A$ is \texttt{object1}, $B$ is \texttt{object2}):
3152 \item[method=CV\_CONTOUR\_MATCH\_I1]
3153 \[ I_1(A,B) = \sum_{i=1...7} \left| \frac{1}{m^A_i} - \frac{1}{m^B_i} \right| \]
3155 \item[method=CV\_CONTOUR\_MATCH\_I2]
3156 \[ I_2(A,B) = \sum_{i=1...7} \left| m^A_i - m^B_i \right| \]
3158 \item[method=CV\_CONTOUR\_MATCH\_I3]
3159 \[ I_3(A,B) = \sum_{i=1...7} \frac{ \left| m^A_i - m^B_i \right| }{ \left| m^A_i \right| } \]
3166 m^A_i = sign(h^A_i) \cdot \log{h^A_i}
3167 m^B_i = sign(h^B_i) \cdot \log{h^B_i}
3171 and $h^A_i, h^B_i$ are the Hu moments of $A$ and $B$ respectively.
3173 \cvfunc{CalcEMD2}\label{CalcEMD2}
3175 Computes the "minimal work" distance between two weighted point configurations.
3178 float cvCalcEMD2( \par const CvArr* signature1,\par const CvArr* signature2,\par int distance\_type,\par CvDistanceFunction distance\_func=NULL,\par const CvArr* cost\_matrix=NULL,\par CvArr* flow=NULL,\par float* lower\_bound=NULL,\par void* userdata=NULL );
3179 }{CPP}{CalcEMD2(signature1, signature2, distance\_type, distance\_func = None, cost\_matrix=None, flow=None, lower\_bound=None, userdata = None) -> float}
3182 typedef float (*CvDistanceFunction)(const float* f1, const float* f2, void* userdata);
3186 \cvarg{signature1}{First signature, a $\texttt{size1}\times \texttt{dims}+1$ floating-point matrix. Each row stores the point weight followed by the point coordinates. The matrix is allowed to have a single column (weights only) if the user-defined cost matrix is used}
3187 \cvarg{signature2}{Second signature of the same format as \texttt{signature1}, though the number of rows may be different. The total weights may be different, in this case an extra "dummy" point is added to either \texttt{signature1} or \texttt{signature2}}
3188 \cvarg{distance\_type}{Metrics used; \texttt{CV\_DIST\_L1, CV\_DIST\_L2}, and \texttt{CV\_DIST\_C} stand for one of the standard metrics; \texttt{CV\_DIST\_USER} means that a user-defined function \texttt{distance\_func} or pre-calculated \texttt{cost\_matrix} is used}
3189 \cvarg{distance\_func}{The user-defined distance function. It takes coordinates of two points and returns the distance between the points}
3190 \cvarg{cost\_matrix}{The user-defined $\texttt{size1}\times \texttt{size2}$ cost matrix. At least one of \texttt{cost\_matrix} and \texttt{distance\_func} must be NULL. Also, if a cost matrix is used, lower boundary (see below) can not be calculated, because it needs a metric function}
3191 \cvarg{flow}{The resultant $\texttt{size1} \times \texttt{size2}$ flow matrix: $\texttt{flow}_{i,j}$ is a flow from $i$ th point of \texttt{signature1} to $j$ th point of \texttt{signature2}}
3192 \cvarg{lower\_bound}{Optional input/output parameter: lower boundary of distance between the two signatures that is a distance between mass centers. The lower boundary may not be calculated if the user-defined cost matrix is used, the total weights of point configurations are not equal, or if the signatures consist of weights only (i.e. the signature matrices have a single column). The user \textbf{must} initialize \texttt{*lower\_bound}. If the calculated distance between mass centers is greater or equal to \texttt{*lower\_bound} (it means that the signatures are far enough) the function does not calculate EMD. In any case \texttt{*lower\_bound} is set to the calculated distance between mass centers on return. Thus, if user wants to calculate both distance between mass centers and EMD, \texttt{*lower\_bound} should be set to 0}
3193 \cvarg{userdata}{Pointer to optional data that is passed into the user-defined distance function}
3196 The function \texttt{cvCalcEMD2} computes the earth mover distance and/or
3197 a lower boundary of the distance between the two weighted point
3198 configurations. One of the applications described in \cross{RubnerSept98} is
3199 multi-dimensional histogram comparison for image retrieval. EMD is a a
3200 transportation problem that is solved using some modification of a simplex
3201 algorithm, thus the complexity is exponential in the worst case, though, on average
3202 it is much faster. In the case of a real metric the lower boundary
3203 can be calculated even faster (using linear-time algorithm) and it can
3204 be used to determine roughly whether the two signatures are far enough
3205 so that they cannot relate to the same object.
3207 \section{Structural Analysis}
3209 \subsection{Contour Processing Functions}
3211 \cvfunc{ApproxChains}\label{ApproxChains}
3213 Approximates Freeman chain(s) with a polygonal curve.
3216 CvSeq* cvApproxChains( \par CvSeq* src\_seq,\par CvMemStorage* storage,\par int method=CV\_CHAIN\_APPROX\_SIMPLE,\par double parameter=0,\par int minimal\_perimeter=0,\par int recursive=0 );
3217 }{CPP}{ApproxChains(src\_seq,storage,method=CV\_CHAIN\_APPROX\_SIMPLE,parameter=0,minimal\_perimiter=0,recursive=0)-> chains}
3220 \cvarg{src\_seq}{Pointer to the chain that can refer to other chains}
3221 \cvarg{storage}{Storage location for the resulting polylines}
3222 \cvarg{method}{Approximation method (see the description of the function \cross{FindContours})}
3223 \cvarg{parameter}{Method parameter (not used now)}
3224 \cvarg{minimal\_perimeter}{Approximates only those contours whose perimeters are not less than \texttt{minimal\_perimeter}. Other chains are removed from the resulting structure}
3225 \cvarg{recursive}{If not 0, the function approximates all chains that access can be obtained to from \texttt{src\_seq} by using the \texttt{h\_next} or \texttt{v\_next links}. If 0, the single chain is approximated}
3228 This is a stand-alone approximation routine. The function \texttt{cvApproxChains} works exactly in the same way as \cross{FindContours} with the corresponding approximation flag. The function returns pointer to the first resultant contour. Other approximated contours, if any, can be accessed via the \texttt{v\_next} or \texttt{h\_next} fields of the returned structure.
3231 \cvfunc{StartReadChainPoints}\label{StartReadChainPoints}
3233 Initializes the chain reader.
3236 void cvStartReadChainPoints( CvChain* chain, CvChainPtReader* reader );
3239 The function \texttt{cvStartReadChainPoints} initializes a special reader.
3241 \cvfunc{ReadChainPoint}\label{ReadChainPoint}
3243 Gets the next chain point.
3246 CvPoint cvReadChainPoint( CvChainPtReader* reader );
3250 \cvarg{reader}{Chain reader state}
3253 The function \texttt{cvReadChainPoint} returns the current chain point and updates the reader position.
3257 \cvfunc{ApproxPoly}\label{ApproxPoly}
3259 Approximates polygonal curve(s) with the specified precision.
3262 CvSeq* cvApproxPoly( \par const void* src\_seq,\par int header\_size,\par CvMemStorage* storage,\par int method,\par double parameter,\par int parameter2=0 );
3264 ApproxPoly(src\_seq, storage, method, parameter=0, parameter2=0)
3268 \cvarg{src\_seq}{Sequence of an array of points}
3269 \cvarg{header\_size}{Header size of the approximated curve[s]}
3270 \cvarg{storage}{Container for the approximated contours. If it is NULL, the input sequences' storage is used}
3271 \cvarg{method}{Approximation method; only \texttt{CV\_POLY\_APPROX\_DP} is supported, that corresponds to the Douglas-Peucker algorithm}
3272 \cvarg{parameter}{Method-specific parameter; in the case of \texttt{CV\_POLY\_APPROX\_DP} it is a desired approximation accuracy}
3273 \cvarg{parameter2}{If case if \texttt{src\_seq} is a sequence, the parameter determines whether the single sequence should be approximated or all sequences on the same level or below \texttt{src\_seq} (see \cross{FindContours} for description of hierarchical contour structures). If \texttt{src\_seq} is an array CvMat* of points, the parameter specifies whether the curve is closed (\texttt{parameter2}!=0) or not (\texttt{parameter2} =0)}
3276 The function \texttt{cvApproxPoly} approximates one or more curves and
3277 returns the approximation result[s]. In the case of multiple curves,
3278 the resultant tree will have the same structure as the input one (1:1
3281 \cvfunc{BoundingRect}\label{BoundingRect}
3283 Calculates the up-right bounding rectangle of a point set.
3286 CvRect cvBoundingRect( CvArr* points, int update=0 );
3287 }{CPP}{BoundingRect(points,update=0)-> CvRect}
3290 \cvarg{points}{2D point set, either a sequence or vector (\texttt{CvMat}) of points}
3291 \cvarg{update}{The update flag. See below.}
3294 The function \texttt{cvBoundingRect} returns the up-right bounding rectangle for a 2d point set.
3295 Here is the list of possible combination of the flag values and type of \texttt{points}:
3297 \begin{tabular}{|c|c|p{3in}|}
3299 update & points & action \\ \hline
3300 0 & \texttt{CvContour\*} & the bounding rectangle is not calculated, but it is taken from \texttt{rect} field of the contour header.\\ \hline
3301 1 & \texttt{CvContour\*} & the bounding rectangle is calculated and written to \texttt{rect} field of the contour header.\\ \hline
3302 0 & \texttt{CvSeq\*} or \texttt{CvMat\*} & the bounding rectangle is calculated and returned.\\ \hline
3303 1 & \texttt{CvSeq\*} or \texttt{CvMat\*} & runtime error is raised.\\ \hline
3306 \cvfunc{ContourArea}\label{ContourArea}
3308 Calculates the area of a whole contour or a contour section.
3311 double cvContourArea( \par const CvArr* contour, \par CvSlice slice=CV\_WHOLE\_SEQ );
3312 }{CPP}{ContourAres(contour,slice=CV\_WHOLE\_SEQ)-> double}
3315 \cvarg{contour}{Contour (sequence or array of vertices)}
3316 \cvarg{slice}{Starting and ending points of the contour section of interest, by default, the area of the whole contour is calculated}
3319 The function \texttt{cvContourArea} calculates the area of a whole contour
3320 or a contour section. In the latter case the total area bounded by the
3321 contour arc and the chord connecting the 2 selected points is calculated
3322 as shown on the picture below:
3324 \includegraphics[width=0.5\textwidth]{pics/contoursecarea.png}
3326 Orientation of the contour affects the area sign, thus the function may return a \emph{negative} result. Use the \texttt{fabs()} function from C runtime to get the absolute value of the area.
3328 \cvfunc{ArcLength}\label{ArcLength}
3330 Calculates the contour perimeter or the curve length.
3333 double cvArcLength( \par const void* curve,\par CvSlice slice=CV\_WHOLE\_SEQ,\par int is\_closed=-1 );
3334 }{CPP}{ArcLength(curve,slice=CV\_WHOLE\_SEQ,is\_closed=-1)-> double}
3337 \cvarg{curve}{Sequence or array of the curve points}
3338 \cvarg{slice}{Starting and ending points of the curve, by default, the whole curve length is calculated}
3339 \cvarg{is\_closed}{Indicates whether the curve is closed or not. There are 3 cases:
3341 \item $\texttt{is\_closed} =0$ the curve is assumed to be unclosed.
3342 \item $\texttt{is\_closed}>0$ the curve is assumed to be closed.
3343 \item $\texttt{is\_closed}<0$ if curve is sequence, the flag \texttt{CV\_SEQ\_FLAG\_CLOSED} of \texttt{((CvSeq*)curve)->flags} is checked to determine if the curve is closed or not, otherwise (curve is represented by array (CvMat*) of points) it is assumed to be unclosed.
3347 The function \texttt{cvArcLength} calculates the length or curve as the sum of lengths of segments between subsequent points
3349 \cvfunc{CreateContourTree}\label{CreateContourTree}
3351 Creates a hierarchical representation of a contour.
3354 CvContourTree* cvCreateContourTree( /par const CvSeq* contour,\par CvMemStorage* storage,\par double threshold );
3355 }{CPP}{CreateCountourTree(contour,storage,threshold)-> contour\_tree}
3358 \cvarg{contour}{Input contour}
3359 \cvarg{storage}{Container for output tree}
3360 \cvarg{threshold}{Approximation accuracy}
3363 The function \texttt{cvCreateContourTree} creates a binary tree representation for the input \texttt{contour} and returns the pointer to its root. If the parameter \texttt{threshold} is less than or equal to 0, the function creates a full binary tree representation. If the threshold is greater than 0, the function creates a representation with the precision \texttt{threshold}: if the vertices with the interceptive area of its base line are less than \texttt{threshold}, the tree should not be built any further. The function returns the created tree.
3365 \cvfunc{ContourFromContourTree}\label{ContourFromContourTree}
3367 Restores a contour from the tree.
3370 CvSeq* cvContourFromContourTree( \par const CvContourTree* tree,\par CvMemStorage* storage,\par CvTermCriteria criteria );
3371 }{CPP}{ContourFromContourTree(tree,storage,criteria)-> contour}
3374 \cvarg{tree}{Contour tree}
3375 \cvarg{storage}{Container for the reconstructed contour}
3376 \cvarg{criteria}{Criteria, where to stop reconstruction}
3379 The function \texttt{cvContourFromContourTree} restores the contour from its binary tree representation. The parameter \texttt{criteria} determines the accuracy and/or the number of tree levels used for reconstruction, so it is possible to build an approximated contour. The function returns the reconstructed contour.
3381 \cvfunc{MatchContourTrees}\label{MatchContourTrees}
3383 Compares two contours using their tree representations.
3386 double cvMatchContourTrees( \par const CvContourTree* tree1,\par const CvContourTree* tree2,\par int method,\par double threshold );
3387 }{CPP}{MatchContourTrees(tree1,tree2,method,threshold)-> double}
3390 \cvarg{tree1}{First contour tree}
3391 \cvarg{tree2}{Second contour tree}
3392 \cvarg{method}{Similarity measure, only \texttt{CV\_CONTOUR\_TREES\_MATCH\_I1} is supported}
3393 \cvarg{threshold}{Similarity threshold}
3396 The function \texttt{cvMatchContourTrees} calculates the value of the matching measure for two contour trees. The similarity measure is calculated level by level from the binary tree roots. If at a certain level the difference between contours becomes less than \texttt{threshold}, the reconstruction process is interrupted and the current difference is returned.
3398 \subsection{Computational Geometry}
3400 \cvfunc{MaxRect}\label{MaxRect}
3402 Finds the bounding rectangle for two given rectangles.
3405 CvRect cvMaxRect( \par const CvRect* rect1,\par const CvRect* rect2 );
3406 }{CPP}{MaxRect(rect1,rect2)-> CvRect}
3409 \cvarg{rect1}{First rectangle}
3410 \cvarg{rect2}{Second rectangle}
3413 The function \texttt{cvMaxRect} finds the minimum area rectangle that contains both input rectangles.
3415 \includegraphics[width=0.5\textwidth]{pics/maxrect.png}
3418 \cvstruct{CvBox2D}\label{CvBox2D}
3423 typedef struct CvBox2D
3425 CvPoint2D32f center; /* center of the box */
3426 CvSize2D32f size; /* box width and length */
3427 float angle; /* angle between the horizontal axis
3428 and the first side (i.e. length) in degrees */
3435 \cvfunc{PointSeqFromMat}\label{PointSeqFromMat}
3437 Initializes a point sequence header from a point vector.
3440 CvSeq* cvPointSeqFromMat( \par int seq\_kind,\par const CvArr* mat,\par CvContour* contour\_header,\par CvSeqBlock* block );
3444 \cvarg{seq\_kind}{Type of the point sequence: point set (0), a curve (\texttt{CV\_SEQ\_KIND\_CURVE}), closed curve (\texttt{CV\_SEQ\_KIND\_CURVE+CV\_SEQ\_FLAG\_CLOSED}) etc.}
3445 \cvarg{mat}{Input matrix. It should be a continuous, 1-dimensional vector of points, that is, it should have type \texttt{CV\_32SC2} or \texttt{CV\_32FC2}}
3446 \cvarg{contour\_header}{Contour header, initialized by the function}
3447 \cvarg{block}{Sequence block header, initialized by the function}
3450 The function \texttt{cvPointSeqFromMat} initializes a sequence
3451 header to create a "virtual" sequence in which elements reside in
3452 the specified matrix. No data is copied. The initialized sequence
3453 header may be passed to any function that takes a point sequence
3454 on input. No extra elements can be added to the sequence,
3455 but some may be removed. The function is a specialized variant of
3456 \cross{MakeSeqHeaderForArray} and uses
3457 the latter internally. It returns a pointer to the initialized contour
3458 header. Note that the bounding rectangle (field \texttt{rect} of
3459 \texttt{CvContour} strucuture) is not initialized by the function. If
3460 you need one, use \cross{BoundingRect}.
3462 Here is a simple usage example.
3467 CvMat* vector = cvCreateMat( 1, 3, CV_32SC2 );
3469 CV_MAT_ELEM( *vector, CvPoint, 0, 0 ) = cvPoint(100,100);
3470 CV_MAT_ELEM( *vector, CvPoint, 0, 1 ) = cvPoint(100,200);
3471 CV_MAT_ELEM( *vector, CvPoint, 0, 2 ) = cvPoint(200,100);
3473 IplImage* img = cvCreateImage( cvSize(300,300), 8, 3 );
3476 cvDrawContours( img,
3477 cvPointSeqFromMat(CV_SEQ_KIND_CURVE+CV_SEQ_FLAG_CLOSED,
3483 0, 3, 8, cvPoint(0,0));
3487 \cvfunc{BoxPoints}\label{BoxPoints}
3489 Finds the box vertices.
3492 void cvBoxPoints( \par CvBox2D box,\par CvPoint2D32f pt[4] );
3493 }{CPP}{BoxPoints(box)-> points}
3497 \cvarg{pt}{Array of vertices}
3500 The function \texttt{cvBoxPoints} calculates the vertices of the input 2d box. Here is the function code:
3503 void cvBoxPoints( CvBox2D box, CvPoint2D32f pt[4] )
3505 float a = (float)cos(box.angle)*0.5f;
3506 float b = (float)sin(box.angle)*0.5f;
3508 pt[0].x = box.center.x - a*box.size.height - b*box.size.width;
3509 pt[0].y = box.center.y + b*box.size.height - a*box.size.width;
3510 pt[1].x = box.center.x + a*box.size.height - b*box.size.width;
3511 pt[1].y = box.center.y - b*box.size.height - a*box.size.width;
3512 pt[2].x = 2*box.center.x - pt[0].x;
3513 pt[2].y = 2*box.center.y - pt[0].y;
3514 pt[3].x = 2*box.center.x - pt[1].x;
3515 pt[3].y = 2*box.center.y - pt[1].y;
3519 \cvfunc{FitEllipse}\label{FitEllipse}
3521 Fits an ellipse around a set of 2D points.
3524 CvBox2D cvFitEllipse2( \par const CvArr* points );
3525 }{CPP}{FitEllipse2(points)-> Box2D}
3528 \cvarg{points}{Sequence or array of points}
3531 The function \texttt{cvFitEllipse} calculates the ellipse that fits best
3532 (in least-squares sense) around a set of 2D points. The meaning of the
3533 returned structure fields is similar to those in \cross{Ellipse} except
3534 that \texttt{size} stores the full lengths of the ellipse axises,
3537 \cvfunc{FitLine}\label{FitLine}
3539 Fits a line to a 2D or 3D point set.
3542 void cvFitLine( \par const CvArr* points,\par int dist\_type,\par double param,\par double reps,\par double aeps,\par float* line );
3543 }{CPP}{FitLine(points, dist\_type, param, reps, aeps) -> line}
3546 \cvarg{points}{Sequence or array of 2D or 3D points with 32-bit integer or floating-point coordinates}
3547 \cvarg{dist\_type}{The distance used for fitting (see the discussion)}
3548 \cvarg{param}{Numerical parameter (\texttt{C}) for some types of distances, if 0 then some optimal value is chosen}
3549 \cvarg{reps, aeps}{Sufficient accuracy for the radius (distance between the coordinate origin and the line) and angle, respectively; 0.01 would be a good default value for both.}
3550 \cvarg{line}{The output line parameters. In the case of a 2d fitting,
3551 it is \cvC{an array} \cvPy{a tuple} of 4 floats \texttt{(vx, vy,
3552 x0, y0)} where \texttt{(vx, vy)} is a normalized vector collinear to the
3553 line and \texttt{(x0, y0)} is some point on the line. in the case of a
3554 3D fitting it is \cvC{an array} \cvPy{a tuple} of 6 floats \texttt{(vx, vy, vz, x0, y0, z0)}
3555 where \texttt{(vx, vy, vz)} is a normalized vector collinear to the line
3556 and \texttt{(x0, y0, z0)} is some point on the line}
3559 The function \texttt{cvFitLine} fits a line to a 2D or 3D point set by minimizing $\sum_i \rho(r_i)$ where $r_i$ is the distance between the $i$ th point and the line and $\rho(r)$ is a distance function, one of:
3563 \item[dist\_type=CV\_DIST\_L2]
3564 \[ \rho(r) = r^2/2 \quad \text{(the simplest and the fastest least-squares method)} \]
3566 \item[dist\_type=CV\_DIST\_L1]
3569 \item[dist\_type=CV\_DIST\_L12]
3570 \[ \rho(r) = 2 \cdot (\sqrt{1 + \frac{r^2}{2}} - 1) \]
3572 \item[dist\_type=CV\_DIST\_FAIR]
3573 \[ \rho\left(r\right) = C^2 \cdot \left( \frac{r}{C} - \log{\left(1 + \frac{r}{C}\right)}\right) \quad \text{where} \quad C=1.3998 \]
3575 \item[dist\_type=CV\_DIST\_WELSCH]
3576 \[ \rho\left(r\right) = \frac{C^2}{2} \cdot \left( 1 - \exp{\left(-\left(\frac{r}{C}\right)^2\right)}\right) \quad \text{where} \quad C=2.9846 \]
3578 \item[dist\_type=CV\_DIST\_HUBER]
3581 {C \cdot (r-C/2)}{otherwise} \quad \text{where} \quad C=1.345
3585 \cvfunc{ConvexHull2}\label{ConvexHull2}
3587 Finds the convex hull of a point set.
3590 CvSeq* cvConvexHull2( \par const CvArr* input,\par void* hull\_storage=NULL,\par int orientation=CV\_CLOCKWISE,\par int return\_points=0 );
3591 }{CPP}{ConvexHull2(points,storage,orientaton=CV\_CLOCKWISE,return\_points=0)-> convex\_hull}
3594 \cvarg{points}{Sequence or array of 2D points with 32-bit integer or floating-point coordinates}
3595 \cvarg{hull\_storage}{The destination array (CvMat*) or memory storage (CvMemStorage*) that will store the convex hull. If it is an array, it should be 1d and have the same number of elements as the input array/sequence. On output the header is modified as to truncate the array down to the hull size. If \texttt{hull\_storage} is NULL then the convex hull will be stored in the same storage as the input sequence}
3596 \cvarg{orientation}{Desired orientation of convex hull: \texttt{CV\_CLOCKWISE} or \texttt{CV\_COUNTER\_CLOCKWISE}}
3597 \cvarg{return\_points}{If non-zero, the points themselves will be stored in the hull instead of indices if \texttt{hull\_storage} is an array, or pointers if \texttt{hull\_storage} is memory storage}
3600 The function \texttt{cvConvexHull2} finds the convex hull of a 2D point set using Sklansky's algorithm. If \texttt{hull\_storage} is memory storage, the function creates a sequence containing the hull points or pointers to them, depending on \texttt{return\_points} value and returns the sequence on output. If \texttt{hull\_storage} is a CvMat, the function returns NULL.
3602 % ===== Example. Building convex hull for a sequence or array of points =====
3605 #include "highgui.h"
3608 #define ARRAY 0 /* switch between array/sequence method by replacing 0<=>1 */
3610 void main( int argc, char** argv )
3612 IplImage* img = cvCreateImage( cvSize( 500, 500 ), 8, 3 );
3613 cvNamedWindow( "hull", 1 );
3616 CvMemStorage* storage = cvCreateMemStorage();
3621 int i, count = rand()%100 + 1, hullcount;
3624 CvSeq* ptseq = cvCreateSeq( CV_SEQ_KIND_GENERIC|CV_32SC2,
3630 for( i = 0; i < count; i++ )
3632 pt0.x = rand() % (img->width/2) + img->width/4;
3633 pt0.y = rand() % (img->height/2) + img->height/4;
3634 cvSeqPush( ptseq, &pt0 );
3636 hull = cvConvexHull2( ptseq, 0, CV_CLOCKWISE, 0 );
3637 hullcount = hull->total;
3639 CvPoint* points = (CvPoint*)malloc( count * sizeof(points[0]));
3640 int* hull = (int*)malloc( count * sizeof(hull[0]));
3641 CvMat point_mat = cvMat( 1, count, CV_32SC2, points );
3642 CvMat hull_mat = cvMat( 1, count, CV_32SC1, hull );
3644 for( i = 0; i < count; i++ )
3646 pt0.x = rand() % (img->width/2) + img->width/4;
3647 pt0.y = rand() % (img->height/2) + img->height/4;
3650 cvConvexHull2( &point_mat, &hull_mat, CV_CLOCKWISE, 0 );
3651 hullcount = hull_mat.cols;
3654 for( i = 0; i < count; i++ )
3657 pt0 = *CV_GET_SEQ_ELEM( CvPoint, ptseq, i );
3661 cvCircle( img, pt0, 2, CV_RGB( 255, 0, 0 ), CV_FILLED );
3665 pt0 = **CV_GET_SEQ_ELEM( CvPoint*, hull, hullcount - 1 );
3667 pt0 = points[hull[hullcount-1]];
3670 for( i = 0; i < hullcount; i++ )
3673 CvPoint pt = **CV_GET_SEQ_ELEM( CvPoint*, hull, i );
3675 CvPoint pt = points[hull[i]];
3677 cvLine( img, pt0, pt, CV_RGB( 0, 255, 0 ));
3681 cvShowImage( "hull", img );
3683 int key = cvWaitKey(0);
3684 if( key == 27 ) // 'ESC'
3688 cvClearMemStorage( storage );
3697 \cvfunc{CheckContourConvexity}\label{CheckContourConvexity}
3699 Tests contour convexity.
3702 int cvCheckContourConvexity( const CvArr* contour );
3703 }{CPP}{CheckContourConvexity(contour)-> int}
3706 \cvarg{contour}{Tested contour (sequence or array of points)}
3709 The function \texttt{cvCheckContourConvexity} tests whether the input contour is convex or not. The contour must be simple, without self-intersections.
3711 \cvstruct{CvConvexityDefect}\label{CvConvexityDefect}
3713 Structure describing a single contour convexity defect.
3716 typedef struct CvConvexityDefect
3718 CvPoint* start; /* point of the contour where the defect begins */
3719 CvPoint* end; /* point of the contour where the defect ends */
3720 CvPoint* depth_point; /* the farthest from the convex hull point within the defect */
3721 float depth; /* distance between the farthest point and the convex hull */
3722 } CvConvexityDefect;
3725 % ===== Picture. Convexity defects of hand contour. =====
3726 \includegraphics[width=0.5\textwidth]{pics/defects.png}
3728 \cvfunc{ConvexityDefects}\label{ConvexityDefects}
3730 Finds the convexity defects of a contour.
3733 CvSeq* cvConvexityDefects( \par const CvArr* contour,\par const CvArr* convexhull,\par CvMemStorage* storage=NULL );
3734 }{CPP}{ConvexityDefects(contour,convexhull,storage)-> convexity\_defects}
3737 \cvarg{contour}{Input contour}
3738 \cvarg{convexhull}{Convex hull obtained using \cross{ConvexHull2} that should contain pointers or indices to the contour points, not the hull points themselves (the \texttt{return\_points} parameter in \cross{ConvexHull2} should be 0)}
3739 \cvarg{storage}{Container for the output sequence of convexity defects. If it is NULL, the contour or hull (in that order) storage is used}
3742 The function \texttt{ConvexityDefects} finds all convexity defects of the input contour and returns a sequence of the CvConvexityDefect structures.
3744 \cvfunc{PointPolygonTest}\label{PointPolygonTest}
3746 Point in contour test.
3749 double cvPointPolygonTest( \par const CvArr* contour,\par CvPoint2D32f pt,\par int measure\_dist );
3750 }{CPP}{PointPolygonTest(contour,pt,measure\_dist)-> double}
3753 \cvarg{contour}{Input contour}
3754 \cvarg{pt}{The point tested against the contour}
3755 \cvarg{measure\_dist}{If it is non-zero, the function estimates the distance from the point to the nearest contour edge}
3758 The function \texttt{cvPointPolygonTest} determines whether the
3759 point is inside a contour, outside, or lies on an edge (or coinsides
3760 with a vertex). It returns positive, negative or zero value,
3761 correspondingly. When $\texttt{measure\_dist} =0$, the return value
3762 is +1, -1 and 0, respectively. When $\texttt{measure\_dist} \ne 0$,
3763 it is a signed distance between the point and the nearest contour
3766 Here is the sample output of the function, where each image pixel is tested against the contour.
3768 \includegraphics[width=0.5\textwidth]{pics/pointpolygon.png}
3770 \cvfunc{MinAreaRect2}\label{MinAreaRect2}
3772 Finds the circumscribed rectangle of minimal area for a given 2D point set.
3775 CvBox2D cvMinAreaRect2( \par const CvArr* points,\par CvMemStorage* storage=NULL );
3776 }{CPP}{MinAreaRect2(points,storage)-> CvBox2D}
3779 \cvarg{points}{Sequence or array of points}
3780 \cvarg{storage}{Optional temporary memory storage}
3783 The function \texttt{cvMinAreaRect2} finds a circumscribed rectangle of the minimal area for a 2D point set by building a convex hull for the set and applying the rotating calipers technique to the hull.
3785 \cvfunc{Picture. Minimal-area bounding rectangle for contour}
3787 \includegraphics[width=0.5\textwidth]{pics/minareabox.png}
3789 \cvfunc{MinEnclosingCircle}\label{MinEnclosingCircle}
3791 Finds the circumscribed circle of minimal area for a given 2D point set.
3794 int cvMinEnclosingCircle( \par const CvArr* points,\par CvPoint2D32f* center,\par float* radius );
3795 }{CPP}{MinEnclosingCircle(points)-> int,center,radius}
3798 \cvarg{points}{Sequence or array of 2D points}
3799 \cvarg{center}{Output parameter; the center of the enclosing circle}
3800 \cvarg{radius}{Output parameter; the radius of the enclosing circle}
3803 The function \texttt{cvMinEnclosingCircle} finds the minimal circumscribed
3804 circle for a 2D point set using an iterative algorithm. It returns nonzero
3805 if the resultant circle contains all the input points and zero otherwise
3806 (i.e. the algorithm failed).
3808 \cvfunc{CalcPGH}\label{CalcPGH}
3810 Calculates a pair-wise geometrical histogram for a contour.
3813 void cvCalcPGH( const CvSeq* contour, CvHistogram* hist );
3814 }{CPP}{CalcPGH(contour,hist)-> None}
3817 \cvarg{contour}{Input contour. Currently, only integer point coordinates are allowed}
3818 \cvarg{hist}{Calculated histogram; must be two-dimensional}
3821 The function \texttt{cvCalcPGH} calculates a
3822 2D pair-wise geometrical histogram (PGH), described in
3824 for the contour. The algorithm considers every pair of contour
3825 edges. The angle between the edges and the minimum/maximum distances
3826 are determined for every pair. To do this each of the edges in turn
3827 is taken as the base, while the function loops through all the other
3828 edges. When the base edge and any other edge are considered, the minimum
3829 and maximum distances from the points on the non-base edge and line of
3830 the base edge are selected. The angle between the edges defines the row
3831 of the histogram in which all the bins that correspond to the distance
3832 between the calculated minimum and maximum distances are incremented
3833 (that is, the histogram is transposed relatively to the \cross{Iivarninen97}
3834 definition). The histogram can be used for contour matching.
3836 \subsection{Planar Subdivisions}
3838 \cvstruct{CvSubdiv2D}\label{CvSubdiv2D}
3843 #define CV_SUBDIV2D_FIELDS() \
3846 int is_geometry_valid; \
3847 CvSubdiv2DEdge recent_edge; \
3848 CvPoint2D32f topleft; \
3849 CvPoint2D32f bottomright;
3851 typedef struct CvSubdiv2D
3853 CV_SUBDIV2D_FIELDS()
3858 Planar subdivision is the subdivision of a plane into a set of
3859 non-overlapped regions (facets) that cover the whole plane. The above
3860 structure describes a subdivision built on a 2d point set, where the points
3861 are linked together and form a planar graph, which, together with a few
3862 edges connecting the exterior subdivision points (namely, convex hull points)
3863 with infinity, subdivides a plane into facets by its edges.
3865 For every subdivision there exists a dual subdivision in which facets and
3866 points (subdivision vertices) swap their roles, that is, a facet is
3867 treated as a vertex (called a virtual point below) of the dual subdivision and
3868 the original subdivision vertices become facets. On the picture below
3869 original subdivision is marked with solid lines and dual subdivision
3872 \includegraphics[width=0.5\textwidth]{pics/subdiv.png}
3874 OpenCV subdivides a plane into triangles using Delaunay's
3875 algorithm. Subdivision is built iteratively starting from a dummy
3876 triangle that includes all the subdivision points for sure. In this
3877 case the dual subdivision is a Voronoi diagram of the input 2d point set. The
3878 subdivisions can be used for the 3d piece-wise transformation of a plane,
3879 morphing, fast location of points on the plane, building special graphs
3880 (such as NNG,RNG) and so forth.
3882 \cvstruct{CvQuadEdge2D}\label{CvQuadEdge2D}
3884 Quad-edge of planar subdivision.
3887 /* one of edges within quad-edge, lower 2 bits is index (0..3)
3888 and upper bits are quad-edge pointer */
3889 typedef long CvSubdiv2DEdge;
3891 /* quad-edge structure fields */
3892 #define CV_QUADEDGE2D_FIELDS() \
3894 struct CvSubdiv2DPoint* pt[4]; \
3895 CvSubdiv2DEdge next[4];
3897 typedef struct CvQuadEdge2D
3899 CV_QUADEDGE2D_FIELDS()
3905 Quad-edge is a basic element of subdivision containing four edges (e, eRot, reversed e and reversed eRot):
3907 \includegraphics[width=0.5\textwidth]{pics/quadedge.png}
3909 \cvstruct{CvSubdiv2DPoint}\label{CvSubdiv2DPoint}
3911 Point of original or dual subdivision.
3914 #define CV_SUBDIV2D_POINT_FIELDS()\
3916 CvSubdiv2DEdge first; \
3919 #define CV_SUBDIV2D_VIRTUAL_POINT_FLAG (1 << 30)
3921 typedef struct CvSubdiv2DPoint
3923 CV_SUBDIV2D_POINT_FIELDS()
3928 \cvfunc{Subdiv2DGetEdge}\label{Subdiv2DGetEdge}
3930 Returns one of the edges related to the given edge.
3933 CvSubdiv2DEdge cvSubdiv2DGetEdge( CvSubdiv2DEdge edge, CvNextEdgeType type );
3936 }{CPP}{Subdiv2DGetEdge(edge,type)-> CvSubdiv2DEdge}
3938 #define cvSubdiv2DNextEdge( edge ) cvSubdiv2DGetEdge( edge, CV_NEXT_AROUND_ORG )
3942 \cvarg{edge}{Subdivision edge (not a quad-edge)}
3943 \cvarg{type}{Specifies which of the related edges to return, one of the following:}
3945 \cvarg{CV\_NEXT\_AROUND\_ORG}{next around the edge origin (\texttt{eOnext} on the picture above if \texttt{e} is the input edge)}
3946 \cvarg{CV\_NEXT\_AROUND\_DST}{next around the edge vertex (\texttt{eDnext})}
3947 \cvarg{CV\_PREV\_AROUND\_ORG}{previous around the edge origin (reversed \texttt{eRnext})}
3948 \cvarg{CV\_PREV\_AROUND\_DST}{previous around the edge destination (reversed \texttt{eLnext})}
3949 \cvarg{CV\_NEXT\_AROUND\_LEFT}{next around the left facet (\texttt{eLnext})}
3950 \cvarg{CV\_NEXT\_AROUND\_RIGHT}{next around the right facet (\texttt{eRnext})}
3951 \cvarg{CV\_PREV\_AROUND\_LEFT}{previous around the left facet (reversed \texttt{eOnext})}
3952 \cvarg{CV\_PREV\_AROUND\_RIGHT}{previous around the right facet (reversed \texttt{eDnext})}
3956 The function \texttt{cvSubdiv2DGetEdge} returns one of the edges related to the input edge.
3958 \cvfunc{Subdiv2DRotateEdge}\label{Subdiv2DRotateEdge}
3960 Returns another edge of the same quad-edge.
3963 CvSubdiv2DEdge cvSubdiv2DRotateEdge( \par CvSubdiv2DEdge edge,\par int rotate );
3964 }{CPP}{Subdiv2DRotateEdge(edge,rotate)-> CvSubdiv2DEdge}
3967 \cvarg{edge}{Subdivision edge (not a quad-edge)}
3968 \cvarg{type}{Specifies which of the edges of the same quad-edge as the input one to return, one of the following:
3970 \cvarg{0}{the input edge (\texttt{e} on the picture above if \texttt{e} is the input edge)}
3971 \cvarg{1}{the rotated edge (\texttt{eRot})}
3972 \cvarg{2}{the reversed edge (reversed \texttt{e} (in green))}
3973 \cvarg{3}{the reversed rotated edge (reversed \texttt{eRot} (in green))}
3977 The function \texttt{cvSubdiv2DRotateEdge} returns one of the edges of the same quad-edge as the input edge.
3979 \cvfunc{Subdiv2DEdgeOrg}\label{Subdiv2DEdgeOrg}
3981 Returns the edge origin.
3984 CvSubdiv2DPoint* cvSubdiv2DEdgeOrg( \par CvSubdiv2DEdge edge );
3985 }{CPP}{Subdiv2DEdgeOrg(edge)-> point}
3988 \cvarg{edge}{Subdivision edge (not a quad-edge)}
3991 The function \texttt{cvSubdiv2DEdgeOrg} returns the edge
3992 origin. The returned pointer may be NULL if the edge is from dual
3993 subdivision and the virtual point coordinates are not calculated
3994 yet. The virtual points can be calculated using the function
3995 \cross{CalcSubdivVoronoi2D}.
3997 \cvfunc{Subdiv2DEdgeDst}\label{Subdiv2DEdgeDst}
3999 Returns the edge destination.
4002 CvSubdiv2DPoint* cvSubdiv2DEdgeDst( \par CvSubdiv2DEdge edge );
4003 }{CPP}{Subdiv2DEdgeDist(edge)-> point}
4006 \cvarg{edge}{Subdivision edge (not a quad-edge)}
4009 The function \texttt{cvSubdiv2DEdgeDst} returns the edge destination. The
4010 returned pointer may be NULL if the edge is from dual subdivision and
4011 the virtual point coordinates are not calculated yet. The virtual points
4012 can be calculated using the function \cross{CalcSubdivVoronoi2D}.
4014 \cvfunc{CreateSubdivDelaunay2D}\label{CreateSubdivDelaunay2D}
4016 Creates an empty Delaunay triangulation.
4019 CvSubdiv2D* cvCreateSubdivDelaunay2D( \par CvRect rect,\par CvMemStorage* storage );
4020 }{CPP}{CreateSubdivDelaunay2D(rect,storage)-> delaunay\_triangulation}
4023 \cvarg{rect}{Rectangle that includes all of the 2d points that are to be added to the subdivision}
4024 \cvarg{storage}{Container for subdivision}
4027 The function \texttt{CreateSubdivDelaunay2D} creates an empty Delaunay
4028 subdivision, where 2d points can be added using the function
4029 \cross{SubdivDelaunay2DInsert}. All of the points to be added must be within
4030 the specified rectangle, otherwise a runtime error will be raised.
4032 Note that the triangulation is a single large triangle that covers the given rectangle. Hence the three vertices of this triangle are outside the rectangle \texttt{rect}.
4034 \cvfunc{SubdivDelaunay2DInsert}\label{SubdivDelaunay2DInsert}
4036 Inserts a single point into a Delaunay triangulation.
4039 CvSubdiv2DPoint* cvSubdivDelaunay2DInsert( \par CvSubdiv2D* subdiv,\par CvPoint2D32f pt);
4040 }{CPP}{SubdivDelaunay2DInsert(subdiv,pt)-> point}
4043 \cvarg{subdiv}{Delaunay subdivision created by the function \cross{CreateSubdivDelaunay2D}}
4044 \cvarg{pt}{Inserted point}
4047 The function \texttt{cvSubdivDelaunay2DInsert} inserts a single point into a subdivision and modifies the subdivision topology appropriately. If a point with the same coordinates exists already, no new point is added. The function returns a pointer to the allocated point. No virtual point coordinates are calculated at this stage.
4049 \cvfunc{Subdiv2DLocate}\label{Subdiv2DLocate}
4051 Returns the location of a point within a Delaunay triangulation.
4054 CvSubdiv2DPointLocation cvSubdiv2DLocate( \par CvSubdiv2D* subdiv,\par CvPoint2D32f pt,\par CvSubdiv2DEdge* edge,\par CvSubdiv2DPoint** vertex=NULL );
4055 }{CPP}{Subdiv2DLocate(subdiv, pt) -> (loc, where)}
4058 \cvarg{subdiv}{Delaunay or another subdivision}
4059 \cvarg{pt}{The point to locate}
4060 \cvC{\cvarg{edge}{The output edge the point falls onto or right to}}
4061 \cvC{\cvarg{vertex}{Optional output vertex double pointer the input point coinsides with}}
4062 \cvPy{\cvarg{loc}{The location of the point within the triangulation}}
4063 \cvPy{\cvarg{where}{The edge or vertex. See below.}}
4066 The function \texttt{cvSubdiv2DLocate} locates the input point within the subdivision. There are 5 cases:
4070 \item The point falls into some facet. The function returns \texttt{CV\_PTLOC\_INSIDE} and \texttt{*edge} will contain one of edges of the facet.
4071 \item The point falls onto the edge. The function returns \texttt{CV\_PTLOC\_ON\_EDGE} and \texttt{*edge} will contain this edge.
4072 \item The point coincides with one of the subdivision vertices. The function returns \texttt{CV\_PTLOC\_VERTEX} and \texttt{*vertex} will contain a pointer to the vertex.
4073 \item The point is outside the subdivsion reference rectangle. The function returns \texttt{CV\_PTLOC\_OUTSIDE\_RECT} and no pointers are filled.
4074 \item One of input arguments is invalid. A runtime error is raised or, if silent or "parent" error processing mode is selected, \\texttt{CV\_PTLOC\_ERROR} is returnd.
4080 \item The point falls into some facet. \texttt{loc} is \texttt{CV\_PTLOC\_INSIDE} and \texttt{where} is one of edges of the facet.
4081 \item The point falls onto the edge. \texttt{loc} is \texttt{CV\_PTLOC\_ON\_EDGE} and \texttt{where} is the edge.
4082 \item The point coincides with one of the subdivision vertices. \texttt{loc} is \texttt{CV\_PTLOC\_VERTEX} and \texttt{where} is the vertex.
4083 \item The point is outside the subdivsion reference rectangle. \texttt{loc} is \texttt{CV\_PTLOC\_OUTSIDE\_RECT} and \texttt{where} is None.
4084 \item One of input arguments is invalid. The function raises an exception.
4088 \cvfunc{FindNearestPoint2D}\label{FindNearestPoint2D}
4090 Finds the closest subdivision vertex to the given point.
4093 CvSubdiv2DPoint* cvFindNearestPoint2D( \par CvSubdiv2D* subdiv,\par CvPoint2D32f pt );
4094 }{CPP}{FindNearestPoint2D(subdiv,pt)-> point}
4097 \cvarg{subdiv}{Delaunay or another subdivision}
4098 \cvarg{pt}{Input point}
4101 The function \texttt{cvFindNearestPoint2D} is another function that
4102 locates the input point within the subdivision. It finds the subdivision vertex that
4103 is the closest to the input point. It is not necessarily one of vertices
4104 of the facet containing the input point, though the facet (located using
4105 \cross{Subdiv2DLocate}) is used as a starting
4106 point. The function returns a pointer to the found subdivision vertex.
4108 \cvfunc{CalcSubdivVoronoi2D}\label{CalcSubdivVoronoi2D}
4110 Calculates the coordinates of Voronoi diagram cells.
4113 void cvCalcSubdivVoronoi2D( \par CvSubdiv2D* subdiv );
4114 }{CPP}{CalcSubdivVoronoi2D(subdiv)-> None}
4117 \cvarg{subdiv}{Delaunay subdivision, in which all the points are already added}
4120 The function \texttt{cvCalcSubdivVoronoi2D} calculates the coordinates
4121 of virtual points. All virtual points corresponding to some vertex of the
4122 original subdivision form (when connected together) a boundary of the Voronoi
4125 \cvfunc{ClearSubdivVoronoi2D}\label{ClearSubdivVoronoi2D}
4127 Removes all virtual points.
4130 void cvClearSubdivVoronoi2D( CvSubdiv2D* subdiv );
4131 }{CPP}{ClearSubdivVoronoi2D(subdiv)-> None}
4134 \cvarg{subdiv}{Delaunay subdivision}
4137 The function \texttt{ClearSubdivVoronoi2D} removes all of the virtual points. It
4138 is called internally in \cross{CalcSubdivVoronoi2D} if the subdivision
4139 was modified after previous call to the function.
4141 There are a few other lower-level functions that work with planar
4142 subdivisions, see cv.h and the sources. A demo script delaunay.c that
4143 builds a Delaunay triangulation and a Voronoi diagram of a random 2d point
4144 set can be found at opencv/samples/c.
4146 \section{Motion Analysis and Object Tracking Reference}
4148 \subsection{Accumulation of Background Statistics}
4150 \cvfunc{Acc}\label{Acc}
4152 Adds a frame to an accumulator.
4155 void cvAcc( \par const CvArr* image,\par CvArr* sum,\par const CvArr* mask=NULL );
4156 }{CPP}{Acc(image,smu,mask=NULL)-> None}
4159 \cvarg{image}{Input image, 1- or 3-channel, 8-bit or 32-bit floating point. (each channel of multi-channel image is processed independently)}
4160 \cvarg{sum}{Accumulator with the same number of channels as input image, 32-bit or 64-bit floating-point}
4161 \cvarg{mask}{Optional operation mask}
4164 The function \texttt{cvAcc} adds the whole image \texttt{image} or its selected region to the accumulator \texttt{sum}:
4166 \[ \texttt{sum}(x,y) \leftarrow \texttt{sum}(x,y) + \texttt{image}(x,y) \quad \text{if} \quad \texttt{mask}(x,y) \ne 0 \]
4168 \cvfunc{SquareAcc}\label{SquareAcc}
4170 Adds the square of the source image to the accumulator.
4173 void cvSquareAcc( \par const CvArr* image,\par CvArr* sqsum,\par const CvArr* mask=NULL );
4174 }{CPP}{SquareAcc(image,sqsum,mask=NULL)-> None}
4177 \cvarg{image}{Input image, 1- or 3-channel, 8-bit or 32-bit floating point (each channel of multi-channel image is processed independently)}
4178 \cvarg{sqsum}{Accumulator with the same number of channels as input image, 32-bit or 64-bit floating-point}
4179 \cvarg{mask}{Optional operation mask}
4182 The function \texttt{cvSquareAcc} adds the input image \texttt{image} or its selected region, raised to power 2, to the accumulator \texttt{sqsum}:
4184 \[ \texttt{sqsum}(x,y) \leftarrow \texttt{sqsum}(x,y) + \texttt{image}(x,y)^2 \quad \text{if} \quad \texttt{mask}(x,y) \ne 0 \]
4186 \cvfunc{MultiplyAcc}\label{MultiplyAcc}
4188 Adds the product of two input images to the accumulator.
4191 void cvMultiplyAcc( \par const CvArr* image1,\par const CvArr* image2,\par CvArr* acc,\par const CvArr* mask=NULL );
4192 }{CPP}{MulitplyAcc(image1,image2,acc,mask=NULL)-> None}
4195 \cvarg{image1}{First input image, 1- or 3-channel, 8-bit or 32-bit floating point (each channel of multi-channel image is processed independently)}
4196 \cvarg{image2}{Second input image, the same format as the first one}
4197 \cvarg{acc}{Accumulator with the same number of channels as input images, 32-bit or 64-bit floating-point}
4198 \cvarg{mask}{Optional operation mask}
4201 The function \texttt{cvMultiplyAcc} adds the product of 2 images or their selected regions to the accumulator \texttt{acc}:
4203 \[ \texttt{acc}(x,y) \leftarrow \texttt{acc}(x,y) + \texttt{image1}(x,y) \cdot \texttt{image2}(x,y) \quad \text{if} \quad \texttt{mask}(x,y) \ne 0 \]
4205 \cvfunc{RunningAvg}\label{RunningAvg}
4207 Updates the running average.
4210 void cvRunningAvg( \par const CvArr* image,\par CvArr* acc,\par double alpha,\par const CvArr* mask=NULL );
4211 }{CPP}{RunningAvg(image,acc,alpha,mask=NULL)-> None}
4214 \cvarg{image}{Input image, 1- or 3-channel, 8-bit or 32-bit floating point (each channel of multi-channel image is processed independently)}
4215 \cvarg{acc}{Accumulator with the same number of channels as input image, 32-bit or 64-bit floating-point}
4216 \cvarg{alpha}{Weight of input image}
4217 \cvarg{mask}{Optional operation mask}
4220 The function \texttt{cvRunningAvg} calculates the weighted sum of the input image
4221 \texttt{image} and the accumulator \texttt{acc} so that \texttt{acc}
4222 becomes a running average of frame sequence:
4224 \[ \texttt{acc}(x,y) \leftarrow (1-\alpha) \cdot \texttt{acc}(x,y) + \alpha \cdot \texttt{image}(x,y) \quad \text{if} \quad \texttt{mask}(x,y) \ne 0 \]
4226 where $\alpha$ (\texttt{alpha}) regulates the update speed (how fast the accumulator forgets about previous frames).
4228 \subsection{Motion Templates}
4230 \cvfunc{UpdateMotionHistory}\label{UpdateMotionHistory}
4232 Updates the motion history image by a moving silhouette.
4235 void cvUpdateMotionHistory( \par const CvArr* silhouette,\par CvArr* mhi,\par double timestamp,\par double duration );
4236 }{CPP}{UpdateMotionHistory(silhouette,mhi,timestamp,duration)-> None}
4239 \cvarg{silhouette}{Silhouette mask that has non-zero pixels where the motion occurs}
4240 \cvarg{mhi}{Motion history image, that is updated by the function (single-channel, 32-bit floating-point)}
4241 \cvarg{timestamp}{Current time in milliseconds or other units}
4242 \cvarg{duration}{Maximal duration of the motion track in the same units as \texttt{timestamp}}
4245 The function \texttt{cvUpdateMotionHistory} updates the motion history image as following:
4248 \texttt{mhi}(x,y)=\forkthree
4249 {\texttt{timestamp}}{if $\texttt{silhouette}(x,y) \ne 0$}
4250 {0}{if $\texttt{silhouette}(x,y) = 0$ and $\texttt{mhi} < (\texttt{timestamp} - \texttt{duration})$}
4251 {\texttt{mhi}(x,y)}{otherwise}
4253 That is, MHI pixels where motion occurs are set to the current timestamp, while the pixels where motion happened far ago are cleared.
4255 \cvfunc{CalcMotionGradient}\label{CalcMotionGradient}
4257 Calculates the gradient orientation of a motion history image.
4260 void cvCalcMotionGradient( \par const CvArr* mhi,\par CvArr* mask,\par CvArr* orientation,\par double delta1,\par double delta2,\par int aperture\_size=3 );
4261 }{CPP}{CalcMotionGradient(mhi,mask,orientation,delta1,delta2,aperture\_size=3)-> None}
4264 \cvarg{mhi}{Motion history image}
4265 \cvarg{mask}{Mask image; marks pixels where the motion gradient data is correct; output parameter}
4266 \cvarg{orientation}{Motion gradient orientation image; contains angles from 0 to ~360 degrees }
4267 \cvarg{delta1, delta2}{The function finds the minimum ($m(x,y)$) and maximum ($M(x,y)$) mhi values over each pixel $(x,y)$ neighborhood and assumes the gradient is valid only if
4269 \min(\texttt{delta1} , \texttt{delta2} ) \le M(x,y)-m(x,y) \le \max(\texttt{delta1} ,\texttt{delta2} ).
4271 \cvarg{aperture\_size}{Aperture size of derivative operators used by the function: CV\_SCHARR, 1, 3, 5 or 7 (see \cross{Sobel})}
4274 The function \texttt{cvCalcMotionGradient} calculates the derivatives $Dx$ and $Dy$ of \texttt{mhi} and then calculates gradient orientation as:
4277 \texttt{orientation}(x,y)=\arctan{\frac{Dy(x,y)}{Dx(x,y)}}
4280 where both $Dx(x,y)$ and $Dy(x,y)$ signs are taken into account (as in the \cross{CartToPolar} function). After that \texttt{mask} is filled to indicate where the orientation is valid (see the \texttt{delta1} and \texttt{delta2} description).
4282 \cvfunc{CalcGlobalOrientation}\label{CalcGlobalOrientation}
4284 Calculates the global motion orientation of some selected region.
4287 double cvCalcGlobalOrientation( \par const CvArr* orientation,\par const CvArr* mask,\par const CvArr* mhi,\par double timestamp,\par double duration );
4288 }{CPP}{CalcGlobalOrientation(orientation,mask,mhi,timestamp,duration)-> None}
4291 \cvarg{orientation}{Motion gradient orientation image; calculated by the function \cross{CalcMotionGradient}}
4292 \cvarg{mask}{Mask image. It may be a conjunction of a valid gradient mask, obtained with \cross{CalcMotionGradient} and the mask of the region, whose direction needs to be calculated}
4293 \cvarg{mhi}{Motion history image}
4294 \cvarg{timestamp}{Current time in milliseconds or other units, it is better to store time passed to \cross{UpdateMotionHistory} before and reuse it here, because running \cross{UpdateMotionHistory} and \cross{CalcMotionGradient} on large images may take some time}
4295 \cvarg{duration}{Maximal duration of motion track in milliseconds, the same as \cross{UpdateMotionHistory}}
4298 The function \texttt{cvCalcGlobalOrientation} calculates the general
4299 motion direction in the selected region and returns the angle between
4300 0 degrees and 360 degrees . At first the function builds the orientation histogram
4301 and finds the basic orientation as a coordinate of the histogram
4302 maximum. After that the function calculates the shift relative to the
4303 basic orientation as a weighted sum of all of the orientation vectors: the more
4304 recent the motion, the greater the weight. The resultant angle is
4305 a circular sum of the basic orientation and the shift.
4307 \cvfunc{SegmentMotion}\label{SegmentMotion}
4309 Segments a whole motion into separate moving parts.
4312 CvSeq* cvSegmentMotion( \par const CvArr* mhi,\par CvArr* seg\_mask,\par CvMemStorage* storage,\par double timestamp,\par double seg\_thresh );
4313 }{CPP}{SegmentMotion(mhi,seg\_mask,storage,timestamp,seg\_thresh)-> None}
4316 \cvarg{mhi}{Motion history image}
4317 \cvarg{seg\_mask}{Image where the mask found should be stored, single-channel, 32-bit floating-point}
4318 \cvarg{storage}{Memory storage that will contain a sequence of motion connected components}
4319 \cvarg{timestamp}{Current time in milliseconds or other units}
4320 \cvarg{seg\_thresh}{Segmentation threshold; recommended to be equal to the interval between motion history "steps" or greater}
4323 The function \texttt{cvSegmentMotion} finds all of the motion segments and
4324 marks them in \texttt{seg\_mask} with individual values (1,2,...). It
4325 also returns a sequence of \cross{CvConnectedComp}
4326 structures, one for each motion component. After that the
4327 motion direction for every component can be calculated with
4328 \cross{CalcGlobalOrientation} using the extracted mask of the particular
4329 component \cross{Cmp}.
4331 \subsection{Object Tracking}
4333 \cvfunc{MeanShift}\label{MeanShift}
4335 Finds the object center on back projection.
4338 int cvMeanShift( \par const CvArr* prob\_image,\par CvRect window,\par CvTermCriteria criteria,\par CvConnectedComp* comp );
4339 }{CPP}{MeanShift(prob\_image,window,criteria)-> comp}
4342 \cvarg{prob\_image}{Back projection of the object histogram (see \cross{CalcBackProject})}
4343 \cvarg{window}{Initial search window}
4344 \cvarg{criteria}{Criteria applied to determine when the window search should be finished}
4345 \cvarg{comp}{Resultant structure that contains the converged search window coordinates (\texttt{comp->rect} field) and the sum of all of the pixels inside the window (\texttt{comp->area} field)}
4348 The function \texttt{cvMeanShift} iterates to find the object center
4349 given its back projection and initial position of search window. The
4350 iterations are made until the search window center moves by less than
4351 the given value and/or until the function has done the maximum number
4352 of iterations. The function returns the number of iterations made.
4354 \cvfunc{CamShift}\label{CamShift}
4356 Finds the object center, size, and orientation.
4359 int cvCamShift( \par const CvArr* prob\_image,\par CvRect window,\par CvTermCriteria criteria,\par CvConnectedComp* comp,\par CvBox2D* box=NULL );
4360 }{CPP}{CamShift(prob\_image,window,criteria,box=NULL)-> comp}
4363 \cvarg{prob\_image}{Back projection of object histogram (see \cross{CalcBackProject})}
4364 \cvarg{window}{Initial search window}
4365 \cvarg{criteria}{Criteria applied to determine when the window search should be finished}
4366 \cvarg{comp}{Resultant structure that contains the converged search window coordinates (\texttt{comp->rect} field) and the sum of all of the pixels inside the window (\texttt{comp->area} field)}
4367 \cvarg{box}{Circumscribed box for the object. If not \texttt{NULL}, it contains object size and orientation}
4370 The function \texttt{cvCamShift} implements the CAMSHIFT object tracking algrorithm
4372 First, it finds an object center using \cross{MeanShift} and, after that, calculates the object size and orientation. The function returns number of iterations made within \cross{MeanShift}.
4374 The \cross{CvCamShiftTracker} class declared in cv.hpp implements the color object tracker that uses the function.
4376 \cvfunc{SnakeImage}\label{SnakeImage}
4378 Changes the contour position to minimize its energy.
4381 void cvSnakeImage( \par const IplImage* image,\par CvPoint* points,\par int length,\par float* alpha,\par float* beta,\par float* gamma,\par int coeff\_usage,\par CvSize win,\par CvTermCriteria criteria,\par int calc\_gradient=1 );
4382 }{CPP}{SnakeImage(image,points,alpha,beta,gamma,coeff\_usage,win,criteria,calc\_gradient=1)-> None}
4385 \cvarg{image}{The source image or external energy field}
4386 \cvarg{points}{Contour points (snake)}
4387 \cvarg{length}{Number of points in the contour}
4388 \cvarg{alpha}{Weight[s] of continuity energy, single float or array of \texttt{length} floats, one for each contour point}
4389 \cvarg{beta}{Weight[s] of curvature energy, similar to \texttt{alpha}}
4390 \cvarg{gamma}{Weight[s] of image energy, similar to \texttt{alpha}}
4391 \cvarg{coeff\_usage}{Different uses of the previous three parameters:
4393 \cvarg{CV\_VALUE}{indicates that each of \texttt{alpha, beta, gamma} is a pointer to a single value to be used for all points;}
4394 \cvarg{CV\_ARRAY}{indicates that each of \texttt{alpha, beta, gamma} is a pointer to an array of coefficients different for all the points of the snake. All the arrays must have the size equal to the contour size.}
4396 \cvarg{win}{Size of neighborhood of every point used to search the minimum, both \texttt{win.width} and \texttt{win.height} must be odd}
4397 \cvarg{criteria}{Termination criteria}
4398 \cvarg{calc\_gradient}{Gradient flag; if not 0, the function calculates the gradient magnitude for every image pixel and consideres it as the energy field, otherwise the input image itself is considered}
4401 The function \texttt{cvSnakeImage} updates the snake in order to minimize its
4402 total energy that is a sum of internal energy that depends on the contour
4403 shape (the smoother contour is, the smaller internal energy is) and
4404 external energy that depends on the energy field and reaches minimum at
4405 the local energy extremums that correspond to the image edges in the case
4406 of using an image gradient.
4408 The parameter \texttt{criteria.epsilon} is used to define the minimal
4409 number of points that must be moved during any iteration to keep the
4410 iteration process running.
4412 If at some iteration the number of moved points is less
4413 than \texttt{criteria.epsilon} or the function performed
4414 \texttt{criteria.max\_iter} iterations, the function terminates.
4416 \subsection{Optical Flow}
4418 \cvfunc{CalcOpticalFlowHS}\label{CalcOpticalFlowHS}
4420 Calculates the optical flow for two images.
4423 void cvCalcOpticalFlowHS( \par const CvArr* prev,\par const CvArr* curr,\par int use\_previous,\par CvArr* velx,\par CvArr* vely,\par double lambda,\par CvTermCriteria criteria );
4424 }{CPP}{CalcOpticalFlowHS(prev,curr,use\_previous,velx,vely,lambda,criteria)-> None}
4427 \cvarg{prev}{First image, 8-bit, single-channel}
4428 \cvarg{curr}{Second image, 8-bit, single-channel}
4429 \cvarg{use\_previous}{Uses the previous (input) velocity field}
4430 \cvarg{velx}{Horizontal component of the optical flow of the same size as input images, 32-bit floating-point, single-channel}
4431 \cvarg{vely}{Vertical component of the optical flow of the same size as input images, 32-bit floating-point, single-channel}
4432 \cvarg{lambda}{Lagrangian multiplier}
4433 \cvarg{criteria}{Criteria of termination of velocity computing}
4436 The function \texttt{cvCalcOpticalFlowHS} computes the flow for every pixel of the first input image using the Horn and Schunck algorithm
4439 \cvfunc{CalcOpticalFlowLK}\label{CalcOpticalFlowLK}
4441 Calculates the optical flow for two images.
4444 void cvCalcOpticalFlowLK( \par const CvArr* prev,\par const CvArr* curr,\par CvSize win\_size,\par CvArr* velx,\par CvArr* vely );
4445 }{CPP}{CalcOpticalFlowLK(prev,curr,win\_size,velx,vely)-> None}
4448 \cvarg{prev}{First image, 8-bit, single-channel}
4449 \cvarg{curr}{Second image, 8-bit, single-channel}
4450 \cvarg{win\_size}{Size of the averaging window used for grouping pixels}
4451 \cvarg{velx}{Horizontal component of the optical flow of the same size as input images, 32-bit floating-point, single-channel}
4452 \cvarg{vely}{Vertical component of the optical flow of the same size as input images, 32-bit floating-point, single-channel}
4455 The function \texttt{cvCalcOpticalFlowLK} computes the flow for every pixel of the first input image using the Lucas and Kanade algorithm
4458 \cvfunc{CalcOpticalFlowBM}\label{CalcOpticalFlowBM}
4460 Calculates the optical flow for two images by using the block matching method.
4463 void cvCalcOpticalFlowBM( \par const CvArr* prev,\par const CvArr* curr,\par CvSize block\_size,\par CvSize shift\_size,\par CvSize max\_range,\par int use\_previous,\par CvArr* velx,\par CvArr* vely );
4464 }{CPP}{CalcOpticalFlowBM(prev,curr,block\_size,shift\_size,max\_range,use\_previous,velx,vely)-> None}
4467 \cvarg{prev}{First image, 8-bit, single-channel}
4468 \cvarg{curr}{Second image, 8-bit, single-channel}
4469 \cvarg{block\_size}{Size of basic blocks that are compared}
4470 \cvarg{shift\_size}{Block coordinate increments}
4471 \cvarg{max\_range}{Size of the scanned neighborhood in pixels around the block}
4472 \cvarg{use\_previous}{Uses the previous (input) velocity field}
4473 \cvarg{velx}{Horizontal component of the optical flow of
4475 \left\lfloor \frac{\texttt{prev->width} - \texttt{block\_size.width}}{\texttt{shiftSize.width}} \right\rfloor
4477 \left\lfloor \frac{\texttt{prev->height} - \texttt{block\_size.height}}{\texttt{shiftSize.height}} \right\rfloor
4479 size, 32-bit floating-point, single-channel}
4480 \cvarg{vely}{Vertical component of the optical flow of the same size \texttt{velx}, 32-bit floating-point, single-channel}
4483 The function \texttt{CalcOpticalFlowBM} calculates the optical
4484 flow for overlapped blocks $\texttt{block\_size.width} \times
4485 \texttt{block\_size.height}$ pixels each, thus the velocity
4486 fields are smaller than the original images. For every block
4487 in \texttt{prev} the functions tries to find a similar block in
4488 \texttt{curr} in some neighborhood of the original block or shifted by
4489 (velx(x0,y0),vely(x0,y0)) block as has been calculated by previous
4490 function call (if \texttt{use\_previous=1})
4492 \cvfunc{CalcOpticalFlowPyrLK}\label{CalcOpticalFlowPyrLK}
4494 Calculates the optical flow for a sparse feature set using the iterative Lucas-Kanade method with pyramids.
4497 void cvCalcOpticalFlowPyrLK( \par const CvArr* prev,\par const CvArr* curr,\par CvArr* prev\_pyr,\par CvArr* curr\_pyr,\par const CvPoint2D32f* prev\_features,\par CvPoint2D32f* curr\_features,\par int count,\par CvSize win\_size,\par int level,\par char* status,\par float* track\_error,\par CvTermCriteria criteria,\par int flags );
4499 CalcOpticalFlowPyrLK( prev, curr, prev\_pyr, curr\_pyr, prev\_features, CvSize win\_size, int level, criteria, flags, guesses = None) -> (curr\_features, status, track\_error)
4503 \cvarg{prev}{First frame, at time \texttt{t}}
4504 \cvarg{curr}{Second frame, at time \texttt{t + dt} }
4505 \cvarg{prev\_pyr}{Buffer for the pyramid for the first frame. If the pointer is not \texttt{NULL} , the buffer must have a sufficient size to store the pyramid from level \texttt{1} to level \texttt{level} ; the total size of \texttt{(image\_width+8)*image\_height/3} bytes is sufficient}
4506 \cvarg{curr\_pyr}{Similar to \texttt{prev\_pyr}, used for the second frame}
4507 \cvarg{prev\_features}{Array of points for which the flow needs to be found}
4508 \cvarg{curr\_features}{Array of 2D points containing the calculated new positions of the input features in the second image}
4509 \cvarg{count}{Number of feature points}
4510 \cvarg{win\_size}{Size of the search window of each pyramid level}
4511 \cvarg{level}{Maximal pyramid level number. If \texttt{0} , pyramids are not used (single level), if \texttt{1} , two levels are used, etc}
4512 \cvarg{status}{Array. Every element of the array is set to \texttt{1} if the flow for the corresponding feature has been found, \texttt{0} otherwise}
4513 \cvarg{error}{Array of double numbers containing the difference between patches around the original and moved points. Optional parameter; can be \texttt{NULL }}
4514 \cvarg{criteria}{Specifies when the iteration process of finding the flow for each point on each pyramid level should be stopped}
4515 \cvarg{flags}{Miscellaneous flags:
4517 \cvarg{CV\_LKFLOW\_PYR\_A\_READY}{pyramid for the first frame is precalculated before the call}
4518 \cvarg{CV\_LKFLOW\_PYR\_B\_READY}{ pyramid for the second frame is precalculated before the call}
4519 \cvC{\cvarg{CV\_LKFLOW\_INITIAL\_GUESSES}{array B contains initial coordinates of features before the function call}}
4521 \cvPy{\cvarg{guesses}{optional array of estimated coordinates of features in second frame, with same length as \texttt{prev_features}}}
4524 The function \texttt{cvCalcOpticalFlowPyrLK} implements the sparse iterative version of the Lucas-Kanade optical flow in pyramids
4526 . It calculates the coordinates of the feature points on the current video
4527 frame given their coordinates on the previous frame. The function finds
4528 the coordinates with sub-pixel accuracy.
4530 Both parameters \texttt{prev\_pyr} and \texttt{curr\_pyr} comply with the
4531 following rules: if the image pointer is 0, the function allocates the
4532 buffer internally, calculates the pyramid, and releases the buffer after
4533 processing. Otherwise, the function calculates the pyramid and stores
4534 it in the buffer unless the flag \texttt{CV\_LKFLOW\_PYR\_A[B]\_READY}
4535 is set. The image should be large enough to fit the Gaussian pyramid
4536 data. After the function call both pyramids are calculated and the
4537 readiness flag for the corresponding image can be set in the next call
4538 (i.e., typically, for all the image pairs except the very first one
4539 \texttt{CV\_LKFLOW\_PYR\_A\_READY} is set).
4542 \subsection{Estimators}
4544 \cvfunc{CvKalman}\label{CvKalman}
4546 Kalman filter state.
4548 'changequote(KLAK,KLOK)
4550 typedef struct CvKalman
4552 int MP; /* number of measurement vector dimensions */
4553 int DP; /* number of state vector dimensions */
4554 int CP; /* number of control vector dimensions */
4556 /* backward compatibility fields */
4558 float* PosterState; /* =state_pre->data.fl */
4559 float* PriorState; /* =state_post->data.fl */
4560 float* DynamMatr; /* =transition_matrix->data.fl */
4561 float* MeasurementMatr; /* =measurement_matrix->data.fl */
4562 float* MNCovariance; /* =measurement_noise_cov->data.fl */
4563 float* PNCovariance; /* =process_noise_cov->data.fl */
4564 float* KalmGainMatr; /* =gain->data.fl */
4565 float* PriorErrorCovariance;/* =error_cov_pre->data.fl */
4566 float* PosterErrorCovariance;/* =error_cov_post->data.fl */
4567 float* Temp1; /* temp1->data.fl */
4568 float* Temp2; /* temp2->data.fl */
4571 CvMat* state_pre; /* predicted state (x'(k)):
4572 x(k)=A*x(k-1)+B*u(k) */
4573 CvMat* state_post; /* corrected state (x(k)):
4574 x(k)=x'(k)+K(k)*(z(k)-H*x'(k)) */
4575 CvMat* transition_matrix; /* state transition matrix (A) */
4576 CvMat* control_matrix; /* control matrix (B)
4577 (it is not used if there is no control)*/
4578 CvMat* measurement_matrix; /* measurement matrix (H) */
4579 CvMat* process_noise_cov; /* process noise covariance matrix (Q) */
4580 CvMat* measurement_noise_cov; /* measurement noise covariance matrix (R) */
4581 CvMat* error_cov_pre; /* priori error estimate covariance matrix (P'(k)):
4582 P'(k)=A*P(k-1)*At + Q*/
4583 CvMat* gain; /* Kalman gain matrix (K(k)):
4584 K(k)=P'(k)*Ht*inv(H*P'(k)*Ht+R)*/
4585 CvMat* error_cov_post; /* posteriori error estimate covariance matrix (P(k)):
4586 P(k)=(I-K(k)*H)*P'(k) */
4587 CvMat* temp1; /* temporary matrices */
4597 The structure \texttt{CvKalman} is used to keep the Kalman filter
4598 state. It is created by the \cross{CreateKalman} function, updated
4599 by the \cross{KalmanPredict} and \cross{KalmanCorrect} functions
4600 and released by the \cross{ReleaseKalman} function. Normally, the
4601 structure is used for the standard Kalman filter (notation and the
4602 formulas below are borrowed from the excellent Kalman tutorial
4606 x_k=A \cdot x_{k-1}+B \cdot u_k+w_k
4614 x_k (x{k-1}) & \text{state of the system at the moment $k$ $(k-1)$ }\\
4615 z_k & \text{measurement of the system state at the moment $k$}\\
4616 u_k & \text{external control applied at the moment $k$}
4620 $w_k$ and $v_k$ are normally-distributed process and measurement noise, respectively:
4631 $Q$ process noise covariance matrix, constant or variable,
4633 $R$ measurement noise covariance matrix, constant or variable
4635 In the case of the standard Kalman filter, all of the matrices: A, B, H, Q and R are initialized once after the \cross{CvKalman} structure is allocated via \cross{CreateKalman}. However, the same structure and the same functions may be used to simulate the extended Kalman filter by linearizing the extended Kalman filter equation in the current system state neighborhood, in this case A, B, H (and, probably, Q and R) should be updated on every step.
4637 \cvfunc{CreateKalman}\label{CreateKalman}
4639 Allocates the Kalman filter structure.
4642 CvKalman* cvCreateKalman( \par int dynam\_params,\par int measure\_params,\par int control\_params=0 );
4646 \cvarg{dynam\_params}{dimensionality of the state vector}
4647 \cvarg{measure\_params}{dimensionality of the measurement vector}
4648 \cvarg{control\_params}{dimensionality of the control vector}
4651 The function \texttt{cvCreateKalman} allocates \cross{CvKalman} and all its matrices and initializes them somehow.
4653 \cvfunc{ReleaseKalman}\label{ReleaseKalman}
4655 Deallocates the Kalman filter structure.
4658 void cvReleaseKalman( \par CvKalman** kalman );
4662 \cvarg{kalman}{double pointer to the Kalman filter structure}
4665 The function \texttt{cvReleaseKalman} releases the structure \cross{CvKalman} and all of the underlying matrices.
4667 \cvfunc{KalmanPredict}\label{KalmanPredict}
4669 Estimates the subsequent model state.
4672 const CvMat* cvKalmanPredict( \par CvKalman* kalman, \par const CvMat* control=NULL );
4675 #define cvKalmanUpdateByTime cvKalmanPredict
4679 \cvarg{kalman}{Kalman filter state}
4680 \cvarg{control}{Control vector $u_k$, should be NULL iff there is no external control (\texttt{control\_params} =0)}
4683 The function \texttt{cvKalmanPredict} estimates the subsequent stochastic model state by its current state and stores it at \texttt{kalman->state\_pre}:
4687 x'_k=A \cdot x_k+B \cdot u_k\\
4688 P'_k=A \cdot P_{k-1}+A^T + Q
4694 \begin{tabular}{l p{5 in}}
4695 $x'_k$ & is predicted state \texttt{kalman->state\_pre},\\
4696 $x_{k-1}$ & is corrected state on the previous step \texttt{kalman->state\_post}
4697 (should be initialized somehow in the beginning, zero vector by default),\\
4698 $u_k$ & is external control (\texttt{control} parameter),\\
4699 $P'_k$ & is priori error covariance matrix \texttt{kalman->error\_cov\_pre}\\
4700 $P_{k-1}$ & is posteriori error covariance matrix on the previous step \texttt{kalman->error\_cov\_post}
4701 (should be initialized somehow in the beginning, identity matrix by default),
4704 The function returns the estimated state.
4706 \cvfunc{KalmanCorrect}\label{KalmanCorrect}
4708 Adjusts the model state.
4711 const CvMat* cvKalmanCorrect( CvKalman* kalman, const CvMat* measurement );
4715 #define cvKalmanUpdateByMeasurement cvKalmanCorrect
4719 \cvarg{kalman}{Pointer to the structure to be updated}
4720 \cvarg{measurement}{Pointer to the structure CvMat containing the measurement vector}
4723 The function \texttt{cvKalmanCorrect} adjusts the stochastic model state on the basis of the given measurement of the model state:
4727 K_k=P'_k \cdot H^T \cdot (H \cdot P'_k \cdot H^T+R)^{-1}\\
4728 x_k=x'_k+K_k \cdot (z_k-H \cdot x'_k)\\
4729 P_k=(I-K_k \cdot H) \cdot P'_k
4735 \begin{tabular}{l p{4 in}}
4736 $z_k$ & given measurement (\texttt{mesurement} parameter)\\
4737 $K_k$ & Kalman "gain" matrix.
4740 The function stores the adjusted state at \texttt{kalman->state\_post} and returns it on output.
4742 \cvfunc{Example. Using Kalman filter to track a rotating point}
4745 #include "highgui.h"
4748 int main(int argc, char** argv)
4751 const float A[] = { 1, 1, 0, 1 };
4753 IplImage* img = cvCreateImage( cvSize(500,500), 8, 3 );
4754 CvKalman* kalman = cvCreateKalman( 2, 1, 0 );
4755 /* state is (phi, delta_phi) - angle and angle increment */
4756 CvMat* state = cvCreateMat( 2, 1, CV_32FC1 );
4757 CvMat* process_noise = cvCreateMat( 2, 1, CV_32FC1 );
4758 /* only phi (angle) is measured */
4759 CvMat* measurement = cvCreateMat( 1, 1, CV_32FC1 );
4763 cvRandInit( &rng, 0, 1, -1, CV_RAND_UNI );
4765 cvZero( measurement );
4766 cvNamedWindow( "Kalman", 1 );
4770 cvRandSetRange( &rng, 0, 0.1, 0 );
4771 rng.disttype = CV_RAND_NORMAL;
4773 cvRand( &rng, state );
4775 memcpy( kalman->transition_matrix->data.fl, A, sizeof(A));
4776 cvSetIdentity( kalman->measurement_matrix, cvRealScalar(1) );
4777 cvSetIdentity( kalman->process_noise_cov, cvRealScalar(1e-5) );
4778 cvSetIdentity( kalman->measurement_noise_cov, cvRealScalar(1e-1) );
4779 cvSetIdentity( kalman->error_cov_post, cvRealScalar(1));
4780 /* choose random initial state */
4781 cvRand( &rng, kalman->state_post );
4783 rng.disttype = CV_RAND_NORMAL;
4787 #define calc_point(angle) \
4788 cvPoint( cvRound(img->width/2 + img->width/3*cos(angle)), \
4789 cvRound(img->height/2 - img->width/3*sin(angle)))
4791 float state_angle = state->data.fl[0];
4792 CvPoint state_pt = calc_point(state_angle);
4794 /* predict point position */
4795 const CvMat* prediction = cvKalmanPredict( kalman, 0 );
4796 float predict_angle = prediction->data.fl[0];
4797 CvPoint predict_pt = calc_point(predict_angle);
4798 float measurement_angle;
4799 CvPoint measurement_pt;
4801 cvRandSetRange( &rng,
4803 sqrt(kalman->measurement_noise_cov->data.fl[0]),
4805 cvRand( &rng, measurement );
4807 /* generate measurement */
4808 cvMatMulAdd( kalman->measurement_matrix, state, measurement, measurement );
4810 measurement_angle = measurement->data.fl[0];
4811 measurement_pt = calc_point(measurement_angle);
4814 #define draw_cross( center, color, d ) \
4815 cvLine( img, cvPoint( center.x - d, center.y - d ), \
4816 cvPoint( center.x + d, center.y + d ), \
4818 cvLine( img, cvPoint( center.x + d, center.y - d ), \
4819 cvPoint( center.x - d, center.y + d ), \
4823 draw_cross( state_pt, CV_RGB(255,255,255), 3 );
4824 draw_cross( measurement_pt, CV_RGB(255,0,0), 3 );
4825 draw_cross( predict_pt, CV_RGB(0,255,0), 3 );
4826 cvLine( img, state_pt, predict_pt, CV_RGB(255,255,0), 3, 0 );
4828 /* adjust Kalman filter state */
4829 cvKalmanCorrect( kalman, measurement );
4831 cvRandSetRange( &rng,
4833 sqrt(kalman->process_noise_cov->data.fl[0]),
4835 cvRand( &rng, process_noise );
4836 cvMatMulAdd( kalman->transition_matrix,
4841 cvShowImage( "Kalman", img );
4842 code = cvWaitKey( 100 );
4844 if( code > 0 ) /* break current simulation by pressing a key */
4847 if( code == 27 ) /* exit by ESCAPE */
4855 \cvfunc{CvConDensation}\label{CvConDensation}
4857 ConDenstation state.
4860 typedef struct CvConDensation
4862 int MP; //Dimension of measurement vector
4863 int DP; // Dimension of state vector
4864 float* DynamMatr; // Matrix of the linear Dynamics system
4865 float* State; // Vector of State
4866 int SamplesNum; // Number of the Samples
4867 float** flSamples; // array of the Sample Vectors
4868 float** flNewSamples; // temporary array of the Sample Vectors
4869 float* flConfidence; // Confidence for each Sample
4870 float* flCumulative; // Cumulative confidence
4871 float* Temp; // Temporary vector
4872 float* RandomSample; // RandomVector to update sample set
4873 CvRandState* RandS; // Array of structures to generate random vectors
4877 The structure \texttt{CvConDensation} stores the CONditional DENSity propagATION tracker state. The information about the algorithm can be found at \url{http://www.dai.ed.ac.uk/CVonline/LOCAL\_COPIES/ISARD1/condensation.html}.
4879 \cvfunc{CreateConDensation}\label{CreateConDensation}
4881 Allocates the ConDensation filter structure.
4884 CvConDensation* cvCreateConDensation( \par int dynam\_params,\par int measure\_params,\par int sample\_count );
4888 \cvarg{dynam\_params}{Dimension of the state vector}
4889 \cvarg{measure\_params}{Dimension of the measurement vector}
4890 \cvarg{sample\_count}{Number of samples}
4893 The function \texttt{cvCreateConDensation} creates a \texttt{CvConDensation} structure and returns a pointer to the structure.
4895 \cvfunc{ReleaseConDensation}\label{ReleaseConDensation}
4897 Deallocates the ConDensation filter structure.
4900 void cvReleaseConDensation( CvConDensation** condens );
4904 \cvarg{condens}{Pointer to the pointer to the structure to be released}
4907 The function \texttt{cvReleaseConDensation} releases the structure \cross{CvConDensation}) and frees all memory previously allocated for the structure.
4909 \cvfunc{ConDensInitSampleSet}\label{ConDensInitSampleSet}
4911 Initializes the sample set for the ConDensation algorithm.
4914 void cvConDensInitSampleSet( CvConDensation* condens, \par CvMat* lower\_bound, \par CvMat* upper\_bound );
4918 \cvarg{condens}{Pointer to a structure to be initialized}
4919 \cvarg{lower\_bound}{Vector of the lower boundary for each dimension}
4920 \cvarg{upper\_bound}{Vector of the upper boundary for each dimension}
4923 The function \texttt{cvConDensInitSampleSet} fills the samples arrays in the structure \cross{CvConDensation} with values within the specified ranges.
4925 \cvfunc{ConDensUpdateByTime}\label{ConDensUpdateByTime}
4927 Estimates the subsequent model state.
4930 void cvConDensUpdateByTime( \par CvConDensation* condens );
4934 \cvarg{condens}{Pointer to the structure to be updated}
4937 The function \texttt{cvConDensUpdateByTime} estimates the subsequent stochastic model state from its current state.
4940 \section{Pattern Recognition}
4942 \subsection{Object Detection}
4944 The object detector described below has been initially proposed by Paul Viola
4946 and improved by Rainer Lienhart
4948 . First, a classifier (namely a \emph{cascade of boosted classifiers working with haar-like features}) is trained with a few hundred sample views of a particular object (i.e., a face or a car), called positive examples, that are scaled to the same size (say, 20x20), and negative examples - arbitrary images of the same size.
4950 After a classifier is trained, it can be applied to a region of interest
4951 (of the same size as used during the training) in an input image. The
4952 classifier outputs a "1" if the region is likely to show the object
4953 (i.e., face/car), and "0" otherwise. To search for the object in the
4954 whole image one can move the search window across the image and check
4955 every location using the classifier. The classifier is designed so that
4956 it can be easily "resized" in order to be able to find the objects of
4957 interest at different sizes, which is more efficient than resizing the
4958 image itself. So, to find an object of an unknown size in the image the
4959 scan procedure should be done several times at different scales.
4961 The word "cascade" in the classifier name means that the resultant
4962 classifier consists of several simpler classifiers (\emph{stages}) that
4963 are applied subsequently to a region of interest until at some stage the
4964 candidate is rejected or all the stages are passed. The word "boosted"
4965 means that the classifiers at every stage of the cascade are complex
4966 themselves and they are built out of basic classifiers using one of four
4967 different \texttt{boosting} techniques (weighted voting). Currently
4968 Discrete Adaboost, Real Adaboost, Gentle Adaboost and Logitboost are
4969 supported. The basic classifiers are decision-tree classifiers with at
4970 least 2 leaves. Haar-like features are the input to the basic classifers,
4971 and are calculated as described below. The current algorithm uses the
4972 following Haar-like features:
4974 \includegraphics[width=0.5\textwidth]{pics/haarfeatures.png}
4976 The feature used in a particular classifier is specified by its shape (1a, 2b etc.), position within the region of interest and the scale (this scale is not the same as the scale used at the detection stage, though these two scales are multiplied). For example, in the case of the third line feature (2c) the response is calculated as the difference between the sum of image pixels under the rectangle covering the whole feature (including the two white stripes and the black stripe in the middle) and the sum of the image pixels under the black stripe multiplied by 3 in order to compensate for the differences in the size of areas. The sums of pixel values over a rectangular regions are calculated rapidly using integral images (see below and the \cross{Integral} description).
4979 A simple demonstration of face detection, which draws a rectangle around each detected face:
4983 hc = cv.Load("haarcascade_frontalface_default.xml")
4984 img = cv.LoadImage("faces.jpg", 0)
4985 faces = cv.HaarDetectObjects(img, hc, cv.CreateMemStorage())
4986 for (x,y,w,h),n in faces:
4987 cv.Rectangle(img, (x,y), (x+w,y+h), 255)
4988 cv.SaveImage("faces_detected.jpg", img)
4995 To see the object detector at work, have a look at the HaarFaceDetect demo.
4997 The following reference is for the detection part only. There
4998 is a separate application called \texttt{haartraining} that can
4999 train a cascade of boosted classifiers from a set of samples. See
5000 \texttt{opencv/apps/haartraining} for details.
5002 \cvfunc{CvHaarFeature, CvHaarClassifier, CvHaarStageClassifier, CvHaarClassifierCascade}
5003 \label{CvHaarFeature}
5004 \label{CvHaarClassifier}
5005 \label{CvHaarStageClassifier}
5006 \label{CvHaarClassifierCascade}
5008 Boosted Haar classifier structures.
5011 #define CV_HAAR_FEATURE_MAX 3
5013 /* a haar feature consists of 2-3 rectangles with appropriate weights */
5014 typedef struct CvHaarFeature
5016 int tilted; /* 0 means up-right feature, 1 means 45--rotated feature */
5018 /* 2-3 rectangles with weights of opposite signs and
5019 with absolute values inversely proportional to the areas of the
5020 rectangles. If rect[2].weight !=0, then
5021 the feature consists of 3 rectangles, otherwise it consists of 2 */
5026 } rect[CV_HAAR_FEATURE_MAX];
5030 /* a single tree classifier (stump in the simplest case) that returns the
5031 response for the feature at the particular image location (i.e. pixel
5032 sum over subrectangles of the window) and gives out a value depending
5034 typedef struct CvHaarClassifier
5036 int count; /* number of nodes in the decision tree */
5038 /* these are "parallel" arrays. Every index \texttt{i}
5039 corresponds to a node of the decision tree (root has 0-th index).
5041 left[i] - index of the left child (or negated index if the
5042 left child is a leaf)
5043 right[i] - index of the right child (or negated index if the
5044 right child is a leaf)
5045 threshold[i] - branch threshold. if feature responce is <= threshold,
5046 left branch is chosen, otherwise right branch is chosen.
5047 alpha[i] - output value correponding to the leaf. */
5048 CvHaarFeature* haar_feature;
5056 /* a boosted battery of classifiers(=stage classifier):
5057 the stage classifier returns 1
5058 if the sum of the classifiers responses
5059 is greater than \texttt{threshold} and 0 otherwise */
5060 typedef struct CvHaarStageClassifier
5062 int count; /* number of classifiers in the battery */
5063 float threshold; /* threshold for the boosted classifier */
5064 CvHaarClassifier* classifier; /* array of classifiers */
5066 /* these fields are used for organizing trees of stage classifiers,
5067 rather than just stright cascades */
5072 CvHaarStageClassifier;
5074 typedef struct CvHidHaarClassifierCascade CvHidHaarClassifierCascade;
5076 /* cascade or tree of stage classifiers */
5077 typedef struct CvHaarClassifierCascade
5079 int flags; /* signature */
5080 int count; /* number of stages */
5081 CvSize orig_window_size; /* original object size (the cascade is
5084 /* these two parameters are set by cvSetImagesForHaarClassifierCascade */
5085 CvSize real_window_size; /* current object size */
5086 double scale; /* current scale */
5087 CvHaarStageClassifier* stage_classifier; /* array of stage classifiers */
5088 CvHidHaarClassifierCascade* hid_cascade; /* hidden optimized
5089 representation of the
5091 cvSetImagesForHaarClassifierCascade */
5093 CvHaarClassifierCascade;
5096 All the structures are used for representing a cascaded of boosted Haar classifiers. The cascade has the following hierarchical structure:
5113 The whole hierarchy can be constructed manually or loaded from a file or an embedded base using the function \cross{LoadHaarClassifierCascade}.
5116 \cvfunc{LoadHaarClassifierCascade}\label{LoadHaarClassifierCascade}
5118 Loads a trained cascade classifier from a file or the classifier database embedded in OpenCV.
5121 CvHaarClassifierCascade* cvLoadHaarClassifierCascade( \par const char* directory,\par CvSize orig\_window\_size );
5125 \cvarg{directory}{Name of the directory containing the description of a trained cascade classifier}
5126 \cvarg{orig\_window\_size}{Original size of the objects the cascade has been trained on. Note that it is not stored in the cascade and therefore must be specified separately}
5129 The function \texttt{cvLoadHaarClassifierCascade} loads a trained cascade
5130 of haar classifiers from a file or the classifier database embedded in
5131 OpenCV. The base can be trained using the \texttt{haartraining} application
5132 (see opencv/apps/haartraining for details).
5134 \textbf{The function is obsolete}. Nowadays object detection classifiers are stored in XML or YAML files, rather than in directories. To load a cascade from a file, use the \cross{Load} function.
5136 \cvfunc{ReleaseHaarClassifierCascade}\label{ReleaseHaarClassifierCascade}
5138 Releases the haar classifier cascade.
5141 void cvReleaseHaarClassifierCascade( \par CvHaarClassifierCascade** cascade );
5145 \cvarg{cascade}{Double pointer to the released cascade. The pointer is cleared by the function}
5148 The function \texttt{cvReleaseHaarClassifierCascade} deallocates the cascade that has been created manually or loaded using \cross{LoadHaarClassifierCascade} or \cross{Load}.
5150 \cvfunc{HaarDetectObjects}\label{HaarDetectObjects}
5152 Detects objects in the image.
5155 typedef struct CvAvgComp
5157 CvRect rect; /* bounding rectangle for the object (average rectangle of a group) */
5158 int neighbors; /* number of neighbor rectangles in the group */
5166 CvSeq* cvHaarDetectObjects( \par const CvArr* image,\par CvHaarClassifierCascade* cascade,\par CvMemStorage* storage,\par double scale\_factor=1.1,\par int min\_neighbors=3,\par int flags=0,\par CvSize min\_size=cvSize(0,\par0) );
5167 }{CPP}{HaarDetectObjects(image,cascade,storage,scale\_factor=1.1,min\_neighbors=3,flags=0,min\_size=(0,0))-> detected\_objects}
5170 \cvarg{image}{Image to detect objects in}
5171 \cvarg{cascade}{Haar classifier cascade in internal representation}
5172 \cvarg{storage}{Memory storage to store the resultant sequence of the object candidate rectangles}
5173 \cvarg{scale\_factor}{The factor by which the search window is scaled between the subsequent scans, 1.1 means increasing window by 10\% }
5174 \cvarg{min\_neighbors}{Minimum number (minus 1) of neighbor rectangles that makes up an object. All the groups of a smaller number of rectangles than \texttt{min\_neighbors}-1 are rejected. If \texttt{min\_neighbors} is 0, the function does not any grouping at all and returns all the detected candidate rectangles, which may be useful if the user wants to apply a customized grouping procedure}
5175 \cvarg{flags}{Mode of operation. Currently the only flag that may be specified is \texttt{CV\_HAAR\_DO\_CANNY\_PRUNING}. If it is set, the function uses Canny edge detector to reject some image regions that contain too few or too much edges and thus can not contain the searched object. The particular threshold values are tuned for face detection and in this case the pruning speeds up the processing}
5176 \cvarg{min\_size}{Minimum window size. By default, it is set to the size of samples the classifier has been trained on ($\sim 20\times 20$ for face detection)}
5179 The function \texttt{cvHaarDetectObjects} finds rectangular regions in the given image that are likely to contain objects the cascade has been trained for and returns those regions as a sequence of rectangles. The function scans the image several times at different scales (see \cross{SetImagesForHaarClassifierCascade}). Each time it considers overlapping regions in the image and applies the classifiers to the regions using \cross{RunHaarClassifierCascade}. It may also apply some heuristics to reduce number of analyzed regions, such as Canny prunning. After it has proceeded and collected the candidate rectangles (regions that passed the classifier cascade), it groups them and returns a sequence of average rectangles for each large enough group. The default parameters (\texttt{scale\_factor} =1.1, \texttt{min\_neighbors} =3, \texttt{flags} =0) are tuned for accurate yet slow object detection. For a faster operation on real video images the settings are: \texttt{scale\_factor} =1.2, \texttt{min\_neighbors} =2, \texttt{flags} =\texttt{CV\_HAAR\_DO\_CANNY\_PRUNING}, \texttt{min\_size} =\textit{minimum possible face size} (for example, $\sim$ 1/4 to 1/16 of the image area in the case of video conferencing).
5182 % ===== Example. Using cascade of Haar classifiers to find objects (e.g. faces). =====
5185 #include "highgui.h"
5187 CvHaarClassifierCascade* load_object_detector( const char* cascade_path )
5189 return (CvHaarClassifierCascade*)cvLoad( cascade_path );
5192 void detect_and_draw_objects( IplImage* image,
5193 CvHaarClassifierCascade* cascade,
5196 IplImage* small_image = image;
5197 CvMemStorage* storage = cvCreateMemStorage(0);
5201 /* if the flag is specified, down-scale the input image to get a
5202 performance boost w/o loosing quality (perhaps) */
5205 small_image = cvCreateImage( cvSize(image->width/2,image->height/2), IPL_DEPTH_8U, 3 );
5206 cvPyrDown( image, small_image, CV_GAUSSIAN_5x5 );
5210 /* use the fastest variant */
5211 faces = cvHaarDetectObjects( small_image, cascade, storage, 1.2, 2, CV_HAAR_DO_CANNY_PRUNING );
5213 /* draw all the rectangles */
5214 for( i = 0; i < faces->total; i++ )
5216 /* extract the rectanlges only */
5217 CvRect face_rect = *(CvRect*)cvGetSeqElem( faces, i, 0 );
5218 cvRectangle( image, cvPoint(face_rect.x*scale,face_rect.y*scale),
5219 cvPoint((face_rect.x+face_rect.width)*scale,
5220 (face_rect.y+face_rect.height)*scale),
5221 CV_RGB(255,0,0), 3 );
5224 if( small_image != image )
5225 cvReleaseImage( &small_image );
5226 cvReleaseMemStorage( &storage );
5229 /* takes image filename and cascade path from the command line */
5230 int main( int argc, char** argv )
5233 if( argc==3 && (image = cvLoadImage( argv[1], 1 )) != 0 )
5235 CvHaarClassifierCascade* cascade = load_object_detector(argv[2]);
5236 detect_and_draw_objects( image, cascade, 1 );
5237 cvNamedWindow( "test", 0 );
5238 cvShowImage( "test", image );
5240 cvReleaseHaarClassifierCascade( &cascade );
5241 cvReleaseImage( &image );
5248 \cvfunc{SetImagesForHaarClassifierCascade}\label{SetImagesForHaarClassifierCascade}
5250 Assigns images to the hidden cascade.
5253 void cvSetImagesForHaarClassifierCascade( \par CvHaarClassifierCascade* cascade,\par const CvArr* sum,\par const CvArr* sqsum,\par const CvArr* tilted\_sum,\par double scale );
5257 \cvarg{cascade}{Hidden Haar classifier cascade, created by \cross{CreateHidHaarClassifierCascade}}
5258 \cvarg{sum}{Integral (sum) single-channel image of 32-bit integer format. This image as well as the two subsequent images are used for fast feature evaluation and brightness/contrast normalization. They all can be retrieved from input 8-bit or floating point single-channel image using the function \cross{Integral}}
5259 \cvarg{sqsum}{Square sum single-channel image of 64-bit floating-point format}
5260 \cvarg{tilted\_sum}{Tilted sum single-channel image of 32-bit integer format}
5261 \cvarg{scale}{Window scale for the cascade. If \texttt{scale} =1, the original window size is used (objects of that size are searched) - the same size as specified in \cross{LoadHaarClassifierCascade} (24x24 in the case of \texttt{default\_face\_cascade}), if \texttt{scale} =2, a two times larger window is used (48x48 in the case of default face cascade). While this will speed-up search about four times, faces smaller than 48x48 cannot be detected}
5264 The function \texttt{cvSetImagesForHaarClassifierCascade} assigns images and/or window scale to the hidden classifier cascade. If image pointers are NULL, the previously set images are used further (i.e. NULLs mean "do not change images"). Scale parameter has no such a "protection" value, but the previous value can be retrieved by the \cross{GetHaarClassifierCascadeScale} function and reused again. The function is used to prepare cascade for detecting object of the particular size in the particular image. The function is called internally by \cross{HaarDetectObjects}, but it can be called by the user if they are using the lower-level function \cross{RunHaarClassifierCascade}.
5266 \cvfunc{RunHaarClassifierCascade}\label{RunHaarClassifierCascade}
5268 Runs a cascade of boosted classifiers at the given image location.
5271 int cvRunHaarClassifierCascade( \par CvHaarClassifierCascade* cascade,\par CvPoint pt,\par int start\_stage=0 );
5275 \cvarg{cascade}{Haar classifier cascade}
5276 \cvarg{pt}{Top-left corner of the analyzed region. Size of the region is a original window size scaled by the currenly set scale. The current window size may be retrieved using the \cross{GetHaarClassifierCascadeWindowSize} function}
5277 \cvarg{start\_stage}{Initial zero-based index of the cascade stage to start from. The function assumes that all the previous stages are passed. This feature is used internally by \cross{HaarDetectObjects} for better processor cache utilization}
5280 The function \texttt{cvRunHaarHaarClassifierCascade} runs the Haar classifier
5281 cascade at a single image location. Before using this function the
5282 integral images and the appropriate scale (window size) should be set
5283 using \cross{SetImagesForHaarClassifierCascade}. The function returns
5284 a positive value if the analyzed rectangle passed all the classifier stages
5285 (it is a candidate) and a zero or negative value otherwise.
5289 \section{Camera Calibration and 3D Reconstruction}
5291 \subsection{Pinhole Camera Model, Distortion}
5293 The functions in this section use the so-called pinhole camera model. That
5294 is, a scene view is formed by projecting 3D points into the image plane
5295 using a perspective transformation.
5298 s \quad m' = A [R|t] M'
5304 s \vecthree{u}{v}{1} = \vecthreethree
5309 r_{11} & r_{12} & r{13} & t_1 \\
5310 r_{21} & r_{22} & r{23} & t_2 \\
5311 r_{31} & r_{32} & r{33} & t_3
5313 \begin{bmatrix}X\\Y\\Z\\1 \end{bmatrix}
5316 Where $(X, Y, Z)$ are the coordinates of a 3D point in the world
5317 coordinate space, $(u, v)$ are the coordinates of the projection point
5318 in pixels. $A$ is called a camera matrix, or a matrix of
5319 intrinsic parameters. $(cx, cy)$ is a principal point (that is
5320 usually at the image center), and $fx, fy$ are the focal lengths
5321 expressed in pixel-related units. Thus, if an image from camera is
5322 scaled by some factor, all of these parameters should
5323 be scaled (multiplied/divided, respectively) by the same factor. The
5324 matrix of intrinsic parameters does not depend on the scene viewed and,
5325 once estimated, can be re-used (as long as the focal length is fixed (in
5326 case of zoom lens)). The joint rotation-translation matrix $[R|t]$
5327 is called a matrix of extrinsic parameters. It is used to describe the
5328 camera motion around a static scene, or vice versa, rigid motion of an
5329 object in front of still camera. That is, $[R|t]$ translates
5330 coordinates of a point $(X, Y, Z)$ to some coordinate system,
5331 fixed with respect to the camera. The transformation above is equivalent
5332 to the following (when $z \ne 0$):
5336 \vecthree{x}{y}{z} = R \vecthree{X}{Y}{Z} + t\\
5344 Real lenses usually have some distortion, mostly
5345 radial distorion and slight tangential distortion. So, the above model
5350 \vecthree{x}{y}{z} = R \vecthree{X}{Y}{Z} + t\\
5353 x'' = x' (1 + k_1 r^2 + k_2 r^4 + k_3 r^6) + 2 p_1 x' y' + p_2(r^2 + 2 x'^2) \\
5354 y'' = y' (1 + k_1 r^2 + k_2 r^4 + k_3 r^6) + p_1 (r^2 + 2 y'^2) + 2 p_2 x' y' \\
5355 \text{where} \quad r^2 = x'^2 + y'^2 \\
5361 $k_1$, $k_2$, $k_3$ are radial distortion coefficients, $p_1$, $p_2$ are tangential distortion coefficients.
5362 Higher-order coefficients are not considered in OpenCV.
5363 The distortion coefficients also do not depend on the scene viewed, thus they are intrinsic camera parameters.
5364 \emph{And they remain the same regardless of the captured image resolution.}
5365 That is, if, for example, a camera has been calibrated on images of $320
5366 \times 240$ resolution, absolutely the same distortion coefficients can
5367 be used for images of $640 \times 480$ resolution from the same camera (while $fx$,
5368 $fy$, $cx$ and $cy$ need to be scaled appropriately).
5370 The functions below use the above model to
5373 \item Project 3D points to the image plane given intrinsic and extrinsic parameters
5374 \item Compute extrinsic parameters given intrinsic parameters, a few 3D points and their projections.
5375 \item Estimate intrinsic and extrinsic camera parameters from several views of a known calibration pattern (i.e. every view is described by several 3D-2D point correspodences).
5378 \subsection{Camera Calibration}
5380 \cvfunc{ProjectPoints2}\label{ProjectPoints2}
5382 Projects 3D points on to an image plane.
5385 void cvProjectPoints2( \par const CvMat* object\_points,\par const CvMat* rotation\_vector,\par const CvMat* translation\_vector,\par const CvMat* intrinsic\_matrix,\par const CvMat* distortion\_coeffs,\par CvMat* image\_points,\par CvMat* dpdrot=NULL,\par CvMat* dpdt=NULL,\par CvMat* dpdf=NULL,\par CvMat* dpdc=NULL,\par CvMat* dpddist=NULL );
5386 }{CPP}{ProjectPoints2(object\_points,rotation\_vector,translation\_vector,intrinsic\_matrix,distortion\_coeffs, image\_points,dpdrot=NULL,dpdt=NULL,dpdf=NULL,dpdc=NULL,dpddist=NULL)-> None}
5389 \cvarg{object\_points}{The array of object points, 3xN or Nx3, where N is the number of points in the view}
5390 \cvarg{rotation\_vector}{The rotation vector, 1x3 or 3x1}
5391 \cvarg{translation\_vector}{The translation vector, 1x3 or 3x1}
5392 \cvarg{intrinsic\_matrix}{The camera matrix $A = \vecthreethree{fx}{0}{cx}{0}{fy}{cy}{0}{0}{1} $}
5393 \cvarg{distortion\_coeffs}{The vector of distortion coefficients, 4x1 or 1x4 $k_1, k_2, k_3, k_4$. If it is \texttt{NULL}, all of the distortion coefficients are considered 0's}
5394 \cvarg{image\_points}{The output array of image points, 2xN or Nx2, where N is the total number of points in the view}
5395 \cvarg{dpdrot}{Optional Nx3 matrix of derivatives of image points with respect to components of the rotation vector}
5396 \cvarg{dpdt}{Optional Nx3 matrix of derivatives of image points with respect to components of the translation vector}
5397 \cvarg{dpdf}{Optional Nx2 matrix of derivatives of image points with respect to $fx$ and $fy$}
5398 \cvarg{dpdc}{Optional Nx2 matrix of derivatives of image points with respect to $cx$ and $cy$}
5399 \cvarg{dpddist}{Optional Nx4 matrix of derivatives of image points with respect to distortion coefficients}
5402 The function \texttt{cvProjectPoints2} computes projections of 3D
5403 points to the image plane given intrinsic and extrinsic camera
5404 parameters. Optionally, the function computes jacobians - matrices
5405 of partial derivatives of image points as functions of all the
5406 input parameters with respect to the particular parameters, intrinsic and/or
5407 extrinsic. The jacobians are used during the global optimization
5408 in \cross{CalibrateCamera2} and
5409 \cross{FindExtrinsicCameraParams2}. The
5410 function itself is also used to compute back-projection error for with
5411 current intrinsic and extrinsic parameters.
5413 Note, that with intrinsic and/or extrinsic parameters set to special
5414 values, the function can be used to compute just an extrinsic transformation
5415 or just an intrinsic transformation (i.e. distortion of a sparse set
5418 \cvfunc{FindHomography}\label{FindHomography}
5420 Finds the perspective transformation between two planes.
5423 void cvFindHomography( \par const CvMat* src\_points,\par const CvMat* dst\_points,\par CvMat* homography \par
5424 int method=0, \par double ransacReprojThreshold=0, \par CvMat* mask=NULL);
5425 }{CPP}{FindHomography(src\_points,dst\_points)-> homography}
5428 \cvarg{src\_points}{Point coordinates in the original plane, 2xN, Nx2, 3xN or Nx3 array (the latter two are for representation in homogenious coordinates), where N is the number of points}
5429 \cvarg{dst\_points}{Point coordinates in the destination plane, 2xN, Nx2, 3xN or Nx3 array (the latter two are for representation in homogenious coordinates)}
5430 \cvarg{homography}{Output 3x3 homography matrix}
5431 \cvarg{method}{ The method used to computed homography matrix; one of the following:
5433 \cvarg{0}{regular method using all the point pairs}
5434 \cvarg{CV\_RANSAC}{RANSAC-based robust method}
5435 \cvarg{CV\_LMEDS}{Least-Median robust method}
5437 \cvarg{ransacReprojThreshold}{The maximum allowed reprojection error to treat a point pair as an inlier. The parameter is only used in RANSAC-based homography estimation. E.g. if \texttt{dst\_points} coordinates are measured in pixels with pixel-accurate precision, it makes sense to set this parameter somewhere in the range 1 to 3. }
5438 \cvarg{mask}{The optional output mask set by a robust method (\texttt{CV\_RANSAC} or \texttt{CV\_LMEDS}).}
5441 The function \texttt{cvFindHomography} finds the perspective transformation $H$ between the source and the destination planes:
5444 s_i \vecthree{x'_i}{y'_i}{1} \sim H \vecthree{x_i}{y_i}{1}
5447 So that the back-projection error is minimized:
5451 \left( x'_i-\frac{h_{11} x_i + h_{12} y_i + h_{13}}{h_{31} x_i + h_{32} y_i + h_{33}} \right)^2+
5452 \left( y'_i-\frac{h_{21} x_i + h_{22} y_i + h_{23}}{h_{31} x_i + h_{32} y_i + h_{33}} \right)^2
5455 If the parameter method is set to the default value 0, the function
5456 uses all the point pairs and estimates the best suitable homography
5457 matrix. However, if not all of the point pairs ($src\_points_i$,
5458 $dst\_points_i$) fit the rigid perspective transformation (i.e. there
5459 can be outliers), it is still possible to estimate the correct
5460 transformation using one of the robust methods available. Both
5461 methods, \texttt{CV\_RANSAC} and \texttt{CV\_LMEDS}, try many different random subsets
5462 of the corresponding point pairs (of 5 pairs each), estimate
5463 the homography matrix using this subset and a simple least-square
5464 algorithm and then compute the quality/goodness of the computed homography
5465 (which is the number of inliers for RANSAC or the median reprojection
5466 error for LMeDs). The best subset is then used to produce the initial
5467 estimate of the homography matrix and the mask of inliers/outliers.
5469 Regardless of the method, robust or not, the computed homography
5470 matrix is refined further (using inliers only in the case of a robust
5471 method) with the Levenberg-Marquardt method in order to reduce the
5472 reprojection error even more.
5474 The method \texttt{CV\_RANSAC} can handle practically any ratio of outliers,
5475 but it needs the threshold to distinguish inliers from outliers.
5476 The method \texttt{CV\_LMEDS} does not need any threshold, but it works
5477 correctly only when there are more than 50\% of inliers. Finally,
5478 if you are sure in the computed features and there can be only some
5479 small noise, but no outliers, the default method could be the best
5482 The function is used to find initial intrinsic and extrinsic matrices.
5483 Homography matrix is determined up to a scale, thus it is normalized
5484 to make $h_{33} =1$.
5486 \cvfunc{CalibrateCamera2}\label{CalibrateCamera2}
5488 Finds the intrinsic and extrinsic camera parameters using a calibration pattern.
5491 void cvCalibrateCamera2( \par const CvMat* object\_points,\par const CvMat* image\_points,\par const CvMat* point\_counts,\par CvSize image\_size,\par CvMat* intrinsic\_matrix,\par CvMat* distortion\_coeffs,\par CvMat* rotation\_vectors=NULL,\par CvMat* translation\_vectors=NULL,\par int flags=0 );
5492 }{CPP}{CalibrateCamera2(object\_points,image\_points,point\_counts,image\_size,intrinsic\_matrix,distortion\_coeffs,rotation\_vectors,translation\_vectors,flags=0)-> None}
5495 \cvarg{object\_points}{The joint matrix of object points, 3xN or Nx3, where N is the total number of points in all views}
5496 \cvarg{image\_points}{The joint matrix of corresponding image points, 2xN or Nx2, where N is the total number of points in all views}
5497 \cvarg{point\_counts}{Vector containing the number of points in each particular view, 1xM or Mx1, where M is the number of points in a scene}
5498 \cvarg{image\_size}{Size of the image, used only to initialize the intrinsic camera matrix}
5499 \cvarg{intrinsic\_matrix}{The output camera matrix $A = \vecthreethree{fx}{0}{cx}{0}{fy}{cy}{0}{0}{1} $. If \texttt{CV\_CALIB\_USE\_INTRINSIC\_GUESS} and/or \texttt{CV\_CALIB\_FIX\_ASPECT\_RATION} are specified, some or all of \texttt{fx, fy, cx, cy} must be initialized}
5500 \cvarg{distortion\_coeffs}{The output 4x1 or 1x4 vector of distortion coefficients $k_1, k_2, k_3, k_4$}
5501 \cvarg{rotation\_vectors}{The output 3xM or Mx3 array of rotation vectors (compact representation of rotation matrices, \cross{Rodrigues2})}
5502 \cvarg{translation\_vectors}{The output 3xM or Mx3 array of translation vectors}
5503 \cvarg{flags}{Different flags, may be 0 or combination of the following values:
5505 \cvarg{CV\_CALIB\_USE\_INTRINSIC\_GUESS}{\texttt{intrinsic\_matrix} contains the valid initial values of \texttt{fx, fy, cx, cy} that are optimized further. Otherwise, \texttt{(cx, cy)} is initially set to the image center (\texttt{image\_size} is used here), and focal distances are computed in some least-squares fashion. Note, that if intrinsic parameters are known, there is no need to use this function. Use \cross{FindExtrinsicCameraParams2} instead.}
5506 \cvarg{CV\_CALIB\_FIX\_PRINCIPAL\_POINT}{The principal point is not changed during the global optimization, it stays at the center and at the other location specified (when \texttt{CV\_CALIB\_USE\_INTRINSIC\_GUESS} is set as well)}
5507 \cvarg{CV\_CALIB\_FIX\_ASPECT\_RATIO}{The optimization procedure considers only one of \texttt{fx} and \texttt{fy} as independent variables and keeps the aspect ratio \texttt{fx/fy} the same as it was set initially in \texttt{intrinsic\_matrix}. In this case the actual initial values of \texttt{(fx, fy)} are either taken from the matrix (when \texttt{CV\_CALIB\_USE\_INTRINSIC\_GUESS} is set) or estimated somehow (in the latter case \texttt{fx, fy} may be set to arbitrary values, only their ratio is used).}
5508 \cvarg{CV\_CALIB\_ZERO\_TANGENT\_DIST}{Tangential distortion coefficients are set to zeros and do not change during the optimization.}}
5512 The function \texttt{cvCalibrateCamera2} estimates the intrinsic camera
5513 parameters and extrinsic parameters for each of the views. The
5514 coordinates of 3D object points and their correspondent 2D projections
5515 in each view must be specified. That may be achieved by using an
5516 object with known geometry and easily detectable feature points.
5517 Such an object is called a calibration rig or calibration pattern,
5518 and OpenCV has built-in support for a chessboard as a calibration
5519 rig (see \cross{FindChessboardCornerGuesses}). Currently, initialization
5520 of intrinsic parameters (when \texttt{CV\_CALIB\_USE\_INTRINSIC\_GUESS}
5521 is not set) is only implemented for planar calibration rigs
5522 (z-coordinates of object points must be all 0's or all 1's). 3D
5523 rigs can still be used as long as initial \texttt{intrinsic\_matrix}
5524 is provided. After the initial values of intrinsic and extrinsic
5525 parameters are computed, they are optimized to minimize the total
5526 back-projection error - the sum of squared differences between the
5527 actual coordinates of image points and the ones computed using
5528 \cross{ProjectPoints2}.
5530 Note: if you're using a non-square (=non-NxN) grid and
5531 \cross{FindChessboardCorners} for calibration, and cvCalibrateCamera2 returns
5532 bad values (i.e. zero distortion coefficients, an image center of
5533 (w/2-0.5,h/2-0.5), and / or large differences between $fx$ and $fy$ (ratios of
5534 10:1 or more)), then you've probaby used pattern\_size=cvSize(rows,cols),
5535 but should use pattern\_size=cvSize(cols,rows) in \cross{FindChessboardCorners}.
5537 \cvfunc{FindExtrinsicCameraParams2}\label{FindExtrinsicCameraParams2}
5539 Finds the extrinsic camera parameters for a particular view.
5542 void cvFindExtrinsicCameraParams2( \par const CvMat* object\_points,\par const CvMat* image\_points,\par const CvMat* intrinsic\_matrix,\par const CvMat* distortion\_coeffs,\par CvMat* rotation\_vector,\par CvMat* translation\_vector );
5543 }{CPP}{FindExtrinsicCameraParams2(object\_points,image\_points,intrinsic\_matrix,distortion\_coeffs,rotation\_vector,translation\_vector)-> None}
5546 \cvarg{object\_points}{The array of object points, 3xN or Nx3, where N is the number of points in the view}
5547 \cvarg{image\_points}{The array of corresponding image points, 2xN or Nx2, where N is the number of points in the view}
5548 \cvarg{intrinsic\_matrix}{The input camera matrix $A = \vecthreethree{fx}{0}{cx}{0}{fy}{cy}{0}{0}{1} $}
5549 \cvarg{distortion\_coeffs}{The input 4x1 or 1x4 vector of distortion coefficients $k_1, k_2, k_3, k_4$. If it is NULL, all of the distortion coefficients are set to 0}
5550 \cvarg{rotation\_vector}{The output 3x1 or 1x3 rotation vector (compact representation of a rotation matrix, \cross{Rodrigues2}}
5551 \cvarg{translation\_vector}{The output 3x1 or 1x3 translation vector}
5554 The function \texttt{cvFindExtrinsicCameraParams2} estimates the extrinsic camera parameters using known intrinsic parameters and extrinsic parameters for each view. The coordinates of 3D object points and their correspondent 2D projections must be specified. This function also minimizes back-projection error.
5556 \cvfunc{StereoCalibrate}
5558 Calibrates stereo camera.
5562 void cvStereoCalibrate( \par const CvMat* object\_points, \par const CvMat* image\_points1,
5563 \par const CvMat* image\_points2, \par const CvMat* point\_counts,
5564 \par CvMat* camera\_matrix1, \par CvMat* dist\_coeffs1,
5565 \par CvMat* camera\_matrix2, \par CvMat* dist\_coeffs2,
5566 \par CvSize image\_size, \par CvMat* R, \par CvMat* T,
5567 \par CvMat* E=0, \par CvMat* F=0,
5568 \par CvTermCriteria term\_crit=cvTermCriteria(
5569 \par CV\_TERMCRIT\_ITER+CV\_TERMCRIT\_EPS,30,1e-6),
5570 \par int flags=CV\_CALIB\_FIX\_INTRINSIC );
5572 }{CPP}{StereoCalibrate(\par object\_points,\par image\_points1,\par image\_points2,\par point\_counts,\par camera\_matrix1,\par dist\_coeffs1,\par camera\_matrix2,\par dist\_coeffs2,\par image\_size,\par R,\par T,\par E=NULL,\par F=NULL,\par term\_crit=cvTermCriteria(CV\_TERMCRIT\_ITER+CV\_TERMCRIT\_EPS,30,1e-6),\par flags=CV\_CALIB\_FIX\_INTRINSIC)-> None}
5575 \cvarg{object\_points}{The joint matrix of object points, 3xN or Nx3, where N is the total number of points in all views.}
5576 \cvarg{image\_points1}{The joint matrix of corresponding image points in the views from the 1st camera, 2xN or Nx2, where N is the total number of points in all views.}
5577 \cvarg{image\_points2}{The joint matrix of corresponding image points in the views from the 2nd camera, 2xN or Nx2, where N is the total number of points in all views.}
5578 \cvarg{point\_counts}{Vector containing numbers of points in each view, 1xM or Mx1, where M is the number of views.}
5579 \cvarg{camera\_matrix1, camera\_matrix2}{The input/output camera matrices [${fx}_k 0 {cx}_k; 0 {fy}_k {cy}_k; 0 0 1$]. If \texttt{CV\_CALIB\_USE\_INTRINSIC\_GUESS} or \texttt{CV\_CALIB\_FIX\_ASPECT\_RATIO} are specified, some or all of the elements of the matrices must be initialized.}
5580 \cvarg{dist\_coeffs1, dist\_coeffs2}{The input/output vectors of distortion coefficients for each camera, \href{\#Pinhole Camera Model, Distortion}{4x1, 1x4, 5x1 or 1x5.}}
5581 \cvarg{image\_size}{Size of the image, used only to initialize intrinsic camera matrix.}
5582 \cvarg{R}{The rotation matrix between the 1st and the 2nd cameras' coordinate systems.}
5583 \cvarg{T}{The translation vector between the cameras' coordinate systems.}
5584 \cvarg{E}{The optional output essential matrix.}
5585 \cvarg{F}{The optional output fundamental matrix.}
5586 \cvarg{term\_crit}{Termination criteria for the iterative optimiziation algorithm.}
5587 \cvarg{flags}{Different flags, may be 0 or combination of the following values:
5589 \cvarg{CV\_CALIB\_FIX\_INTRINSIC}{If it is set, \texttt{camera\_matrix1,2}, as well as \texttt{dist\_coeffs1,2} are fixed, so that only extrinsic parameters are optimized.}
5590 \cvarg{CV\_CALIB\_USE\_INTRINSIC\_GUESS}{The flag allows the function to optimize some or all of the intrinsic parameters, depending on the other flags, but the initial values are provided by the user.}
5591 \cvarg{CV\_CALIB\_FIX\_PRINCIPAL\_POINT}{The principal points are fixed during the optimization.}
5592 \cvarg{CV\_CALIB\_FIX\_FOCAL\_LENGTH}{${fx}_k$ and ${fy}_k$ are fixed.}
5593 \cvarg{CV\_CALIB\_FIX\_ASPECT\_RATIO}{${fy}_k$ is optimized, but the ratio ${fx}_k/{fy}_k$ is fixed.}
5594 \cvarg{CV\_CALIB\_SAME\_FOCAL\_LENGTH}{Enforces ${fx}_0={fx}_1$ and ${fy}_0={fy}_1$. \texttt{CV\_CALIB\_ZERO\_TANGENT\_DIST} - Tangential distortion coefficients for each camera are set to zeros and fixed there.}
5595 \cvarg{CV\_CALIB\_FIX\_K1}{The 0-th distortion coefficients (k1) are fixed.}
5596 \cvarg{CV\_CALIB\_FIX\_K2}{The 1-st distortion coefficients (k2) are fixed.}
5597 \cvarg{CV\_CALIB\_FIX\_K3}{The 4-th distortion coefficients (k3) are fixed.}
5601 The function \texttt{cvStereoCalibrate} estimates transformation between the 2 cameras making a stereo pair. If we have a stereo camera, where the relative position and orientatation of the 2 cameras is fixed, and if we computed poses of an object relative to the fist camera and to the second camera, (R1, T1) and (R2, T2), respectively (that can be done with \cross{cvFindExtrinsicCameraParams2}), obviously, those poses will relate to each other, i.e. given ($R_1$, $T_1$) it should be possible to compute ($R_2$, $T_2$) - we only need to know the position and orientation of the 2nd camera relative to the 1st camera. That's what the described function does. It computes ($R$, $T$) such that:
5608 Optionally, it computes the essential matrix E:
5619 where $T_i$ are components of the translation vector $T$: $T=[T_0, T_1, T_2]^T$. And also the function can compute the fundamental matrix F:
5621 $F = inv(camera\_matrix2)^T*E*inv(camera\_matrix1)$
5623 Besides the stereo-related information, the function can also perform full calibration of each of the 2 cameras. However, because of the high dimensionality of the parameter space and noise in the input data the function can diverge from the correct solution. Thus, if intrinsic parameters can be estimated with high accuracy for each of the cameras individually (e.g. using \cross{cvCalibrateCamera2}), it is recommended to do so and then pass \texttt{CV\_CALIB\_FIX\_INTRINSIC} flag to the function along with the computed intrinsic parameters. Otherwise, if all the parameters are estimated at once, it makes sense to restrict some parameters, e.g. pass \texttt{CV\_CALIB\_SAME\_FOCAL\_LENGTH} and \texttt{CV\_CALIB\_ZERO\_TANGENT\_DIST} flags, which are usually reasonable assumptions.
5625 \cvfunc{StereoRectify}
5627 Computes rectification transform for stereo camera.
5631 void cvStereoRectify( \par const CvMat* camera\_matrix1, \par const CvMat* camera\_matrix2,
5632 \par const CvMat* dist\_coeffs1, \par const CvMat* dist\_coeffs2,
5633 \par CvSize image\_size, \par const CvMat* R, \par const CvMat* T,
5634 \par CvMat* R1, \par CvMat* R2, \par CvMat* P1, \par CvMat* P2,
5635 \par CvMat* Q=0, \par int flags=CV\_CALIB\_ZERO\_DISPARITY );
5637 }{CPP}{StereoRectify(\par camera\_matrix1,\par camera\_matrix2,\par dist\_coeffs1,\par dist\_coeffs2,\par image\_size,\par R,\par T,\par R1,\par R2,\par P1,\par P2,\par Q=NULL,\par flags=CV\_CALIB\_ZERO\_DISPARITY)-> None}
5640 \cvarg{camera\_matrix1, camera\_matrix2}{The camera matrices [${fx}_k$ 0 ${cx}_k$; 0 ${fy}_k$ ${cy}_k$; 0 0 1].}
5641 \cvarg{dist\_coeffs1, dist\_coeffs2}{The vectors of distortion coefficients for each camera, \href{\#Pinhole Camera Model, Distortion}{4x1, 1x4, 5x1 or 1x5.}}
5642 \cvarg{image\_size}{Size of the image used for stereo calibration.}
5643 \cvarg{R}{The rotation matrix between the 1st and the 2nd cameras' coordinate systems.}
5644 \cvarg{T}{The translation vector between the cameras' coordinate systems.}
5645 \cvarg{R1, R2}{3x3 Rectification transforms (rotation matrices) for the first and the second cameras, respectively.}
5646 \cvarg{P1, P2}{3x4 Projection matrices in the new (rectified) coordinate systems.}
5647 \cvarg{Q}{The optional output disparity-to-depth mapping matrix, 4x4, see \cross{cvReprojectImageTo3D}.}
5648 \cvarg{flags}{The operation flags; may be 0 or \texttt{CV\_CALIB\_ZERO\_DISPARITY}. If the flag is set, the function makes the principal points of each camera have the same pixel coordinates in the rectified views. And if the flag is not set, the function can shift one of the image in horizontal or vertical direction (depending on the orientation of epipolar lines) in order to maximise the useful image area. }
5651 The function \texttt{cvStereoRectify} computes the rotation matrices for each camera that (virtually) make both camera image planes the same plane. Consequently, that makes all the epipolar lines parallel and thus simplifies the dense stereo correspondence problem. On input the function takes the matrices computed by \cross{cvStereoCalibrate} and on output it gives 2 rotation matrices and also 2 projection matrices in the new coordinates. The function is normally called after \cross{cvStereoCalibrate} that computes both camera matrices, the distortion coefficients, R and T. The 2 cases are distinguished by the function:
5654 \item{Horizontal stereo, when 1st and 2nd camera views are shifted relative to each other mainly along the x axis (with possible small vertical shift). Then in the rectified images the corresponding epipolar lines in left and right cameras will be horizontal and have the same y-coordinate. P1 and P2 will look as:
5667 f & 0 & cx2 & Tx*f\\
5674 where $T_x$ is horizontal shift between the cameras and cx1=cx2 if \texttt{CV\_CALIB\_ZERO\_DISPARITY} is set.}
5675 \item{Vertical stereo, when 1st and 2nd camera views are shifted relative to each other mainly in vertical direction (and probably a bit in the horizontal direction too). Then the epipolar lines in the rectified images will be vertical and have the same x coordinate. P2 and P2 will look as:
5689 0 & f & cy2 & Ty*f\\
5695 where $T_y$ is vertical shift between the cameras and cy1=cy2 if \texttt{CV\_CALIB\_ZERO\_DISPARITY} is set.}
5698 As you can see, the first 3 columns of P1 and P2 will effectively be the new "rectified" camera matrices.
5700 \cvfunc{StereoRectifyUncalibrated}
5702 Computes rectification transform for uncalibrated stereo camera.
5706 void cvStereoRectifyUncalibrated( \par const CvMat* points1, \par const CvMat* points2,
5707 \par const CvMat* F, \par CvSize image\_size,
5708 \par CvMat* H1, \par CvMat* H2,
5709 \par double threshold=5 );
5711 }{CPP}{StereoRectifyUncalibrated(points1,points2,F,image\_size,H1,H2,threshold=5)-> None}
5714 \cvarg{points1, points2}{The 2 arrays of corresponding 2D points.}
5715 \cvarg{F}{Fundamental matrix. It can be computed using the same set of point pairs points1 and points2 using \cross{cvFindFundamentalMat}.}
5716 \cvarg{image\_size}{Size of the image.}
5717 \cvarg{H1, H2}{The rectification homography matrices for the first and for the second images.}
5718 \cvarg{threshold}{Optional threshold used to filter out the outliers. If the parameter is greater than zero, then all the point pairs that do not comply the epipolar geometry well enough (that is, the points for which $fabs(points2[i]^T*F*points1[i])>threshold$) are rejected prior to computing the homographies. }
5721 The function \texttt{cvStereoRectifyUncalibrated} computes the rectification transformations without knowing intrinsic parameters of the cameras and their relative position in space, hence the suffix "Uncalibrated". Another related difference from \cross{cvStereoRectify} is that the function outputs not the rectification transformations in the object (3D) space, but the planar perspective transformations, encoded by the homography matrices H1 and H2. The function implements the following algorithm \href{\#Hartly99}{[Hartley99]}.
5723 Note that while the algorithm does not need to know the intrinsic parameters of the cameras, it heavily depends on the epipolar geometry. Therefore, if the camera lenses have significant distortion, it would better be corrected before computing the fundamental matrix and calling this function. For example, distortion coefficients can be estimated for each head of stereo camera separately by using \cross{cvCalibrateCamera2} and then the images can be corrected using \cross{cvUndistort2}.
5725 \cvfunc{Rodrigues2}\label{Rodrigues2}
5727 Converts a rotation matrix to a rotation vector or vice versa.
5730 int cvRodrigues2( \par const CvMat* src,\par CvMat* dst,\par CvMat* jacobian=0 );
5731 }{CPP}{Rodrigues2(src,dst,jacobian=0)-> None}
5734 \cvarg{src}{The input rotation vector (3x1 or 1x3) or rotation matrix (3x3)}
5735 \cvarg{dst}{The output rotation matrix (3x3) or rotation vector (3x1 or 1x3), respectively}
5736 \cvarg{jacobian}{Optional output Jacobian matrix, 3x9 or 9x3 - partial derivatives of the output array components with respect to the input array components}
5739 The function \texttt{cvRodrigues2} converts a rotation vector to a rotation matrix or vice versa. A rotation vector is a compact representation of rotation matrix. Direction of the rotation vector is the rotation axis and the length of the vector is the rotation angle around the axis. The rotation matrix $R$, corresponding to the rotation vector $r$, is computed as following:
5743 \theta \leftarrow norm(r)\\
5744 r \leftarrow r/\theta\\
5745 R = \cos{\theta} I + (1-\cos{\theta}) r r^T + \sin{\theta}
5753 Inverse transformation can also be done easily as
5765 A rotation vector is a convenient representation of a rotation matrix
5766 as a matrix with only 3 degrees of freedom. The representation is
5767 used in the global optimization procedures inside
5768 \cross{FindExtrinsicCameraParams2}
5769 and \cross{CalibrateCamera2}.
5771 \cvfunc{Undistort2}\label{Undistort2}
5773 Transforms an image to compensate for lens distortion.
5776 void cvUndistort2( \par const CvArr* src,\par CvArr* dst,\par const CvMat* intrinsic\_matrix,\par const CvMat* distortion\_coeffs );
5777 }{CPP}{Undistort2(src,dst,intrinsic\_matrix,distortion\_coeffs)-> None}
5780 \cvarg{src}{The input (distorted) image}
5781 \cvarg{dst}{The output (corrected) image}
5782 \cvarg{intrinsic\_matrix}{The camera matrix $A = \vecthreethree{fx}{0}{cx}{0}{fy}{cy}{0}{0}{1} $}
5783 \cvarg{distortion\_coeffs}{The 4x1 or 1x4 vector of distortion coefficients $k_1, k_2, k_3, k_4$.}
5786 The function \texttt{cvUndistort2} transforms the image to compensate
5787 radial and tangential lens distortion. The camera matrix and
5788 distortion parameters can be determined using
5789 \cross{CalibrateCamera2}. For every
5790 pixel in the output image the function computes the coordinates of the
5791 corresponding location in the input image using the formulas in the
5792 section beginning. Then, the pixel value is computed using bilinear
5793 interpolation. If the resolution of images is different from what
5794 was used at the calibration stage, $fx, fy, cx$ and $cy$
5795 need to be adjusted appropriately, while the distortion coefficients
5798 \cvfunc{InitUndistortMap}\label{InitUndistortMap}
5800 Computes an undistortion map.
5803 void cvInitUndistortMap( \par const CvMat* intrinsic\_matrix,\par const CvMat* distortion\_coeffs,\par CvArr* mapx,\par CvArr* mapy );
5804 }{CPP}{InitUndistortMap(camera\_matrix,distortion\_coeffs,mapx,mapy)-> None}
5807 \cvarg{intrinsic\_matrix}{The output camera matrix $A = \vecthreethree{fx}{0}{cx}{0}{fy}{cy}{0}{0}{1} $}
5808 \cvarg{distortion\_coeffs}{The output 4x1 or 1x4 vector of distortion coefficients $k_1, k_2, k_3, k_4$.}
5809 \cvarg{mapx}{The output array of x-coordinates of the map}
5810 \cvarg{mapy}{The output array of y-coordinates of the map}
5813 The function \texttt{cvInitUndistortMap} pre-computes the undistortion map - coordinates of the corresponding pixel in the distorted image for every pixel in the corrected image. Then, the map (together with input and output images) can be passed to the \cross{Remap} function.
5815 \cvfunc{InitUndistortRectifyMap}
5817 Computes the undistortion and rectification transformation map of a head of a stereo camera.
5821 void cvInitUndistortRectifyMap( \par const CvMat* camera\_matrix,
5822 \par const CvMat* dist\_coeffs,
5823 \par const CvMat* R,
5824 \par const CvMat* new\_camera\_matrix,
5825 \par CvArr* mapx, /par CvArr* mapy );
5827 }{CPP}{InitUndistortRectifyMap(camera\_matrix,dist\_coeffs,R,new\_camera\_matrix,mapx,mapy)-> None}
5830 \cvarg{camera\_matrix}{The camera matrix $A=[fx 0 cx; 0 fy cy; 0 0 1]$}
5831 \cvarg{dist\_coeffs}{The vector of distortion coefficients, \cross{4x1, 1x4, 5x1 or 1x5}}
5832 \cvarg{R}{The rectification transformation in object space (3x3 matrix). R1 or R2, computed by \cross{StereoRectify} can be passed here. If the parameter is NULL, the identity matrix is used}
5833 \cvarg{new\_camera\_matrix}{The new camera matrix $A'=[fx' 0 cx'; 0 fy' cy'; 0 0 1]$}
5834 \cvarg{mapx}{The output array of x-coordinates of the map}
5835 \cvarg{mapy}{The output array of y-coordinates of the map}
5838 The function \texttt{InitUndistortRectifyMap} is an extended version of \cross{InitUndistortMap}. That is, in addition to the correction of lens distortion, the function can also apply arbitrary perspective transformation R and finally it can scale and shift the image according to the new camera matrix. That is, in pseudo code the transformation can be represented as:
5841 // (u,v) is the input point,
5842 // camera_matrix=[fx 0 cx; 0 fy cy; 0 0 1]
5843 // new_camera_matrix=[fx' 0 cx'; 0 fy' cy'; 0 0 1]
5846 [X,Y,W]T = R-1*[x y 1]T
5848 x" = x'*(1 + k1r2 + k2r4 + k3r6) + 2*p1x'*y' + p2(r2+2*x'2)
5849 y" = y'*(1 + k1r2 + k2r4 + k3r6) + p1(r2+2*y'2) + 2*p2*x'*y'
5850 mapx(u,v) = x"*fx + cx
5851 mapy(u,v) = y"*fy + cy
5854 Note that the code above does the reverse transformation from the target image (i.e. the ideal one, after undistortion and rectification) to the original "raw" image straight from the camera. That's for bilinear interpolation purposes and in order to fill the whole destination image w/o gaps using \cross{Remap}.\
5856 Normally, this function is called [twice, once for each head of stereo camera] after \cross{StereoRectify}. But it is also possible to compute the rectification transformations directly from the fundamental matrix, e.g. by using \cross{StereoRectifyUncalibrated}. Such functions work with pixels and produce homographies as rectification transformations, not rotation matrices R in 3D space. In this case, the R can be computed from the homography matrix \texttt{H} as
5859 R = inv(camera_matrix)*H*camera_matrix
5862 \cvfunc{UndistortPoints}
5864 Computes the ideal point coordinates from the observed point coordinates.
5868 void cvUndistortPoints( \par const CvMat* src, \par CvMat* dst,
5869 \par const CvMat* camera\_matrix,
5870 \par const CvMat* dist\_coeffs,
5871 \par const CvMat* R=NULL,
5872 \par const CvMat* P=NULL);
5874 }{CPP}{UndistortPoints(src,dst,camera\_matrix,dist\_coeffs,R=NULL,P=NULL)-> None}
5877 \cvarg{src}{The observed point coordinates}
5878 \cvarg{dst}{The ideal point coordinates, after undistortion and reverse perspective transformation}
5879 \cvarg{camera\_matrix}{The camera matrix $A=[fx 0 cx; 0 fy cy; 0 0 1]$}
5880 \cvarg{dist\_coeffs}{he vector of distortion coefficients, \cross{4x1, 1x4, 5x1 or 1x5}}
5881 \cvarg{R}{The rectification transformation in object space (3x3 matrix). \texttt{R1} or \texttt{R2}, computed by \cross{StereoRectify} can be passed here. If the parameter is NULL, the identity matrix is used}
5882 \cvarg{P}{The new camera matrix (3x3) or the new projection matrix (3x4). \texttt{P1} or \texttt{P2}, computed by \cross{StereoRectify} can be passed here. If the parameter is NULL, the identity matrix is used}
5885 The function \texttt{UndistortPoints} is similar to \cross{InitUndistortRectifyMap} and is opposite to it at the same time. The functions are similar in that they both are used to correct lens distortion and to perform the optional perspective (rectification) transformation. They are opposite because the function \cross{InitUndistortRectifyMap} does actually perform the reverse transformation in order to initialize the maps properly, while this function does the forward transformation. That is, in pseudo-code it can be expressed as:
5888 // (u,v) is the input point, (u', v') is the output point
5889 // camera_matrix=[fx 0 cx; 0 fy cy; 0 0 1]
5890 // P=[fx' 0 cx' tx; 0 fy' cy' ty; 0 0 1 tz]
5893 (x',y') = undistort(x",y",dist_coeffs)
5894 [X,Y,W]T = R*[x' y' 1]T
5900 where undistort() is approximate iterative algorithm that estimates the normalized original point coordinates out of the normalized distorted point coordinates ("normalized" means that the coordinates do not depend on the camera matrix).
5902 The function can be used as for stereo cameras, as well as for individual cameras when R=NULL.
5904 \cvfunc{FindChessboardCorners}\label{FindChessboardCorners}
5906 Finds the positions of the internal corners of the chessboard.
5909 int cvFindChessboardCorners( \par const void* image,\par CvSize pattern\_size,\par CvPoint2D32f* corners,\par int* corner\_count=NULL,\par int flags=CV\_CALIB\_CB\_ADAPTIVE\_THRESH );
5910 }{CPP}{FindChessboardCorners(image, pattern\_size, flags=CV\_CALIB\_CB\_ADAPTIVE\_THRESH) -> corners}
5913 \cvarg{image}{Source chessboard view; it must be an 8-bit grayscale or color image}
5914 \cvarg{pattern\_size}{The number of inner corners per chessboard row and column}
5915 ( pattern\_size = cvSize(points\_per\_row,points\_per\_colum) = cvSize(columns,rows) )
5916 \cvarg{corners}{The output array of corners detected}
5917 \cvC{\cvarg{corner\_count}{The output corner counter. If it is not NULL, it stores the number of corners found}}
5918 \cvarg{flags}{Various operation flags, can be 0 or a combination of the following values:
5920 \cvarg{CV\_CALIB\_CB\_ADAPTIVE\_THRESH}{use adaptive thresholding to convert the image to black and white, rather than a fixed threshold level (computed from the average image brightness).}
5921 \cvarg{CV\_CALIB\_CB\_NORMALIZE\_IMAGE}{normalize the image using \cross{NormalizeHist} before applying fixed or adaptive thresholding.}
5922 \cvarg{CV\_CALIB\_CB\_FILTER\_QUADS}{use additional criteria (like contour area, perimeter, square-like shape) to filter out false quads that are extracted at the contour retrieval stage.}
5926 The function \texttt{cvFindChessboardCorners} attempts to determine
5927 whether the input image is a view of the chessboard pattern and
5928 locate the internal chessboard corners. The function returns a non-zero
5929 value if all of the corners have been found and they have been placed
5930 in a certain order (row by row, left to right in every row),
5931 otherwise, if the function fails to find all the corners or reorder
5932 them, it returns 0. For example, a regular chessboard has 8 x 8
5933 squares and 7 x 7 internal corners, that is, points, where the black
5934 squares touch each other. The coordinates detected are approximate,
5935 and to determine their position more accurately, the user may use
5936 the function \cross{FindCornerSubPix}.
5938 \cvfunc{DrawChessBoardCorners}\label{DrawChessBoardCorners}
5940 Renders the detected chessboard corners.
5943 void cvDrawChessboardCorners( \par CvArr* image,\par CvSize pattern\_size,\par CvPoint2D32f* corners,\par int count,\par int pattern\_was\_found );
5944 }{CPP}{DrawChessboardCorners(image,pattern\_size,corners,pattern\_was\_found)-> None}
5947 \cvarg{image}{The destination image; it must be an 8-bit color image}
5948 \cvarg{pattern\_size}{The number of inner corners per chessboard row and column. ( pattern\_size = cvSize(points\_per\_row,points\_per\_colum) = cvSize(columns,rows) )}
5949 \cvarg{corners}{The array of corners detected}
5950 \cvarg{count}{The number of corners}
5951 \cvarg{pattern\_was\_found}{Indicates whether the complete board was found $(\ne 0)$ or not $(=0)$. One may just pass the return value \cross{FindChessboardCorners} here}
5954 The function \texttt{cvDrawChessboardCorners} draws the individual chessboard corners detected as red circles if the board was not found $(\texttt{pattern\_was\_found} =0)$ or as colored corners connected with lines if the board was found $(\texttt{pattern\_was\_found} \ne 0)$.
5957 \cvfunc{RQDecomp3x3}\label{RQDecomp3x3}
5959 Computes the `RQ' decomposition of 3x3 matrices.
5962 void cvRQDecomp3x3( \par const CvMat *matrixM,\par CvMat *matrixR,\par CvMat *matrixQ,\par CvMat *matrixQx=NULL,\par CvMat *matrixQy=NULL,\par CvMat *matrixQz=NULL,\par CvPoint3D64f *eulerAngles=NULL);
5963 }{CPP}{RQDecomp3x3(matrixM, matrixR, matrixQ, matrixQx = None, matrixQy = None, matrixQz = None) -> eulerAngles}
5966 \cvarg{matrixM}{The 3x3 input matrix M}
5967 \cvarg{matrixR}{The output 3x3 upper-triangular matrix R}
5968 \cvarg{matrixQ}{The output 3x3 orthogonal matrix Q}
5969 \cvarg{matrixQx}{Optional 3x3 rotation matrix around x-axis}
5970 \cvarg{matrixQy}{Optional 3x3 rotation matrix around y-axis}
5971 \cvarg{matrixQz}{Optional 3x3 rotation matrix around z-axis}
5972 \cvarg{eulerAngles}{Optional 3 points containing the three Euler angles of rotation}
5975 The function \texttt{cvRQDecomp3x3} computes a RQ decomposition using the given rotations. This function is used in \cross{DecomposeProjectionMatrix} to decompose the left 3x3 submatrix of a projection matrix into a calibration and a rotation matrix.
5977 It optionally returns three rotation matrices, one for each axis, and the three Euler angles that could be used in OpenGL.
5980 \cvfunc{DecomposeProjectionMatrix}\label{DecomposeProjectionMatrix}
5982 Computes the `RQ' decomposition of 3x3 matrices.
5985 void cvDecomposeProjectionMatrix( \par const CvMat *projMatr,\par CvMat *calibMatr,\par CvMat *rotMatr,\par CvMat *posVect,\par CvMat *rotMatrX=NULL,\par CvMat *rotMatrY=NULL,\par CvMat *rotMatrZ=NULL,\par CvPoint3D64f *eulerAngles=NULL);
5986 }{CPP}{DecomposeProjectionMatrix(projMatr, calibMatr, rotMatr, posVect, rotMatrX = None, rotMatrY = None, rotMatrZ = None) -> eulerAngles}
5989 \cvarg{projMatr}{The 3x4 input projection matrix P}
5990 \cvarg{calibMatr}{The output 3x3 internal calibration matrix K}
5991 \cvarg{rotMatr}{The output 3x3 external rotation matrix R}
5992 \cvarg{posVect}{The output 4x1 external homogenious position vector C}
5993 \cvarg{rotMatrX}{Optional 3x3 rotation matrix around x-axis}
5994 \cvarg{rotMatrY}{Optional 3x3 rotation matrix around y-axis}
5995 \cvarg{rotMatrZ}{Optional 3x3 rotation matrix around z-axis}
5996 \cvarg{eulerAngles}{Optional 3 points containing the three Euler angles of rotation}
5999 The function \texttt{cvDecomposeProjectionMatrix} computes a decomposition of a projection matrix into a calibration and a rotation matrix and the position of the camera.
6001 It optionally returns three rotation matrices, one for each axis, and the three Euler angles that could be used in OpenGL.
6004 \subsection{Pose Estimation}
6007 \cvfunc{CreatePOSITObject}\label{CreatePOSITObject}
6009 Initializes a structure containing object information.
6012 CvPOSITObject* cvCreatePOSITObject( \par CvPoint3D32f* points,\par int point\_count );
6013 }{CPP}{CreatePOSITObject(points)-> POSITObject}
6016 \cvarg{points}{Pointer to the points of the 3D object model}
6017 \cvarg{point\_count}{Number of object points}
6020 The function \texttt{cvCreatePOSITObject} allocates memory for the object structure and computes the object inverse matrix.
6022 The preprocessed object data is stored in the structure \cross{CvPOSITObject}, internal for OpenCV, which means that the user cannot directly access the structure data. The user may only create this structure and pass its pointer to the function.
6024 An object is defined as a set of points given in a coordinate system. The function \cross{POSIT} computes a vector that begins at a camera-related coordinate system center and ends at the \texttt{points[0]} of the object.
6026 Once the work with a given object is finished, the function \cross{ReleasePOSITObject} must be called to free memory.
6028 \cvfunc{POSIT}\label{POSIT}
6030 Implements the POSIT algorithm.
6033 void cvPOSIT( \par CvPOSITObject* posit\_object,\par CvPoint2D32f* image\_points,\par double focal\_length,\par CvTermCriteria criteria,\par CvMatr32f rotation\_matrix,\par CvVect32f translation\_vector );
6034 }{CPP}{POSIT(posit\_object,image\_points,focal\_length,criteria)-> rotation\_matrix,translation\_vector}
6037 \cvarg{posit\_object}{Pointer to the object structure}
6038 \cvarg{image\_points}{Pointer to the object points projections on the 2D image plane}
6039 \cvarg{focal\_length}{Focal length of the camera used}
6040 \cvarg{criteria}{Termination criteria of the iterative POSIT algorithm}
6041 \cvarg{rotation\_matrix}{Matrix of rotations}
6042 \cvarg{translation\_vector}{Translation vector}
6045 The function \texttt{cvPOSIT} implements the POSIT algorithm. Image coordinates are given in a camera-related coordinate system. The focal length may be retrieved using the camera calibration functions. At every iteration of the algorithm a new perspective projection of the estimated pose is computed.
6047 Difference norm between two projections is the maximal distance between corresponding points. The parameter \texttt{criteria.epsilon} serves to stop the algorithm if the difference is small.
6050 \cvfunc{ReleasePOSITObject}\label{ReleasePOSITObject}
6052 Deallocates a 3D object structure.
6055 void cvReleasePOSITObject( \par CvPOSITObject** posit\_object );
6059 \cvarg{posit\_object}{Double pointer to \texttt{CvPOSIT} structure}
6062 The function \texttt{cvReleasePOSITObject} releases memory previously allocated by the function \cross{CreatePOSITObject}.
6066 \cvfunc{CalcImageHomography}\label{CalcImageHomography}
6068 Calculates the homography matrix for an oblong planar object (e.g. arm).
6071 void cvCalcImageHomography( \par float* line,\par CvPoint3D32f* center,\par float* intrinsic,\par float* homography );
6072 }{CPP}{CalcImageHomography(line,points)-> intrinsic,homography}
6075 \cvarg{line}{the main object axis direction (vector (dx,dy,dz))}
6076 \cvarg{center}{object center ((cx,cy,cz))}
6077 \cvarg{intrinsic}{intrinsic camera parameters (3x3 matrix)}
6078 \cvarg{homography}{output homography matrix (3x3)}
6081 The function \texttt{cvCalcImageHomography} calculates the homography
6082 matrix for the initial image transformation from image plane to the
6083 plane, defined by a 3D oblong object line (See \_\_Figure 6-10\_\_
6084 in the OpenCV Guide 3D Reconstruction Chapter).
6087 \subsection{Epipolar Geometry}
6089 \cvfunc{FindFundamentalMat}\label{FindFundamentalMat}
6091 Calculates the fundamental matrix from the corresponding points in two images.
6094 int cvFindFundamentalMat( \par const CvMat* points1,\par const CvMat* points2,\par CvMat* fundamental\_matrix,\par int method=CV\_FM\_RANSAC,\par double param1=1.,\par double param2=0.99,\par CvMat* status=NULL);
6095 }{CPP}{FindFundamentalMat(points1, points2, fundamental\_matrix, method=CV\_FM\_RANSAC, param1=1., double param2=0.99, status = None) -> None}
6098 \cvarg{points1}{Array of the first image points of \texttt{2xN, Nx2, 3xN} or \texttt{Nx3} size (where \texttt{N} is number of points). Multi-channel \texttt{1xN} or \texttt{Nx1} array is also acceptable. The point coordinates should be floating-point (single or double precision)}
6099 \cvarg{points2}{Array of the second image points of the same size and format as \texttt{points1}}
6100 \cvarg{fundamental\_matrix}{The output fundamental matrix or matrices. The size should be 3x3 or 9x3 (7-point method may return up to 3 matrices)}
6101 \cvarg{method}{Method for computing the fundamental matrix
6103 \cvarg{CV\_FM\_7POINT}{for a 7-point algorithm. $N = 7$}
6104 \cvarg{CV\_FM\_8POINT}{for an 8-point algorithm. $N \ge 8$}
6105 \cvarg{CV\_FM\_RANSAC}{for the RANSAC algorithm. $N \ge 8$}
6106 \cvarg{CV\_FM\_LMEDS}{for the LMedS algorithm. $N \ge 8$}
6108 \cvarg{param1}{The parameter is used for RANSAC or LMedS methods only. It is the maximum distance from point to epipolar line in pixels, beyond which the point is considered an outlier and is not used for computing the final fundamental matrix. Usually it is set to 0.5 or 1.0}
6109 \cvarg{param2}{The parameter is used for RANSAC or LMedS methods only. It denotes the desirable level of confidence that the matrix is correct}
6110 \cvarg{status}{The optional output array of N elements, every element of which is set to 0 for outliers and to 1 for the other points. The array is computed only in RANSAC and LMedS methods. For other methods it is set to 1}
6113 The epipolar geometry is described by the following equation:
6117 where $F$ is fundamental matrix, $p_1$ and $p_2$ are corresponding points in the first and the second images, respectively.
6119 The function \texttt{cvFindFundamentalMat} calculates the fundamental
6120 matrix using one of four methods listed above and returns the number
6121 of fundamental matrices found (1 or 3) and 0, if no matrix is found.
6123 The calculated fundamental matrix may be passed further to
6124 \texttt{cvComputeCorrespondEpilines} that finds the epipolar lines
6125 corresponding to the specified points.
6127 \cvfunc{Example. Estimation of fundamental matrix using RANSAC algorithm}
6129 int point_count = 100;
6133 CvMat* fundamental_matrix;
6135 points1 = cvCreateMat(1,point_count,CV_32FC2);
6136 points2 = cvCreateMat(1,point_count,CV_32FC2);
6137 status = cvCreateMat(1,point_count,CV_8UC1);
6139 /* Fill the points here ... */
6140 for( i = 0; i < point_count; i++ )
6142 points1->data.fl[i*2] = <x,,1,i,,>;
6143 points1->data.fl[i*2+1] = <y,,1,i,,>;
6144 points2->data.fl[i*2] = <x,,2,i,,>;
6145 points2->data.fl[i*2+1] = <y,,2,i,,>;
6148 fundamental_matrix = cvCreateMat(3,3,CV_32FC1);
6149 int fm_count = cvFindFundamentalMat( points1,points2,fundamental_matrix,
6150 CV_FM_RANSAC,1.0,0.99,status );
6153 \cvfunc{ComputeCorrespondEpilines}\label{ComputeCorrespondEpilines}
6155 For points in one image of a stereo pair, computes the corresponding epilines in the other image.
6158 void cvComputeCorrespondEpilines( \par const CvMat* points,\par int which\_image,\par const CvMat* fundamental\_matrix,\par CvMat* correspondent\_lines);
6159 }{CPP}{ComputeCorrespondEpilines(points, which\_image, fundamental\_matrix, correspondent\_lines) -> None}
6162 \cvarg{points}{The input points. \texttt{2xN, Nx2, 3xN} or \texttt{Nx3} array (where \texttt{N} number of points). Multi-channel \texttt{1xN} or \texttt{Nx1} array is also acceptable}
6163 \cvarg{which\_image}{Index of the image (1 or 2) that contains the \texttt{points}}
6164 \cvarg{fundamental\_matrix}{Fundamental matrix}
6165 \cvarg{correspondent\_lines}{Computed epilines, a \texttt{3xN} or \texttt{Nx3} array}
6168 For every point in one of the two images of a stereo-pair the function
6169 \texttt{ComputeCorrespondEpilines} finds the equation of a line that
6170 contains the corresponding point (i.e. projection of the same 3D
6171 point) in the other image. Each line is encoded by a vector of 3
6172 elements $l = \vecthree{a}{b}{c}$ so that:
6174 \[ l^T \vecthree{x}{y}{1} = 0 \]
6176 \[ a x + b y + c = 0 \]
6178 From the fundamental matrix definition (see \cross{FindFundamentalMatrix}
6179 discussion), line $l_1$ for a point $p_1$ in the first image
6180 $(\texttt{which\_image} =1)$ can be computed as:
6184 and the line $l_1$ for a point $p_2$ in the second image $(\texttt{which\_image} =1)$ can be computed as:
6188 Line coefficients are defined up to a scale. They are normalized $(a^2+b^2=1)$ are stored into \texttt{correspondent\_lines}.
6190 \cvfunc{ConvertPointsHomogenious}\label{ConvertPointsHomogenious}
6192 Convert points to/from homogenious coordinates.
6195 void cvConvertPointsHomogenious( \par const CvMat* src,\par CvMat* dst );
6199 \cvarg{src}{The input point array, \texttt{2xN, Nx2, 3xN, Nx3, 4xN or Nx4 (where \texttt{N} is the number of points)}. Multi-channel \texttt{1xN} or \texttt{Nx1} array is also acceptable}
6200 \cvarg{dst}{The output point array, must contain the same number of points as the input; The dimensionality must be the same, 1 less or 1 more than the input, and also within 2 to 4}
6203 The function \texttt{cvConvertPointsHomogenious} converts 2D or 3D points from/to homogenious coordinates, or simply copies or transposes the array. If the input array dimensionality is larger than the output, each coordinate is divided by the last coordinate:
6207 (x,y[,z],w) -> (x',y'[,z'])\\
6211 z' = z/w \quad \text{(if output is 3D)}
6215 If the output array dimensionality is larger, an extra 1 is appended to each point. Otherwise, the input array is simply copied (with optional tranposition) to the output.
6217 \textbf{Note} because the function accepts a large variety of array layouts, it may report an error when input/output array dimensionality is ambiguous. It is always safe to use the function with number of points $\texttt{N} \ge 5$, or to use multi-channel \texttt{Nx1} or \texttt{1xN} arrays.
6219 \cvfunc{CvStereoBMState}
6221 The structure for block matching stereo correspondence algorithm.
6224 typedef struct CvStereoBMState
6226 //pre filters (normalize input images):
6227 int preFilterType; // 0 for now
6228 int preFilterSize; // ~5x5..21x21
6229 int preFilterCap; // up to ~31
6230 //correspondence using Sum of Absolute Difference (SAD):
6231 int SADWindowSize; // Could be 5x5..21x21
6232 int minDisparity; // minimum disparity (=0)
6233 int numberOfDisparities; // maximum disparity - minimum disparity
6234 //post filters (knock out bad matches):
6235 int textureThreshold; // areas with no texture are ignored
6236 float uniquenessRatio;// filter out pixels if there are other close matches
6237 // with different disparity
6238 int speckleWindowSize;// Disparity variation window (not used)
6239 int speckleRange; // Acceptable range of variation in window (not used)
6240 // internal buffers, do not modify (!)
6241 CvMat* preFilteredImg0;
6242 CvMat* preFilteredImg1;
6243 CvMat* slidingSumBuf;
6248 The block matching stereo correspondence algorithm, by Kurt Konolige, is very fast one-pass stereo matching algorithm that uses sliding sums of absolute differences between pixels in the left image and the pixels in the right image, shifted by some varying amount of pixels (from \texttt{minDisparity} to \texttt{minDisparity+numberOfDisparities}). On a pair of images WxH the algorithm computes disparity in \texttt{O(W*H*numberOfDisparities)} time. In order to improve quality and reability of the disparity map, the algorithm includes pre-filtering and post-filtering procedures.
6250 Note that the algorithm searches for the corresponding blocks in x direction only. It means that the supplied stereo pair should be rectified. Vertical stereo layout is not directly supported, but in such a case the images could be transposed by user.
6252 \cvfunc{CreateStereoBMState}
6254 Creates block matching stereo correspondence structure.
6257 #define CV_STEREO_BM_BASIC 0
6258 #define CV_STEREO_BM_FISH_EYE 1
6259 #define CV_STEREO_BM_NARROW 2
6264 CvStereoBMState* cvCreateStereoBMState( int preset=CV\_STEREO\_BM\_BASIC,
6265 int numberOfDisparities=0 );
6267 }{CPP}{CreateStereoBMState(preset=CV\_STEREO\_BM\_BASIC,numberOfDisparities=0)-> StereoBMState}
6270 \cvarg{preset}{ID of one of the pre-defined parameter sets. Any of the parameters can be overridden after creating the structure.}
6271 \cvarg{numberOfDisparities}{The number of disparities. If the parameter is 0, it is taken from the preset, otherwise the supplied value overrides the one from preset.}
6274 The function \texttt{cvCreateStereoBMState} creates the stereo correspondence structure and initializes it. It is possible to override any of the parameters at any time between the calls to \cross{cvFindStereoCorrespondenceBM}.
6276 \cvfunc{ReleaseStereoBMState}
6278 Releases block matching stereo correspondence structure.
6282 void cvReleaseStereoBMState( CvStereoBMState** state );
6284 }{CPP}{ReleaseStereoBMState(state)-> None}
6287 \cvarg{state}{Double pointer to the released structure.}
6290 The function \texttt{cvReleaseStereoBMState} releases the stereo correspondence structure and all the associated internal buffers.
6292 \cvfunc{FindStereoCorrespondenceBM}
6294 Computes the disparity map using block matching algorithm.
6298 void cvFindStereoCorrespondenceBM( \par const CvArr* left, \par const CvArr* right,
6299 \par CvArr* disparity, \par CvStereoBMState* state );
6301 }{CPP}{FindStereoCorrespondenceBM(left,right,disparity,state)-> None}
6304 \cvarg{left}{The left single-channel, 8-bit image.}
6305 \cvarg{right}{The right image of the same size and the same type.}
6306 \cvarg{disparity}{The output single-channel 16-bit signed disparity map of the same size as input images. Its elements will be the computed disparities, multiplied by 16 and rounded to integers.}
6307 \cvarg{state}{Stereo correspondence structure.}
6310 The function cvFindStereoCorrespondenceBM computes disparity map for the input rectified stereo pair.
6312 \cvfunc{CvStereoGCState}
6314 The structure for graph cuts-based stereo correspondence algorithm
6317 typedef struct CvStereoGCState
6319 int Ithreshold; // threshold for piece-wise linear data cost function (5 by default)
6320 int interactionRadius; // radius for smoothness cost function (1 by default; means Potts model)
6321 float K, lambda, lambda1, lambda2; // parameters for the cost function
6322 // (usually computed adaptively from the input data)
6323 int occlusionCost; // 10000 by default
6324 int minDisparity; // 0 by default; see CvStereoBMState
6325 int numberOfDisparities; // defined by user; see CvStereoBMState
6326 int maxIters; // number of iterations; defined by user.
6341 The graph cuts stereo correspondence algorithm, described in \href{\#Kolmogrov03}{[Kolmogorov03]} (as \textbf{KZ1}), is non-realtime stereo correpsondence algorithm that usually gives very accurate depth map with well-defined object boundaries. The algorithm represents stereo problem as a sequence of binary optimization problems, each of those is solved using maximum graph flow algorithm. The state structure above should not be allocated and initialized manually; instead, use \cross{cvCreateStereoGCState} and then override necessary parameters if needed.
6343 \cvfunc{CreateStereoGCState}
6345 Creates the state of graph cut-based stereo correspondence algorithm.
6349 CvStereoGCState* cvCreateStereoGCState( int numberOfDisparities,
6352 }{CPP}{CreateStereoGCState(numberOfDispaities,maxIters)-> StereoGCState}
6355 \cvarg{numberOfDisparities}{The number of disparities. The disparity search range will be $\texttt{state->minDisparity} \le disparity < \texttt{state->minDisparity} + \texttt{state->numberOfDisparities}$}
6356 \cvarg{maxIters}{Maximum number of iterations. On each iteration all possible (or reasonable) alpha-expansions are tried. The algorithm may terminate earlier if it could not find an alpha-expansion that decreases the overall cost function value. See \href{\#Kolmogorov03}{[Kolmogorov03]} for details. }
6359 The function \texttt{cvCreateStereoGCState} creates the stereo correspondence structure and initializes it. It is possible to override any of the parameters at any time between the calls to \cross{cvFindStereoCorrespondenceGC}.
6361 \cvfunc{ReleaseStereoGCState}
6363 Releases the state structure of the graph cut-based stereo correspondence algorithm.
6367 void cvReleaseStereoGCState( CvStereoGCState** state );
6369 }{CPP}{ReleaseStereoGCState(state)-> None}
6372 \cvarg{state}{Double pointer to the released structure.}
6375 The function \texttt{cvReleaseStereoGCState} releases the stereo correspondence structure and all the associated internal buffers.
6378 \cvfunc{FindStereoCorrespondenceGC}
6380 Computes the disparity map using graph cut-based algorithm.
6384 void cvFindStereoCorrespondenceGC( \par const CvArr* left, \par const CvArr* right,
6385 \par CvArr* dispLeft, \par CvArr* dispRight,
6386 \par CvStereoGCState* state,
6387 \par int useDisparityGuess = CV\_DEFAULT(0) );
6389 }{CPP}{FindStereoCorrespondenceGC(\par left,\par right,\par dispLeft,\par dispRight,\par state,\par useDisparityGuess=CV\_DEFAULT(0))-> None}
6392 \cvarg{left}{The left single-channel, 8-bit image.}
6393 \cvarg{right}{The right image of the same size and the same type.}
6394 \cvarg{dispLeft}{The optional output single-channel 16-bit signed left disparity map of the same size as input images.}
6395 \cvarg{dispRight}{The optional output single-channel 16-bit signed right disparity map of the same size as input images.}
6396 \cvarg{state}{Stereo correspondence structure.}
6397 \cvarg{useDisparityGuess}{If the parameter is not zero, the algorithm will start with pre-defined disparity maps. Both dispLeft and dispRight should be valid disparity maps. Otherwise, the function starts with blank disparity maps (all pixels are marked as occlusions).}
6400 The function \texttt{cvFindStereoCorrespondenceGC} computes disparity maps for the input rectified stereo pair. Note that the left disparity image will contain values in the following range:
6403 -\texttt{state->numberOfDisparities}-\texttt{state->minDisparity}
6404 < dispLeft(x,y) \le -\texttt{state->minDisparity},
6409 dispLeft(x,y) == \texttt{CV\_STEREO\_GC\_OCCLUSION}
6412 and for the right disparity image the following will be true:
6415 \texttt{state->minDisparity} \le dispRight(x,y)
6416 < \texttt{state->minDisparity} + \texttt{state->numberOfDisparities}
6422 dispRight(x,y) == \texttt{CV\_STEREO\_GC\_OCCLUSION}
6425 that is, the range for the left disparity image will be inversed,
6426 and the pixels for which no good match has been found, will be marked
6429 Here is how the function can be called:
6432 // image_left and image_right are the input 8-bit single-channel images
6433 // from the left and the right cameras, respectively
6434 CvSize size = cvGetSize(image_left);
6435 CvMat* disparity_left = cvCreateMat( size.height, size.width, CV_16S );
6436 CvMat* disparity_right = cvCreateMat( size.height, size.width, CV_16S );
6437 CvStereoGCState* state = cvCreateStereoGCState( 16, 2 );
6438 cvFindStereoCorrespondenceGC( image_left, image_right,
6439 disparity_left, disparity_right, state, 0 );
6440 cvReleaseStereoGCState( &state );
6441 // now process the computed disparity images as you want ...
6444 and this is the output left disparity image computed from the well-known Tsukuba stereo pair and multiplied by -16 (because the values in the left disparity images are usually negative):
6447 CvMat* disparity_left_visual = cvCreateMat( size.height, size.width, CV_8U );
6448 cvConvertScale( disparity_left, disparity_left_visual, -16 );
6449 cvSave( "disparity.png", disparity_left_visual );
6452 \includegraphics{pics/disparity.png}
6454 \cvfunc{ReprojectImageTo3D}
6456 Reprojects disparity image to 3D space.
6460 void cvReprojectImageTo3D( const CvArr* disparity,
6461 CvArr* \_3dImage, const CvMat* Q );
6463 }{CPP}{ReprojectImageTo3D(disparity,\_3dImage,Q)-> None}
6466 \cvarg{disparity}{Disparity map.}
6467 \cvarg{\_3dImage}{3-channel, 16-bit integer or 32-bit floating-point image - the output map of 3D points.}
6468 \cvarg{Q}{The reprojection 4x4 matrix.}
6471 The function \texttt{cvReprojectImageTo3D} transforms 1-channel disparity map to 3-channel image, a 3D surface. That is, for each pixel \texttt{(x,y)} and the corresponding disparity d=disparity\texttt{(x,y)} it computes:
6473 $[X Y Z W]^T = Q*[x y d 1]^T
6475 \_3dImage(x,y) = (X/W, Y/W, Z/W)$
6477 The matrix Q can be arbitrary, e.g. the one, computed by \cross{cvStereoRectify}. To reproject a sparse set of points {(x,y,d),...} to 3D space, use \cross{cvPerspectiveTransform}.
6479 \section{Bibliography}
6481 This bibliography provides a list of publications that were might be useful to the OpenCV users. This list is not complete; it serves only as a starting point.
6483 1. '''[Borgefors86]''' Gunilla Borgefors, "Distance Transformations in Digital Images". Computer Vision, Graphics and Image Processing 34, 344-371 (1986).
6484 1. '''[Bouguet00]''' Jean-Yves Bouguet. Pyramidal Implementation of the Lucas Kanade Feature Tracker.<<BR>> The paper is included into OpenCV distribution ([[attachment:algo\_tracking.pdf]])
6485 1. '''[Bradski98]''' G.R. Bradski. Computer vision face tracking as a component of a perceptual user interface. In Workshop on Applications of Computer Vision, pages 214?219, Princeton, NJ, Oct. 1998.<<BR>> Updated version can be found at http://www.intel.com/technology/itj/q21998/articles/art\_2.htm.<<BR>> Also, it is included into OpenCV distribution ([[attachment:camshift.pdf]])
6486 1. '''[Bradski00]''' G. Bradski and J. Davis. Motion Segmentation and Pose Recognition with Motion History Gradients. IEEE WACV'00, 2000.
6487 1. '''[Burt81]''' P. J. Burt, T. H. Hong, A. Rosenfeld. Segmentation and Estimation of Image Region Properties Through Cooperative Hierarchical Computation. IEEE Tran. On SMC, Vol. 11, N.12, 1981, pp. 802-809.
6488 1. '''[Canny86]''' J. Canny. A Computational Approach to Edge Detection, IEEE Trans. on Pattern Analysis and Machine Intelligence, 8(6), pp. 679-698 (1986).
6489 1. '''[Davis97]''' J. Davis and Bobick. The Representation and Recognition of Action Using Temporal Templates. MIT Media Lab Technical Report 402, 1997.
6490 1. '''[DeMenthon92]''' Daniel F. DeMenthon and Larry S. Davis. Model-Based Object Pose in 25 Lines of Code. In Proceedings of ECCV '92, pp. 335-343, 1992.
6491 1. '''[Fitzgibbon95]''' Andrew W. Fitzgibbon, R.B.Fisher. A Buyer?s Guide to Conic Fitting. Proc.5th British Machine Vision Conference, Birmingham, pp. 513-522, 1995.
6492 1. '''[Ford98]''' Adrian Ford, Alan Roberts. Colour Space Conversions. http://www.poynton.com/PDFs/coloureq.pdf
6493 1. '''[Horn81]''' Berthold K.P. Horn and Brian G. Schunck. Determining Optical Flow. Artificial Intelligence, 17, pp. 185-203, 1981.
6494 1. '''[Hu62]''' M. Hu. Visual Pattern Recognition by Moment Invariants, IRE Transactions on Information Theory, 8:2, pp. 179-187, 1962.
6495 1. '''[Iivarinen97]''' Jukka Iivarinen, Markus Peura, Jaakko Srel, and Ari Visa. Comparison of Combined Shape Descriptors for Irregular Objects, 8th British Machine Vision Conference, BMVC'97.<<BR>>http://www.cis.hut.fi/research/IA/paper/publications/bmvc97/bmvc97.html
6496 1. '''[Jahne97]''' B. Jahne. Digital Image Processing. Springer, New York, 1997.
6497 1. '''[Lucas81]''' Lucas, B., and Kanade, T. An Iterative Image Registration Technique with an Application to Stereo Vision, Proc. of 7th International Joint Conference on Artificial Intelligence (IJCAI), pp. 674-679.
6498 1. '''[Kass88]''' M. Kass, A. Witkin, and D. Terzopoulos. Snakes: Active Contour Models, International Journal of Computer Vision, pp. 321-331, 1988.
6499 1. '''[Lienhart02]''' Rainer Lienhart and Jochen Maydt. An Extended Set of Haar-like Features for Rapid Object Detection. IEEE ICIP 2002, Vol. 1, pp. 900-903, Sep. 2002.<<BR>> This paper, as well as the extended technical report, can be retrieved at http://www.lienhart.de/Publications/publications.html
6500 1. '''[Matas98]''' J.Matas, C.Galambos, J.Kittler. Progressive Probabilistic Hough Transform. British Machine Vision Conference, 1998.
6501 1. '''[Rosenfeld73]''' A. Rosenfeld and E. Johnston. Angle Detection on Digital Curves. IEEE Trans. Computers, 22:875-878, 1973.
6502 1. '''[RubnerJan98]''' Y. Rubner. C. Tomasi, L.J. Guibas. Metrics for Distributions with Applications to Image Databases. Proceedings of the 1998 IEEE International Conference on Computer Vision, Bombay, India, January 1998, pp. 59-66.
6503 1. '''[RubnerSept98]''' Y. Rubner. C. Tomasi, L.J. Guibas. The Earth Mover?s Distance as a Metric for Image Retrieval. Technical Report STAN-CS-TN-98-86, Department of Computer Science, Stanford University, September 1998.
6504 1. '''[RubnerOct98]''' Y. Rubner. C. Tomasi. Texture Metrics. Proceeding of the IEEE International Conference on Systems, Man, and Cybernetics, San-Diego, CA, October 1998, pp. 4601-4607. http://robotics.stanford.edu/~rubner/publications.html
6505 1. '''[Serra82]''' J. Serra. Image Analysis and Mathematical Morphology. Academic Press, 1982.
6506 1. '''[Schiele00]''' Bernt Schiele and James L. Crowley. Recognition without Correspondence Using Multidimensional Receptive Field Histograms. In International Journal of Computer Vision 36 (1), pp. 31-50, January 2000.
6507 1. '''[Suzuki85]''' S. Suzuki, K. Abe. Topological Structural Analysis of Digital Binary Images by Border Following. CVGIP, v.30, n.1. 1985, pp. 32-46.
6508 1. '''[Teh89]''' C.H. Teh, R.T. Chin. On the Detection of Dominant Points on Digital Curves. - IEEE Tr. PAMI, 1989, v.11, No.8, p. 859-872.
6509 1. '''[Trucco98]''' Emanuele Trucco, Alessandro Verri. Introductory Techniques for 3-D Computer Vision. Prentice Hall, Inc., 1998.
6510 1. '''[Viola01]''' Paul Viola and Michael J. Jones. Rapid Object Detection using a Boosted Cascade of Simple Features. IEEE CVPR, 2001.<<BR>> The paper is available online at http://www.ai.mit.edu/people/viola/
6511 1. '''[Welch95]''' Greg Welch, Gary Bishop. An Introduction To the Kalman Filter. Technical Report TR95-041, University of North Carolina at Chapel Hill, 1995.<<BR>> Online version is available at http://www.cs.unc.edu/~welch/kalman/kalmanIntro.html
6512 1. '''[Williams92]''' D. J. Williams and M. Shah. A Fast Algorithm for Active Contours and Curvature Estimation. CVGIP: Image Understanding, Vol. 55, No. 1, pp. 14-26, Jan., 1992. http://www.cs.ucf.edu/~vision/papers/shah/92/WIS92A.pdf.
6513 1. '''[Yuen03]''' H.K. Yuen, J. Princen, J. Illingworth and J. Kittler. Comparative study of Hough Transform methods for circle finding.<<BR>>http://www.sciencedirect.com/science/article/B6V09-48TCV4N-5Y/2/91f551d124777f7a4cf7b18325235673
6514 1. '''[Yuille89]''' A.Y.Yuille, D.S.Cohen, and P.W.Hallinan. Feature Extraction from Faces Using Deformable Templates in CVPR, pp. 104-109, 1989.
6515 1. '''[Zhang96]''' Z. Zhang. Parameter Estimation Techniques: A Tutorial with Application to Conic Fitting, Image and Vision Computing Journal, 1996.
6516 1. '''[Zhang99]''' Z. Zhang. Flexible Camera Calibration By Viewing a Plane From Unknown Orientations. International Conference on Computer Vision (ICCV'99), Corfu, Greece, pages 666-673, September 1999.
6517 1. '''[Zhang00]''' Z. Zhang. A Flexible New Technique for Camera Calibration. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(11):1330-1334, 2000.