UCSD CSE 252C - Aligning Sequences and Actions - D77004

Home> Schools> University of California, San Diego> Computer Science & Engineering (CSE) > CSE 252C> Aligning Sequences and Actions

DOC PREVIEW

UCSD CSE 252C - Aligning Sequences and Actions

School name University of California, San Diego

Course Cse 252c- Selected Topics in Vision and Learning

Pages 25

This preview shows page 1-2-24-25 out of 25 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 25 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 25 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 25 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 25 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 25 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Aligning Sequences and Actions by Maximizing Space-Time CorrelationsYaron Ukrainitz and Michal IraniPresented by, Deborah GoshornCSE 252C – Dr. Serge BelongieOutline• Introduction• Image Alignment first!• Algorithm • Experimental Results • More research…IntroductionSequence alignment – wide range of scenarios:•Align sequence pairs with:* stationary/(jointly moving) cameras* same/different photometric properties * with/without moving objects•Algorithm applies to intensity information* Without segmenting foreground* Without a priori finding corresponding features across sequencesMain Goal:Extend previous research of image alignment to space-time alignmentfggftxyxyextendPreviously….Multi-Sensor Image AlignmentMichal Irani & P. AnandanIdentify an image representation for multi-sensor alignment- does not rely on sparse image features (e.g. edge, contour, point features)Presents new alignment technique- applies global estimation to any choice of local similarity measure1. What is the relationship between the brightness values of pixels in an image from sensor 1 and in another image from sensor 2, of different modality?2. Contrast reversal may occur between the 2 images in someplaces of the image but not in othersEO IRProblemsEO IR3. Visual features present in one image but not the other (mutually exclusive features)4. Multiple brightness values in one image may map to one singlebrightness value in other image, vice versa.ProblemsIn other words…The 2 images are usually not globally correlated!… So, one thinks to find:1. A good image representation to work with, which brings out commoninformation between the 2 images and also suppresses non-commoninformation2. Once we found the image representation, now find the right similarity measure for matching.1. Find Image Representation2. Find Similarity MeasureAlignment Algorithm1. Methods that use invariant image representation.ex.)edge maps, oriented edge vector fields, contour features, feature points.Information loss, because of thresholding stepsSparse set of highly significant featuresThreshold choice is very data and sensor dependent2. Methods that use invariant similarity measure to register multi-sensor images.ex) mutual information, proposed methodPrevious Work:Alignment by Maximization of Mutual Information – Viola & Wells (1997)•Also did intensity based not feature based•Efficient because uses stochastic approximation (noisy derivatives in gradient descent algorithm)•Claims mutual information is more robust than traditionalcorrelationAuthors claim the mutual information method:1. Assumes the 2 images have a global stat. correlation (violated)2. Since stat. correlation between raw multi-sensor images decreases as spatial resolution decreases, it will notextend to coarse-to-fineestimation, (which is often used to fix large misalignments)EOIRWhy not the mutual information method?1. Only assuming a local correlation, not global, of the images.2. Method is invariant to contrast reversal3. Method provides orientational sensitivity4. Method suitable for coarse-to-fine processing5. Method rejects outliers (I.e. mutually exclusive visual features)How they handled the problems:At low resolution levels, we must still capture small (high-resolution) temporal changes! EO IR…Apply directional derivative filters, then square it (to handle contrast reversal)1. Find Image RepresentationImage RepresentationSince we handled coarse-to-fine, we can fix largemisalignments by constructing Gaussian pyramid!EOIREstimate parametric transformation globally- Useful due to the plurality of outliers across sensors and hence the unreliability of local matches.Global estimation applied directly to local correlation functions-could have used local mutual information hereGlobal AlignmentNormalized-Correlation as a Local Similarity Measure- Invariant to local changes in mean and contrast- Locally, within small image patches which contain corresponding image features, stat. correlation is high- Normalized-correlation is a linear approx to stat. correlation of 2 signals in a small window – cheaper to compute!Extend image alignment to space-time alignment!fggftxyxyextendDefinitionsSequence alignment: “finding the spatial and temporal coordinate transformation that bring one sequence into alignment with the other, both in space and time”* alignment notrecognitionAction alignment: recover space-time alignment transformations in sequences when the same action is performed at different times/places/people/(sensors)/(speeds)Multi-Sensor Alignment: sequences are simultaneous records of of the same scene, recorded using different sensor modalitiesIntroduction•Aligns sequences of same actioneither:* at different times/places/people/people (different speeds)or:* same scene but multiple cameras (different modalities)Observations1. Temporal changes are captured in space-time volumes created by 2 sequences, notindividual framesÆ Sequence to Sequence Alignment is “better”than Image to Image AlignmentProblem Formulation,:fgThe sequences (or filtered versions) to align.:pThe spatio-temporal parametric transformation vector that maximizes…(,):MfgA global similarity measure btw f, gafter aligning fand gProblem Formulation(, ,):xytOne space-time point in a sequence.()()()12311232456378:(, , ),,;(,;) , ,;,,;uuuuxytpxpypuxyuxytpxpypuxyt ptp=++⎡⎤⎡⎤⎢⎥⎢⎥== =++⎢⎥⎢⎥⎢⎥⎢⎥+⎣⎦⎣⎦uppppSpatio-temporal displacement vector•1-D affine transformation for time•2-D affine transformation for space (okay because assuming planar, i.e. distant, or the 2 cameras are close to each other) Alignment AlgorithmAlgorithm:Alignment AlgorithmAlgorithm:1. Make a space-time gaussian pyramid for each sequence2. Find initial guess po3. Apply maximization iterations in the current pyramid level until convergence- use current parameter estimate pofrom the last iteration to find delta- Update current parameter estimate po*= po+ delta- Test for convergence: if M(po) – M(p0*) < eps Æ go to 3, else break4. Go to next pyramid level & then go to 3Similarity Measure M(·)Outliers!EOIROutliers-For corresponding space-time blocks that have mutually exclusive image features, the normalized correlation function is not concave shape. Thus, we count those small space time blocks as outliers.-To measure concavity, take determinant of HessianSimilarity Measure ·Similarity Measure M(·)IMAGES-Globally,

View Full Document