DOC PREVIEW
UCSD ECE 271A - The Cheetah Problem

This preview shows page 1-2-3-4 out of 12 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

The Cheetah problemThe Cheetah problemNuno VasconcelosECE 271ACheetah• statistical learning only makes sense when youtry it on datawhen you try it on data• we will test what we learn on a image processing problem– given the cheetah image, can we teach a computer to segment it into object and foreground?– the question will be answered with different techniques, typically one problem per week•first problem this weekp– brief introduction to image representation(features) and other pre-processing stepsImage representationgp• we will use the discrete cosine transform (DCT)thi k f it F i T f b tl–think of it as a Fourier Transform, but real– maps an array of pixels (image block) into an array of frequency coefficients– for block x(i,j)⎥⎦⎤⎢⎣⎡+⎥⎦⎤⎢⎣⎡+=∑∑−=−=)12(2cos)12(2cos),(4),(11010121jNkiNkjixkkTNiNjππ– each coefficient is a projection onto a basis function– basis functions are 2D sinusoids of different frequencies⎦⎣⎦⎣==00ij– T(k1,k2) captures image information on the frequency band⎥⎤⎢⎡+⎥⎤⎢⎡+112211kkxkkππππ⎥⎦⎢⎣+⎥⎦⎢⎣+12,212,2 NNxNNIn a picturep• we will use blocks of 8 x 8 pixels•theDCT basis functionsarethe DCT basis functionsare•1stfunction is constant,1stcoefficient is the blockmean, not very interesting(depends on illuminationetc.))• there is a MATLABfunction – dct2(.) –th t t ththat computes the DCT coefficientsIn a picturep• coefficients have a naturalorder by frequencyyq y• it is called thezig-zag pattern• allows us to transformthe 2D array of coefficients into a vector• this vector has 64 features,i.e. is a point on a 64Dspace• we will make available a filewith this zig-zag pattern wt t s gag patteImage representationgpdi t iimage8x8 blocks8x8 DCTdiscrete cosinetransformBag of DCT vectorsR6464++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++number ++++++of blocksFeatures• 64D is a lot, we will see later in the course how to pick good featuresgood features• for now we will use a single featureX = location of the coefficient of 2ndlargest magnitude•e g for vector (100 123253 14) we have X 4•e.g. for vector (100, 12, -32, -53, 14) we have X = 4• rationale: 1stcoefficient is always the largest, but not very informative, 2ndlargest gives the dominant frequency band,gg q y• note that X is now a scalar feature, we can estimate all CCDs with histogramsClassifier• Training:bktiii it88bl k–break training images into 8x8 blocks– for each block• compute DCT, • order coefficients with zig-zag scan•pick position of 2ndlargest magnitude as the feature value– note: we will give you this!– the collection of all such positions is the training set– from training set estimate PX|Y(x|cheetah), PX|Y(x|background), gX|Y(| )X|Y(| g )using histograms, and PY(cheetah), PY(background), using common-senseClassifier• classification:bktiii it88bl k–break training images into 8x8 blocks– for each block• compute DCT, • order coefficients with zig-zag scan•pick position of 2ndlargest magnitude as the feature value–use BDR to find class Y for each block– create a binary mask with 1’s for foreground blocks and0’s for background blocks• note: you’ll have to implement all of this on your ownRemarks• this is a realistic problemhliWILL NOT BE PERFECT•the solution WILL NOT BE PERFECT• there is no unique right answer•by looking at the resulting segmentation maskyou will•by looking at the resulting segmentation mask, you will know if the results are “decent”– holes, noisy, is OK–but it should look somewhat like thisMost common problemsp• “my segmentation mask is very blocky”d i l ifi ti lidi i d th t b i l–during classification, use a sliding window that moves by one pixel at each step– this will give you a binary value per pixel (e.g. assign it to the t l i l i th bl k th t l ft ) f thcentral pixel in the block, or the top left corner) for the segmentation mask• “I get complete garbage”– make sure to always work with doubles in the range [0-1] (this is how the training data was created)–after you read the image doyg• im2double(image)•or double(image)/255Most common problemsp• “my probability of error is too high”ktth hit biiillhit–make sure to use the same histogram binning in all histograms– MATLAB let’s you do this easily• “how do I read an image on MATLAB?”– you should be able to figure out the answers to these type of questions on your ownquestions on your own– MATLAB’s help, tutorials, etc.hi ilhTAblb lhi•other questions, email the TA, but please be gentle on


View Full Document

UCSD ECE 271A - The Cheetah Problem

Download The Cheetah Problem
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view The Cheetah Problem and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view The Cheetah Problem 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?