Asp Forum - image matching algorithms

Daniel Fetchinson

3/10/2008 6:33:00 AM

Hi all,

There are a number of free tools for image matching but it's not very
easy to decipher the actual algorithm from the code that includes db
management, GUI, etc, etc. I have my own image database and GUI so all
I need is the actual algorithm preferably in pseudo code and not in
the form of a research paper (from which I also found a lot but since
I'm not that much interested in the actual science of image
recognition this seems like an over kill).

My understanding of image matching is that it works by first
calculating N real numbers for an image and defining a metric for
pairs of N-tuples. Images are similar if their distance (defined by
the metric) is small.

The various free tools differ by their chosen optimization paths and
their degree of specialization. My preference would be,

1. Doesn't really matter how long it takes to compute the N numbers per image
2. Lookups should be fast, consequently N should not be too large (I guess)
3. It should be a generic algorithm working on generic images (everyday photos)
4. PIL should be enough for the implementation

So if anyone knows of a good resource that is close to being pseudo
code I would be very grateful!

Cheers,
Daniel

11 Answers

Aaron Brady

3/10/2008 6:54:00 AM

On Mar 10, 1:32 am, "Daniel Fetchinson" <fetchin...@googlemail.com>
wrote:
> Hi all,
>
> There are a number of free tools for image matching but it's not very
> easy to decipher the actual algorithm from the code that includes db
> management, GUI, etc, etc. I have my own image database and GUI so all
> I need is the actual algorithm preferably in pseudo code and not in
> the form of a research paper (from which I also found a lot but since
> I'm not that much interested in the actual science of image
> recognition this seems like an over kill).
>
> My understanding of image matching is that it works by first
> calculating N real numbers for an image and defining a metric for
> pairs of N-tuples. Images are similar if their distance (defined by
> the metric) is small.
>
> The various free tools differ by their chosen optimization paths and
> their degree of specialization. My preference would be,
>
> 1. Doesn't really matter how long it takes to compute the N numbers per image
> 2. Lookups should be fast, consequently N should not be too large (I guess)
> 3. It should be a generic algorithm working on generic images (everyday photos)
> 4. PIL should be enough for the implementation

http://www.idi.ntnu.no/~blake/gb...
"High level features carry information about an image in an abstracted
or propositional form"

It says it constructs a graph about the image's features. Here's the
graph:

Graph components Notes

[A[@id=crot3.77;ext:sqr:aa(1659):mm(19815,148,0,0): <- Leading node
cg(62,86):cr(255,153,153):pl(-204,574,792,10353)]] with
attributes

[[5][@rep= 1 ]] <- Relation
and strength

[B[@id=crot3.77;ext:sqr:aa(199):mm(17759,244,1,0): <- Trailing
node
cg(98,77):cr(153,153,255):pl(966,2,258,-79198)]]$ with
attributes

It doesn't say what corner cases it leaves. "seem to provide" and
"seems to be extremely flexible". I like this feature:

- the equation of the best fitting plane Ax+By+Cz+D=0 to the range
image data masked by the current region;

Where does that get you?

Yu-Xi Lim

3/10/2008 9:16:00 PM

Daniel Fetchinson wrote:
> There are a number of free tools for image matching but it's not very
> easy to decipher the actual algorithm from the code that includes db
> management, GUI, etc, etc. I have my own image database and GUI so all
> I need is the actual algorithm preferably in pseudo code and not in
> the form of a research paper (from which I also found a lot but since
> I'm not that much interested in the actual science of image
> recognition this seems like an over kill).

I'd recommend SIFT. There's quite a bit of information on SIFT. In most
cases, they don't cover the background science too much, but are still
heavy on the math. Pseudo code is hard to come by since it will take
many lines of pseudo code just to express one concise mathematical
equation. There are however many links to implementations in various
languages on the Wikipedia page.

http://en.wikipedia.org/wiki/Scale-invariant_feature...

I have had good experiences with SIFT for feature extraction from images
(I have used it for panorama stitching and robot mapping). It's
insensitive to scale and rotation. Note that it is a patented algorithm
and this may (or may not) pose a problem for you.

Daniel Fetchinson

3/10/2008 11:06:00 PM

> > There are a number of free tools for image matching but it's not very
> > easy to decipher the actual algorithm from the code that includes db
> > management, GUI, etc, etc. I have my own image database and GUI so all
> > I need is the actual algorithm preferably in pseudo code and not in
> > the form of a research paper (from which I also found a lot but since
> > I'm not that much interested in the actual science of image
> > recognition this seems like an over kill).
>
> I'd recommend SIFT. There's quite a bit of information on SIFT. In most
> cases, they don't cover the background science too much, but are still
> heavy on the math. Pseudo code is hard to come by since it will take
> many lines of pseudo code just to express one concise mathematical
> equation. There are however many links to implementations in various
> languages on the Wikipedia page.
>
> http://en.wikipedia.org/wiki/Scale-invariant_feature...
>
> I have had good experiences with SIFT for feature extraction from images
> (I have used it for panorama stitching and robot mapping). It's
> insensitive to scale and rotation. Note that it is a patented algorithm
> and this may (or may not) pose a problem for you.

Thanks for the info! SIFT really looks like a heavy weight solution,
but do you think the whole concept can be simplified if all I needed
was: given a photo, find similar ones? I mean SIFT first detects
objects on the image and find similarities, but I don't need the
detection part at all, all I care about is similarity for the whole
photo. I surely don't understand the big picture fully but just have
the general feeling that SIFT and other expert tools are an overkill
for me and a simplified version would be just as good with a much more
easily comprehensible core algorithm.

Or am I being too optimistic and there is no way out of going into the details?

Yu-Xi Lim

3/10/2008 11:48:00 PM

Daniel Fetchinson wrote:
> Thanks for the info! SIFT really looks like a heavy weight solution,
> but do you think the whole concept can be simplified if all I needed
> was: given a photo, find similar ones? I mean SIFT first detects
> objects on the image and find similarities, but I don't need the
> detection part at all, all I care about is similarity for the whole
> photo. I surely don't understand the big picture fully but just have
> the general feeling that SIFT and other expert tools are an overkill
> for me and a simplified version would be just as good with a much more
> easily comprehensible core algorithm.

Please describe the kind of photos you are dealing with. Are they
identical photos, in say different formats or with different metadata?
Or are they rescaled images? Or maybe they are the same photo cropped
differently?

SIFT will work in more or less the process you described in your first
post. It basically calculates the N sets of numbers for each image,
representing the unique features of that image and their relative
positions. The rest of the process if up to you. You have to compare the
different sets of numbers to find the image with the minimal difference,
as opposed to comparing the whole image.

Shane Geiger

3/11/2008 12:46:00 AM

Daniel Fetchinson wrote:
>>> There are a number of free tools for image matching but it's not very
>>> easy to decipher the actual algorithm from the code that includes db
>>> management, GUI, etc, etc. I have my own image database and GUI so all
>>> I need is the actual algorithm preferably in pseudo code and not in
>>> the form of a research paper (from which I also found a lot but since
>>> I'm not that much interested in the actual science of image
>>> recognition this seems like an over kill).
>>>
>> I'd recommend SIFT. There's quite a bit of information on SIFT. In most
>> cases, they don't cover the background science too much, but are still
>> heavy on the math. Pseudo code is hard to come by since it will take
>> many lines of pseudo code just to express one concise mathematical
>> equation. There are however many links to implementations in various
>> languages on the Wikipedia page.
>>
>> http://en.wikipedia.org/wiki/Scale-invariant_feature...
>>
>> I have had good experiences with SIFT for feature extraction from images
>> (I have used it for panorama stitching and robot mapping). It's
>> insensitive to scale and rotation. Note that it is a patented algorithm
>> and this may (or may not) pose a problem for you.
>>
>
> Thanks for the info! SIFT really looks like a heavy weight solution,
> but do you think the whole concept can be simplified if all I needed
> was: given a photo, find similar ones? I mean SIFT first detects
> objects on the image and find similarities, but I don't need the
> detection part at all, all I care about is similarity for the whole
> photo. I surely don't understand the big picture fully but just have
> the general feeling that SIFT and other expert tools are an overkill
> for me and a simplified version would be just as good with a much more
> easily comprehensible core algorithm.
>
> Or am I being too optimistic and there is no way out of going into the details?
>

Using the histogram of the picture may be good enough for your
application. Here's something I put together for comparing images (for
purposes of detecting motion) taken by the built-in web cam in my
Macbook Pro. This might be good enough if you play with the threshold.

"""
I'm writing a simple little app for doing motion detection with data
output from wacaw, a package for MacOSX. You could easily modify this
script to get data output from some other source.

cd /Applications ; for i in `jot 1024`; do /Applications/wacaw --png
picture-${i} && sleep 3 ; done; open *.png
cd /Applications; open picture-*

"""

# cd /Applications ; for i in `jot 1024`; do /Applications/wacaw --png
picture-${i} && sleep 3 ; done; open *.png

# SOURCE:
http://gumuz.looze.net/wordpress/index.php/archives/2005/06/06/python-webcam-fun-motion-...

import os
from time import sleep
import tempfile

import Image

# Sun Mar 18 16:40:51 CDT 2007
def diff_image(img1, img2, pix_threshold=50, img_threshold=4):
"""Compare 2 images to detect possible motion
You might want to choose the img_threshold amount based on the
conditions.
"""
img1 = Image.open(img1)
img2 = Image.open(img2)
if not img1 or not img2: return False
img1 = img1.getdata()
img2 = img2.getdata()
pixel_count = len(img1)
pixdiff = 0
for i in range(pixel_count):
#if abs(sum(img1[i]) - sum(img2[i])) > pix_threshold:
diffval = abs(sum(img1[i]) - sum(img2[i]))
#print "Pixel diffval:",diffval
if diffval > pix_threshold:
pixdiff += 1
diffperc = pixdiff / (pixel_count/100.0)
print "Photo diff percentage:",diffperc
if diffperc > img_threshold:
# motion detected
return True
else:
return False

photos = [] # consider automatically serializing this data

import commands

"""
def analyze_thresholds(list_of_photos):
last = list_of_photos[0]
for photo in list_of_photos[1:]:
diff_image(last, photo)
last = photo
"""

def detect():
number = 0
while True:
number += 1
sleep(3)
current = 'photo-'+str(number)
#tmp = tempfile.mktemp()
print commands.getoutput('/Applications/wacaw --png ' + current
) # ' + tmp +'.png')

# Here's the actual name of the file wacaw created:
current = '/Applications/'+current+'.png'
photos.append( current )
if len(photos) < 2: # pad the list for the first time
photos.append( current )

if diff_image(photos[-1],photos[-2], pix_threshold=50,
img_threshold=5):
print "motion detected"
else:
print "motion NOT detected"

detect()

#import glob
#analyze_thresholds(glob.glob('/Applications/photo-*')

--
Shane Geiger
IT Director
National Council on Economic Education
sgeiger@ncee.net | 402-438-8958 | http://ww...

Leading the Campaign for Economic and Financial Literacy

Daniel Fetchinson

3/11/2008 1:05:00 AM

> > Thanks for the info! SIFT really looks like a heavy weight solution,
> > but do you think the whole concept can be simplified if all I needed
> > was: given a photo, find similar ones? I mean SIFT first detects
> > objects on the image and find similarities, but I don't need the
> > detection part at all, all I care about is similarity for the whole
> > photo. I surely don't understand the big picture fully but just have
> > the general feeling that SIFT and other expert tools are an overkill
> > for me and a simplified version would be just as good with a much more
> > easily comprehensible core algorithm.
>
> Please describe the kind of photos you are dealing with. Are they
> identical photos, in say different formats or with different metadata?
> Or are they rescaled images? Or maybe they are the same photo cropped
> differently?

The photos are just coming straight from my digital camera. Same
format (JPEG), varying size (6-10 megapixel) and I would like to be
able to pick one and then query the database for similar ones. For
example: I pick a photo which is more or less a portrait of someone,
the query should return other photos with more or less portraits. If I
pick a landscape with lot of green and a mountain the query should
result in other nature (mostly green) photos. Something along these
lines, of course the matches won't be perfect because I'm looking for
a simple algorithm, but something along these lines.

> SIFT will work in more or less the process you described in your first
> post. It basically calculates the N sets of numbers for each image,
> representing the unique features of that image and their relative
> positions. The rest of the process if up to you. You have to compare the
> different sets of numbers to find the image with the minimal difference,
> as opposed to comparing the whole image.

Great, this sounds very good, I'll give SIFT a try (at least trying to
understand the basic concepts) although at the moment it looks a bit
scary :)

Daniel Fetchinson

3/11/2008 1:06:00 AM

> >>> There are a number of free tools for image matching but it's not very
> >>> easy to decipher the actual algorithm from the code that includes db
> >>> management, GUI, etc, etc. I have my own image database and GUI so all
> >>> I need is the actual algorithm preferably in pseudo code and not in
> >>> the form of a research paper (from which I also found a lot but since
> >>> I'm not that much interested in the actual science of image
> >>> recognition this seems like an over kill).
> >>>
> >> I'd recommend SIFT. There's quite a bit of information on SIFT. In most
> >> cases, they don't cover the background science too much, but are still
> >> heavy on the math. Pseudo code is hard to come by since it will take
> >> many lines of pseudo code just to express one concise mathematical
> >> equation. There are however many links to implementations in various
> >> languages on the Wikipedia page.
> >>
> >> http://en.wikipedia.org/wiki/Scale-invariant_feature...
> >>
> >> I have had good experiences with SIFT for feature extraction from images
> >> (I have used it for panorama stitching and robot mapping). It's
> >> insensitive to scale and rotation. Note that it is a patented algorithm
> >> and this may (or may not) pose a problem for you.
> >>
> >
> > Thanks for the info! SIFT really looks like a heavy weight solution,
> > but do you think the whole concept can be simplified if all I needed
> > was: given a photo, find similar ones? I mean SIFT first detects
> > objects on the image and find similarities, but I don't need the
> > detection part at all, all I care about is similarity for the whole
> > photo. I surely don't understand the big picture fully but just have
> > the general feeling that SIFT and other expert tools are an overkill
> > for me and a simplified version would be just as good with a much more
> > easily comprehensible core algorithm.
> >
> > Or am I being too optimistic and there is no way out of going into the
> details?
> >
>
>
> Using the histogram of the picture may be good enough for your
> application. Here's something I put together for comparing images (for
> purposes of detecting motion) taken by the built-in web cam in my
> Macbook Pro. This might be good enough if you play with the threshold.
>
>
> """
> I'm writing a simple little app for doing motion detection with data
> output from wacaw, a package for MacOSX. You could easily modify this
> script to get data output from some other source.

Thanks Shane, this is simple enough indeed, which is great. I'll give
this a try and maybe it'll be good enough for me.

Yu-Xi Lim

3/13/2008 12:45:00 AM

Daniel Fetchinson wrote:
> The photos are just coming straight from my digital camera. Same
> format (JPEG), varying size (6-10 megapixel) and I would like to be
> able to pick one and then query the database for similar ones. For
> example: I pick a photo which is more or less a portrait of someone,
> the query should return other photos with more or less portraits. If I
> pick a landscape with lot of green and a mountain the query should
> result in other nature (mostly green) photos. Something along these
> lines, of course the matches won't be perfect because I'm looking for
> a simple algorithm, but something along these lines.
>

Ah. In that case, SIFT isn't for you. SIFT would work well if you have
multiple photos of the same object. Say, a building from different
angles, or the a vase against different backdrops.

If I'm understanding your correctly, what you're attempting here is very
general and well into the highly experimental. I've been wishing for
such a feature to appear in something like Google Image Search
(pick/submit a photo and return similar images found on the web). I'm
sure if there's even a practical solution, Google (or MS) would be on it
already.

The problem is that there isn't really one. Despite what you may see
claimed in university press releases and research papers, the current
crop of algorithms don't work very well, at least according to my
understanding and discussion with researchers in this field. The glowing
results tend to be from tests done under ideal conditions and there's no
real practical and commercial solution.

If you restrict the domain somewhat, there are some solutions, but none
trivial. You are probably aware of the face searches available on Google
and Live.

The histogram approach suggested by Shane Geiger may work for some cases
and in fact would work very well for identical resized images. I doubt
it will work for the general case. A mountain with a grassy plain at
noon has quite a different histogram from one at sunset, and yet both
have related content. Manual tagging of the images, a la Flickr, would
probably be your best bet.

Good luck.

Daniel Fetchinson

3/13/2008 1:06:00 AM

> > The photos are just coming straight from my digital camera. Same
> > format (JPEG), varying size (6-10 megapixel) and I would like to be
> > able to pick one and then query the database for similar ones. For
> > example: I pick a photo which is more or less a portrait of someone,
> > the query should return other photos with more or less portraits. If I
> > pick a landscape with lot of green and a mountain the query should
> > result in other nature (mostly green) photos. Something along these
> > lines, of course the matches won't be perfect because I'm looking for
> > a simple algorithm, but something along these lines.
> >
>
> Ah. In that case, SIFT isn't for you. SIFT would work well if you have
> multiple photos of the same object. Say, a building from different
> angles, or the a vase against different backdrops.
>
> If I'm understanding your correctly, what you're attempting here is very
> general and well into the highly experimental. I've been wishing for
> such a feature to appear in something like Google Image Search
> (pick/submit a photo and return similar images found on the web). I'm
> sure if there's even a practical solution, Google (or MS) would be on it
> already.
>
> The problem is that there isn't really one. Despite what you may see
> claimed in university press releases and research papers, the current
> crop of algorithms don't work very well, at least according to my
> understanding and discussion with researchers in this field. The glowing
> results tend to be from tests done under ideal conditions and there's no
> real practical and commercial solution.
>
> If you restrict the domain somewhat, there are some solutions, but none
> trivial. You are probably aware of the face searches available on Google
> and Live.
>
> The histogram approach suggested by Shane Geiger may work for some cases
> and in fact would work very well for identical resized images. I doubt
> it will work for the general case. A mountain with a grassy plain at
> noon has quite a different histogram from one at sunset, and yet both
> have related content. Manual tagging of the images, a la Flickr, would
> probably be your best bet.

Since you seem to know quite a bit about this topic, what is your
opinion on the apparently 'generic' algorithm described here:
http://grail.cs.washington.edu/proje... ?
So far it seems to me that it does what I'm asking for, it does even
more because it can take a hand drawn sample image and query the
database for similar photos.

There is even a python implementation for it here:
http://members.tripod.com/~edcjones/p...

On the histogram method I agree that it won't work partly because of
what you say and partly because it is terribly slow since it's
comparing every single pixel.

Thanks,
Daniel

Yu-Xi Lim

3/13/2008 10:00:00 PM

Daniel Fetchinson wrote:
> Since you seem to know quite a bit about this topic, what is your
> opinion on the apparently 'generic' algorithm described here:
> http://grail.cs.washington.edu/proje... ?
> So far it seems to me that it does what I'm asking for, it does even
> more because it can take a hand drawn sample image and query the
> database for similar photos.
>
> There is even a python implementation for it here:
> http://members.tripod.com/~edcjones/p...
>
> On the histogram method I agree that it won't work partly because of
> what you say and partly because it is terribly slow since it's
> comparing every single pixel.

I'm hardly the expert and can't answer authoritatively, but here's my 2c.

I can't comment as to the actual accuracy of the algorithm, since it
will depend on your specific data set (set of photos). The algorithm is
sensitive to spatial and luminance information (because of the YIQ
colorspace), so there are simple ways in which it will fail.

The histogram method uses only color, but has a lot of numbers to
compare. You may find the histogram method insensitive to spatial
relations (a landscape with the mountain on the left and one with the
mountain on the right) compared to the wavelet approach.

This is a relatively old paper, and I've seen other more recent image
retrieval research using wavelets (some cases using only the
high-frequency wavelets for "texture" information instead of the
low-frequency ones used by this paper for "shape") and other information
retrieval-related research using lossy compressed data as the features.
If you have time, you may want to look at other research that cite this
particular paper.

And just a thought: Instead of merely cutting off at m largest-wavelets,
why not apply a quantization matrix to all the values?

Let me know how it works out.

comp.lang.python

image matching algorithms

Daniel Fetchinson

Aaron Brady

Yu-Xi Lim

Daniel Fetchinson

Yu-Xi Lim

Shane Geiger

Daniel Fetchinson

Daniel Fetchinson

Yu-Xi Lim

Daniel Fetchinson

Yu-Xi Lim

x Login to ForumsZone