Computational Photography

Computational Imaging III

by Francho Melendez

lab 3

  • difficulties?
  • did you start?
  • easier?
  • clearier?
  • Next week submit Lab 2 and 3 (final)

today's schedule

  • Recap
  • Using internet as data source
  • Using millios of images for scene completion
  • Using millions of images for image retrieval
  • Borrowing information from neighbours
  • Using millions of image for defining visual similarity
  • other applications

recap

photons to RAW

RAW to JPEG

burst photography

idea: take 2 or more images (bursts of images) and combine them

  • super-resolution
  • focal stack
  • aperture stack - confocal stereo
  • blurry/noisy
  • flash/no flash
  • HDR

expanding the field of view

  • Compact Camera FOV = 50 x 35°
  • Human FOV = 200 x 135°
  • Panoramic Mosaic = 360 x 180°

how does it work?

homography

$\begin{bmatrix}wx'\\wy'\\w\end{bmatrix} = \begin{bmatrix}* & * & *\\* & * & *\\* & * & *\end{bmatrix} \begin{bmatrix}x\\y\\1\end{bmatrix}$

$p' = Hp$

getting H

$\begin{bmatrix}wx'\\wy'\\w\end{bmatrix} = \begin{bmatrix}a & b & c\\d & e & f\\g & h & i\end{bmatrix} \begin{bmatrix}x\\y\\1\end{bmatrix}$

9 unknowns and $w'$

$w'$ is easy: $w' = gy + hx + i$

Set up a system of linear equations:

$Ah = b$

where vector of unknowns $h = [a,b,c,d,e,f,g,h]^T$

Need at least 8 eqs

Solve for h. SVD for Eigen Value = 0

$min\|Ah-b\|^2$

didn't see

  • Automatic correspondances: Feature detector
  • Projection in a different space
  • Blending

parametric (global) warping

forward VS backwards

Backward Mapping eliminate holes

Needs a invertible wrap function: Not always possible

non-parametric warping

  • Input correspondences at key feature points
  • Define a triangular mesh over the points
  • Same mesh in both images!
  • Warp each triangle separately from source to destination (affine)

internet data

Using Internet Billions of Images to...

  • Improve our photographs
  • tag
  • describe
  • fill holes
  • colorize
  • explore places
  • recognition
  • computer intelligence
  • learning to see

also

  • Computer vision problems are hard: more images can help
  • also can make things more complicated
  • Grand-goal: understand what is in images
  • data is unorganized
  • lots of industry application: search, browsing, content matched advertising
  • pics and video are fun! (and a huge part content consumption)

internet as a data source

A.I. for the postmodern world:
all questions have already been answered…many times, in many ways Google is dumb, the “intelligence” is in the data

Text is simple: clean, segmented, compact, 1D
Visual data is much harder: Noisy, unsegmented, high entropy, 2D/3D

what can we do with m(b)illions of images

Scene Completion Using Millions of Photographs. [Hays and Efros. 2007]

Texture synthesis

inpainting

looking for semantic information

[Hays and Efros. 2007]

Compute oriented edge response at multiple scales (5 spatial scales, 6 orientations)

Gist scene descriptor (Oliva and Torralba 2001)

Color descriptor – color of the query image downsampled to 4x4

Find 200 closest neighbors in database

Texture – 5x5 median filter of image gradient magnitude at each pixel

Graph-cut

  • The scene matching distance
  • The context matching distance (color + texture)
  • The graph cut cost

evaluation

why works?

10 nearest neighbors, 20.000 image database

why works?

10 nearest neighbors, 2.3M image database

sumary

conclusion

  • Simple way of describing images
  • Importance of having a lot of data
  • Using a number of close neighbours for robustness

80 Million Tiny Images. [Torralba et al. 2008]

53,464 nouns, 79 million images. Very low resolution (32x32)

  • want minimal representation of images for the task: classifing scene
  • compact representation
  • low-res 32x32 color images:fast for processing, low memory footprint
  • Humans can do recognition well at small scale

similarity metric

similarity metric

siblings quality

What can we use it for?

experiments

label an image as containing a person or not

experiments

colorize gray scale images

experiments

more later...

experiments

experiments

experiments

conclusions

  • Can get good results with simple algorithms and lots of data
  • issues with internet data: image bias & labeling noise
  • many metrics...

IM2GPS: estimating geographic information from a single image. [Hays et al. 2008]

where?

What can you say about where these photos were taken?

how?

Collect a large collection of geo-tagged photos

6.5 million images with both GPS coordinates and geographic keywords, removing images with keywords like birthday, concert, abstract, …

Test set – 400 randomly sampled images from this collection. Manually removed abstract photos and photos with recognizable people – 237 test photos.

how?

Features
  • Tiny images – 16x16 color images
  • Color histograms in CIE Lab color space
  • Texton histograms – clustered responses to a bank of filters.
  • Line features – histogram describing statistics of straight lines in image
  • Gist descriptor + color
  • Geometric context (ground, sky, vertical)

results

results

results

results

results

performance

across database size

performance

across features

Data-driven visual similarity for cross-domain image matching. [Shrivastava et al. 2011]

why is it so hard?

sift

sift

what is unique?

what is unique given this world?

support vector machine (SVM)

per exemplar (SVM)

histogram of oriented gradients (HOG)

search using pintings

search using pintings

search using sketches

search using sketches

painting2GPS

Where was the Painter Standing?

painting2GPS

rephotography

rephotography

organizing data collections

other applications

photo-tourism

photo-tourism

cg2Real

semantic photo-synthesis

photo clip art

Convolutional Neural Networks Deep Learning

  • Learns the "descriptors" (convolutional kerners)
  • from the data, needs very large datasets
  • still tricky to converge (but much easier now)
  • tend to work better than pevious approaches
  • (because of more paramenters, and non-linearities)

clasification (like tyny images)

Limitations

  • Solves it with more data
  • Capsule Networks?
  • dangers of data

    bias

    • Internet is a great source of visual data...
    • but is not random sample of the visual world
    • meany sources of bias
      • Sampling Bias
      • Photographer Bias
      • Social

    bias

    Flicker Paris

    Real (random) Paris

    photographer bias

    people follow photographic conventions

    social bias

    social bias

    conclusions

    • amount of data is key
    • simple descriptors and simple metrics
    • deep learning is the future (and present)
    • computationally challenging: tiny images took 9 months to get the data
    • data is biased
    • pick your data to suit your problem

    announcements

    http://franchomelendez.com/Uwr/teaching/COMPHO/_LECTURES/L5/computational_imaging_III.html


    franchomelendez@cs.uni.wroc.pl

    credits and references and aditional readings

    These slides have been prepared with materials, slides, and discussions from the authors of the papers.