Computational Photography

Computational Imaging III

by Francho Melendez

lab 3

  • difficulties?
  • did you start?
  • easier?
  • clearier?
  • Next week submit Lab 2 and 3 (final)

today's schedule

  • Recap
  • Using internet as data source
  • Using millios of images for scene completion
  • Using millions of images for image retrieval
  • Borrowing information from neighbours
  • Using millions of image for defining visual similarity
  • other applications


photons to RAW


burst photography

idea: take 2 or more images (bursts of images) and combine them

  • super-resolution
  • focal stack
  • aperture stack - confocal stereo
  • blurry/noisy
  • flash/no flash
  • HDR

expanding the field of view

  • Compact Camera FOV = 50 x 35°
  • Human FOV = 200 x 135°
  • Panoramic Mosaic = 360 x 180°

how does it work?


$\begin{bmatrix}wx'\\wy'\\w\end{bmatrix} = \begin{bmatrix}* & * & *\\* & * & *\\* & * & *\end{bmatrix} \begin{bmatrix}x\\y\\1\end{bmatrix}$

$p' = Hp$

getting H

$\begin{bmatrix}wx'\\wy'\\w\end{bmatrix} = \begin{bmatrix}a & b & c\\d & e & f\\g & h & i\end{bmatrix} \begin{bmatrix}x\\y\\1\end{bmatrix}$

9 unknowns and $w'$

$w'$ is easy: $w' = gy + hx + i$

Set up a system of linear equations:

$Ah = b$

where vector of unknowns $h = [a,b,c,d,e,f,g,h]^T$

Need at least 8 eqs

Solve for h. SVD for Eigen Value = 0


didn't see

  • Automatic correspondances: Feature detector
  • Projection in a different space
  • Blending

parametric (global) warping

forward VS backwards

Backward Mapping eliminate holes

Needs a invertible wrap function: Not always possible

non-parametric warping

  • Input correspondences at key feature points
  • Define a triangular mesh over the points
  • Same mesh in both images!
  • Warp each triangle separately from source to destination (affine)

internet data

Using Internet Billions of Images to...

  • Improve our photographs
  • tag
  • describe
  • fill holes
  • colorize
  • explore places
  • recognition
  • computer intelligence
  • learning to see


  • Computer vision problems are hard: more images can help
  • also can make things more complicated
  • Grand-goal: understand what is in images
  • data is unorganized
  • lots of industry application: search, browsing, content matched advertising
  • pics and video are fun! (and a huge part content consumption)

internet as a data source

A.I. for the postmodern world:
all questions have already been answered…many times, in many ways Google is dumb, the “intelligence” is in the data

Text is simple: clean, segmented, compact, 1D
Visual data is much harder: Noisy, unsegmented, high entropy, 2D/3D

what can we do with m(b)illions of images

Scene Completion Using Millions of Photographs. [Hays and Efros. 2007]

Texture synthesis


looking for semantic information

[Hays and Efros. 2007]

Compute oriented edge response at multiple scales (5 spatial scales, 6 orientations)

Gist scene descriptor (Oliva and Torralba 2001)

Color descriptor – color of the query image downsampled to 4x4

Find 200 closest neighbors in database

Texture – 5x5 median filter of image gradient magnitude at each pixel


  • The scene matching distance
  • The context matching distance (color + texture)
  • The graph cut cost


why works?

10 nearest neighbors, 20.000 image database

why works?

10 nearest neighbors, 2.3M image database



  • Simple way of describing images
  • Importance of having a lot of data
  • Using a number of close neighbours for robustness

80 Million Tiny Images. [Torralba et al. 2008]

53,464 nouns, 79 million images. Very low resolution (32x32)

  • want minimal representation of images for the task: classifing scene
  • compact representation
  • low-res 32x32 color images:fast for processing, low memory footprint
  • Humans can do recognition well at small scale

similarity metric

similarity metric

siblings quality

What can we use it for?


label an image as containing a person or not


colorize gray scale images


more later...





  • Can get good results with simple algorithms and lots of data
  • issues with internet data: image bias & labeling noise
  • many metrics...

IM2GPS: estimating geographic information from a single image. [Hays et al. 2008]


What can you say about where these photos were taken?


Collect a large collection of geo-tagged photos

6.5 million images with both GPS coordinates and geographic keywords, removing images with keywords like birthday, concert, abstract, …

Test set – 400 randomly sampled images from this collection. Manually removed abstract photos and photos with recognizable people – 237 test photos.


  • Tiny images – 16x16 color images
  • Color histograms in CIE Lab color space
  • Texton histograms – clustered responses to a bank of filters.
  • Line features – histogram describing statistics of straight lines in image
  • Gist descriptor + color
  • Geometric context (ground, sky, vertical)







across database size


across features

Data-driven visual similarity for cross-domain image matching. [Shrivastava et al. 2011]

why is it so hard?



what is unique?

what is unique given this world?

support vector machine (SVM)

per exemplar (SVM)

histogram of oriented gradients (HOG)

search using pintings

search using pintings

search using sketches

search using sketches


Where was the Painter Standing?




organizing data collections

other applications




semantic photo-synthesis

photo clip art

Convolutional Neural Networks Deep Learning

  • Learns the "descriptors" (convolutional kerners)
  • from the data, needs very large datasets
  • still tricky to converge (but much easier now)
  • tend to work better than pevious approaches
  • (because of more paramenters, and non-linearities)

clasification (like tyny images)


  • Solves it with more data
  • Capsule Networks?
  • dangers of data


    • Internet is a great source of visual data...
    • but is not random sample of the visual world
    • meany sources of bias
      • Sampling Bias
      • Photographer Bias
      • Social


    Flicker Paris

    Real (random) Paris

    photographer bias

    people follow photographic conventions

    social bias

    social bias


    • amount of data is key
    • simple descriptors and simple metrics
    • deep learning is the future (and present)
    • computationally challenging: tiny images took 9 months to get the data
    • data is biased
    • pick your data to suit your problem


    credits and references and aditional readings

    These slides have been prepared with materials, slides, and discussions from the authors of the papers.