Carl Olsson
Mathematical Imaging Group at the Centre for Mathematical Sciences
Lund Institute of Technology, Lund University.
Interests: Mathematics in general, global optimization with applications in computer vision,
robotics and related areas, rock n' roll...
Below is an example of my research. Given a number images of an object try to determine the camera positions and the structure
of the viewed object:
More movies on my YouTube channel.
Teaching.
Multidimensional Calculus (fall 2009)
Computer Vision (spring 2011)
Computer Vision (spring 2012)
Computer Vision (spring 2013)
Computer Vision (spring 2014)
Linear Algebra (fall 2014)
Linear Algebra (fall 2015)
Linear Algebra (Spring 2018)
Data Sets.
I am in the process of collecting all my structure from motion data sets on
this page.
Publications.
PhDthesis: Global Optimization in Computer Vision: Convexity, Cuts and Approximation Algorithms
Olsson, Carl, 2009.
Licentiatethesis: Global Optimization and Approximation Algorithms in Computer Vision
Olsson, Carl, 2007.
Masterthesis: Shape Optimization for Incompressible Laminar Flows
Olsson, Carl, 2004.
Papers:
This paper considers the problem of finding a low rank matrix from observations of linear combinations of its elements.
It is well known that if the problem fulfills a restricted isometry property (RIP), convex relaxations using the nuclear norm typically work well and come with theoretical performance guarantees.
On the other hand these formulations suffer from a shrinking bias that can severely degrade the solution in the presence of noise.
In this theoretical paper we study an alternative nonconvex relaxation that in contrast to the nuclear norm does not penalize the leading singular values and thereby avoids this bias. We show that despite its nonconvexity the proposed formulation will in many cases have a single stationary point if a RIP holds. Our numerical tests show that our approach typically converges to a better solution than nuclear norm based alternatives even in cases when the RIP does not hold.
This paper considers the problem of recovering either a low rank matrix or a sparse vector from observations of linear combinations of the vector or matrix elements.
Recent methods replace the nonconvex regularization with $\ell_1$ or nuclear norm relaxations.
It is well known that this approach can be guaranteed to recover a near optimal solutions
if a so called restricted isometry property (RIP) holds.
On the other hand it is also known to perform soft thresholding which results in a shrinking bias which can degrade the solution.
In this paper we study an alternative nonconvex regularization term.
This formulation does not penalize elements that are larger than a certain threshold making it much less prone to small solutions.
Our main theoretical results show that if a RIP holds then the stationary points are often well separated, in the sense that their differences must be of high cardinality/rank.
Thus, with a suitable initial solution the approach is unlikely to fall into a bad local minima.
Our numerical tests show that the approach is likely to converge to a better solution than standard $\ell_1$/nuclearnorm relaxation even when starting from trivial initializations.
In many cases our results can also be used to verify global optimality of our method.
A convex envelope for the problem of finding the best approximation to a given matrix with a prescribed rank is constructed. This convex envelope allows the usage of traditional optimization techniques when additional constraints are added to the finite rank approximation problem. Expression for the dependence of the convex envelope on the singular values of the given matrix is derived and global minimization properties are derived. The corresponding proximity operator is also studied.
Traditional matrix factorization methods approximate high dimensional data with a low dimensional subspace. This imposes constraints on the matrix elements which allow for estimation of missing entries. A lower rank provides stronger constraints and makes estimation of the missing entries less ambiguous at the cost of measurement fit.
In this paper we propose a new factorization model that further constrains the matrix entries.
Our approach can be seen as a unification of traditional lowrank matrix factorization and the more recent unionofsubspace approach. It adaptively finds clusters that can be modeled with low dimensional local subspaces and simultaneously uses a global rank constraint to capture the overall scene interactions. For inference we use an energy that penalizes a tradeoff between data fit and degreesoffreedom of the resulting factorization. We show qualitatively and quantitatively that regularizing both local and global dynamics yields significantly improved missing data estimation.
One of the key challenges for structure from motion systems in order to make them robust to failure is the ability to handle outliers among the correspondences. In this paper we present two new algorithms that find the optimal solution in the presence of outliers when the camera undergoes a pure translation. The first algorithm has polynomialtime computational complexity, independently of the amount of outliers. The second algorithm does not offer such a theoretical complexity guarantee, but we demonstrate that it is magnitudes faster in practice. No random sampling approaches such as RANSAC are guaranteed to find an optimal solution, while our two methods do. We evaluate and compare the algorithms both on synthetic and real experiments. We also embed the algorithms in a larger system, where we optimize for the rotation angle as well (the rotation axis is measured by other means). The experiments show that for problems with a large amount of outliers, the RANSAC estimates may deteriorate compared to our optimal methods.
Low rank approximation is an important tool in many applications. Given an observed matrix with elements corrupted by Gaussian noise it is possible to find the best approximating matrix of a given rank through singular value decomposition. However, due to the nonconvexity of the formulation it is not possible to incorporate any additional knowledge of the sought matrix without resorting to heuristic optimization techniques. In this paper we propose a convex formulation that is more flexible in that it can be combined with any other convex constraints and penalty functions. The formulation uses the so called convex envelope, which is the provably best possible convex relaxation. We show that for a general class of problems the envelope can be efficiently computed and may in some cases even have a closed form expression. We test the algorithm on a number of real and synthetic data sets and show stateoftheart results.
In computer vision, many problems can be formulated as finding a low rank approximation of a given matrix. Ideally, if all elements of the measurement matrix are available, this is easily solved in the L2norm using factorization. However, in practice this is rarely the case. Lately, this problem has been addressed using different approaches, one is to replace the rank term by the convex nuclear norm, another is to derive the convex envelope of the rank term plus a data term. In the latter case, matrices are divided into submatrices and the envelope is computed for each subblock individually. In this paper a new convex envelope is derived which takes all submatrices into account simultaneously. This leads to a simpler formulation, using only one parameter to control the tradeof between rank and data fit, for applications where one seeks low rank approximations of multiple matrices with the same rank. We show in this paper how our general framework can be used for manifold denoising of several images at once, as well as just denoising one image. Experimental comparisons show that our method achieves results similar to stateoftheart approaches while being applicable for other problems such as linear shape model estimation.
Previous work on estimating the epipolar geometry of two views relies on being able to reliably match feature points based on appearance. In this paper, we go one step further and show that it is feasible to compute both the epipolar geometry and the correspondences at the same time based on geometry only. We do this in a globally optimal manner. Our approach is based on an efficient branch and bound technique in combination with bipartite matching to solve the correspondence problem. We rely on several recent works to obtain good bounding functions to battle the combinatorial explosion of possible matchings. It is experimentally demonstrated that more difficult cases can be handled and that more inlier correspondences can be obtained by being less restrictive in the matching phase.
This paper presents a simple and effective way of solving the robust subspace estimation problem where the corruptions are columnwise. The method we present can handle a large class of robust loss functions and is simple to implement. It is based on Iteratively Reweighted Least Squares (IRLS) and works in an iterative manner by solving a weighted leastsquares rankconstrained problem in every iteration. By considering the special case of columnwise loss functions, we show that each such surrogate problem admits a closed form solution.
Unlike many other approaches to subspace estimation, we make no relaxation of the lowrank constraint and our method is guaranteed to produce a subspace estimate with the correct dimension.
Subspace estimation is a core problem for several applications in computer vision. We empirically demonstrate the performance of our method and compare it to several other techniques for subspace estimation. Experimental results are given for both synthetic and real image data including the following applications: linear shape basis estimation, plane fitting and nonrigid structure from motion.
Many standard optimization methods for segmentation and reconstruction compute ML model estimates for appearance or geometry of segments, e.g. ZhuYuille 1996, Torr 1998, ChanVese 2001, GrabCut 2004, Delong et al. 2012. We observe that the standard likelihood term in these formulations corresponds to a generalized probabilistic Kmeans energy. In learning it is well known that this energy has a strong bias to clusters of equal size, which can be expressed as a penalty for KL divergence from a uniform distribution of cardinalities. However, this volumetric bias has been mostly ignored in computer vision. We demonstrate significant artifacts in standard segmentation and reconstruction methods due to this bias. Moreover, we propose binary and multilabel optimization techniques that either (a) remove this bias or (b) replace it by a KL divergence term for any given target volume distribution. Our general ideas apply to many continuous or discrete energy formulations in segmentation, stereo, and other reconstruction problems.
Outliers pose a problem in all real structure from motion systems. Due to the use of automatic matching methods one has to expect that a (sometimes very large) portion of the detected correspondences can be incorrect. In this paper we propose a method that estimates the relative translation between two cameras and simultaneously maximizes the number of inlier correspondences. Traditionally, outlier removal tasks have been addressed using RANSAC approaches. However, these are random in nature and offer no guarantees of finding a good solution. If the amount of mismatches is large, the approach becomes costly because of the need to evaluate a large number of random samples. In contrast, our approach is based on the branch and bound methodology which guarantees that an optimal solution will be found. While most optimal methods trade speed for optimality, the proposed algorithm has competitive running times on problem sizes well beyond what is common in practice. Experiments on both real and synthetic data show that the method outperforms stateoftheart alternatives, including RANSAC, in terms of solution quality. In addition, the approach is shown to be faster than RANSAC in settings with a large amount of outliers.
In this paper we consider the classical problem of finding a low rank approximation of
a given matrix. In a least squares sense a closed form solution is available via factorization. However, with additional constraints, or in the presence of missing data, the problem becomes much more difficult.
In this paper we show how to efficiently compute the convex envelopes of a class of rank minimization formulations. This opens up the possibility of adding additional convex constraints and functions to the minimization problem resulting in strong convex relaxations. We evaluate the framework on both real and synthetic data sets and demonstrate stateoftheart performance.
The problem of finding a low rank approximation of a given measurement matrix is of key interest in computer vision. If all the elements of the measurement matrix are available, the problem can be solved using factorization. However, in the case of missing data no satisfactory solution exists. Recent approaches replace the rank term with the weaker (but convex) nuclear norm. In this paper we show that this heuristic works poorly on problems where the locations of the missing entries are highly correlated and structured which is a common situation in many applications. Our main contribution is the derivation of a much stronger convex relaxation that takes into account not only the rank function but also the data. We propose an algorithm which uses this relaxation to solve the rank approximation problem on matrices where the given measurements can be organized into overlapping blocks without missing data. The algorithm is computationally efficient and we have applied it to several classical problems including structure from motion and linear shape basis estimation. We demonstrate on both real and synthetic data that it outperforms stateoftheart alternatives.
Stereo matching is an inherently difficult problem due to ambiguous and noisy texture. The nonconvexity and non differentiability makes local linear (or quadratic) approximations poor, thereby preventing the use of standard local descent methods. Therefore recent methods are predominantly based on discretization and/or random sampling of some class of approximating surfaces (e.g. planes). While these methods are very efficient in generating a rough surface estimate, via either fusion of proposals or label propagation, the end result is usually not as smooth as desired. In this paper we show that, if the objective function is decomposed correctly, local refinement of candidate solutions can be performed using an ADMM approach. This allows searching over more general function classes, thereby resulting in visually more appealing smooth surface estimations.
One of the major research areas in computer vision is scene reconstruction from image streams. The advent of RGBD cameras, such as the Microsoft Kinect, has lead to new possibilities for performing accurate and dense 3D reconstruction. There are already wellworking algorithms to acquire 3D models from depth sensors, both for large and small scale scenes. However, these methods often break down when the scene geometry is not so informative, for example, in the case of planar surfaces. Similarly, standard imagebased methods fail for textureless scenes. We combine both color and depth measurements from an RGBD sensor to simultaneously reconstruct both the camera motion and the scene geometry in a robust manner. Experiments on real data show that we can accurately reconstruct largescale 3D scenes despite many planar surfaces.
Energies with highorder nonsubmodular interactions have been shown to be very useful in vision due to their high modeling power. Optimization of such energies, however, is generally NPhard. A naive approach that works for small problem instances is exhaustive search, that is, enumera tion of all possible labelings of the underlying graph. We propose a general minimization approach for large graphs based on enumeration of labelings of certain small patches. This partial enumeration technique reduces complex high order energy formulations to pairwise Constraint Satisfac tion Problems with unary costs (uCSP), which can be ef ficiently solved using standard methods like TRWS. Our approach outperforms a number of existing stateoftheart algorithms on well known difficult problems (e.g. curvature regularization, stereo, deconvolution); it gives near global minimum and better speed. Our main application of interest is curvature regularization.
In the context of segmentation, our partial enu meration technique allows to evaluate curvature directly on small patches using a novel integral geometry approach.
In this paper we extend the 3Dlabel approach and demonstrate how to simultaneously fuse more than two proposals
at the same time for a 2nd order stereo regularizer. The optimization is
made possible by effectively computing a generalized distance transform. This
allows for computation of messages in linear time in the number of proposals, similar to the 1Dlabel case.
In addition the approach provides a lower bound on the globally optimal solution of
the multifusion problem. We verify experimentally that the lower bound is very
close to the computed solution, thus providing a near optimal solution.
It is commonly believed that higher order smoothness should be modeled using higher order interactions.
For example, 2nd order derivatives for deformable (active) contours are represented by triple cliques.
Similarly, the 2nd order regularization methods in stereo predominantly use MRF models with scalar (1D) disparity labels
and triple clique interactions. In this paper we advocate a largely overlooked alternative approach to stereo where 2nd order surface smoothness is represented by pairwise interactions with 3Dlabels, e.g. tangent planes.
This general paradigm has been criticized due to perceived computational complexity
of optimization in higherdimensional label space. Contrary to popular beliefs, we demonstrate
that representing 2nd order surface smoothness with 3D labels leads to simpler optimization
problems with (nearly) submodular pairwise interactions. Our theoretical and experimental
results demonstrate advantages over stateoftheart methods for 2nd order smoothness stereo.
Multiple rotation averaging is an important problem in computer vision. The problem is challenging because of the nonlinear constraints required to represent the set of rotations. To our knowledge no one has proposed any globally optimal solution for the case of simultaneous updates of the rotations. In this paper we propose a simple procedure based on Lagrangian duality that can be used to verify global optimality of a local solution, by solving a linear system of equations. We show experimentally on real and synthetic data that unless the noise levels are extremely high this procedure always generates the globally optimal solution.
We consider the leastsquares (L2) minimization problems in multiple view geometry for triangulation, homography, camera resectioning and structureandmotion with known rotatation, or known plane. Although optimal algorithms have been given for these problems under an Linfinity cost function, finding optimal leastsquares solutions to these problems is difficult, since the cost functions are not convex, and in the worst case may have multiple minima. Iterative methods can be used to find a good solution, but this may be a local minimum. This paper provides a method for verifying whether a localminimum solution is globally optimal, by providing a simple and rapid test involving the Hessian of the cost function. The basic idea is that by showing that the cost function is convex in a restricted but large enough neighbourhood, a sufficient condition for global optimality is obtained.
The method is tested on numerous problem instances of real data sets. In the vast majority of cases we are able to verify that the solutions are optimal, in particular, for small to mediumscale problems.
Many problems in computer vision can be turned into a largescale boolean optimization problem, which is in general NPhard. In this paper, we further develop one of the most successful approaches, namely roof duality, for approximately solving such problems for higherorder models. Two new methods that can be applied independently or in combination are investigated. The first one is based on constructing relaxations using generators of the submodular function cone. In the second method, it is shown that the roof dual bound can be applied in an iterated way in order to obtain a tighter relaxation. We also provide experimental results that demonstrate better performance with respect to the stateoftheart, both in terms of improved bounds and the number of optimally assigned variables.
Geometric reconstruction from image collections is a classical computer vision problem. The problem essentially consists of two steps; First, the identification of matches and assembling of point tracks, and second, multiple view geometry computations. In this paper we address the problem of constructing point tracks using graph theoretical algorithms. From standard descriptor matches between all pairs of images we construct a graph representing all image points and all possible matches. Using GomoryHu trees we make cuts in the graph to construct the individual point tracks. We present both theoretical and experimental results (on real datasets) that clearly demonstrates the benefits of using our approach.
Here we propose an energybased framework for approximating surfaces from a cloud of point measurements corrupted by noise and outliers. Our energy assigns a tangent plane to each (noisy) data point by minimizing the squared distances to the points and the irregularity of the surface implicitly defined by the tangent planes. In order to avoid the wellknown "shrinking" bias associated with firstorder surface regularization, we choose a robust smoothing term that approximates curvature of the underlying surface. In contrast to a number of recent publications estimating curvature using discrete (e.g. binary) labellings with triplecliques we use higherdimensional labels that allows modeling curvature with only pairwise interactions. Hence, many standard optimization algorithms (e.g. message passing, graph cut, etc) can minimize the proposed curvaturebased regularization functional. The accuracy of our approach for representing curvature is demonstrated by theoretical and empirical results on synthetic and real data sets from multiview reconstruction and stereo.
This paper points out a few benefits of using a nonsequential approach to structure from motion.
Prior work on multiview structure from motion is dominated by sequential approaches starting from a single twoview reconstruction, then adding new images one by one. In contrast, we propose a nonsequential methodology based on rotational consistency and robust estimation using convex optimization. The resulting system is more robust with respect to (i) unreliable twoview estimations caused by short baselines, (ii) repetitive scenes with locally consistent structures that are not consistent with the global geometry and (iii) loop closing as errors are not propagated in a sequential manner. Both theoretical justifications and experimental comparisons are given to support these claims.
This paper presents a nonincremental system for structure and motion.
Our solution is based on robustly computing global rotations from relative geometries and
feeding these into the knownrotation framework to create an initial solution for bundle adjustment.
To increase robustness we present a new method for constructing reliable point tracks from pairwise matches.
We show that our method can be seen as maximizing the reliability of a point track if the
quality of the weakest link in the track is used to evaluate reliability.
To estimate the final geometry we alternate between bundle adjustment and a robust version of the knownrotation formulation.
The ability to compute both structure and camera translations independent of initialization makes our algorithm insensitive
to degenerate epipolar geometries.
This paper investigates triangulation of a scene plane from two images.
It shows that when minimizaing backprojection error of the plane induced homography,
this problem is quasiconvex and can be solved using standard methods.
This report points out a few benefits of using a nonsequential approach to structure from motion.
In particular we show that nonsequential approaches are insensitive to geometries with small baselines.
This is in contrast to sequential methods that in each step rely on a well estimated structure to be able
to add new cameras.
This paper extends the Linfinity framework to problems in 1dimmensional vision when using angular errors.
We show that a number of 1D geometry problems are quasiconvex.
Using these results we also propose a branch and bound method for the full (biquasiconvex) structure and motion problem for 1Dvision.
Furthermore, we show that due to limited field of view the problem is often ill posed, and there exist multiple good
(in terms of reprojection error) local minima.
The paper builds on the preliminary results of this one.
Recent work on geometric vision problems has exploited convexity properties in order to obtain globally optimal solutions.
In this paper we give an overview of these developments and show the tight connections between different types of convexity
and optimality conditions for a large class of multiview geometry problems.
We also show how the convexity properties are closely linked to different types of optimization algorithms for computing the solutions.
This paper is based on some of the preliminary results in this one and this one.
This paper proposes two methods for outlier removal in structure for motion.
The first is an improvment of the algorithm presented by Sim & Hartley in this
paper.
We achive a speedup or roughly an order of magnitude with the same guarantees.
The second one is related to the popular approach of
L1approximation.
While this version is even more efficient in terms of computation times, it does not have the same guarantees as the previous one.
We show using duality that these methods are in fact related and we demonstrate on a number of large scale data sets that both methods
works very well.
This paper proposes a branch and bound method for solving optimization problems involving orthogonal constraints,
and is based on the two conference papers here and here.
Our first application is a generalization of the absolute orientation problem introduced in
Horn et.al. Rather than only registering two point clouds we also use pointtoplane and pointtoline correspondences.
The second application considered is calibrated camera pose estimation.
We show on both real and synthetic data that local minima do occur for this problem.
In this paper we study optimization problems with quasiconvex residuals under the L2norm.
We consider the same framework as is introduced in
KahlHartley2008. However, since we are not using the maxnorm, but rather the least squares formulation,
we cannot be sure that the resulting function is local minima free.
Rather than resorting to more complex global optimization methods, such as branch and bound,
we derive a simple test for verifying that a given local minima is indeed global.
We show on a number of real data sets that this approach is very efficient and in the vast majority of all cases
we are able to verify optimality.
This paper extends this one by considering 1dimmensional vision problems.
In this case it is more natural to use the sum of squared angular error.
We show that similar to the 2Dcase it is possible to derive a similar test to verify optimality.
Further more, we present a probabilistic method that can with high probability verify global optimality
should the former one fail.
In this paper we study how to fill in the missing data entries of a matrix under a low rank constraint.
The well known
factorization method can be used to solve for both structure and motion simultaneously
if the entire observation matrix is known. However, if some of the entries are missing these have to be
filled in using a rank constraint. Since the rank function is nonconvex we propose to instead choose the
missing data that minimizes the sum of the singular values, the nuclear norm, of the measurement matrix.
This approach is similar to the successful
L1relaxation
method that has successfully been applied to
compressed sensing and other related areas.
Solving Quadratically Constrained Geometrical Problems using Lagrangian Duality
Olsson, Carl; Eriksson, Anders P., International Conference on Pattern Recognition, 2008.
(bibtex)
This paper studies the generalized registration problem introduced here.
In particular we investigate the quality of the lower bound on the global optimum which is provided by
Lagrangian duality. Since the orthogonallity constraints are non convex we cannot expect that there is no duality gap.
However, we show that the obtained solution is always better than direct linear methods, and in practice close to
the optimum.
In this paper we study outlier removal problems for problems with pseudoconvex residuals.
We show that these problems can be solved in polynomial time under the maxnorm.
Speciffcally, if the dimmension of the search space is n (eg. triangulation n=3, homography estimation n=8) then
there is always a set of no more than n+1 residuals that determines the solution.
This paper studies how to effectively solve multiple view geometry problems under the maxnorm.
The framework is the same as that of
KahlHartley2008. We show that these problems are not only quasiconvex but pseudoconvex, which is
a stronger criteria. This opens up the possibility of using other more efficient algorithms than the bisection algorithm.
This paper extends the Linfinity framework to problems in 1dimmensional vision when using angular errors.
We show that a number of 1D geometry problems are quasiconvex.
Using these results we also propose a branch and bound method
for the full (biquasiconvex) structure and motion problem for 1Dvision.
The Registration Problem Revisited: Optimal Solutions From Points, Lines and Planes
Olsson, Carl; Kahl, Fredrik; Oskarsson, Magnus, IEEE Conf. on Computer Vision and Pattern Recognition, 2006.
(bibtex)
This paper considers a generalization of the absolute orientation problem introduced in
Horn et.al. Rather than only registering two point clouds we also use pointtoplane and pointtoline correspondences.
We propose a branch and bound method that efficiently solves this problem.
In this paper we present a branch and bound method for calibrated camera pose estimation.
We show on both real and synthetic data that local minima do occur for this problem.
This paper presents a reformulation of the normalized cuts criterion that allows incorporation of linear equality constraints.
This is done by restating the problem and showing how linear constraints can
be enforced exactly in the optimization scheme through duality.
Extending Continuous Cuts: Anisotropic Metrics and Expansion Moves
Olsson, Carl; Byröd, Martin; Overgaard, Niels Christian; Kahl, Fredrik, International Conference on Computer Vision, 2009.
Improved Spectral Relaxation Methods for Binary Quadratic Optimization Problems
Olsson, Carl; Eriksson, Anders P.; Kahl, Fredrik, Published in: Computer Vision Image Understanding, 2008.
(bibtex)
Normalized Cuts Revisited: A Reformulation for Segmentation with Linear Grouping Constraints
Eriksson, Anders P; Olsson, Carl; Kahl, Fredrik, International Conference on Computer Vision, 2007.
(bibtex)
Efficiently Solving the Fractional Trust Region Problem
Eriksson, Anders; Olsson, Carl; Kahl, Fredrik, Asian Conference on Computer Vision, 2007.
(bibtex)
Image Segmentation with Context
Eriksson, Anders; Olsson, Carl; Kahl, Fredrik, Scandinavian Conference on Image Analysis, 2007.
(bibtex)
Solving Large Scale Binary Quadratic Problems: Spectral Methods vs. Semidefinite Programming
Olsson, Carl; Eriksson, Anders; Kahl, Fredrik, IEEE Conf. on Computer Vision and Pattern Recognition, 2007.
(bibtex)
Address: Centre for Mathematical Sciences, Lund University, PO Box 118, 221 00 Lund, Sweden
Phone: +46 46 222 85 65, Fax: +46 46 222 40 10
Email: calle@maths.lth.se
