For course Y196, November 2000

Ilkka Korpela

University Of Helsinki

Department of Forest Resource Management

This paper discusses 3-d reconstruction of complex real world objects (surfaces) from (aerial) photographs and airborne laser scanning data. This paper is based on the article by Baltsavias (1999a).

**1. Introduction**

The possibility to store and analyze 3-D real world data in digital form has become reality in many fields of human activity. More detailed information and data about the environment we live in can now be sought, analyzed and presented as the digital data storage and manipulation capacity has increased and made available for many. Accurate data collected with cost-efficient methods is always useful in planning and decision making. For example, fifteen to twenty years ago, accurate 3-D photogrammetry required highly sophisticated and expensive equipment together with trained personnel for many of its routine tasks. Now, with digital imagery, at least the demand for the special equipment has changed dramatically. Technical advancements in laser and positioning technology have also provided us with new ways of looking at the world (Baltsavias 1999a, Kraus 1993).

Photogrammetry is based on the processing of images. The main products are digital terrain models (DTM), digital surface models (DSM), orthoimages, 2D and 3D reconstruction and classification of real world objects, and visualization (virtual models). Although many photogrammetric theories date back more than a century the use of digital photogrammetry in extracting information has undergone only development for the last 20 years. The degree of automatization and digitalization of the processing work is now quite high. The quality assessment of photogrammetric products is well established and the overall possibilities and limits of photogrammetric methods are commonly known (Baltsavias 1999a).

Laser is said to be one of the most prominent technological discoveries of the 20th century. It was quite recently introduced to the photogrammetric community and currently interest in airborne laser scanning (ALS) has strongly increased. The fact that ALS is still a new technology influences the costs, the maturity of the data processing methods and the imaging system integration, the number of system and service providers etc. (Baltsavias 1999a).

The major differences between photogrammetry and ALS can be listed as follows: i) passive vs. active, high-power, collimated and monochromatic sensing, ii) frame or linear sensors vs. point sensors, iii) perspective imaging geometry vs. polar geometry, iv) full area coverage vs. pointwise sampling, v) indirect vs. direct acquisition or encoding of 3D coordinates, vi) geometrically and radiometrically high quality images vs. no imaging or monochromatic images of inferior quality, vii) mature technology and error modeling vs. both being under development, viii) desktop systems vs. strong dependence on the system and data provider. Many more differences can be seen as consequences of those mentioned above (Baltsavias 1999a).

Common aspects between ALS and image matching for DSM/DTM production
are: i) use of GPS/INS, ii) methods for the processing of raw (X,Y,Z) -data
such as filtering of outliers, DTM generation from DSM, detection of elevation
breaklines etc, and iii) image analysis/processing methods when ALS-data
is regularly interpolated into an image (Baltsavias 1999a).

**Picture 1. **Work phases
in DSM production. Surface approximation is where the production lines
meet. (POS = passive optical sensing ~ photogrammetric DSM production)

**2. The task of producing a DEM/DSM/DTM**

Deriving or reconstructing a 3-D or a 2½-D surface from 2-D projections or ranging data is sometimes called 3D scanning. As a semi-product there is always a 3D point-cloud which may or may not be ordered in some sense like the consecutive points in laser data. There are many ways to create such point-cloud data from real world objects and in the context of this paper the following are of importance: i) registration of control (correspondence ~, object ~) points in multiple views, ii) laser scanning using one or multiple views for objects of more complicated topological type (Hoppe 1994).

The somewhat irrational concept of a 2½-dimensional* surface
model* comes from the idea that it can be assumed that for each XY-position
there exists only one Z-coordinate in a model of a three-dimensional object
(Kraus 1997 p. 357). This type of model has clear advantages for being
so straightforward in its definition. In terms of surface parameterization
this type of model is rewarding. XY-plane is a firm base and a unique solution
in terms of acquiring three-dimensional coordinates for any point is guaranteed
by the definition of the representation. Limitations are also obvious.
A 2½-D model will not suffice if complex three-dimensional objects
need to be represented. An example could be a three-dimensional model of
a broad-leaved tree for visualization purposes for which a 2½-dimensional
representation would fail. On the other hand a representation for the surface
of the soil could be done using a 2½-dimensional model

In this paper the term DSM is used for the kind of 2½-dimensional representation where for a unique point in the XY-plane there is unique Z-value or a single limited range of Z-values representing a breakline in the surface model. A digital terrain model, DTM is usually considered to be a refinement or (including) a subset of a DSM. It can be said that a DSM is filtered to produce a DTM or a DEM. Objects and surface variability are removed in this filtering process.

The problem of deriving a DSM from measurements could be defined as
follows: Given a set of sample points **X Î
Â ^{3}**, assumed to lie on or
near an unknown surface

It must be remembered that the measurements in **X** are never free
from measurement errors and that often the set **X** contains outliers.
An example of a case when outliers are present in **X** is an ALS dataset
with mixed measurements for both the vegetation surface and ground surface.
To identify and delete an outlier is problematic and wrong identifications
easily lead to biased surface estimates.

The set of surface models {**S**_{j}} is in practice limitless
and many different surface models can be equally good approximations for
the true surface (data points). This makes surface approximation an ill-posed
problem. Usually the functional form of the surface model is fixed and
the approximation task is reduced to parameter-estimation or is changed
from a variation calculus problem into a regression problem. The Cartesian
[a,b] x [c,d] Î Â^{2}
grid or arbitrary area A Î Â^{2}
for which the surface model is constructed can be segmented into an irregular
or regular grid (graph) to create a 'piecewise' model - or the model can
be global. Different models take into account breaklines or irregularities
in the surface in different ways, some models produce a more smooth or
geometrically more continuous surface than others, the data structures
and computational complexity vary, etc. The existence of breaklines in
the surface and the ability of the surface modeling method to recognize
and represent them is a fundamental aspect when different methods are rated
for their performance.

**___________________________________________________________________________________**

**Some 2½-D and 3-D surface representations**

**Polygonal meshes and TINs**

Planar polygons, also knows as planar facets or faces, can be used to model complex 3D-objects. A set of surface polygons encloses the object interior. For example the surface features of a polyhedron are precisely defined by a set of surface polygons. For other objects surfaces need to be tessellated differently to produce the polygon-mesh approximation. For example a cone-shaped tree crown could be made up of triangular facets that approximate the surface of the cone. Geometric data is stored in vertex, edge and polygon-surface tables. Polygonal meshes are useful for example when the parametric form for a complex 3D surface cannot be found. An irregular network of triangles is a 2½-dimensional polygonal mesh that is often used to model topography. When the value for Z is to be approximated at a location (x, y) it is given by a planar equation (ax + by + c = 0) for the triangle that encloses the planar point. In this representation breaklines present a problem. The graph can be made locally very dense to diminish the effect of breaklines but that comes at the cost of extra storage and computing effort. To overcome the problem of breaklines or discontinuity in the Z-representation the graph is broken into a 'piecewise' structure, where the breaklines are conditioned to follow the edges of the 'piecewise' presentation (Jain, Kasturi & Schunk p.374-378).

**Surface patches**

Polynomial surface patches are good for modeling portions of a surface, such as the neighborhood around a point. They are not convenient for modeling an entire surface and can be applied on graph surfaces only. The patches are bivariate polynomials: bilinear, biquadratic, bicubic, biquartic etc. (Jain, Kasturi & Schunk p.378-379)

**Tensor product surfaces**

A complex 2½-D surface can be represented as a sequence of different functions: polynomials, B-splines, trigonometric functions etc. B-splines are commonly used because of their geometric continuity properties. (Jain, Kasturi & Schunk p.380)

_____________________________________________________________________

The quality assessment of a DSM is partly an application specific task.
A DSM that is used for the production of a large scale orthophoto has to
meet different criteria from a DSM that is used as a basis for a DEM to
be used for flood-protection. The density of the 3D-point cloud affects
the quality of the DSM. A high sampling density makes it possible for the
surface model to follow abrupt changes in the surface (any high frequency
components). Random errors in measurements (in point locations, s_{XY},
s_{Z}) on the other hand need to be
smoothed and therefore the high frequency components are often missing
from a modeled surface.

**3. Methods**

**3.1. Airborne laser scanning**

A typical airborne laser ranging system operates in the near-infrared band where the wavelength is 1040-1060 nm. Typically the pulses are short, 10 ns (10 kHz) in duration and of medium to high power. The beam divergence is ca. 1 mrad. Some systems allow the recording of multiple echoes from one laser pulse, for example first and last, or even additional ones at regular intervals in-between. Besides the range information intensity information is also recorded. The position of the aircraft is estimated with GPS or eq. and the attitude of the laser gun with INS. With this orientation information the echoes can be given real world coordinates and they convert directly into geocoded /referenced point information. Any errors in this orientation result in location errors for the points (e.g. Baltsavias 1999b p.207-212) (see picture 2a for illustration). The outer orientation of the laser system is a function of time and this sets high demands on the positioning system. Typically the field of view (FOV) of ranging laser is less than 30 degrees so the maximum strip width is from 100 to 500 meters for typical flying heights ranging from 200 to 1000 meters. The points do not form a regular grid on the ground surface and the concept of point is vague because of the beam divergence. The projection area of the beam on the target, also called the laser footprint, can be several square meters for high ranging altitudes which deteriorates the geometric and radiometric properties of the point data (see picture 3).

**Picture 2a. **Errors
in the GPS/INS system are one source of LOCATION inaccuracy for the measured
points.

**Example** of location errors for the points (see picture 2a for
illustration): Assume the laser location (X0,Y0,Z0) to be free of errors,
an average flying altitude of 500 meters and the INS errors for the angles
Ð**a** and Ð**b**
to be Gaussian distributed with 0.1 degree (= 0.001745 rad.) standard deviations.
The error of the laser range reading (**r**) is expected to be normally
distributed with zero mean and a standard deviation of 0.1 % of the flying
altitude.

Point location (X,Y,Z) is given by the set of equations:

**Picture 2b**. Simulated
point clouds with erroneous INS (Ð a, Ð b) and range observations
(r). Left FOV 10 degrees, right FOV 30 degrees. Angle b (see picture 2a)
has a mean of +5 degrees in both cases. Flying altitude 500 meters, point
was assumed to be at Z=0 level.

**Picture 3**. The flying
altitude affects the size of the footprint. In forest conditions multiple
echoes can be registered. To classify points into ground and vegetation
points is a task that is needed for DTM production.

Laser, being an active sensor, can theoretically be used for 24 hours a day. Illumination shadows that are cast by objects do not exist, only occlusion by other objects. For DTM generation, data acquisition is better when trees have no leaves, whereby flights can even take place in winter with a thin snow cover. The dynamic range of object reflections is high for monochromatic laser light so that the signal often saturates and on the other hand some objects may remain 'invisible' because of their low reflectance in the laser wavelength region (Baltsavias 1999a, Krauss & Pfeifer 1998).

To plan an ALS flight is a demanding task. For example tie points that
are used for strip adjustment and control points at the block borders need
to be well-defined 3D-structures. Height control points are replaced by
flat height control areas that are measured for height with for example
GPS. They need to be large enough to compensate the possible identification
error. The risk for failures in the scanning process is quite high. On
the other hand a successful flight delivers the data (unclassified point
set **X**) fast and directly geocoded in an object coordinate system.

**3.2 Photogrammetric data acquisition**

Central elements of photogrammetry that make it different from ALS are: the data is obtained with a passive frame (only frame sensor or metric aerial cameras and aerial photogrammetry are treated here onwards) sensor; the geometric transformation is determined by central projection and perspective geometry; the inner orientation of metric aerial cameras is usually well established (stable bundle of rays) and the images are of high geometric and radiometric quality; the objects that are detectable on these images are usually smaller than in the case of ALS; 3D coordinates are obtained indirectly; and the error theory of 3D-measurements is well established for photogrammetry.

Methods of analytical and digital photogrammetry have lead to an increase in the degree of automation of photogrammetry. i) Automated photogrammetric feature (point) measurement, ii) automated solutions of the stereo correspondence problems (image matching) and iii) the accuracy of such stereoscopic data capture are the key issues in automated DSM-production using photogrammetric data.

The definition given earlier said that a DSM is understood to be surface
model **S** approximating the true surface **U** from which we have
point samples **XÎ Â ^{3
}**containing errors. To obtain the set

**Picture 4. **In
photogrammetry the points in X have to be "matchable" distinctive geometric
features visible on more than one image.

Automated surface modeling or the reconstruction of geometric shapes of objects from digital images can be done by applying several image-matching methods. Different problems call for different solutions (Kraus 1993, p. 367). The main branches of image matching techniques are:

i) feature-based matching methods,

ii) signal/area based methods and

iii) relational methods.

The trouble with all these methods is in the nature of the correspondence problem: it is not solvable in the general case at all. Self-occlusion of images of non-convex objects and uniform intensity of large areas in the images make the correspondence problem inherently ambiguous. Constraints must be used to reduce the inherent ambiguity of the stereo correspondence problem.

Following constraints can be used: epipolar, uniqueness, photometric compatibility, geometric similarity, disparity smoothness, figural disparity, feature compatibility, disparity limit, ordering and mutual correspondence constraint (Sonka et al p. 477).

Epipolar constraint which says that the corresponding point can only lie on the epipolar line in the second image reduces the potential 2D search space into 1D (see picture 4). Uniqueness constraint stating that in most cases a pixel from the first image can correspond to only one pixel in the second image does often not hold true because of the geometric differences between images caused by perspective geometry. For example in the case of forest canopy the ordering constraint cannot be used, because surfaces (treetops) are usually not on the same depth. Perspective geometry is the cause for geometric differences between the images used for matching. These differences are modeled with a geometric transformation in the matching process. This enables the use of some the constraints listed above. Usually the two or more images used for matching also have radiometric differences between them that need to be modeled to enable the use of the photometric compatibility constraint.

**Correlation-based stereo correspondence**

Correlation-based algorithms use the assumption that pixels in correspondence have very similar intensities (photometric compatibility constraint). The intensity of a single pixel does not give sufficient information and thus intensities of several neighboring pixels in a window are considered. Correlation-based methods are therefore also called area-based or signal-based (Sonka et al p. 480).

Solving the stereo (3D) correspondence problem from two digital images
is possible when the orientation of images is known. Then it is possible
to apply one-dimensional correlation by means of epipolar geometry in the
two original images. In this method there is the *target image* and
the corresponding *search area* in the second image. The method works
only if the target image is large enough to include at least one more or
less well defined edge in the density profile, because such edges are essential
for accurate correlation (Kraus 1993, p. 367).

In correlation based methods hierarchical multi-level or multi-resolution matching is often used. The pyramid method is very efficient in finding approximate values. The results at a coarser level provide the start point for correlation at a finer level (Kraus, 1993. p. 374).

**An example of one correlation-based method: The vertical line locus
-method**

It is possible to perform the image correlation (matching) and the object formation in one process without having the matched 3D points as a sub-phase in the surface construction work. One such method that achieves this double operation is known as Vertical Line Locus (VLL) correlation. VLL can be applied after the orientation of the two (or more) images. In the VLL-method the first phase is to define the XY-locations for which Z is to be estimated. This can be a regular grid (graph). For each XY-location, collinearity equations can be used to locate areas corresponding to different Z-values on both images. These are made into windows on each image and homologous windows are found by finding the pair that maximizes the correlation. It defines the required XYZ-point. Geometric and radiometric corrections can be taken into account and the correlation equation is extended to include the additional unknowns of the geometric (plane) or radiometric transformations (Kraus 1993, p. 376).

**Picture 5.** In the
VLL-method XY-coordinates are first specified. A series of equidistant
planes is established for each xy-location. homologous windows are found
by means of image correlation.

**Feature-based stereo correspondence**

The quality of correlation depends on the texture of the image content
and therefore it may be advisable to select interesting regions of the
image first. So-called interest operators are suitable for this purpose.
The demands on the points **X** from the surface **U** of the 3D
object stated in chapter 2.2.1. are met by the interest operator. Demands
for the interest operator providing candidate features to be matched are:
features should be distinguishable; they should be invariant for as many
radiometric and geometric transformations as possible; features should
be stable in the sense that the probability to find the same feature from
the second image with the operator is high. In general feature-based methods
are less ambiguous since the number of potential candidates for correspondence
is smaller. There are fewer candidates to be checked or verified.

**3.3. Calculating 3D coordinates for the corresponding points**

The 3D-coordinates for the points (or other conjugate entities) that
are found on two or more images are obtained analytically using the collinearity
condition and spatial intersection principle. Accuracy of the point (feature)
locations (s_{XY}, s_{Z})
is affected among other things by the accuracy of the orientation of the
images (imaging geometry), the accuracy of the feature locations (measurement
noise) on the images (possible subpixel accuracy) and the flying altitude
or object distance in aerial photogrammetry (imaging geometry). These 3D
locations then form the data (set **X**) for the DSM estimation to follow.

**4. DSM (DTM) production from 3D-point clouds**

**4.1 ALS data**

Kraus and Pfeifer (1998) give a description for an iterative algorithm for computing a high-quality DTM by eliminating the non-terrain points from ALS data. At first an average surface is calculated based on all points, non-terrain and terrain. Residuals from this approximating surface are weighted so that the terrain points which are more likely to have negative residuals get the highest weight and large positive residuals (tree tops, branches) get zero weight. On the second and later iterations the terrain points 'attract' the computed surface. Kraus and Pfeifer conclude that the geomorphologic quality of the model is low: along valleys the DTM have artificial depressions and the natural breaklines do not exist. Manual editing is often needed. This calls for POS-images taken for the same locations.

**Picture 6.** Producing
a DTM involves a classification problem. Outliers or points not belonging
to the true surface U should be detected.

**4.2 Photogrammetric data**

A simple and tedious way of making a DTM with POS data is the method of following and vectoring contour lines manually using a digital (or analytic) stereo workstation. Field visits are needed where occlusion occurs. To automate DTM production from photogrammetric data requires that the matched points represent terrain. In non-forest this can be accomplished, but expensive fieldwork is needed for forest areas. Matching methods produce and measure a DSM (Baltsavias 1999 p. 89). Automatic aerial triangulation methods exist although they should rather be called semi-automatic because some work phases still need an operator (Heikkilä 2000).

**5. Conclusions**

ALS has certain advantages over photogrammetric data capture, which together with the fact that it is a developing field make it promising. ALS can be used in the mapping of surfaces that have very little texture. Badly definable surfaces like the forest canopy are measurable (volumetric) with ALS. Even in dense forest conditions the laser signal penetrates the vegetation layer and pulses from the terrain are captured. In automated or manual image matching dense forest represents an area where the POS captures no information from the terrain. Fieldwork in such conditions is also very cumbersome. ALS is more cost efficient in the 3D-mapping of long, narrow features. These include roads, powerline corridors, coastlines, waterways etc. In urban areas ALS can be made to provide dense and accurate measurements (low flying altitude, kinematic GPS) on which the modeling of 3D objects with sharp discontinuities like buildings can be based. Such a DSM can be used for example to plan the locations & orientation of communication antennas. Small objects like powerlines are visible to the laser. ALS has fast response. The information can quickly be converted into 3D coordinates, which can be of importance in some cases. Accuracy issues together with the classification and identification of points and objects are areas where ALS-technology needs to be improved and where POS still can compete with ALS. Other techniques like the interferometric SAR must also be kept in mind when it is to be decided what data capture is used to solve a specific 3D-modeling (DSM) task. Algorithm developments in aerial photogrammetry have been rather slow recently, and research is mainly focusing on the automation of feature extraction. Further advancements in DTM generation are said to be lacking or are very slow. ALS is to stay in photogrammetry. It has an overlap with existing photogrammetric processes; it is competing with them and will eventually, at least partly, replace them. An integration of ALS with digital POS may open up new approaches in the whole photogrammetric production chain (Baltsavias 1999).

**List of references**

Axelsson, P. 1999. *Processing of laser scanner data - algorithms
and applications*. ISPRS. Journal of Photogrammetry & Remote Sensing
54 (1999) pp. 138-147.

Baltsavias, E.P. 1999a. *A comparison between photogrammetry and laser
scanning*. ISPRS. Journal of Photogrammetry & Remote Sensing 54
(1999) pp. 83-94.

Baltsavias, E.P. 1999b. *Airborne laser scanning: basic relations
and formulas*. ISPRS. Journal of Photogrammetry & Remote Sensing
54 (1999) pp. 199-214.

Heikkilä, J. 2000. *Digitaalinen ilmakolmiointi.* Luentokalvot
Maa57.306.

Hoppe, H. 1994. *Surface Reconstruction from Unorganized Points*.
PhD Thesis. University of

Washington. 116 p.

Inkilä, K. 1999. *Digitaalisten kuvien yhteensovitus*. Luentomoniste.
Teknillinen korkeakoulu. Fotogrammetrian ja kaukokartoituksen laboratorio.
(*Matching of digital images*. Lecture Notes. Helsinki University
of Technology. Laboratory of Photogrammetry and Remote sensing.)

Jain, R., Kasturi,. R., Schunck, B. 1995. *Machine vision.* McGraw-Hill.
549 p.

Kraus, K. 1993. *Photogrammetry*. Vol 1. Fundamentals and Standard
Processes. Dummler/Bomm. 397 p.

Kraus, K., Pfeifer, N. 1998. *Determination of terrain models in wooded
areas with airborne laser scanner data.* ISPRS. Journal of Photogrammetry
& Remote Sensing 53 (1998) pp. 193-203.

Sonka, M., Hlavac, V., Boyle, R. 1999. Image processing, Analysis and Machine Vision. 2nd edition. PWS Publishing. 770 p.