Agricultura 14: No 1-2:1-7 (2017) DOI: 10.1515/agricultura-2017-0012
Estimating the size of plants by using two parallel views
Barbara VIDEC, Jurij RAKUN
Faculty of Agriculture and Life Sciences, Pivola 10, 2311 Hoce, Slovenia
ABSTRACT
This paper presents a method of estimating the size of plants by using two parallel views of the scene, taken by a common digital camera. The approach relays on the principle of similar triangles with the following constraints: the resolution of the camera is known; the object is always in parallel to the camera sensor and the intermediate distance between the two concessive images is available. The approach was first calibrated and tested using one artificial object in a controlled environment. After that real examples were taken from agriculture, where we measured the distance and the size of a vine plant, apple and pear tree. By comparing the calculated values to measured values, we concluded that the average absolute error in distance was 0.11 m or around 3.7 %, and the absolute error in high was 0.09 m or 4.6 %.
Key words: digital image processing, size, digital camera, pixels, similar triangles
INTRODUCTION
In agriculture, as well as in other areas, it is important to know the size of the objects (Stajnko et al. 2004, Ehlert et al. 2009, Marcon et al. 2011) we are observing. But it is not always possible to measure them with conventional tools, such as tape measure, and special equipment, such as laser range finders, is rarely available. The only peace of equipment that is almost always with us is a mobile telephone with integrated digital camera. So why not measure distances with its help?
Such readings are also used when estimating sizes of the trees to calculate biomass (Ehlert et al. 2009), the size of leaf area to estimate the productivity of the crop (Marcon et al. 2011), the size of a fruit (Stajnko et al. 2004) to make a prognosis of the yield at the end of the harvest. The biomass is usually estimated by measuring the diameter of the tree trunk and assessing the height with the help of a trained eye. The usual way to measure leaf area involves destructive steps as the leafs have to be removed from the tree and put inside a scanning device. Marcon et al. on the other hand suggest less accurate but non-destructive computer vision supported approach. The last example, the size is measured with simple
calliper measure. In combination with the number of the fruit on a tree is important to know when making a prognosis about the yield at the end of the harvest, to prepare enough storage, find potential buyers, set the right price, when fruit is only in the developing stages.
When making snapshots, the digital camera takes an image of the scene, transforms it to pixel or spatial space (Gonzales et al. 2008). There the metric information is lost. If we take an image at closer range, the object is bigger than if we take it further away. By taking a closer look at the images, we see that information is stored as a set of colour or grey pixels that describe the scene as well as the object. By taking a look at two such images of the same scene, taken at different distances, we see that the ratios are preserved (Videc 2015). This is a key property that can help to reconstruct metric information otherwise lost when capturing images.
This paper is organised as follows. In the second section mathematical background on how to construct a spatial to metric transformation is presented. Section three then evaluates the approach, by first using examples from a controlled environment in order to calculate the necessary parameters. The second part of section three then presents three examples from agriculture, used to estimate the distance
*Correspondence to: E-mail: jurij.rakun@um.si
to as well as an object. Section four concludes the paper by suggesting some possible future improvements.
MATERIALS AND METHODS
In order to estimate the metrics of an observed objects, at least two digital snapshots from two different distances of the scene must be captured. Of course, the position of the sensor must be in parallel to the object, with a known intermediate distance between the two capturing steps and known resolution of the camera. With these three constrains a pixel to metric space can be made when analysing the two images.
The pixel - metric space transformation is based on the use of similar triangles (Burger et al. 2009). The two triangles are similar if the ratio of the sides is the same and all their corresponding angles are equal. This relation is applied to transformation if one side of the triangle is considered as a distance between the object and the sensor and another as height of the object. By using two triangles from different images and comparing the known intermediate distances with the difference in pixel - height the rest of metric information can be computed (Videc 2015).
Fig. 1 depicts two similar triangles, with el as the distance between the object and the camera sensor and z1 as the corresponding object height. On the next triangle, the distance and height are represented as e2 and z2. The Az is used as a known intermediate distance between the two capturing steps.
e7 z1 + Az
€2
ei	
	11 ¿2
12
Fig. 1: Two similar triangles.
Distance between the camera and the object
Triangular symmetries from Fig. 1 can be summarized by Eq. (1) as:
(2)
(3)
The left side of Eq. (1) defines the ratio that can represent metric (real world) or pixel (spatial domain) distance. As the first is unknown at this point, the simple Euclidian pixel distance (Gonzales et al. 2008) is used for e1 in e2 measured from the image pare. Once z1 is known, computing z2 is a straightforward step, but the height of the object defined as Euclidian pixel distance requires an additional step of determining pixel-metric relation at a calculated distance where first the viewing angle of a camera must be known.
Viewing angle
The viewing angle or angle of view of a camera can defined with the help of the border points of the scene that lay at the opposite sites and are still visible on the image. The third point, the origin, is an imaginary point where the camera is located. All three define a viewing angle of an extent the scene is visible by the camera. The horizontal and vertical viewing angles of the cameras are usually different and depends on the lenses.
If the viewing angles of the camera are unknown, they can be measured and calculated. For instance, if an object of known size spans from one edge of the image to the other and the distance of the object is known, simple trigonometric equations can be used to compute it. Fig. 2 depicts the principle and Eq. (4) and Eq. (5) the mathematical background.
Fig. 2: Viewing angle of a camera.
el -1
(1)
If z1 is an unknown distance, then z2 can be written as a sum of z1 and intermediate distance Az, defined as shown by Eq. (2) and Eq. (3).
If h from Fig. 2 is the width of an object and z the distance from the camera, then angle a can be computed as follows:
(4)
(5)
a = 2 ■ arctg
2 z
Size of an object
If the distance between the camera and the object and the viewing angle are known, or computed as shown in subsections 2.1 and 2.2, respectively, then the size of an object can be computed using spatial - metric transformation. The spatial domain is build from pixels and their number depends on the resolution of the sensor that is used in the camera. Each pixel describes a part of the scene but the size in metric units depends on the distance.
So the first step is to calculate the height of the area that is summarized by one pixel at a given distance. The corresponding metric size of an area is defined by Eq. (6).
with d as Euclidian pixel distance between two farthest points on the image, ry the resolution of an image on the y axis and Ay as the height of an area on the current axes in metric space.
In the second and final step the number of pixels or Euclidian pixel distance between two farthest points of an object (d) and the metric size of one pixel at a given distance (Ay) are used to compute the height of an object. This is done with a simple multiplication as shown by Eq. (7), with sh as the height.
The same approach can be used to calculate the width of an object, but using a different resolution constant that corresponds to the resolution of the sensor.
RESULTS
The results were captured in two steps. The first step was taken to calculate necessary parameters from subsection 2.2 and 2.3., with an additional goal to verify the approach on static object in indoor environment. Indoor environment is more suited to test the approach as it it not interfered by changing conditions that occur in nature, e.g. over and under illuminated scenes, moving caused by wind, etc. The second step used three real world scenarios, where digital images were used to estimate the height of an apple and peach tree as well as a vine plant.
Test phase - controlled environment
The test phase started with the search for camera viewing angle (a from Eq. (5)). For this step an object of known sizes was used, where its height was observed on different images taken from different distances; from 0.25 m to 2 m with 0.25 m step. For each distance an angle was calculated according to Eq. (5) and used to calculate an average value in order to minimize errors caused by human error. The results are summarized on Tab. 1.
Table 1: Viewing angle at different distances from the object.
Distance from the object [m]	Height of the object at given distance [m]	Viewing angle [°]
0.25	0.25	53.13
0.50	0.49	52.21
0.75	0.72	51.28
1.00	0.96	51.28
1.25	1.19	50.91
1.50	1.42	50.66
1.75	1.66	50.75
2.00	1.90	50.82
Average:		51.38 ± 0.86
Next, a static object was placed in front of the camera at different distances, so it surface was in parallel with the camera. The object is depicted on Fig. 3.
Fig. 3: Static object positioned 0.25 m from the camera.
For the test phase the static object from Fig. 3 was positioned 0.25 m to 3 m from the camera in 0.25 m and 0.5 m from the camera. The step was longer in cases where the camera was positioned farther from the object, because the approach is less accurate at bigger distances due to the low number of pixels that describe it. For all images the height of the object was measured on images to get the Euclidian pixel distance from coordinates from the location of the farthest pixel on top and on the bottom. The measurements are shown in Tab. 2.
Next, according to Eq. (3) two consecutive Euclidian distances from Tab. 2 were used to calculate the distance between the object and the camera. The distances are summarized on Tab. 3 along with real, measured distances and an average error.
As shown by the results from Tab. 3 the accuracy of the calculated distance for the eight measurements falls within 0.22 m ± 0.25 m. In general, the greater the distance, the bigger the error. This is caused by the low number of pixels,
Table 2: The height of an object in pixels at different distances.
Distance from the object [m]	Euclidian pixel height [pixel]
0.25	2664
0.50	1344
0.75	896
1.00	672
1.25	552
1.50	461
2.00	361
2.50	289
3.00	241
Table 3: Calculated vs. measured distances.
Measured distance [m]	Calculated distance [m]	Absolute error [m]
0.50	0.48	0.02
0.75	0.71	0.04
1.00	0.90	0.10
1.25	1.75	0.50
1.50	1.53	0.03
2.00	2.55	0.55
2.50	2.01	0.49
3.00	3.02	0.02
Average:		0.22 ± 0.25
that makes it hard to pinpoint the precise top / bottom pixel and is caused by human error.
The information about the distance, view angle and resolution of the camera makes it possible to calculate the size of objects that are parallel to the camera by measuring the Euclidian pixel distance. For the test object from Fig. 3 the results are summarized in Tab. 4.
The results from Tab. 4 prove that it is possible to use the approach from section 2 to calculate distance to the object as well as its height. The absolute distance error increases about 0.01 m for each 0.25 m of distance. On the other hand, this error does not effect the height measurements for the selected test distance (3 m) where the error is more or less constant.
Real world examples
For this subsection three different examples have been selected from agriculture, with an intent to calculate the height of the trees / plant and the distance from the camera, all compared to real, manually measured values. For these examples images of apple and pear tree as well as vine plant have been selected. In all three cases the Euclidian pixel distance was measured from the bottom to the top of the tree /
plant for two consecutive images, taken with an intermediate distance of 0.5 m. Fig. 4 depicts all three selected examples, each with a measuring tape for reference.
In contrast to images from the previous subsection, in this case we have little influence on the capturing process. The objects are illuminated by the sun, with areas that can be completely white and other that are in the shade and are almost completely black. In addition, if there is some wind, the objects can move while taking the first and then the second image. Even if no wind is present it is not guaranteed that the object is in perfect alignment (in parallel) to the
Fig. 4: Three real world examples - pear tree (top), apple tree (middle), vine plant (bottom).
Table 4: Measured vs calculated distance and height of test object from Fig. 3.
Actual distance [m]	Calculated distance [m]	Abs. error -distance [m]	Actual height [m]	Calculated height [m]	Abs. Error - height [m]
0.25	0.24	0.01	0.25	0.24	0.01
0.50	0.48	0.02	0.25	0.24	0.01
0.75	0.72	0.03	0.25	0.24	0.01
1.00	0.96	0.04	0.25	0.24	0.01
1.25	1.20	0.05	0.25	0.25	0.00
1.50	1.44	0.06	0.25	0.25	0.00
2.00	1.92	0.08	0.25	0.26	0.01
2.50	2.41	0.09	0.25	0.26	0.01
3.00	2.89	0.11	0.25	0.26	0.01
Average:		0.05 ± 0.03	Average:		0.008 ± 0.004
sensor, which means that the distance actually changes from one part of the object to the next. All this effect the results.
Tab. 5 shows the Euclidian pixel distances for all three selected examples, along with the measured distance to an object, for reference. The actual calculated distances and heights are summarized and compared to real distances in Tab. 6, 7 and 8, respectively.
As seen in Tabs. 6, 7 and 8, the calculated distance on average misses for 0.12 m for peach tree, 0.12 m for apple tree and 0.9 m for vine plant. The error of the calculated height is 0.15 m, 0.06 m and 0.07 m. In both cases the results have a bigger error rate for examples from the uncontrolled environment compared to the controlled from the previous subsection, which was expected. In order to evaluate the approach, all three examples are summarized by Tab. 9.
Table 6: Calculated vs measured distance and height for peach tree.
Measured distance [m]	Calculated distance [m]	Abs. error -distance [m]	Measured height [m]	Calculated height [m]	Abs. error - height [m]
3.00	2.89	0.11	2.35	2.49	0.14
3.50	3.37	0.13	2.35	2.50	0.15
Average:		0.12 ± 0.01	Average:		0.15 ± 0.01
Table 7: Calculated vs measured distance and high for apple tree.
Measured distance [m]	Calculated distance [m]	Abs. error - distance [m]	Measured height [m]	Calculated height [m]	Abs. error - height [m]
3.00	2.89	0.11	2.60	2.54	0.06
3.50	3.37	0.13	2.60	2.54	0.06
Average:		0.12 ± 0.014	Average:		0.06 ± 0.00
Table 5: Euclidian pixel distances for the tree selected examples.
Measured distance [m]		Euclidian pixel distance [pixels]
Peach tree	3	2587
	3.5	2226
Apple tree	3	2642
	3.5	2259
Vine plant	2	2213
	2.5	1783
Estimating the size of plants by using two parallel views Table 8: Calculated vs measured distance and height for vine plant.
Measured distance [m]	Calculated distance [m]	Abs. error - distance [m]	Measured height [m]	Calculated height [m]	Abs. error - height [m]
2.00	1.92	0.08	1.35	1.41	0.06
2.50	2.41	0.09	1.35	1.43	0.08
Average:		0.09 ± 0.007	Average:		0.07 ± 0.01
Table 9: The accuracy of the approach for all selected examples.
Tree / plant	Abs. error - distance [m]	Abs. error - height [m]
Pear	0.12 (3.6 %)	0.15 (6.3 %)
Apple	0.12 (3.6 %)	0.06 (2.3 %)
Vine	0.09 (3.8 %)	0.07 (5.2 %)
Average:	0.11 ± 0.02	0.09 ± 0.05
DISCUSSION
As described in section 2 and evaluated in section 3, the experiment from this paper showed it is possible to reconstruct metric information, which is otherwise lost when taking digital images of different objects. There are of course limitations to this approach; the object has to be in parallel to the camera, two (or more) accurately taken images with known intermediate distance have to be available and that information about the resolution of the images and the viewing angle of the lens is known. The last can of course be calculated, as presented in section 3.
The results from tab. 9 summarize the average absolute error in distance at 0.11 m or around 3.7 % and the absolute error in height 0.09 m or 4.6 %. This is more than enough for some agricultural applications, but not enough for others that require a higher degree of accuracy. In order to improve this, a better equipment could be used (e.q. camera with higher resolution) and more iterations (more snapshots from different views) could be made to minimize human error with the help of averaging.
Another possible approach to improve the results could be an introduction of image registration techniques. This way, by using corresponding pixels' pairs on consecutive images, it would be possible to select precisely the same corresponding pixels when measuring the Euclidian pixel distance and eliminate the error users make when selecting pixels on the opposite sides of an object.
based measuring of crop biomass under field conditions. Precision Agriculture; 2009:10, 5, 395-408.
3.	Gonzales R. C., Woods R. E. In Digital Image Processing, Pearson Prentice Hall, Upper Saddle River, New Jearsy, USA, 2008: 46-68, 92-93.
4.	Marcon M., Mariano K., Braga R. A., Paglis C. M., Scalco M. S., Horgan G. W. Estimation of total leaf area in perennial plants using image analysis. Revista Brasileira de Engenharia Agrícola e Ambiental. 2011:1, 15, 96101.
5.	Stajnko D., Lakota M. Application of image analysis for monitoring growth and development of apple fruits 'Malus domestica' Borkh. during the growing season. Agricultura. 2004:3, 6-11.
6.	Videc B. Ocenjevanje velikosti rastlin s pomočjo digitalnih posnetkov [diplomsko delo], Maribor: Fakulteta za kmetijstvo in biosistemske vede, Univerza v Mariboru, 2015.
REFERENCES
1.	Burger W., Burge M. J. Principles of Digital Image Processing: Fundamental Techniques. Springer-Verlag London, UK, 2009:3-6.
2.	Ehlert D., Adamek R., Horn, H.-J. Laser rangefinder-
Ocenjevanje velikosti rastlin s pomočjo dveh vzporednih pogledov
IZVLEČEK
V delu je opisana metoda za ocenjevanje velikosti rastlin s pomočjo dveh vzporednih pogledov na isto sceno, zajetih s pomočjo digitalne kamere. Postopek temelji na principu podobnih trikotnikov z naslednjimi omejitvami: znani so podatki o ločljivosti uporabljene kamere; opazovani predmet je vedno vzporeden s senzorjem kamere; znan je podatek o vmesni razdalji med dvema korakoma zajema podatkov. Prvi korak v delu opisuje umerjanje metode ob pomoči umetnega predmeta v nadzorovanih pogojih. Drugi korak pa opisuje uporabo postopka na realnih primerih iz področja kmetijstva, kjer je bila ocenjena velikost vinske trte, jablane in hruške. Ob primerjavi dobljenih podatkov z izmerjenimi je bilo ugotovljena absolutna napaka v razdalji do predmetov v velikosti 0,11 m, oz. 3,7 % glede na celotno oddaljenost, absolutna napaka v velikosti v obsegu 0,09 m oz. 4,6 %.
Ključne besede: digitalno procesiranje slik, velikost, digitalna kamera, slikovni elementi, podobni trikotniki