QUT Digital Repository: · Vision Based Anti-collision System for Rail Track Maintenance Vehicles Frederic Maire NICTA, 300 Adelaide Street, Brisbane, QLD 4000, Australia Faculty

This may be the author’s version of a work that was submitted/acceptedfor publication in the following source:

Maire, Frederic(2007)Vision Based Anti-collision System for Rail Track Maintenance Vehicles.In Cavallaro, A (Ed.) Advanced Video and Signal Based Surveillance.IEEE Computer Society, United States of America, pp. 170-175.

This file was downloaded from: https://eprints.qut.edu.au/12540/

c© Copyright 2007 IEEE

Personal use of this material is permitted. However, permission to reprint/republish thismaterial for advertising or promotional purposes or for creating new collective works forresale or redistribution to servers or lists, or to reuse any copyrighted component of thiswork in other works must be obtained from the IEEE.

Notice: Please note that this document may not be the Version of Record(i.e. published version) of the work. Author manuscript versions (as Sub-mitted for peer review or as Accepted for publication after peer review) canbe identified by an absence of publisher branding and/or typeset appear-ance. If there is any doubt, please refer to the published source.

https://doi.org/10.1109/AVSS.2007.4425305

https://eprints.qut.edu.au/view/person/Maire,_Frederic.html

https://eprints.qut.edu.au/12540/

https://doi.org/10.1109/AVSS.2007.4425305

QUT Digital Repository: http://eprints.qut.edu.au/

Maire, Frederic D. (2007) Vision Based Anti-collision System for Rail Track Maintenance Vehicles. In Cavallaro, Andrea, Eds. Proceedings IEEE Conference on Advanced Video and Signal Based Surveillance, 2007. (AVSS 2007), pages pp. 170-175, London (United Kingdom).

© Copyright 2007 IEEE Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

Vision Based Anti-collision System for Rail Track Maintenance Vehicles

Frederic MaireNICTA, 300 Adelaide Street, Brisbane, QLD 4000, Australia

Faculty of IT, Queensland University of Technology, Box 2434, Brisbane Q 4001, AustraliaEmail: [email protected], [email protected]

Abstract

Maintenance trains travel in convoy. In Australia, only thefirst train of the convoy pays attention to the track signal-ization (the other convoy vehicles simply follow the preced-ing vehicle). Because of human errors, collisions can hap-pen between the maintenance vehicles. Although an anti-collision system based on a laser distance meter is alreadyin operation, the existing system has a limited range due tothe curvature of the tracks. In this paper, we introduce ananti-collision system based on vision. The proposed systeminduces a 3D model of the track as a piecewise quadraticfunction (with continuity constraints on the function andits derivative). The geometric constraints of the rail tracksallow the creation of a completely self-calibrating system.Although road lane marking detection algorithms performwell most of the time for rail detection, the metallic surfaceof a rail does not always behave like a road lane marking.Therefore we had to develop new techniques to address thespecific problems of the reflectance of rails.

1. IntroductionTrain track maintenance vehicles travel in convoy whilemoving from one work site to another. The first vehicle ofthe convoy uses the same signalization system as passengerand freight trains, but the other convoy vehicles follow thepreceding vehicle ignoring this signalization system. Be-cause of human errors, collisions can happen between themaintenance vehicles. An anti-collision system based on alaser range finder was introduced to help the track mainte-nance train operators keep a safe distance between the con-voy vehicles. Because of the curvature of rail tracks, thecurrent detection system is reliable only to about 25 me-ters. NICTA was approached by NSW Railcorp to inves-tigate whether a computer vision system could extend thisrange for curved tracks. One of the project requirementswas that any new solution was not to rely on radio links orGPS devices. An initial feasibility study showed that ex-isting techniques for road lane marking detection could beadapted for rail detection in most lighting conditions. How-ever, the metallic surface of a rail does not always behave

like a road lane marking. For example, in sunny conditionsa rail looks like a bright band against a darker background,whereas in a tunnel a rail looks like a dark band againsta brighter background. New detection techniques had to bedeveloped to cater for this wide range of lighting conditions.

In this paper, we describe a prototype solution imple-mented in Matlab that has been tested on images extractedfrom video recordings. The final system will run on an em-bedded computer in the vehicle. The system induces in realtime a 3D representation of the track in front of the train,and computes the length of the obstacle-free zone identified.In Section 2, we give an overview of the system. In Section2.1, we explain how the system calibrates itself automati-cally. Section 2.2 and Section 2.3 describe how the 3D rep-resentation of the rail track is built. Section 3 presents someexperimental results.

1.1. Previous Work

Existing train anti-collision systems like the European RailTraffic Management System (ERTMS) used in Europe relyon infrastructure equipments. The ERTMS is divided intovarious levels, as explained in the Strategic Rail AuthorityWebsite [1]. Level 1 corresponds to the simplest configura-tion with fixed blocks and consists of trackside equipmentthat monitors individual signals and passes this informationto the trains via trackmounted transponders. Level 2 is alsoa fixed block system but a radio link allows a continuousexchange of data between train and trackside, through theGSM-R mobile communication network. This permits thetrain to reach its maximum permitted speed within its blockwhile maintaining safe braking distances.

To the best of our knowledge, there exists no train anti-collision system that even partially relies on vision. How-ever, more than a decade after autonomous-system tech-nologies emerged in projects such as those from Universitatder Bundeswehr Munich [2] and from the NavLab group atCarnegie Mellon University [3], road lane markings detec-tion systems are mature enough to be commercially avail-able as driver assistance systems [4, 6]. Variations of theHough transform have been previously applied for road lanemarking detection [7, 8]. It is natural to consider buildingan

978-1-4244-1696-7/07/$25.00 ©2007 IEEE.

anti-collision train system using the same computer visiontoolbox.

2. System DescriptionThe proposed system is self calibrating. It uses the knowndistance between rails and the spatial period of the sleepersto estimate the parameters of the camera (see Section 2.1for details). Once the calibration has been completed, thesystem knows the distance in pixels between the rails at thebottom row of the image. When the train is in motion, evenat the bottom of the image, the rails appear to move fromone frame to the other because of the lateral swing of thevehicle. Therefore, for each new image, we look for a pairof initial straight segments that correspond to the beginningof the rails (see Section 2.2). The induced 3D model ofthe rail track is a piecewise quadratic function (with con-tinuity constraints on the function and its derivative). Itisobtained by extending the initial (straight) segments. Thefollowing (quadratic) segments are determined by consider-ing candidate rail-pixels and comparing the distributionsofthe pixel intensity of different candidate segments with thepixel intensity distribution of the preceding segment (detailsin Section 2.3).

2.1. Static CalibrationIn this application, two coordinate systems are used: one forthe vehicle frame(x, y), and one for the image(u, v). They axis is pointing forward away from the vehicle (parallel tothe rails), thex-axis is pointing to the right (orthogonal tothe rail), and the origin is the vertical projection of the cam-era on the ground. Theu direction is horizontal right; thevdirection is vertical down and the origin is the top left pixel.Let (u0, vh) be the image coordinates of the intersection oftwo straight rails (the point at infinity). Then, it is easy toshow [4] that there exist constantsβU andβV such that the(x, y) coordinates of any object on the ground and its(u, v)image coordinates are linked by

u = βU × (x/y) + u0 (1)

v = βV × (1/y) + vh (2)

The objective of the camera calibration phase is to deter-mine the constantsu0, vh, βU andβV . We first determineu0 andvh by fitting lines to candidate rail pixels. Candidaterail pixels are determined by the profile of the pixel inten-sity in their neighbourhoods. Most lane marking detectorsare too sensitive to lighting conditions [6]. To address thisproblem, we use a variation of the algorithm presented in[5] to detect candidate rail centres. We scan each row ofa grey image looking for pairs of points(A,B) having re-spectively a positive and negative gradient such that both thepositive and the negative gradients have a magnitude larger

than a minimum threshold. Moreover we require that thepixels between the two pointsA andB have an intensitylarger than the sum of the intensity atA plus one half ofthe intensity gradient atA. Another constraint on(A,B)is that the distance between these two points lie in a pre-defined interval (prior knowledge that the width of rail willbe between 2 and 20 pixels). As can be seen in Figures 1,2 and 3, a standard Radon search on the set of candidaterail pixels allows us to identify the image lines of the formu = m1 +m2×v corresponding to the rails. The line anglerange of the Radon search is limited to[−30,−10] degreesfor the left rail and[10, 30] for the right rail respectively.The point(u0, vh) is the intersection of this two image lines.

In the next phase,βV is determined by exploiting thefact that sleepers are spaced regularly by an average dis-tance∆s. Let y1, y2, . . . , yk be the groundy-coordinatesof k consecutive sleepers. The differenceyi+1 − yi doesnot depend oni and its expectation is equal to∆s. Con-sider a lineu = m1 + m2 × v going through(u0, vh) andclose to a rail. Consider also a pair of pointsP and P ′

on the corresponding ground line and separated by a dis-tance of∆s. On average, the pixels corresponding toPandP ′ should look similar. From Equation 2, we derivevP ′ = vh + βV /(yP + ∆s). This equation allows us tocompute the image coordinate ofP ′ given the image coor-dinate ofP (using the fact thatuP = m1 + m2 × vP anduP ′ = m1 + m2 × vP ′). The mapping fromP to P ′ de-pends onβV . Minimizing with respect toβV the empiricalmean of the absolute difference of the pixel intensities of asample of pairs of pointsP andP ′ gives the optimal valueof βV .

OnceβV is found, we use the known inter-rail distance∆r to computeβU . Let u − u0 = mleft

2 × (v − vh) andu−u0 = mright

2 ×(v−vh) be the line equations of respectivelythe left and right rails. Consider a pair of points on thesame horizontal line, one on the left rail at position(uleft, v)the other on the right rail position(uright, v). The two pointshave differentx andu, but the samey andv. The differenceuright − uleft is proportional tov − vh. More precisely,

uright− uleft = (mright

2 − mleft2 ) × (v − vh) (3)

Using Equation 1, we haveuright − uleft = βU × (∆r/y).Combining with Equation 2, it is easy to show that

βU = βV × (mright

2 − mleft2 )/∆r

2.2. Initial Segment DeterminationAlthough the method described in Section 2.1 is robustenough to determine the initial rail segments on imageswith glare such as the one in Figure 4, it fails on tunnelimages such as the one in Figure 5. Hence a different ap-proach is needed. Given the current estimates ofu0 and

Figure 1: An image used for calibration. The rails arestraight and the train is not moving.

vh, and the distance in pixels∆i (time invariant) betweenthe two rails at the bottom of the image, there is only onedegree of freedom for the position of the rails. The hori-zontal offset of the left rail on the bottom row of the imagecompletely determines the left rail (assuming the point atinfinity is known) and the right rail (with∆i). To com-pute the actual offset̂k of the left rail, we first collect theintensity means of all the rays going through the points atinfinity and intersecting the bottom row of the image. Thatis, for the ray with offsetk, we computeµ(k) the meanpixel intensity along the ray in the bottom third of the im-age. To score the hypothesis“the actual offset of the leftrail is k” , we compute the reverse similarity of the twobands centred at raysk andk + ∆i (the left and right railsare mirror images). To simplify the notation, we abbreviatesubscript sequences using Matlab notation. For example,[7, 6, 5, 4] = 7 : −1 : 4. If we plot τ(k) the cosine ofthe angle between the vectorµ(k − 15 : +1 : k + 15)and the vectorµ(k + ∆i + 15 : −1 : k + ∆i − 15) ver-susk, we observe that̂k is a local maximum. As on uni-form backgrounds the scoreτ(k) is constant and maximum,to find k̂, we maximize with respect tok the expression2 × τ(k) − τ(k − 10) − τ(k + 10). The reversing of theband is critical for the success of the comparison operator.To see why, assume that we were looking for rail bands of3 pixels inµ = [3, 2, 5, 8, 9, 8, 5, 4, 5, 8, 9, 8, 5, 2] and that∆i = 6. Without the reversing,τ(k) would be maximumfor k = 4, 5 and6. With the reversing,τ(k) reaches itsmaximum only fork = 5.

Figure 2: The candidate rail pixels are highlighted withwhite crosses (the image was darkened to make the crossesmore visible). Only the bottom2/3 of the image is pro-cessed for rail pixel detection.

2.3. Piecewise Quadratic Approximation

Each rail is approximated by a piecewise quadratic functionwith two continuity constraints; the curve is continuous aswell as its derivative. Theith quadratic segment is of theform x = a1(i) + a2(i) y + a3(i) y2. Given the previoussegment, there is only one degree of freedom for the deter-mination of the next segment: segmenti must be a contin-uation of segmenti − 1. Let Vi−1 be the common vertexto segmenti − 1 and segmenti. Our assumptions translateinto

a1(i−1)+a2(i−1)y(i−1)+a3(i−1)y(i−1)2 = · · ·

a1(i) + a2(i) y(i − 1) + a3(i) y(i − 1)2

and

a2(i − 1) + 2 a3(i − 1) y(i − 1) = · · ·

a2(i) + 2 a3(i) y(i − 1)

Givena3(i), these two equations allow us to computea2(i)and thena1(i).

From Equations 1 and 2, we can derive the equationof the quadratic segment with respect to the image coor-dinates;

u = (u0 + a2 βU ) + (βU a1/βV ) × (v − vh) · · ·

+ (a3 βU βV )/(v − vh)

Figure 3: Despite the false positives in Figure 2, the Radonsearch successfully determines the rails and the horizon.These lines characterizeu0 andvh.

The curvature of the curvex(y) is

c =

d2xdy2

(

1 +(

dxdy

)2)

3

2

Given thatx = a1(i) + a2(i) y + a3(i) y2, we have

c =2 a3(i)

(

1 + (a2(i − 1) + 2 a3(i − 1))2)

3

2

The turning radius1/c of the rails must be larger than someconstantRm ≥ 600. Therefore the absolute value ofa3(i)cannot be larger than

(

1 + (a2(i − 1) + 2 a3(i − 1))2)

3

2

2 Rm

Several fitness functions have been considered to scorethe different hypotheses regarding the value ofa3(i). Thefitness function based on the number of candidate rail-pixelsin the vicinity of the quadratic segment works well in non-tunnel situations.

For tunnel images, the rail-pixel detector becomes unre-liable. In these situations, we exploit the smoothness of therails. Statistical properties of the pixel intensity distributioncan locate the rails. We have investigated several measuresincluding entropy, maximum likelihood, and histogram dis-tance. A segment corresponding to a rail is smooth com-pared to the background. The entropy of the pixel intensity

Figure 4: An image with glare. The rails are bright.

Figure 5: An image from a tunnel. The rails are dark.

of a setS of pixels corresponding to rail is a local minimum.If µ and Σ are respectively the mean vector and covari-ance matrix of a band of pixels centred on the previous railsegment, the likelihood ofS with respect to the Gaussian(µ,Σ) is also a local maximum. Experimentally, we foundthat the distance between the pixel intensity histograms ofsuccessive quadratic segments provides the most robust fit-ness function (to quantify the continuity of the pixel inten-sity along the curve being tested). This is not surprising asthe histograms convey more information.

This continuity test of the pixel intensity histograms ofthe successive quadratic segments also determines where tostop the rail model. A jump in the histogram distance trig-gers the end of the rail track model. The computation of thehistograms is more expensive than the rail pixel detection

(Section 2.1), but can run in real-time.

3. Experimental ResultsA prototype system written in Matlab has been evaluated onimages extracted from video recordings. A wide spectrumof lighting conditions were tested, ranging from dim light(Figure 5) to bright sun (Figure 4). It is particularly encour-aging that the system works with poor quality images. Thempeg compression artifacts create some challenges for thefitness functions. Fortunately, the embedded system willuse a high quality camera. Figure 9 and Figure 10 showthat, like the state-of-the-art road lane marking detection al-gorithms, the system is robust to shadows.

Figure 6: Despite the dim light and the rails looking black,the system properly identifies the curve of the left rail. Theidentified right rail is not shown to highlight the fuzzinessof the image.

4. Summary and ConclusionsWe have designed and implemented in Matlab a prototypeanti-collision system for rail track maintenance vehiclesthatallows the distance between two maintenance trains to beestimated. The system features an original self-calibrationmodule that fully exploits the constrained geometry of railtracks. An ad hoc algorithm was also developed for thedetermination of the initial segments of the rails. This adhoc technique avoids the use of edge detection algorithmsby using a reverse band similarity measure that does notsuffer from the limitations of road lane marking detectionalgorithms. The system shows robustness under extremeillumination conditions. The performance of this new anti-collision system is at least four times better than the existinglaser system. That is, on the images we were provided with,the vision algorithms detect reliably the obstacle-free zone

Figure 7: Candidate rail pixels for Figure 4.

to more than100 metres. We expect better performance onthe embedded system as it will use a high quality camera.We have no data for rain and fog, but testing under theseweather conditions will be performed with the embeddedsystem.

Acknowledgments

This study was conducted under contract with NSW Rail-corp, and the author would like to thank the company per-sonnel for their help.

References

[1] SRA, 2004. European rail traffic management system. Web-site http://www.sra.gov.uk

[2] E.D. Dickmanns and A. Zapp,Autonomous High Speed RoadVehicle Guidance by Computer Vision, Automatic Control-World Congress, 1987: Selected Papers from the 10th Tri-ennial World Congress of the Intl Federation of AutomaticControl, Pergamon, 1987, pp. 221226.

[3] C. Thorpe, Vision and Navigation: The Carnegie MellonNavLab, Kluwer Academic Publishers, 1990.

[4] Ieng S-S, Tarel J-P. and Labayrade R.,On the Design ofa Single Lane-Markings Detector Regardless the On-boardCamera’s Position, in Proceedings of IEEE Intelligent Vehi-cle Symposium (IV’2003),June 9- 11 2003, Colombus, OH,USA.

[5] Ieng S-S,Methodes robustes pour la detection et le suivides marquages, These de doctorat de luniversite de Paris 6,Novembre 2004.

Figure 8: The glare in the image does not prevent the correctdetection of the left rail in the image shown in Figure 4.

[6] Apostoloff, N. and Zelinsky A.,Vision In and Out of Vehicles:Integrated Driver and Road Scene Monitoring, The Interna-tional Journal of Robotic Research, Vol 23, No 4-5, April-May 2004, pp 513-538.

[7] B. Yu, A. Jain,Lane Boundary Detection Using a Multireso-lution Hough Transform, icip, p. 748, 1997 International Con-ference on Image Processing (ICIP’97) - Volume 2, 1997.

[8] Yue Wang, Eam Khwang Teoh, and Dinggang Shen,Lane de-tection and tracking using B-snake, Image and Vision Com-puting, 22(4):269-280, April 2004.

Figure 9: A view with shadows.

Figure 10: The detected obstacle-free zone is 108 metreslong. The detected left rail model is displayed with whitecrosses. The system is robust to shadows.

Documents

QUT Digital Repository: · Vision Based Anti-collision System for Rail Track Maintenance Vehicles Frederic Maire NICTA, 300 Adelaide Street, Brisbane, QLD 4000, Australia Faculty