For some, machine vision is the coming technology. For others, it’s already here. Although it remains a relative newcomer to the ITS sector, its effects look set to be profound and far-reaching
Encapsulating in just a few short words the distinguishing features of complex technologies and their operating concepts can sometimes be difficult. Often, it is the most subtle of nuances which are both the most important and yet also the most easily lost. Happily, in the case of machine vision this isn’t the case: it’s all about capturing a betterquality, full-resolution image at a precise moment in time; from there, it’s possible to then apply some form of intelligence to that image and so extract more actionable data than would otherwise have been possible.Originating in the process control sector, machine vision technology has been making inroads into the ITS sector for several years now. To date it has been utilised principally in high-end/high-value applications such as tolling and enforcement, where business cases can be made for systems with a higher-than average up-front cost. But, according to the main systems providers, we are already seeing developments in both the technology and standards – in the latter case, the GigE Vision camera interface standard in particular – which will result in greater acceptance and market penetration.
Precisely how far that penetration will go is open to discussion; in a sharply cost-conscious environment, attempting to convince the market that a more expensive solution is the right one can seem foolhardy. However there are always applications where only a very small number of (more expensive) technologies are appropriate. Then there are those nuances; machine vision systems tend to hold a lot of intelligence and processing capability at the network edge and there are major play-offs between individual unit costs and communication outlays, for instance.
This makes like-for-like comparisons rather more difficult but, as relative newcomers to the ITS sector, machine vision specialists tend to accept that it is for them to make the case, technologically and economically, to potential customers. For a start, having more capable, more intelligent cameras out on the network can significantly reduce the numbers needed and so dramatically sway that perunit cost argument, says
Characteristics and trends
“Trying to use one camera to do many things is a clear trend, and if you can apply more intelligence to images and so extract more data you get more bang for your buck. Effectively, we’re talking about cameras with embedded PCs and by comparison older analogue and even newer IP-based security cameras yield much lower-resolution images.“Yes, IP cameras might achieve megapixel resolution but image compression removes detail. Also, they tend to use CMOS sensors, so the pixels are small and sensitivity isn’t great. That said, on the machine vision side we are starting to see the use of high-performance CMOS sensors with global rather than rolling shutters. These offer a competitive price but not absolute quality.”
With that blurring of the technology edges comes also a blurring of the limits of application: “Where previously machine vision and security-type applications were once very separate, we’re now seeing hybrid systems coming through which are very good for traffic
management.
Prime markets
“There was nothing similar in the Frenchmarket before. There have been embedded installations in the UK and Italy for the last three to four years, and in the US for the last couple, but the big difference is that these have been simple in-vehicle cameras, often hand-operated, or external systems featuring large and highly visible roof-mounted solutions with associated installation and maintenance issues.
“The vehicle’s light bar is a very aggressive environment. We expect to see up to 90 per cent humidity and temperature ranges of -20 to +80C but the big innovation is having up to eight cameras integrated into the light bar – ANPR cameras on the front left and right and at the rear, and five more cameras to provide 360 video coverage. From the frontal aspect, higher resolution means the ability to see for longer distances. The cameras are small enough to effectively be invisible and the processing technology is all integrated into the light bar, which means that the system can go onto literally any vehicle. Everything is IP-compatible, so cabling and installation is much simpler. All that is needed in-vehicle is a standard PC.”
ANPR effectively becomes a standard, autonomous application, he continues: “Detection rates are close to 95 per cent and the system requires no action from the in-vehicle team. Alerts are only generated when a stolen vehicle is detected and then an appropriate response is decided upon.” Testing started about five years ago and around 400 vehicles are currently equipped, with the potential to go to about 1,000. At the outset, each equipped vehicle was averaging one arrest per day, although has decreased as a result of successes and, inevitably, as thieves adapt. A single vehicle can monitor around 8,000 vehicles per day. By comparison, previous foot patrol-based solutions would have occupied 30 operatives for 12 months to achieve the same productivity.
For e-tolling, SurVision applies mainly fixed cameras. Again, it uses Sony cameras because, Jouannais says, they are the only high-quality heads on which it can master everything. Machine vision is ideal for both fixed-lane and free-flow gantry applications, he adds. “For reconciliation for enforcement purposes, optical techniques are the only viable solution at the moment. Vehicle manufacturers might embed identification technology in the future but ANPR is what we have now. “We use industrial-quality machine vision cameras because of their robustness. An MTBF of 56,000 hours equates to six years’ constant use. Using conventional cameras and a backoffice computer reduces that down to two years. The conventional solution is much more fragile, partly as a result of the much lower wattages associated with machine vision operation, the need for fans versus sealed, IP67-compatible units, and the fact that conventional systems’ operating systems are susceptible to viruses and so on.
“An hour’s maintenance in a tunnel can cost a concession €10,000 by the time a road has been blocked off and re-opened, and reliability is key in the tolling industry. In-camera processing means that footprint on the network is almost zero. An enforcement data package, including a JPEG image, can be just 40kb versus 4-6Mb for ‘normal’ cameras. Low bandwidth needs mean that low-bandwidth and potentially wireless networks are a real possibility.”
“Processing of the compressed 25-30fps inputs from IP-based cameras is fine when you’re talking about one camera into one PC but when you’re trying to deal with images from hundreds or even thousands of cameras it’s an altogether different proposition. With machine vision, and the very high-quality image captures it makes possible, there’s the possibility of reducing dramatically the numbers of cameras needed.”
Although machine vision offers very high performance, it has until recently struggled with traffic management applications. In the main, this is because it has traditionally been applied in very strictly controlled environments, typically indoor, where light and other factors are fully understood.
“Earlier machine vision offerings struggled in the outdoor environment but the newer hybrid systems are offering very credible performance characteristics,” Hearn says. “Auto-iris functions, very important for bright days, and zoom and focus functions have all improved dramatically. The controlling algorithms have also improved dramatically, even within the last year.
“This has been a trend over the last two to three years, and every machine vision manufacturer targeting the traffic management market is now offering autoexposure and lens control technology on production systems. “Some of the ITS-related machine vision products are being introduced with two or more streams out of the camera. One stream could be a compressed stream and, for example, sent to a server gathering video evidence. The other stream could provide uncompressed highresolution stills or a series of stills.”
In part, these developments have come through cooperation with partners from other verticals. Some machine vision companies had already forged partnerships with companies in the security sector; it was the ‘pure’ machine vision specialists who struggled the most to address the traffic management market’s needs, he notes.
“We’ve a strange situation, in that machine vision started with CCTV-type products and set out to become more controllable. Now we’re seeing it move back into CCTV-type applications. Is machine vision taking over? Perhaps, in some cases. Standard CCTV cameras and images can still provide lots of benefits and if you can still use standard CCTV and the appropriate algorithms why look to use machine vision?
“Where machine vision really scores is in the reduced numbers of cameras needed and real economies of scale. It also offers robustness; with machine vision technology, there’s also an ability to ensure that every packet sent over the network is delivered to the host PC. The GigE Vision standard includes a function to detect lost packets and resend them. A standard CCTV system would just transmit a stream of data – if some were lost, this would not be able to be recovered.
“The market for machine vision in traffic management is growing but that’s within an overall market which is also growing. I’d still place it at the top end in terms of applications – anything where triggering is key, so ANPR for instance. In the past, Stemmer has looked at applications such as locating Hazmat signs on vehicles entering tunnels.
“Cost and capability will always come into it and that has to be viewed on a case-by-case basis. There are play-offs between unit cost and data transmission, for instance. Application of intelligence at the front end can range from capture of that ‘one’ image for license plate location and extraction, for instance, up to monitoring of illegal traffic movements.
A process of learning
“Single-need customers will continue to be perfectly happy with singleuse solutions. You’ll still see convergence, though. There’s also a learning process involved – as functionality is added, awareness increases and leads to wider adoption. “That makes sense, as many products have the same mounting, communications and power needs. We’re working with partners and also engaging in our own R&D to combine systems and functionalities.”
In some parts of the world, legislation dictates the use of single-use technologies for some applications – enforcement, for instance. With machine vision being applicable to so many tasks, there is a potential issue for the future, Shockley notes.
“That’s an issue that’s more prevalent in the UK and Europe at present as technology tends to have been more widely adopted there. It’s currently less of an issue in the US but we’d be foolish to think it’s not a coming one.”
At this year’s ITS World Congress, Aldis will be launching Gridsmart 3.0. This will further enhance the value and appeal associated with what is already a single-sensor intersection monitoring solution, according to Shockley. “We’re having to overcome perceptions of video’s performance caused by multi-camera solutions,” he continues. “With Gridsmart, we don’t have the problems with shadows, glare, sunlight, snow and fog that multi-camera set-ups do.”
“Gridsmart 3.0 will feature SCATS-compatible data collection with an integrated history viewer for incident response which will allow user to review imagery to see what happened. For TMCs there will also be a control station-level status viewer which will give information on severe delays, incidents and any system outages.”
“Five years from now I think we’ll see much more rounded and application-specific products. That reflects an overall trend in machine vision: better-quality images allowing better-quality data to be extracted.
“A big, remaining challenge is how to extract from the customer what it is they want. We can readily tell potential customers about industrial applications of machine vision which, while their relevance to traffic management might not be readily apparent, they demonstrate machine vision’s capabilities. We need to develop the forums which will allow that to happen.”
“The market for machine vision in traffic management is growing but that’s within an overall market which is also growing. I’d still place it at the top end in terms of applications – anything where triggering is key, so ANPR for instance. In the past, Stemmer has looked at applications such as locating Hazmat signs on vehicles entering tunnels.
“Cost and capability will always come into it and that has to be viewed on a case-by-case basis. There are play-offs between unit cost and data transmission, for instance. Application of intelligence at the front end can range from capture of that ‘one’ image for license plate location and extraction, for instance, up to monitoring of illegal traffic movements.
Image quality and apps
Image quality dominates machine vision customers’ thinking, says Mike Gibbons, Product Marketing Manager with
“We sell into markets worldwide but the biggest growth area at the moment is Asia, and China in particular. It’s not just that we’re seeing huge infrastructure spends there; we’re also seeing the willingness to spend extra on quality systems which last. There’s still a cost-consciousness but the specifications we’re seeing match machine vision’s capabilities very closely.
“Size is the easy answer as to why machine vision companies are now targeting the transport market: there’s tremendous volume, especially for a company such as Point Grey which sells across the scale from low-cost USB cameras to higher-end $2,500 units. There are also a lot of integrators, so from a supplier’s perspective there are lots of upsides and few downsides.
“Lower-end cameras feature less functionality but still have CCD-level image quality and general-purpose I/O connectors. USB cameras connect directly to embedded PCs. There’s still a market for CCTV but we’re becoming evermore competitive. Also, when there’s already a system in place, let’s say one using analogue cameras, these can be replaced fairly easily with more advanced cameras offering an HD-SDI interface.
A brighter, clearer future
“It’s important to understand the application – high-end surveillance is different from standard surveillance, with significantly higher shock and vibration requirements, better S/N levels and extended temperature ranges,” says Tue Mørck, Director, Global Business Development with JAI.
“Traffic applications vary greatly. Some traffic management can be done with simple camera systems but the efficiency of high-end tolling systems is dependent on the actual license plate read rate.
“Today’s applied cameras are progressivescan megapixel cameras, replacing interlaced low-resolution cameras which were often standard video cameras following the CCIR or EIA standards. The challenge from the camera side has been to achieve the same or better Signal-to-Noise levels [S/N] with significantly smaller pixel structures. It’s already possible to both have megapixel imagers and use affordable high-quality lenses – the smaller the imager the smaller and less bulky the lens can be. Lower noise and higher sensitivity has made it possible to achieve useful images where it was not possible a few years back.
“Also imager features like better blooming suppression and significantly lower smear levels – effects typically seen from headlights and strong reflections – have played a major role in the development of better traffic and security imaging systems. Both CCD and CMOS imagers have improved but with the improved CMOS sensor quality showing little shutter leakage and practically no smear it is possible to achieve higher frame rates as well. This opens the way to new applications since the time between images becomes longer or because you can grab several images within a short period.
”This is where we stand – superb images showing crisp colours if in colour, high resolution, little smear and blooming with great tap balancing and better S/N than ever before. The imager technology future is even higher resolutions, higher sensitivity per area, smaller pixels and higher frame rates at lower pricing, all to the benefit of the end-user. The camera technology future is more functionality like time protocol adaptation, compression, advanced colour conversion, built-in triggering and light metering. The standard interface technology has simplified interfacing and this process will continue to deal with broader bandwidth requirements and more image data. For the longer term, five to 10 years from now, the spectrum of the imagers is expected to expand at both ends and the sensitivity to increase dramatically. Cameras are expected to be able to see in 3D and directly output derived data of various kinds.
”As a result, ITS camera sub-systems will be able to cover wider areas, for example more lanes or signalised crossings, still detecting the details and this at lower price. Imagers will shrink in size and the optics get smaller, better and cheaper and for the processing power the same counts. Seeing in 3D will make it possible to determine also the size, the position and the speed and direction of vehicles and the higher sensitivity and the expanded spectrum will make it cheaper to see at night as well.
“Everything is heading in the right direction for the consumer. But it’s important to use those pixels well.”
That spread of demand results in applications all over the map: we’re seeing lots of tolling and parking gate applications where plate capture and driver is sought. We’re also seeing highway applications demanding very high shutter speeds and low distortion – a good example of this is software-based vehicle speed analysis that accurately measures speed without the expense of hardware triggers such as ground loops.
Increasingly, there are a lot of grey areas between surveillance and traffic management.”
Machine vision can add value and reduce cost in several different ways, he says.
“Very low-cost PCs linked to USB or HD-SDI cameras result in very compact traffic management systems which can be mounted in a single location, and because the cameras have generalpurpose I/O connectors the need for external triggers for strobes and flashes is eliminated.”
Further improvements will reduce lighting needs, and cost, even further: “Higher-quality CCDs are allowing integrators to reduce the amount of lighting that they need but even a lot of the new CMOS sensors with global shutters, which are typically less expensive, are also offering much better sensitivity. We can expect to see some real gains in sensitivity, to the point where sensors won’t need external lighting. That’ll result in some huge savings.
“The amount of ‘smarts’ in the camera will also increase. I can see oncamera black-and-white/colour conversion and ANPR happening quite readily. One thing that we might see, a few more years out, is 3D imagery. It is possible to derive 3D from a single camera in some circumstances but what I’d expect to see is 3D with a muchreduced camera array. We already supply machine vision cameras to GIS mapping companies and I suspect that future traffic management applications will use much of the same information that they already do.”
Industrial, not consumer-grade
A driver of system cost is the use of industrial-grade, not consumer-grade technology but that’s important for validation, to show that equipment gives accurate data, says
“This market is maturing but the technology remains more expensive than consumer products. Reliability and robustness drives balance, so we can see a case for e-tolling and security applications. Decision-makers are persuaded by apps that generate revenue, on the other hand, and business models are still evolving. For example, if you discuss machine vision with some of the transport industry’s key players they’ll confirm machine vision’s performance but tell you that the cost’s too high because there’s no business model to generate revenue from traffic management.
“As a manufacturer we’re working on it. We realise that price is a penetration issue. We’re a camera supplier but we also take a vertical market approach, working to understand the needs of integrators. We don’t supply an ‘ANPR camera’, for example.
“ITS is a focus area for us but we don’t limit ourselves to certain types of applications; we see developments in smarter cities, for example, and have a mission to prove to users that adding software capabilities inside cameras is the way forward.”
A passing phase
“In terms of future development paths, I think we’ve already had the ‘bump’ and we’re now into a phase of steady development. I can see a lot more pre-processing being added at the camera end which reduces stresses at the PC side. That’s going to make data handling easier.
“Machine vision price is the product of three components: the camera itself; the lens; and lighting sources.
“With cameras, we’re seeing a huge shift to frame-grabber camera solutions offering direct connection to the PC. That overcomes USB’s bandwidth and FireWire’s distance issues. The GigE Vision standard introduced in the last few years has resulted in a move away from proprietary standards. Older technology used a data acquisition card which increased cost but pluggable IP-standard cameras have removed that cost.
“The lens component is still expensive and seems unlikely to go down in price as it is dependent on rare and raw materials prices. With the lighting component we’ve seen a shift to LEDtype sources, the prices of which are also coming down. For the same lighting intensity, prices are probably 50 per cent lower than three years ago.
“Falling costs and newer networking technologies allow the camera to be completely remote from the vision acquiring system so we’re seeing applications such as surveillance cameras on police car roofs as well as remote data access from tablet computers. “The military, another machine vision market, is using commercial off-the-shelf technology in generic ways and achieving vendor-independence. That’s resulting in some very low-cost solutions.
“The biggest development costs are actually at the software level. Most componentry offers a generic software programming API, so we’re seeing differentiation on the software side. In the traffic sector, for example, ANPR providers are using generic cameras but are remaining very closed on the software side.”
The GigE Video standard is perhaps the biggest breakthrough in his eyes: “Lots of equipment, including light sources is now conforming to this standard, and it’s not just plug and play, it’s about true interoperability; it’s about truly communicating rather than just connecting everything up and having things ‘working’. What it means is that multiple PCs can see the same images at the same time without interruptions to the flow – data is effectively ‘broadcast’.”
Holding the door open
Paul Kozik, Product Manager with
“The digital shutter and digital media, which reduce the time it takes to process a violation, have already resulted in applications which I’d consider to be at the top end of capability need, such as in tolling and enforcement. Basically, machine vision can address any application where there is a need for still images of moving vehicles, together with image processing of some kind such as licence plate recognition or vehicle detection.
“But it is the GigE Vision standard which has really opened the way for machine vision to insinuate its way into other areas of traffic management where cost has previously proven prohibitive.
The GigE Vision standard has commoditised machine vision. It has reduced the need for users to come up with proprietary interfaces and that’s going to drive down cost significantly by allowing imaging libraries and cameras to talk more easily with each other.” Critics of GigE Vision have accused it of being something of a ‘club’, in that those using the standard have to sign up to it but Kozik dismisses this.
“I can understand where that perspective comes from but GigE Vision’s not a members-only institution. The solutions currently being addressed using this technology are high-end; we’re servicing those who know how to develop applications using C++ or other programming languages, this is different than end-user solutions typically developed for the security sector. Insofar as machine vision is an advanced technology, there needs to be a certain level of sign-up but only from the developers’ end – users need only know that the systems they buy are compatible and interoperable.
“This is well-understood within the machine vision world, the ITS world however is not familiar with such highend interfaces. The machine vision industry needs to adapt the selling proposition to the transport world and overcome such misconceptions.”
He still considers CCD-based systems to have the edge at present. “There are companies pushing CMOS technology, which performs well in bright conditions where blooming or smearing can occur, but I still see CCD as having the edge where limited light is available and sensor sensitivity is critical.”
The adaptability and real-time capabilities of machine vision are readily demonstrated by a whole host of applications related and unrelated to transport.
For instance, vision specialists from Firstsight Vision worked closely with food processing system company AEW Delford, now part of Marel Food Systems, to apply advanced computer and vision technology to look at and measure a cross-section of product such as bacon prior to it being sliced. This enabled the latter’s food slicing machines to produce greater volumes of more accurate slices to a closer tolerance with consistent pack weights.
The outcome, a patented, flexible and extremely accurate measurement solution that could not be achieved by traditional machine vision technology, has been adopted on a range of commercially available high-speed, high-volume, continuous-feed slicer products.
For bacon slicing, a high-speed camera is used to scan the leading face of the product before slicing. The system measures the area, lean/fat ratio and the lean/fat structure and takes into account their different densities before adjusting the thickness of the next slice and providing grading information to labelling technology down the line. This all helps to maximise the on-weight percentages and minimise giveaway.
For products such as cheese and ham, the vision system is used in conjunction with laser profiling. This follows contours so closely at the cutting face that virtually any variation in product composition, however irregular, is detected. Holes in cheese, voids in ham, areas of fat and even lean/fat ratios are all carefully and accurately measured at the blade, slice by slice. This means high on-weights, extremely low giveaway, consistently accurate grading, high output and excellent product presentation with minimal manual intervention.Sony’s proprietary ExView technology represents the state of the art, he continues: “Originally featured in the ICX285, arguably the most commonly used CCD device in the tolling sector, ExView HAD offers excellent sensitivity and near infrared response; near infrared wavelengths are often used with light strobes during night imaging. And with ExView now available in 3MP and 6MP versions, one camera can replace several where multiple lanes are to be monitored. All three devices are featured in the Prosilica GT lineup, a GigE Vision camera family designed by Allied Vision Technologies for the traffic market.
“Precision iris lens control is another noteworthy area of development, particularly what’s been done by Kowa with its P-Iris product and IP camera company Axis. Both are security market developments which are favourable for digital imaging. Existing solutions such as video iris do not support asynchronous triggering and have been rejected for more demanding traffic applications. Precise iris, in comparison, allows users to adjust the aperture to a fixed f-stop without additional drift, optimising the system for best depth of field, minimising blooming or smearing behaviour and improving system sensitivity on cloudy days.
“Also significant is IEEE 1588, a clock synchronisation protocol similar to NTP but with microsecond precision for realtime devices. Also known as PTP [Precision Time Protocol], this has already been adopted by the power grid industry. For traffic imaging, PTP allows a series of events to be identified by the same timebase. This capability can eliminate system components and reduce system complexity.”
Looking forward, we will see machine vision become more pervasive. Continued market growth during the difficult economic conditions of the last few years is evidence of this, Kozik feels.
“Things have become much more interoperable and they ‘play’ nicely together now. There’s a reason why the machine vision providers are targeting the transport sector: there’s a big and growing customer base worldwide. Regionally, we as a company are seeing a little more business in North America, where tolling is doing well but enforcement is less prevalent, but the prime growth markets at present are Asia and Europe.”
Getting more from CCTV
While some concentrate on advancing the cause of machine vision, getting the mix between machine vision and traditional CCTV right or looking at how to extract the maximum possible data from captured images, UK firm Visimetrics has been working on incorporating the maximum possible information into stored images to create FiND (Forensic investigation Network Database), an intelligent post-event analytics system.
Traditionally video analytics has been used as an event tool, with the detection of individuals or objects moving in a certain way triggering alarms. However, that approach means that operators often do not know what they are looking for until something has occurred. By contrast, FiND works in real time; metadata on all recorded imagery is stored as text in a database to produce an ‘index’ of all relevant points of information. This facilitates very fast searches and identification of vehicle classification, people classification, license plate identification using CCTV cameras, text/logo detection, baggage detection, complex background processing and PTZ compensation.
“Because you hold point-of-interest data on individuals, vehicles and so on, searches of the very large amounts of data generated by larger networks of cameras can be narrowed down significantly,” says Craig Howie, Commercial Director. “For instance, a search involving a 300-camera network which could typically take many days can be condensed into a search of a very small number of cameras’ images taking just minutes. As you’re scanning data rather than video, searches can literally take seconds per camera. A unique set of algorithms allows standardisation of, for instance, colour – CCTV cameras have problems standardising on colour, so our solution has been to do so ahead of any analysis.”
FiND is the subject of a three-year collaborative development project between Visimetrics and the UK’s Loughborough University, and Howie sees a huge potential market for the product. “Essentially, we’re talking about anywhere there are large-scale video deployments,” he continues. “One of the biggest problems of digital video is the lack of compatibility and the development work behind FiND has looked to create a standard allowing any form of video to be used to produce the searchable metadata outputs. That’s useful for, say, retrospective analysis of a major incident by the police who can take the video from 24 hours either side and quickly find what they’re looking for among the outputs from a large number of cameras.
“Trials are ongoing in the UK and we’re well into beta testing. Currently we’re optimising, using a broader test base to take account of the effects of weather– what, for example, are the visual effects of rain on cobblestones on the colour of a car? We’re gathering a lot of data, are close to the point of allowing users to play with the system, and expect the system which hits the market to be fully mature.” The philosophy behind FiND stems from Visimetrics’ background as a data storage provider, he adds.
“People have tended to focus on how ‘clever’ video analytics is and sell it as an alarm-raising tool. In practice, we believe things tend to happen differently in real life. It’s often the most obscure piece of evidence which cracks a case. What we’ve come up with is effectively an indexing tool which identifies key points of interest across an entire CCTV system within a one-click search.”
Although the specifics have yet to be nailed down, FiND is likely to be licensed as a software product in its own right which can be incorporated into other manufacturers’ systems. It is scheduled to be market-ready by the year-end.
visimetrics.com