Computer vision, an AI technology that allows computers to understand and label images, is now used in convenience stores, driverless car testing, daily medical diagnostics, and in monitoring the health of crops and livestock.
From our research, we have seen that computers a proficient at recognizing images. Today, top technology companies such as Amazon, Google, Microsoft, and Facebook are investing billions of dollars in computer vision research and product development.
This post explores what kind of uses of computer vision technology are currently popular across various industries:
• Retail and Retail Security
We aim to give business leaders a high-level view of the available applications in the market, and a possible starting point for their own.
Computer Vision in Retail and Retail Security
Perhaps the most commonly known use of this technology, Amazon recently opened to the public the Amazon Go store where shoppers need not wait in line at the checkout counter to pay for their purchases. Located in Seattle, Washington, the Go store is fitted with cameras specialized in computer vision. It initially only allowed Amazon employee shoppers but welcomed the public beginning in early 2018.
The technology that runs behind the Go store is called Just Walk Out. As shown in this one-minute video, shoppers activate the IOS or Android mobile phone app before entering the gates of the store.
Cameras are placed in the ceiling above the aisles and on shelves, Using computer vision technology, the company claims that these cameras have the capability to determine when an object is taken from a shelf and who has taken it. If an item is returned to the shelf, the system is also able to remove that item from a customer’s virtual basket. The network of cameras allows the app to track people in the store at all times, ensuring it bills the right items to the right shopper when they walk out, without having to use facial recognition.
As the name suggests, shoppers are free to walk out of the store once they have their products. The app will then send them an online receipt and charge the cost of the products to their Amazon account.
In retail fashion, Amazon has applied for a patent for a virtual mirror. In the patent, the company said, “For entertainment purposes, unique visual displays can enhance the experiences of users.”
The virtual mirror technology, sketched in the patent image below, is described as a blended-reality display that puts a shopper’s images into an augmented scene and puts the individual in a virtual dress.
According to the patent, the virtual mirror will use enhanced facial detection, a subset of computer vision, whose algorithms will locate the eyes. Catching the user’s eye position will let the system know what objects the user is seeing in the mirror. The algorithms will then use this data to control the projectors.
Amazon has not made any announcement on this development and the virtual mirror has not been deployed, but the sketches released by the patent office show how a user could see illuminated objects reflected on the mirror combined with images transmitted from the display device to create a scene.
Amazon has previously released Echo Look, a voice-activated camera that takes pictures and six-second videos of an individual’s wardrobe and recommends combinations of outfits.
This 2-minute review of the app shows how it claims to use Amazon’s virtual assistant Alexa to help users compile images of clothes and can even recommend which outfit looks better on the individual.
As the video shows, a user can speak to the gadget and instruct it to take full-body photos or a six-second video. The content is collated to create an inventory of the user’s wardrobe, according to Amazon. Alexa compares two photos of the user in different outfits and recommends which looks better.
In retail security specific to groceries, Massachusetts-based StopLift claims to have developed a computer-vision system that could reduce theft and other losses at store chains. The company’s product, called ScanItAll, is a system that detects checkout errors or cashiers who avoid scanning, also called “sweethearting.” Sweethearting is the cashier’s act of fake scanning a product at the checkout in collusion with a customer who could be a friend, family or fellow employee.
ScanItAll’s computer vision technology works with the grocery store’s existing ceiling-installed video cameras and point-of-sale (POS) systems. Through the camera, the software “watches” the cashier scan all products at the checkout counter. Any product that is not scanned at the POS is labelled as a “loss” by the software. After being notified of the loss, the company says it is up to management to take the next step to accost the staff and take measures to prevent similar incidents from happening in the future.
The three-minute video below shows how the ScanItAll detects the many ways items are skipped at checkout, such as pass around, random weight abuse, cover-up, among others, and how grocery owners can potentially stop the behaviour.
Based on news stories, the company claims to have the technology installed in some supermarkets in Rhode Island, Massachusetts, and Australia.
Computer Vision in the Automotive Industry
According to the World Health Organization, more than 1.25 million people die each year as a result of traffic incidents. The WHO adds that this trend is predicted to become the seventh leading cause of death by 2030 if no sustained action is taken. Nearly half of the road casualties are “vulnerable road users:” pedestrians, cyclists and motorcyclists. According to this research, there’s a clear theme to the vast majority of these incidents: human error and inattention.
One company that claims to make driving safer is Waymo. Formerly known as the Google self-driving car project, Waymo is working to improve transportation for people, building on the self-driving car and sensor technology developed in Google Labs.
Waymo cars are equipped with sensors and software that can detect 360 degrees of movements of pedestrians, cyclists, vehicles, road work and other objects from up to three football fields away. The company also reports that the software has been tested on 7+ million miles of public roads to train its vehicles to navigate safely through daily traffic.
The 3-minute video below shows how the Waymo car navigates through the streets autonomously.
According to the video, it is able to follow traffic flow and regulations and detects obstacles in its way.
The company claims to use deep networks for prediction, planning, mapping and simulation to train the vehicles to manoeuvre through different situations such as construction sites, give way to emergency vehicles, make room to cars that are parking, and stop for crossing pedestrians.
Perhaps the most popular case, Tesla cars, which has Autopilot cars equipped for full self-driving capability.
Each vehicle is fitted with eight cameras for 360-degree visibility around the car with a viewing distance of 250 meters around. Twelve ultrasonic sensors enable the car to detect both hard and soft objects. The company claims that a forward-facing radar enables the car to see through heavy rain, fog, dust and even the car ahead.
Its camera system, called Tesla Vision, works with vision processing tools which the company claims are built on a deep neural network and able to deconstruct the environment to enable the car to navigate complex roads.
This 3-minute video shows a driver with his hands off the wheel and feet off the pedals as they move through rush hour traffic in a Tesla Autopilot car.
While there have been occurrences of accidents while autopilot was engaged, the driver almost always turned out to be at fault, ignoring the car warnings. In the six seconds that hands were not on the wheel, his SUV hit a concrete divider, killing the driver. It was later discovered that the driver nor the car activated the brakes before the crash.
Computer vision in the Healthcare industry
In healthcare, computer vision technology is helping healthcare professionals to accurately classify conditions or illnesses that may potentially save patients’ lives by reducing or eliminating inaccurate diagnoses and incorrect treatment.
1. Gauss Surgical
Gauss Surgical has developed blood monitoring solutions that are described to estimate in real-time blood loss during medical situations. This solution, the website reports, maximizes transfusions and recognizes haemorrhage better than the human eye.
Gauss Surgical’s Triton line of blood monitoring solutions includes Triton ORwhich uses an iPad-based app to capture images of blood on surgical sponges and suction canisters. These images are processed by cloud-based computer vision and machine learning algorithms to estimate blood loss. The company says the application is currently used by medical professionals in hospital operating rooms during surgical operations or Caesarian deliveries.
This 6-minute video shows Triton how captures images of sponges or cloths that absorb blood during a medical procedure, works as a real-time scanner and estimates the potential of blood loss in the patient. CEO of Gauss Surgical Siddharth Satish explains how the solution uses computer vision technology to predict the onset of haemorrhage.
The Triton OR app underwent clinical studies in childbirth settings to validate accuracy and precision and was cleared by the US Food and Drug Administration in 2017.
2. DeepLens and DermLens
Amazon Web Services (AWS) also developed DeepLens, a programmable deep learning-enabled camera that can be integrated with open source software in any industry. In this video, DeepLens is described as a kit that programmers from various industries can use to develop their own computer-vision application.
One healthcare application that uses the DeepLens camera is DermLens, which was developed by an independent startup. DermLens aims to assist patients to monitor and manage a skin condition called psoriasis. Created by digital health startup Predictably Well’s Terje Norderhaug, the DermLens app is intended as a continuing care service where the reported data is available for the physician and care team.
The 4-minute video instructs developers on how to create and deploy an object-detection project using the
For DermLens, this short video explains how the application’s algorithms were trained to recognize psoriasis, by feeding it with 45 images of skin that showed typically red and scaly segments. Each image in the set comes with a mask indicating the abnormal skin. The computer vision device then sends data to the app, which in turn presents the user with an estimate of the severity of psoriasis.
The DermLens team also created a mobile app for self-reporting of additional symptoms such as itching and fatigue.
Computer Vision in the Agriculture industry
Some farms are beginning to adopt computer vision technology to improve their operations. Our research suggests that these technologies aim to help farmers adopt more efficient growth methods, increase yields, and eventually increase profit. We’ve covered agricultural AI applications in great depth for readers with a more general interest in that area.
Slantrange claims to offer computer vision-equipped drones that are connected to what the company calls an “intelligence system” consisting of sensors, processors, storage devices, networks, an artificial intelligence analytics software and other user interfaces to measure and monitor the condition of crops.
The company claims that the drone captures images of the fields to show the different signatures of healthy crops compared with “stressed” crops. These stressors include pest infestations, nutrient deficiencies and dehydration; and metrics to estimate potential yield at harvest, and others. These signatures are passed on to the SlantView analytics system which interprets the data and ultimately helps farmers make decisions related to treatment for stress conditions.
This 5-minute video provides a tutorial on how to use the basic functions of the SlantView app, starting with how a user can use the solution to identify stressed areas through drone-captured images.
Animal facial recognition is one feature that Dublin-based Cainthus claims to offer. Cainthus uses predictive imaging analysis to monitor the health and well-being of crops and livestock. Cainthus uses predictive imaging analysis to monitor the health and well-being of crops and livestock.
The system can identify individual cows in seconds, based on hide patterns and facial recognition, and tracks key data such as food and water intake, heat detection and behaviour patterns. These pieces of information are taken by the AI-powered algorithms and sends health alerts to farmers who make decisions about milk production, reproduction management and overall animal health.
Cainthus also claims to provide features like all-weather crop analysis in rates of growth, general plant health, stressor identification, fruit ripeness and crop maturity, among others.
Cargill, a producer and distributor of agricultural products such as sugar, refined oil, cotton, chocolate and salt, recently partnered with Cainthus to bring facial recognition technology to dairy farms worldwide. The deal includes a minority equity investment from Cargill although terms were not disclosed.
According to news reports in February 2018, Cargill and Cainthus are working on trials using pigs, and aim to release the application commercially by the end of the year. There are also plans to expand the application to poultry and aquaculture.
While most of our previous coverage of AI in banking involves fraud detection (rightly so) and natural language processing, some computer vision technology has also found its way into the banking industry as well.
1. Mitek Systems
Mitek Systems offers image recognition applications that use machine learning to classify, extract data, and authenticate documents such as passports, ID cards, driver’s licenses, and checks.
The applications work by having customers take a photo of an ID or a paper check using their mobile device and send to the user’s bank where computer vision software on the bank’s side verifies authenticity. Once verified and accepted by the user’s bank, the application or check is processed. For deposits, funds typically become available to the customer within a business day, according to the Mitek company website.
The two-minute demo below shows how the Mitek software works on mobile phones to capture the image of a check to be deposited to an account:
To start the process, the user enters his mobile phone number into the bank’s application form. A text message will be sent to his phone with a link the user can click to open an image-capture experience. The customer can choose from a driver license, ID card or passport. Mitek’s technology recognizes thousands of ID documents from around the world. Front and back images of the ID or document are required.
Once the user has submitted the images, the application will real-time feedback to ensure that high-quality images are captured. The company claims that its algorithms correct images; dewarp, deskew, distortion, and poor lighting conditions.
Other industrial use cases:
1. Osprey Informatics
In the industrial sector, computer vision applications such as Osprey Informatics are being used to monitor the status of critical infrastructures, such as remote wells, industrial facilities, work activity and site security. In its website, the company lists Shell and Chevron as among its clients.
In a case study, a client claims that Osprey’s online visual monitoring system for remote oil wells helped it reduce site visits and the equivalent cost. The client was seeking ways to make oil production more efficient in the face of depressed commodity prices. The study notes that it turned to Osprey to deploy virtual monitoring systems at several facilities for operations monitoring and security, and to identify new applications to improve productivity.
The Osprey Reach computer vision system was deployed to the client’s high-priority well sites to provide 15-minute time-lapse images of specific areas of the well, with an option for on-demand images and a live video. Osprey also deployed at a remote tank battery, enabling operators to read tank levels and view the containment area.
According to the case study, the client was able to reduce routine site visits by 50 per cent since deploying the Osprey solution. The average cost for an in-person well site inspection was also reduced from $20 to $1.
The 3-minute video below shows how the Osprey Reach solution claims to allow operators to remotely monitor oil wells, zooming in and out of images to ensure there are no leaks in the surrounding area. The on-demand video also shows the pump jack operating at a normal cadence.
Computer vision applications have emerged in more industries, although some have adopted the technology faster than others. Whatever computer vision technology exists continues to rely on the human element, to monitor, analyze, interpret, control, decide and take action.
In the automotive industry, global companies such as Google and Tesla are moving forward in improving self-driving cars equipped with computer vision cameras. However, with reported fatal accidents, it is clear that these cars are not completely ready to be commercially available and cannot be entirely autonomous.
In retail stores such as the Amazon Go store, human employees continue to work in customer service, and behind the screens to train the algorithms and confirm that the machine learning capability is on track. In terms of retail security, the technology assists to capture videos of the theft incidents, but human resources must step in to correct erring employees.
The application of computer vision in agriculture has been slow to take off. But companies such as Cainthus have entered this market aiming to take the technology from other industries and apply it to agriculture. These applications claim to offer farmers the opportunity to conduct precision farming, to raise production at lower cost. The partnership of Cainthus and Cargill could potentially open other forms of artificial intelligence to the rest of the industry.