New computer vision and deep learning approaches for improving degraded images have been developed by researchers at Yale-NUS College. The new technology is able to extract more accurate data from the low-level vision in videos caused by environmental factors such as rains and nighttime conditions. The new research has been presented at the 2021 Conference on Computer Vision and Pattern Recognition (CVPR).
Earlier technology used in applications like automatic surveillance systems, autonomous vehicles, etc is often impacted by environmental factors, this resulted in poor extraction of data.
How Environmental Factors Impact Images?
Conditions such as low light, artificial effects such as glare, glow, and floodlight affect the nighttime images. Rains can also be an issue due to rain streaks or accumulation.
According to the Associate Professor of Science Robby Tan, who led the research team from Yale-NUS College, “Many computer vision systems like automatic surveillance and self-driving cars, rely on clear visibility of the input videos to work well. For instance, self-driving cars cannot work robustly in heavy rain and CCTV automatic surveillance systems often fail at night, particularly if the scenes are dark or there is significant glare or floodlights.”
The team focused on two studies that introduced deep learning algorithms to enhance the quality of nighttime videos and rain videos. While the first study aimed at boosting the brightness, suppressing the noise and light effects such as glow, glare, and floodlights. The second study
The team relied on two separate studies which introduced deep learning algorithms to enhance the quality of night-time videos and rain videos.
The first study focused on boosting the brightness while simultaneously suppressing noise and light effects, such as glare, glow, and floodlights to create clear night-time images. The new technique is aimed at improving the clarity in night-time images and videos when there is unavoidable glare, which existing methods have yet to do.
The second study tries to solve the problem by introducing a method that employs a frame alignment. This enabled better visual information without being affected by rain streaks that often appear randomly in different frames.
The team removed the veiling effect using a moving camera to employ depth estimation. The earlier methods used focused on removing rain streaks while the new ones not only remove rain streaks but also the rain veiling effect simultaneously.
3D Human Pose Estimation
The team also presented the 3D Human Pose Estimation, a technique used in video surveillance, video gaming, and sports broadcasting.
Research has been going on for years with 3D multi-person pose estimation from monocular video or videos taken from a single camera. Monocular videos are more flexible and can be taken with a single camera such as a mobile phone, unlike videos from multiple cameras.
A high activity like multiple individuals in the same scene affects the accuracy of human detection. Especially when individuals are interacting closely or overlapping with each other in the monocular video.
During their study, the researchers estimated 3D human pose from a video by combining two existing methods, which were top-down and bottom-up approaches. More reliable pose estimation in multi-person settings is possible via the new method as compared with the other two methods. The new method is better equipped to handle the distance between individuals.
Assoc. Prof Tan. said, “As the next step in our 3D human pose estimation research, which is supported by the National Research Foundation, we will be looking at how to protect the private information of the videos. For the visibility enhancement methods, we strive to contribute to advancements in the field of computer vision, as they are critical to many applications that can affect our daily lives, such as enabling self-driving cars to work better in adverse weather conditions.”