Object detection has been used in a wide range of industries. For example, in autonomous driving, the task of object detection is to accurately and efficiently identify and locate a large number of predefined classes of object instances (vehicles, pedestrians, traffic signs, etc.) from road videos. In robotics, the industrial robot needs to recognize specific machine elements. In the security field, the camera should accurately recognize people’s faces. With the wide application of deep learning, the accuracy and efficiency of object detection have greatly improved, but object detection based on deep learning still faces challenges. Different applications of object detection have different requirements, including highly accurate detection, multi-category object detection, real-time detection, robustness to occlusions, etc. To address the above challenges, based on extensive literature research, this paper analyzes methods for improving and optimizing mainstream object detection algorithms from the perspective of evolution of one-stage and two-stage object detection algorithms. Furthermore, this article proposes methods for improving object detection accuracy from the perspective of changing receptive fields. The new model is based on the original YOLOv5 (You Look Only Once) with some modifications. The structure of the head part of YOLOv5 is modified by adding asymmetrical pooling layers. As a result, the accuracy of the algorithm is improved while ensuring speed. The performance of the new model in this article is compared with that of the original YOLOv5 model and analyzed by several parameters. In addition, the new model is evaluated under four scenarios. Moreover, a summary and outlook on the problems to be solved and the research directions in the future are presented.
The usefulness of mobile devices has increased greatly in recent years allowing users to perform more tasks in daily life. Mobile devices and applications provide many benefits for users, perhaps most significantly is the increased access to point-of-use tools, navigation, and alert systems. This paper presents a prototype of a cross-platform mobile augmented reality (AR) system with the core purpose of finding a better means to keep the campus community secure and connected. The mobile AR System consists of four core functionalities – an events system, a policing system, a directory system, and a notification system. The events system keeps the community up-to-date on current events that are happening or will be happening on campus. The policing system allows the community to stay in arms reach of campus resources that will allow them to stay secure. The directory system serves as a one-stop-shop for campus resources, ensuring that staff, faculty, and students will have a convenient and efficient means of accessing pertinent information on the campus departments. The mobile augmented reality system includes integrated guided navigation system that users can use to get directions to various destinations on campus. The various destinations are different buildings and departments on campus. This mobile augmented reality application will assist the students and visitors on campus to efficiently navigate the campus as well as send alert and notifications in case of emergencies. This will allow campus police to respond to the emergencies in a quick and timely manner. The mobile AR system was designed using Unity Game Engine and Vuforia Engine for object detection and classification. Google Map API was integrated for GPS integration in order to provide location-based services. Our contribution lies in our approach to create a user specific customizable navigational and alert system in order to improve the safety of the users at their workplace. Specifically, the paper describes the design and implementation of the proposed mobile AR system and reports the results of the pilot study conducted to evaluate their perceived ease-of-use, and usability.
This research explores a fresh approach to the selection and weighting of classical image features for infrared object detection and target-like clutter rejection. Traditional statistical techniques are used to calculate individual features, while modern supervised machine learning techniques are used to rank-order the predictive-value of each feature. This paper describes the use of Decision Trees to determine which features have the highest value in prediction of the correct binary target/non-target class. This work is unique in that it is focused on infrared imagery and exploits interpretable machine learning techniques for the selection of hand-crafted features integrated into a pre-screening algorithm.
In this paper, we propose a video analytics system to identify the behavior of turkeys. Turkey behavior provides evidence to assess turkey welfare, which can be negatively impacted by uncomfortable ambient temperature and various diseases. In particular, healthy and sick turkeys behave differently in terms of the duration and frequency of activities such as eating, drinking, preening, and aggressive interactions. Our system incorporates recent advances in object detection and tracking to automate the process of identifying and analyzing turkey behavior captured by commercial grade cameras. We combine deep-learning and traditional image processing methods to address challenges in this practical agricultural problem. Our system also includes a web-based user interface to create visualization of automated analysis results. Together, we provide an improved tool for turkey researchers to assess turkey welfare without the time-consuming and labor-intensive manual inspection.