ABSTRACT. All day long, our fingers touch, grasp and move objects in various media such as air, water, oil. We do this almost effortlessly - it feels like we do not spend time planning and reflecting over what our hands and fingers do or how the continuous integration of various sensory modalities such as vision, touch, proprioception, hearing help us to outperform any other biological system in the variety of the interaction tasks that we can execute. Largely overlooked, and perhaps most fascinating is the ease with which we perform these interactions resulting in a belief that these are also easy to accomplish in artificial systems such as robots. However, there are still no robots that can easily hand-wash dishes, button a shirt or peel a potato. Our claim is that this is fundamentally a problem of appropriate representation or parameterization. When interacting with objects, the robot needs to consider geometric, topological, and physical properties of objects. This can be done either explicitly, by modeling and representing these properties, or implicitly, by learning them from data. The main objective of our work is to create new informative and compact representations of deformable objects that incorporate both analytical and learning-based approaches and encode geometric, topological, and physical information about the robot, the object, and the environment. We do this in the context of challenging multimodal, bimanual object interaction tasks. The focus will be on physical interaction with deformable and soft objects.
Evaluation of Software for Risk Assessment Focusing on Human-Robot Collaboration
ABSTRACT. There is still limited support for performing human-robotcollaboration (HRC) risk assessments, so users must rely on their exper-tise and experience. Therefore, risk assessment is challenging for inexpe-rienced users without this knowledge. In this paper, we present a studythat identifies evaluation criteria for risk assessment software based onexpert opinions and evaluate the software against the criteria. In addi-tion, the software is evaluated for its applicability to HRC applications.The results show that the available software takes different approachesand only partially meets the experts’ requirements. The specifics of HRCapplications are only rudimentarily addressed by the software, but canbe mapped with the available functions and the necessary knowledge.
Goal Inference via Corrective Path Demonstration for Human-robot Collaboration
ABSTRACT. Recently, collaborative robots, such as collaborative delivery robots, have been expected to improve the work efficiency of users. For natural human-robot collaboration, it is necessary to infer the appropriate goal position to transport instruments, where the user's convenience and the surrounding environment are considered. In conventional research, the goal is inferred by demonstrating the user's desired positions, but position demonstration requires many trials to obtain the inference model, which is burdensome for the user. Therefore, we focus on the user's correction of the robot position and generate multiple position samples from the user's corrective path. In addition, these position samples are weighted based on the implicit intention of the correction to learn both the desired and undesired positions. Consequently, the robot improves goal inference in fewer trials. The effectiveness of the proposed method was evaluated using a simulator that simulates human-robot collaborative environments.
Skeleton-based Action and Gesture Recognition for Human-Robot Collaboration
ABSTRACT. Human action recognition plays a major role in enabling an effective and safe collaboration between humans and robots. Considering for example a collaborative assembly task, the human worker can use gestures to communicate with the robot while the robot can exploit the recognized actions to anticipate the next steps in the assembly process, improving safety and the overall productivity. In this work, we propose a novel framework for human action recognition based on 3D pose estimation and ensemble techniques. In such framework, we first estimate the 3D coordinates of the human hands and body joints by means of OpenPose and RGB-D data. The estimated joints are then fed to a set of graph convolutional networks derived from Shift-GCN, one network for each set of joints (i.e., body, left hand and right hand). Finally, using an ensemble approach we average the output scores of the all the networks to predict the final human action. The proposed framework was evaluated on a dedicated dataset, named IAS-Lab Collaborative HAR dataset, which includes both actions and gestures commonly used in human-robot collaboration tasks. The experimental results demonstrated how the ensemble of the different action recognition models helps improving the accuracy and the robustness of the overall system, even for very similar actions and gestures.
ABSTRACT. Designs for grippers using variable-stiffness principles have becomecommon in recent years. Researchers have developed various tests and setupsto measure and validate the properties of their designs. However, there are noclear standards or benchmarks to analyse, categorise and compare this type ofgrippers. This paper proposes a set of benchmarks to evaluate and categorisevariable-stiffness grippers. After reviewing the tests commonly applied to evaluateexisting variable-stiffness and generic grippers, we propose four tests to measuredifferent properties of a gripper. The tests are independent of one another andallow to classify each gripper in a three-category taxonomy. In order to validatethe benchmark tests, three simple variable-stiffness grippers have been built andanalysed with the benchmark. The results show that the grippers can be easilyanalysed with our benchmarks
Evaluation of Safe Reinforcement Learning with CoMirror Algorithm in a Non-Markovian Reward Problem
ABSTRACT. In reinforcement learning, an agent in an environment improves the skill depending on a reward, which is the feedback from an environment. For practical, reinforcement learning has several important challenges. First, reinforcement learning algorithms often use assumptions for an environment such as Markov decision processes; however, the environment in the real world often cannot be represented by these assumptions. Especially we focus on the environment with non-Markovian rewards, which allows the reward to depend on past experiences. To handle non-Markovian rewards, researchers have used a reward machine, which decomposes the original task into the sub-tasks. In those works, they assume that the sub-tasks are usually represented by a Markov decision process. Second, safety is also one of the challenges in reinforcement learning. G-CoMDS is a safe reinforcement learning algorithm based on CoMirror algorithm, an algorithm for constrained optimization problems. We have developed G-CoMDS algorithm to learn safely under environments without a Markov decision process. Therefore, the promising approach in complex situations would be decomposing the original task as the reward machine does, then solving the sub-tasks with G-CoMDS. In this paper, we provide additional experimental results and discussions of G-CoMDS, as a preliminary step of combining G-CoMDS with a reward machine. We evaluate G-CoMDS and existing reinforcement learning algorithm in the mobile robot simulation with a kind of non-Markovian rewards. The experimental result shows that G-CoMDS has the effect of suppressing the cost spike and slightly exceeds the performance of the existing safe reinforcement learning algorithm.
Minimum Displacement Motion Planning for Movable Obstacles
ABSTRACT. This paper presents a minimum displacement motion planning problem wherein obstacles are displaced by a minimum amount to find a feasible path. We define a metric for robot-obstacle intersection that measures the extent of the intersection and use this to penalize robot-obstacle overlaps. Employing the actual robot dynamics, the planner first finds a path through the obstacles that minimizes the robot-obstacle intersections. The metric is then used to iteratively displace the obstacles to achieve a feasible path. Several examples are provided that successfully demonstrates the proposed problem.
Mixed Use of Pontryagin's Principle and the Hamilton-Jacobi-Bellman Equation in Infinite-and Finite Horizon Constrained Optimal Control
ABSTRACT. This paper proposes a framework for solving a class of nonlinear
infinite- and finite-horizon optimal control problems with constraints. Establishment
of existence and uniqueness of solutions to the Hamilton-Jacobi-Bellman
(HJB) equation plays a crucial role in verifying well-posedness of a given problem
and in streamlining numerical solutions. The proposed framework revolves
around infinite-horizon Bolza-type cost functions with running costs exponentially
decaying in time.We show \Gamma-convergence of solutions with such cost functions
to the solutions of initial constrained (in)finite-horizon problems (that is,
without running costs exponentially decaying in time). Basically, we demonstrate
how to approximate solutions of (in)finite-horizon constrained optimal problems
using our framework. Employing a solver based on the Pontryagin’s Principle, we
efficiently obtain optimal solutions for finite- and infinite-horizon problems. Efficiency of the proposed framework is demonstrated in simulation by solving a 3D
path planing problem with obstacles for a full nonlinear model of an autonomous
underwater vehicle (AUV).
Comparing SONN Types for Efficient Robot Motion Planning in the Joint Space
ABSTRACT. Motion planning in the configuration space induces benefits,
such as smooth trajectories and self-collision avoidance. It becomes more
complex as the degrees of freedom (DOF) increase. This is due to the
direct relation between the dimensionality of the search space and the
DOF. Self-organizing neural networks (SONN) and their famous can-
didate, the Self-Organizing Map, have been proven to be useful tools
for joint space reduction while preserving its underlying topology, as
presented in [29]. In this work, we extend our previous study with addi-
tional models and adapt the approach from human motion data towards
robots’ kinematics. The evaluation includes the best performant mod-
els from [29] and three additional SONN architectures, representing the
consequent continuation of this previous work.
Physics-Based Motion Planning of a Fruit Harvesting Manipulator for Pushing Obstacles in a Cluttered Environment
ABSTRACT. Recently, the agricultural population is decreasing, and the demand for autonomous agricultural robots has increased. For a robot to successfully harvest fruits in diverse environments, various obstacles must be pushed away from its path. During the pushing motion, however, gravity acts on the fruits, causing them to move back and forth and return to their original state after being pushed away. This results in fruit-to-fruit contact, and fruit dam-age may occur. Therefore, a method to avoid obstacles while reducing the risk of fruit damage is required. However, the direct consideration of fruit damage is difficult because it requires many parameters that are difficult to observe. Herein, we propose a motion planning method by considering the potential energy of fruits as an evaluation index, which could be calculated based on observable parameters; this consideration could also suppress the reciprocating motion of fruits, thus preventing fruit damage. The proposed method was evaluated through simulations, and its effectiveness was verified.
A Dynamics-Aware NMPC Method for Robot Navigation Among Moving Obstacles
ABSTRACT. We present a novel method for mobile robot navigation among obstacles. Our approach is based on Nonlinear Model Predictive Control (NMPC) and uses a dynamics-aware collision avoidance constraint. The constraint, built upon the notion of avoidable collision state, considers not only the robot-obstacle distance but also their velocity as well as the robot actuation capabilities. To highlight the effectiveness of this constraint, we compare the proposed method with a version of the NMPC that uses a constraint purely based on distance information, showing that the first achieves better performance than the second, especially when the robot travels at higher speed among several moving obstacles. Results indicate that the method can work with relatively short prediction horizons and is therefore amenable to real-time implementation.
Learning Sequential Latent Variable Models from Multimodal Time Series Data
ABSTRACT. Sequential modelling of high-dimensional data is an important problem that appears in many domains including model-based reinforcement learning and dynamics identification for control. Latent variable models applied to sequential data (i.e., latent dynamics models) have been shown to be a particularly effective probabilistic approach to solve this problem, especially when dealing with images. However, in many application areas (e.g., robotics), information from multiple sensing modalities is available (beyond images alone)---existing latent dynamics methods have not yet been extended to effectively make use of such multimodal sequential data. Multimodal sensor streams can be correlated in a useful manner and often contain complementary information across modalities. In this work, we present a self-supervised generative modelling framework to jointly learn a probabilistic latent state representation of multimodal data and the respective dynamics. Using synthetic and real-world datasets from a multimodal robotic planar pushing task, we demonstrate that our approach leads to significant improvements in prediction and representation quality. Furthermore, we compare to the common learning baseline of concatenating each modality in the latent space and show that our principled probabilistic formulation performs better. Finally, despite being fully self-supervised, we demonstrate that our method is nearly as effective as an existing supervised approach that relies on ground truth labels.
Pushing the Limits of Learning-based Traversability Analysis for Autonomous Driving on CPU
ABSTRACT. Self-driving vehicles and autonomous ground robots require a reliable and accurate method to analyze the traversability of the surrounding environment for safe navigation. This paper proposes and evaluates a real-time machine learning-based Traversability Analysis method that combines geometric features with appearance-based features in a hybrid approach based on a SVM classifier. In particular, we show that integrating a new set of geometric and visual features and focusing on important implementation details enables a noticeable boost in performance and reliability. The proposed approach has been compared with state-of-the-art Deep Learning approaches on a public dataset of outdoor driving scenarios. It reaches an accuracy of 89.2% in scenarios of varying complexity, demonstrating its effectiveness and robustness. The method runs fully on CPU and reaches comparable results with respect to the other methods, operates faster, and requires fewer hardware resources.
Sensor-Based Navigation Using Hierarchical Reinforcement Learning
ABSTRACT. Robotic systems are nowadays capable of solving complex navigation tasks. However, their capabilities are limited to the knowledge of the designer and consequently lack generalizability to initially unconsidered situations. This makes deep reinforcement learning (DRL) especially interesting, as these algorithms promise a self-learning system only relying on feedback from the environment.
In this paper, we consider the problem of lidar-based robot navigation in continuous action space using DRL without providing any goal-oriented or global information. By relying solely on local sensor data to solve navigation tasks, we design an agent that assigns its own waypoints based on intrinsic motivation. Our agent is able to learn goal-directed navigation behavior even when facing only sparse feedback, i.e., delayed rewards when reaching the target. To address this challenge and the complexity of the continuous action space, we deploy a hierarchical agent structure in which the exploration is distributed across multiple layers.
Within the hierarchical structure, our agent self-assigns internal goals and learns to extract reasonable waypoints to reach the desired target position only based on local sensor data.
In our experiments, we demonstrate the navigation capabilities of our agent in two environments and show that the hierarchical structure seriously improves the performance in terms of success rate and success weighted by path length in comparison to a flat structure. Furthermore, we provide a real-robot experiment to illustrate that the trained agent can be easily transferred to a real-world scenario.
Traction optimization for robust navigation in unstructured environments using deep neural networks on the example of the off-road truck Unimog
ABSTRACT. Traction control is fundamental for self-driving vehicles’ safe and robust off-road control. Aspects as the surface material heavily affect slippage and driving behavior and require consideration for a forward-thinking driving method. This paper presents an approach for optimized traction control based on deep convolutional neural networks, combined with OpenStreetMap data to determine ground surface properties. The semantic surface segmentation provides information about estimated surface friction coefficients. The driving behavior adjusts and maximizes traction by selecting the best-suited gear selection, differential lock control, and tire pressure. Additionally, steering and breaking adjust to the ground friction. The demonstration uses the autonomous off-road truck U5023 in a forest-like environment close to a landfill.
Flattening clothes with a single-arm robot based on reinforcement learning
ABSTRACT. Using reinforcement learning has enabled robots to learn how to accomplish a wide range of tasks without explicit instructions. In this paper, we use a single-arm robot for the flattening of a piece of cloth which is crumpled and placed on a table. We create a simulation environment with a gripper and a piece of cloth to learn a policy for the robot to choose the best action based on the observation of the environment. The policy is then transferred to a real robot and successfully tested. We also introduce our method on the recognition of the corners of the cloth using computer vision which includes comparing classic computer vision approach to a deep learning one.
We use an ABB robot and a 2D camera for the experiments and PyBullet software for the simulation.
A Monte Carlo Framework for Incremental Improvement of Simulation Fidelity
ABSTRACT. Robot software developed in simulation often does not behave as expected when deployed because the simulation does not sufficiently represent reality - this is sometimes called the `reality gap' problem. We propose a novel algorithm to address the reality gap by injecting real-world experience into the simulation.
It is assumed that the robot program (control policy) is developed using simulation, but subsequently deployed on a real system, and that the program includes a performance objective monitor procedure with scalar output.
The proposed approach collects simulation and real world observations and builds conditional probability functions. These are used to generate paired roll-outs to identify points of divergence in simulation and real behavior. From these, state-space kernels are generated that, when integrated with the original simulation, coerce the simulation into behaving more like observed reality.
Performance results are presented for a long-term deployment of an autonomous delivery vehicle example.
Real2Sim or Sim2Real: Robotics Visual Insertion using Deep Reinforcement Learning and Real2Sim Policy Adaptation
ABSTRACT. Reinforcement learning has shown a wide usage in robotics tasks, such as insertion and grasping. However, without a practical sim2real strategy, the simulation trained policy could fail on the real task. There are also wide researches in the sim2real strategies, but most of those methods rely on heavy image rendering, domain randomization training, or tuning.
In this work, we solve the insertion task using a pure visual reinforcement learning solution with minimum infrastructure requirement. We also propose the novel sim2real strategy, Real2Sim, which provides a novel and easier solution in policy adaptation. We discuss the advantage of Real2Sim than Sim2Real.
Towards Synthetic Data: Dealing with the Texture-Bias in Sim2real Learning
ABSTRACT. In this paper we test and ultimately confirm the texture-bias hypothesis of the state of the art method for semantic segmentation, DeepLabv3+. However, our results show that even though texture bias is obvious, shape still plays an important role in the overall system performance. To perform the analysis, we propose using a synthetic dataset with highly realistic textures for sim2real learning in a leaf segmentation task. The labeled synthetic plant datasets used in the evaluation are created using a 3D modeling software, Blender, utilizing a custom designed procedural generation pipeline. We validate the accuracy of the models on real images of sweet peppers and we introduce an improved method for 6DOF pepper pose estimation which utilizes the presented leaf segmentation model. Accuracy of the pepper pose estimation method is experimentally validated.
On Scene Engineering and Domain Randomization: Synthetic Data for Industrial Item Picking
ABSTRACT. Synthetic data for training deep neural networks is increasingly used in computer vision. Different strategies, such as domain randomization or domain adaptation, exist to bridge the gap between synthetic training data and the real application. Despite recent progress and gain in knowledge in this area, the following question remains: How much adjustment to reality is required and which degree of randomization is useful for transferring precise object detectors to real use cases?
In this paper, we present a detailed study with more than 100 datasets and 2,700 trained convolutional neural networks (CNNs), comparing the influence of different degrees of manual optimization (scene engineering) and domain randomization techniques. To distinguish precision and robustness, the trained object detectors are evaluated on different domain shifts with respect to scene environment and object appearance.
Using the example of robot-based industrial item picking, we show that the scene context and structure as well as realistic textures are crucial for the simulation to reality transfer. The combination with well-chosen randomization parameters, especially lighting and distractor objects, improves the robustness of the CNNs at higher domain shifts.
Interspecies Collaboration in the Design of Visual Identity: A Case Study
ABSTRACT. Design usually relies on human ingenuity, but the past decade has seen the field's toolbox expanding to Artificial Intelligence (AI) and its adjacent methods, making room for hybrid, algorithmic creations. This article aims to substantiate the concept of interspecies collaboration – that of natural and artificial intelligence – in the active co-creation of a visual identity, describing a case study of the Regional Center of Excellence for Robotic Technology (CRTA) which opened on 750 m2 in June 2021 within the University of Zagreb. The visual identity of the Center comprises three separately devised elements, each representative of the human-AI relationship and embedded in the institution's logo. Firstly, the letter "C" (from the CRTA acronym) was created using a Gaussian Mixture Model (GMM) applied to (x, y) coordinates that the neurosurgical robot RONNA, CRTA's flag-ship innovation, generated when hand-guided by a human operator. The second shape of the letter "C" was created by using the same (x, y) coordinates as inputs fed to a neural network whose goal was to output letters in novel typography. A basic feedforward back-propagating neural network with two hidden layers was chosen for the task. The third design element was a trajectory the robot RONNA makes when performing a brain biopsy. As CRTA embodies a state-of-the-art venue for robotics research, the 'interspecies' approach was used to accentuate the importance of human-robot collaboration which is at the core of the newly opened Center, illustrating the potential of reciprocal and amicable relationship that humans could have with technology.
Hyperspectral 3D Point Cloud Segmentation with RandLA-Net
ABSTRACT. 3D maps are successfully used in robotics for localization and path planning. Point clouds or triangle meshes are most commonly used to represent 3D maps. To gain a further understanding of the map content, it is useful to annotate the map semantically. For this purpose, in machine learning pre-trained classifiers are used. For 3D point clouds with RGB features, there are existing solutions that use deep learning to segment the data. Since it is not always possible to differentiate between some materials using only RGB channels, hyperspectral histograms can augment the 3D data. The aim of this paper is to evaluate if hyperspectral information can improve the segmentation results for ambiguous objects, e.g. streets, sidewalks, and cars using established deep learning methods. Given the reported performance on pure 3D data and the possibility to directly integrate point annotations, we chose to evaluate the neural network RandLA-Net for this purpose. In addition to the evaluation RandLA-Net in this context, we also provide a ground truth data set with semantically annotated hyperspectral 3D point clouds.
3D Semantic Scene Perception using Distributed Smart Edge Sensors
ABSTRACT. We present a system for 3D semantic scene perception consisting of a network of distributed smart edge sensors. The sensor nodes are based on an embedded CNN inference accelerator and RGB-D and thermal cameras.
Efficient vision CNN models for object detection, semantic segmentation, and human pose estimation run on-device in real time.
2D human keypoint estimations, augmented with the RGB-D depth estimate, as well as semantically annotated point clouds are streamed from the sensors to a central backend, where multiple viewpoints are fused into an allocentric 3D semantic scene model.
As the image interpretation is computed locally, only semantic information is sent over the network. The raw images remain on the sensor boards, significantly reducing the required bandwidth, and preserving the privacy of observed persons.
We evaluate the proposed system in challenging real-world multi-person scenes in our lab.
The proposed perception system provides a complete scene view containing semantically annotated 3D geometry and estimates 3D poses of multiple persons in real time.
Collision Warning by Rotating 2D LiDAR for Safe Crane Operation
ABSTRACT. Construction cranes are unanimously used for carrying very heavy items such as steel frames and concrete blocks at construction sites and harbors. One of the issues of using such large machines is how to prevent accidents, including collision of a person and the load. This paper proposes a collision detection and warning system composed of a long-range 2D LiDAR (light detection and ranging) and a rotary table. By rotating the LiDAR, the system covers a spherical field of view. Since the rotation speed is limited, however, we need to deal with the trade-off between the scanning cycle time and the area to be covered. We propose a method to detect collision with setting a warning margin volume around a danger volume. We present equations for determining the appropriate margin and the angular velocity of the table. We verified the system in simulation and in an actual scene.
Semantic Classification in Uncolored 3D Point Clouds using Multiscale Features
ABSTRACT. While the semantic segmentation of 2D images is already a well-researched field, the assignment of semantic labels to 3D data is lagging behind. This is partly due to the fact that prelabeled training data is only rarely available since not only the training and application of classification methods but also the manual labeling process are much more time-consuming in 3D.
This paper focuses on the more classical approach of first calculating features and subsequently applying a classification algorithm. Existing handcrafted feature definitions are enhanced by using multiple selected reductions of the point cloud as approximations. This serves as input to train a well-studied random forests classifier. A comparison to a recently presented deep learning approach, i.e., the Kernel Point Convolution method, which is a convolutional neural network for direct 3D point clouds as input, reveals that there are well-justified applications for both modern and classical machine learning methods.
To enable the smooth conversion of existing 3D scenes to semantically labeled 3D point clouds the tool Blender2Helios is presented. We show that the therewith generated artificial data is a good choice for training real-world classifiers.
On the Evaluation of RGB-D-based Categorical Pose and Shape Estimation
ABSTRACT. Recently, various methods for 6D pose and shape estimation of objects have been proposed. Typically, these methods evaluate their pose estimation in terms of average precision, and reconstruction quality with chamfer distance. In this work we take a critical look at this predominant evaluation protocol including metrics and datasets. We propose a new set of metrics, contribute new annotations for the Redwood dataset and evaluate state-of-the-art methods in a fair comparison. We find that existing methods do not generalize well to unconstrained orientations, and are actually heavily biased towards objects being upright. We contribute an easy-to-use evaluation toolbox with well-defined metrics, method and dataset interfaces, which readily allows evaluation and comparison with various state-of-the-art approaches (\url{https://github.com/roym899/pose_and_shape_evaluation}).
State-Aware Layered BTs — Behavior Tree Extensions for Post-Actions, Preferences and Local Priorities in Robotic Applications
ABSTRACT. In this paper we propose State-Aware Layered BTs, a behavior tree extension which facilitates the implementation of post-actions, preferences and local priorities for usage in robotics applications. These procedures are hard to be defined on standard behavior tree formulations because behavior trees lack the ability to structurally retain any information on previous states. Therefore, the execution of localized heuristics can only be attained through the inference or emulation of previous states, which can be imprecise, cause reactivity losses, or introduce added complexity to the structure. In this work we cope with this problem by (i) adding a native operator which accesses the previous execution states of individual nodes; (ii) adding separate layers of interaction which expand the operator functionality and are used to describe multiple goals within a same task. The validity of the proposed system is verified through extensive analysis of a series of annotated examples.
Multi-agent Coordination based on POMDPs and Consensus for Active Perception
ABSTRACT. This paper deals with coordination in multi-agent systems based on Partially Observable Markov Decision Processes (POMDP). The well-known POMDP framework has been extended with consensus protocol, allowing for decentralized sharing of beliefs improving the local decision making of individual agents and eliminating the need to track the entire history of actions in the system to achieve coordination. To emphasize the importance of belief update after an observation is made in the system, we propose a novelty-biased consensus protocol. The efficiency of the proposed framework is demonstrated on the multi-agent system, which consists of three agents carrying out search missions in a simulation environment. The results obtained show that the mission completion time is significantly reduced without degradation in the success rate, while using simple POMDPs for individual agents.
KI5GRob: Fusing Cloud Computing and AI for Scalable Robotic System in Production and Logistics
ABSTRACT. Robotics and AI are essential components in current and future production scenarios. For example, in object handling, breakthroughs in AI have achieved almost revolutionary progress. The use of AI methods also increases the demands on resources in industrial settings. As systems scale and the complexity of tasks increase, the application of deep learning in industrial robotics leads to specific questions in terms of scalability, reliability, and safety, which are not the main focus in AI research. Some of these limitations can be overcome by utilizing cutting-edge technologies in cloud computing such as on-demand computing, massive parallelization, microservices, and the DevOps pipeline. With KI5GRob, we propose a novel approach to facilitate the development of robotic applications fusing cloud computing and AI. Research targets machine learning methods for multi-modal sensor signal processing and transfer learning to solve industrial robotic manipulation tasks. The goal of the project is to develop a microservice- based software architecture that enables on-demand deployment of such methods as well as cloud-based motion planning and robot control. This paper gives a high-level overview of the goals and the ongoing research activities in KI5GRob. The first results about the overall hardware and software architecture are also presented.
A Correlated Random Walk Model to Rapidly Approximate Hitting Time Distributions in Multi-Robot Systems
ABSTRACT. Multi-robot systems are frequently used for tasks involving searching, so it is important to be able to estimate the searching time. Yet, simulation approaches and real-world experiments to determine searching time can be cumbersome and even impractical. In this work, we propose a correlated-random-walk based model to efficiently approximate hitting time distributions of multi-robot systems in large arenas.We verified the computational results by using ARGoS, a physics-based simulator. We found that the Gamma distribution can always fit the hitting time distributions of random walkers.
Synthesis and monitoring of complex tasks for heterogeneous robots in an Industry 4.0 scenario
ABSTRACT. Synthesis and tasks execution is a well known problem pertaining to different fields.
With the advent of Industry 4.0, the need to schedule and execute very complex tasks that involve different machinery or robots has become a key topic.
In this article we propose a methodology that allows to define and monitor the execution of high-level tasks for a heterogeneous multi-robot system involved in assembly tasks.
With the proposed approach, a state-chart is automatically synthesized starting from the definition of high-level tasks. The state-chart is then used to monitor online the operations that the robots must execute.
The proposed methodology has been validated and tested in a research facility that reproduces a complete production line involving collaborative robotic arms, mobile robots and manufacturing machines.
Benchmark of Sampling-Based Optimizing Planners for Outdoor Robot Navigation
ABSTRACT. This paper evaluates Sampling-Based Optimizing (SBO) planners found in the Open Motion Planning Library (OMPL) in the context
of mobile robot navigation in outdoor environments. A large number
of SBO planners have been proposed, revealing performance differences
among these planners for path planning problems can be burdensome and
ambiguous. The probabilistic nature of the SBO planners can also add
difficulty to this procedure as one may get different results for the same
planning problem even in consecutive queries from the same planner.
We benchmark all available SBO planners in OMPL with an automated
planning problem generation method specifically for outdoor robot navigation scenarios. Several evaluation metrics such as the resulting path’s
length, smoothness, and success rate are selected and probability distributions for metrics are presented. With the obtained experimental results, clear recommendations are made on high-performing planners for
the mobile robot path planning problems which are useful to researchers
and practitioners in the field of mobile robot planning and navigation.
3D Traversability Analysis in Forest Environments based on Mechanical Effort
ABSTRACT. Autonomous navigation in harsh and dynamic 3D environments poses a great challenge for modern Robotics. This work presents a novel traversability analysis and path-planning technique that processes 3D pointcloud maps to generate terrain gradient information. An analysis of terrain roughness and presence of obstacles is applied on the perceived environment in order to generate efficient paths. These avoid major hills when more conservative paths are available, potentially promoting fuel economy and reducing the wear of the equipment and the associated risks. The proposed approach outperforms existing techniques based on results in realistic 3D simulation scenarios, which are discussed in detail.
Two-step Planning of Dynamic UAV Trajectories using Iterative δ-Spaces
ABSTRACT. UAV trajectory planning is often done in a two-step approach, where a low-dimensional path is refined to a dynamic trajectory. The resulting trajectories are only locally optimal, however.
On the other hand, direct planning in higher-dimensional state spaces generates globally optimal solutions but is time-consuming and thus infeasible for time-constrained applications.
To address this issue, we propose δ-Spaces, a pruned high-dimensional state space representation for trajectory refinement.
It does not only contain the area around a single lower-dimensional path but consists of the union of multiple near-optimal paths.
Thus, it is less prone to local minima.
Furthermore, we propose an anytime algorithm using δ-Spaces of increasing sizes.
We compare our method against state-of-the-art search-based trajectory planning methods and evaluate it in 2D and 3D environments to generate second-order and third-order UAV trajectories.
End-to-end path estimation and automatic dataset generation for robot navigation in plant-rich environments
ABSTRACT. This paper proposes a method to estimate a path to follow directly from an image for robot navigation in plant-rich environments such as greenhouses and unstructured outdoor scenes. In such environments, there are multiple factors that make it difficult for robots to determine a path to travel, such as plants covering the path and ambiguous path boundaries. Approaches based on segmentation of traversable regions cannot be applied to such environments because the regions may not be precisely defined or they may be occluded. In this work, we propose a method to estimate a path from a single image in an end-to-end fashion. We also develop an automatic annotation method utilizing the robot’s trajectory during the data acquisition phase. We conducted a real-world experiment of robot navigation and confirmed that the proposed method is capable of navigation in paths partially covered by plants. We also confirmed that that proposed data annotation method can generate training data more efficiently than manual annotation.
ABSTRACT. This paper introduces Scale Compatible Adaptive Monte-Carlo Localization (SCAM) to localize on topological maps, such as hand-drawn maps and floor plans. This enables fast modifications to maps of indoor spaces whereby the layout changes frequently via image editing instead of re-mapping the environment. SCAM uses x and y scale components in addition to a 2D pose to form a five-dimensional state, modifying Adaptive Monte-Carlo Localization (AMCL) significantly at the prediction and re-sampling step to account for the extra scale components. The scale components influence the projection of the LiDAR scan on each particle, improving the scan match from the LiDAR scan with the imperfect map. The performance of SCAM is tested with real-life data gathered on an empty hallway, a cluttered lab, and a populous lobby. SCAM is evaluated on hand-drawn maps, where the introduction of scale components in SCAM prevented major issues such as loss of localization and running over obstacles, but sees more minor issues such as gliding past obstacles. This result is further verified via repetitions of test cases and making modifications to the map while preserving the topology, and applied to a floor plan. On scaled point cloud maps, SCAM only introduces relatively small positional errors of 0.270m, 0.161m, 0.491m, and heading errors of 1.55$^\circ$, 8.10$^\circ$, 10.29$^\circ$ in the three test areas respectively.
Training traffic light behavior with end-to-end learning
ABSTRACT. In this study, an autonomous-driving agent controls a vehicle by providing future waypoints directly from a forward-facing camera. This end-to-end learning is made possible by following the teacher-student approach of Cheating by Segmentation, itself inspired by Learning
by Cheating.
The objective of this study is to reduce the number of infractions made
by the autonomous-driving agent. Many infractions could be contributed
by how the agent handles traffic lights. In this study, a Pyramid Pooling
Module and Feature Pyramid Network are added to the network architecture with the aim to let the network combine precision with overview,
especially in the case when traffic lights are encountered by the agent.
On Demand Ride Sharing: Scheduling of an Autonoumous Bus Fleet for Last Mile Travel
ABSTRACT. Autonomous buses are expected to expand the mobility-on-demand options in cities in the next 10 years. An essential aspect of ensuring optimal usage of fleets of autonomous buses is the task of scheduling. This paper presents a scheduling approach based on the construction of all possible trips used to formulate an optimization problem. While most research is focused on the scheduling of taxi trips, there is almost no research for other applications, for instance, the scheduling of last-mile travel options. A street network has been generated based on open street map data as a basis for the scheduling. All possible combinations of buses and requests are calculated, and for each of those trips, the optimal order of requests is created. These are used to formulate the optimization problem, calculating an optimal assignment of requests on the buses. The approach has considered constraints such as maximum waiting time, travel delay, and targets to exploit shared trips for higher efficiency. Experiments have been carried out in a simulated environment of a university campus area with fleets of up to 10 vehicles. By performing various trials with changing parameters, the influence of the constraints on waiting time and travel delay to the scheduling is determined. Depending on the setup, service rates above $90\%$, while trails with strict constraints show that the approach can handle short-term requests. Based on the results, a use case-specific composition of the autonomous bus fleet can be done.
Lane Change Classification with Neural Networks for Automated Conversion of Logical Scenarios
ABSTRACT. Validation and testing based on simulated scenarios is at the heart of the automotive industry for all vehicle development phases, since it enables perfect repeatability of the experiments and control of certain parameters. However, simulations alone can hardly capture all the complexities of the real world, thus true driving scenarios also represent an indispensable part of the process. Although invaluable, they offer very little freedom in changing the parameters, which motivates approaches for automated conversion of real world driving scenarios to so-called logical scenarios, which can offer higher abstraction level. To be able to perform the complex process of converting real-world driving data, primarily it is necessary to be able to perform vehicle motion classification. For that purpose, this paper proposes and analyzes five different neural network models. The networks were trained and evaluated on a custom generated dataset to classify lateral vehicle behaviours in three main classes with respect to road lanes: lane keep, lane change right and lane change left. The dataset represents highway driving scenarios on a road with 7 lanes in the curvilinear coordinate system. Model training and evaluation was performed on four different subsets, each of them having a different signal-to-noise ratio. In the end, the best overall result was achieved with the network model composed of a bidirectional long-short term memory and multi-scale convolutional neural network layers.
Trajectory Analysis in a Lane-Based UAS Traffic Management System
ABSTRACT. The development of Advanced Air Mobility (AAM) systems requires not only a robust and safe approach to planning flights, but also a way to monitor Unmanned Aircraft Systems (UAS) flights in realtime to determine whether flights are deviating from their nominal flight paths or if there are rogue (i.e., unplanned) flights in the area. We have proposed a lane-based airways methodology for lane creation, scheduling and strategic deconfliction, and here we describe Nominal vs. Anomalous Behavior (NAB), an efficient and effective way to monitor flight trajectories to determine normal versus anomalous behavior.