Introduction to offline navigation for augmented reality
Computer systems with control without the help of controllers - a new stage in the interaction of man and computer. This area includes technologies that perceive the physical environment, including gesture recognition, voice recognition, face recognition, motion tracking, environment reconstruction. The Intel RealSense F200 and R200 cameras provide a number of features from this area. Due to the ability to shoot with the definition of the depth of the camera F200 and R200 allow you to build a three-dimensional environment and track the movement of the device in relation to the environment. Reconstruction of the environment along with motion tracking allows you to realize the possibilities of virtual reality, in which virtual objects fit into the real world.
The purpose of this article is to introduce autonomous navigation and describe its use in augmented reality applications. The developed example uses the Intel RealSense R200 camera and the Unity 3D game engine. It is recommended that you familiarize yourself with the capabilities of the Intel RealSense SDK and Unity. For information on integrating the Intel RealSense SDK with Unity, see the articles Developing Games with Unity and the Intel RealSense 3D Camera and First Look: Augmented Reality in Unity with the Intel RealSense R200 . Intel RealSense cameras can provide data to augmented reality applications, but creating a truly interesting virtual world is up to developers. One way to create a live environment is to use autonomous agents. Autonomous agents are objects that operate independently using artificial intelligence. Artificial intelligence determines the operational parameters and rules that the agent must obey. The agent dynamically reacts to the conditions of the environment in which it is located, in real time, so even with the simplicity of the principles of action it can have a complex pattern of behavior.
Autonomous agents can exist in many types, but in this discussion we will focus on agents that can move and navigate. These agents include game characters not controlled by the player (NPC), and birds gathering in flocks in educational animation programs. The objectives of the agents will vary depending on the application, but the principles of movement and navigation are the same in all cases. ')
Autonomous navigation
Agents can be navigated in different ways: both simple and complex, both in terms of implementation and in terms of resource intensity. The simplest approach is to determine the path that the agent will take. The waypoint is selected, then the agent moves towards it in a straight line. This approach is simple to implement, but its application involves several problems. The most obvious of them is: what happens if there is no direct path between the agent and the waypoint (Fig. 1)?
Figure 1. The agent moves to the target along a straight path, but the path is blocked by an obstacle.Note.The described issues are applicable to navigation in both two-dimensional and three-dimensional space.This uses 2D space for illustration.
To pave the route around obstacles, you need to add additional waypoints (Fig. 2).
Figure 2. Additional waypoints are added so that the agent can bypass obstacles.
On large maps with a large number of obstacles there will be much more waypoints and routes. In addition, increasing the density of waypoints (Fig. 3) will help to build more efficient routes (the agent’s path to the destination point will be shorter).
Figure 3. As the map size increases, the number of waypoints and possible routes increases.
With a large number of waypoints, a way to construct a route between two waypoints located not in the neighborhood of one another is required. This problem is called "finding the way." Pathfinding is closely related to graph theory and is used in many areas, not only in navigation. Naturally, quite a lot of research is being done in this area; many algorithms have been created to solve various problems of finding a path. One of the most well-known path finding algorithms is A *. This algorithm provides for movement between adjacent waypoints towards the destination and a map of all visited waypoints and all waypoints connected to them. After reaching the destination, the algorithm calculates the path using the created map. After that, the agent can move along this path. Algorithm A * does not provide for a search in all available space, therefore the constructed path is not always optimal. The effectiveness of this algorithm in terms of the load on computing resources is quite high.
Figure 4. Algorithm A * bypasses the map, trying to find a route to the target. Animation: Subh83 / CC BY 3.0
Algorithm A * by its nature cannot adapt to environmental changes, such as adding or removing obstacles, moving boundaries. The augmented reality environment is inherently dynamic, since such an environment is created and modified in accordance with the movement of the user and the physical space.
In a dynamic environment, it is desirable that agents make a decision in real time; such a decision should be made on the basis of the total amount of the agent’s current knowledge of the environment. Thus, it is necessary to define the structure so that the agent can make decisions and act in real time. With regard to navigation, a convenient and common approach is that the structure of behavior is divided into three levels.
The choice of action consists of setting goals and determining how to achieve these goals. For example, a rabbit will move around the field in search of food, but when a predator appears near it, the rabbit will flee. Finite automata (end-state machines) are conveniently used to implement this behavior, since they determine the agent states and the conditions under which the states change.
Guidance is the calculation of movement based on the current state of the agent. For example, if a predator is chasing a rabbit, the rabbit must run away from the predator. Guidance calculates both magnitude and direction of motion.
Movement is the mechanic through which the agent moves. A rabbit, a man, a car, and a spaceship move in different ways. Movement also determines the method of movement (for example, using legs, wheels, rocket engines), and the parameters of this movement (for example, mass, maximum speed, maximum force, etc.).
Together, these levels form the artificial intelligence agent. In the next section, we will show an example of a Unity application that demonstrates the implementation of these three levels. Next, we embed autonomous navigation in an augmented reality application using the R200 camera.
Implementing autonomous navigation
This section describes the platform behavior in the Unity scene for autonomous navigation, starting with movement.
â–ŤMoving
The movement of the agent is based on the laws of Newton's dynamics : when force is applied to the mass, acceleration occurs. We use a simplified model with a uniformly distributed body mass, to which force from any direction can be applied. To limit the movement sets the maximum force and maximum speed (code fragment 1).
publicfloat mass = 1f; // Mass (kg) public float maxSpeed = 0.5f; // Maximum speed (m/s) public float maxForce = 1f; // "Maximum force (N)
Code snippet 1. Agent movement pattern
The agent must have the components rigidbody and collider, which are initialized at startup (see code snippet 2). For simplicity, gravity is excluded, but it can be turned on.
Code snippet 2. The rigidbody and collider components are initialized with Start ()
The agent moves by applying force to a rigidbody in the FixedUpdate () step (see code snippet 3). FixedUpdate () works like Update () , but is guaranteed to run at the same interval as Update () . The Unity engine performs physics calculations (operations with solids) after completing the FixedUpdate () step.
privatevoidFixedUpdate(){ Vector3 force = Vector3.forward; // Upper bound on force if (force.magnitude > this.maxForce) { force = force.normalized * this.maxForce; } // Apply the force rb.AddForce (force, ForceMode.Force); // Upper bound on speed if (rb.velocity.magnitude > this.maxSpeed) { rb.velocity = rb.velocity.normalized * this.maxSpeed; } }
Code 3 snippet. Strength applied to a rigidbody in the FixedUpdate () step.In this example, the agent moves along the z axis.
If the magnitude of the force exceeds the maximum force of the agent, it is adjusted so that the force of the magnitude is equal to the maximum force (the direction is preserved). The AddForce () function applies force through numerical integration .
Equation 1. Numerical integration of speed.The AddForce () function performs this calculation.
Where ν 1 is the new speed, ν 0 is the old speed, f is the force, m is the mass, and Δt is the time interval between updates (by default, the fixed time step in Unity is 0.02 s). If the magnitude of the speed exceeds the maximum speed of the agent, it is adjusted to match the maximum speed.
â–Ť Hope
Guidance computes the force that will be given to the movement pattern. Three hover behavior algorithms will be applied: search, arrival, and avoidance of obstacles.
Search
Behavior “Search” tries to move the object to the target as quickly as possible The desired speed of this behavior - the movement directly to the target with maximum speed. The pointing force is calculated as the difference between the desired and the current speeds of the agent (Fig. 5).
Figure 5. Search Behavior applies pointing power to change the current speed to the desired
The implementation (code snippet 4) first calculates the desired vector by normalizing the offset between the agent and the target and multiplying it by the maximum speed. The return power of the guidance is the desired speed minus the current speed of the rigidbody.
The agent uses the “Search” algorithm, calling Seek () when calculating the force in FixedUpdate () (code snippet 5).
privatevoidFixedUpdate(){ Vector3 force = Seek (); ...
Snippet 5. Calling Seek () in FixedUpdate ()
An example of the “Search” algorithm in action is shown in video 1. The agent is provided with a blue arrow indicating the current speed of the rigidbody and a red arrow indicating that guidance is applied at this time interval.
Video 1. Initially, the speed of the agent is perpendicular to the direction to the target, so the agent moves along the curve
Arrival
With the “Search” algorithm, the agent misses the target and moves around it, as it moved with the greatest possible speed. The “Arrival” algorithm is similar to the “Search” algorithm, and the difference is that it tries to completely stop at the target. The “Deceleration Radius” parameter determines the distance to the target, after reaching which the agent will begin to slow down. When the agent is inside the deceleration radius, the value of the desired speed will be inversely proportional to the distance between the agent and the target. Depending on the values ​​of the maximum force, the maximum speed and the deceleration radius, this behavior may not lead to a complete stop.
The Arrival behavior (snippet 6) first calculates the distance between the agent and the target. The reduced speed is calculated as the maximum speed reduced to distance divided by the deceleration radius. The desired speed is the lowest between the superficial speed and the maximum speed. Thus, if the distance to the target is less than the deceleration radius, then the desired speed is the superficial velocity. Otherwise, the desired speed is the maximum speed. The remainder of this function works exactly like “Search” at the desired speed.
Video 2. The Arrival algorithm reduces speed when the target is reached.
Obstacle avoidance
The “Arrival” and “Search” algorithms are great for arriving at your destination, but they do not cope with obstacles. In a dynamic environment, an agent should be able to evade new obstacles that appear. The Obstacle Avoidance algorithm analyzes the path ahead of the agent along the intended route and determines if there are any obstacles along the way that should be avoided. If obstacles are detected, the algorithm calculates the force that changes the path of the agent in such a way that the agent does not encounter an obstacle (Fig. 6).
Figure 6. If an obstacle is detected on the current path, a force is returned to prevent a collision. The implementation of the Obstacle Avoidance algorithm (snippet 7) uses spherecast to detect collisions. At the same time, a sphere is produced along the current velocity's rigidbody body, and RaycastHit is returned for each collision. The sphere moves from the center of the agent, its radius is equal to the sum of the radius of the collision of the object with the value of the parameter “radius of evasion”. Using the dodge radius, the user can define empty space around the agent. The range of motion of the sphere is limited by the parameter "Front detection".
// Avoidance radius (m). The desired amount of space between the agent and obstacles. public float avoidanceRadius = 0.03f; // Forward detection radius (m). The distance in front of the agent that is checked for obstacles. public float forwardDetection = 0.5f; private Vector3 ObstacleAvoidance () { Vector3 steeringForce = Vector3.zero; // Cast a sphere, that bounds the avoidance zone of the agent, to detect obstacles RaycastHit[] hits = Physics.SphereCastAll(this.transform.position, this.col.bounds.extents.x + this.avoidanceRadius, this.rb.velocity, this.forwardDetection); // Compute and sum the forces across all hits for(int i = 0; i < hits.Length; i++) { // Ensure that the collidier is on a different object if (hits[i].collider.gameObject.GetInstanceID () != this.gameObject.GetInstanceID ()) { if (hits[i].distance > 0) { // Scale the force inversely proportional to the distance to the target float scaledForce = ((this.forwardDetection - hits[i].distance) / this.forwardDetection) * this.maxForce; float desiredForce = Mathf.Min (scaledForce, this.maxForce); // Compute the steering force steeringForce += hits[i].normal * desiredForce; } } } return steeringForce; }
Code snippet 7. Behavior "Obstacle Avoidance"
When spherecast is used, an array of RaycastHit objects is returned . The RaycastHit object contains information about the collision, including the distance to the collision and the normal to the plane of the surface with which the collision occurred. A normal is a vector perpendicular to the plane. It can be used to direct the agent away from the collision point. The magnitude of the force is determined by maximizing the force inversely proportional to the distance to the collision. The forces of each collision are added, and the result is the total force of evasion in one time interval.
To obtain a more complex behavior, you can use several algorithms at once simultaneously (code snippet 8). The Obstacle Avoidance algorithm is only useful when used in conjunction with other algorithms. In this example (video 3), Obstacle Avoidance is used together with the Arrival algorithm. Algorithms of behavior are combined simply by combining their forces. More complex schemes are also possible, where heuristic mechanisms are used to determine the weights of the priority of forces.
privatevoidFixedUpdate(){ // Calculate the total steering force by summing the active steering behaviors Vector3 force = Arrive () + ObstacleAvoidance(); ...
Code snippet 8. The “Arrival” and “Obstacle Avoidance” algorithms are used simultaneously by adding their strengths.
Video 3. The agent uses two types of behavior at once: “Arrival” and “Avoidance from obstacles”
â–ŤSelection of action
The choice of action is setting common goals and making decisions by the agent. Our agent implementation already includes a simple model for choosing actions in the form of a combination of the Arrival and Evasion of Obstacles algorithms. The agent tries to arrive at the target, but if it detects obstacles, its trajectory will be changed. The Evasion Radius and Discovery Ahead parameters of the Obstacle Avoidance algorithm determine the action to be performed.
R200 camera integration
Now the agent is able to move independently, it can be included in the application of augmented reality. The following example is based on the Scene Perception example included in the Intel RealSense SDK. This application creates a three-dimensional model using Scene Perception, and the user can set and move the target in three-dimensional space. After that, the agent will be able to navigate through the created three-dimensional model to achieve the goal.
â–ŤScene Manager
The Scene Manager script initializes the scene and handles user control. The only type of control is touch (or mouse click, if the device does not support touch). The ray tracing from the point of tangency determines whether the created three-dimensional model is touching. The first touch creates a target on a three-dimensional model, the second creates an agent, and each subsequent touch moves the position of the target. The control logic is processed by the state machine (snippet 9).
// State machine that controls the scene: // Start => SceneInitialized -> TargetInitialized -> AgentInitialized private enum SceneState {SceneInitialized, TargetInitialized, AgentInitialized}; private SceneState state = SceneState.SceneInitialized; // Initial scene state. private void Update () { // Trigger when the user "clicks" with either the mouse or a touch up gesture. if(Input.GetMouseButtonUp (0)) { TouchHandler (); } } private void TouchHandler () { RaycastHit hit; // Raycast from the point touched on the screen if (Physics.Raycast (Camera.main.ScreenPointToRay (Input.mousePosition), out hit)) { // Only register if the touch was on the generated mesh if (hit.collider.gameObject.name == "meshPrefab(Clone)") { switch (this.state) { case SceneState.SceneInitialized: SpawnTarget (hit); this.state = SceneState.TargetInitialized; break; case SceneState.TargetInitialized: SpawnAgent (hit); this.state = SceneState.AgentInitialized; break; case SceneState.AgentInitialized: MoveTarget (hit); break; default: Debug.LogError("Invalid scene state."); break; } } } }
Code snippet 9. Touch handler and state machine for sample application
The Scene Perception component forms many small three-dimensional models. Such models usually have no more than 30 vertices. The location of the peaks may vary, with the result that some models are slightly inclined with respect to the surface on which they are located. If the object is on top of the model (for example, a goal or an object), then the object will be incorrectly oriented. To work around this problem, the average normal of the three-dimensional model is used (code snippet 10).
private Vector3 AverageMeshNormal(Mesh mesh){ Vector3 sum = Vector3.zero; // Sum all the normals in the mesh for (int i = 0; i < mesh.normals.Length; i++){ sum += mesh.normals[i]; } // Return the average return sum / mesh.normals.Length; }
Fragment of code 10. Calculation of the average normal of the three-dimensional model
â–Ť Assembly application
All code developed for this example is available on Github . The following instructions build the Scene Manager and agent implementation into the Intel RealSense application.
Open the RF_ScenePerception example in the Intel RealSense SDK folder RSSDK \ framework \ Unity.
Open RealSenseExampleScene in the Assets / AutoNavAR / Scenes / folder.
Build and run the application on any compatible device with the Intel RealSense R200 Camera.
Video 4. Completed Integration with the Intel RealSense R200 Camera
Further development of autonomous navigation
We have developed an example that demonstrates an autonomous agent in an augmented reality application with a R200 camera. There are several ways to develop this work, to increase the "rationality" and realism of the agent.
As an agent, a simplified mechanical model was used with a uniform distribution of mass and without restriction of movement in directions. A more sophisticated model can be developed in which the mass will be distributed unevenly, and the forces applied to the body will be limited (for example, a car with different acceleration and deceleration forces, a spacecraft with the main engine and lateral shunting engines). The more precisely the mechanical models are made, the more realistic the movement will be.
Craig Reynolds (Craig Reynolds) first described in detail the behavioral guidance algorithms in the context of animation and games. The algorithms "Search", "Arrival" and "Evasion from obstacles", demonstrated in our example, are based on his work . Reynolds described other behavioral algorithms, including “Escape”, “Harassment”, “Wandering”, “Research”, “Obstacle Avoidance” and “Following the Route”. Considered group behavior algorithms include “Splitting”, “Merging” and “Building”. Another useful resource is Mat Buckland's “Programming the Artificial Intelligence of Games with Examples”. It describes the implementation of behavioral algorithms and a number of other issues, including finite automata and the search for paths.
In this example, the arrival and the obstacle avoidance guidance algorithms are applied to the agent simultaneously. So you can combine any number of behavior algorithms to obtain more complex models. For example, the Unite into a Behavioral Algorithm is created on the basis of division, merger and construction. A combination of different behavioral algorithms can sometimes give an unnatural result. It is recommended to experiment with different types of such algorithms to identify new features.
In addition, some search path techniques are intended for use in dynamic environments. Algorithm D * is close to A *, but can update the path based on new observations (i.e., added and removed obstacles). Algorithm D * Lite works the same as D *, and is simpler to implement. Path finding can be used with hover: you can set waypoints and then use hover to bypass them.
The choice of actions is not discussed in this article, but is widely studied in game theory. Game theory explores the mathematical basis of strategy and decision making. Game theory is applied in many areas, including economics, political science and psychology. For autonomous agents, game theory can control how and when decisions are made. Game Theory 101: A Complete Reference Book by William Spaniel is a great starting resource with a series of videos on YouTube .
Conclusion
There is a whole arsenal of tools for setting up the movement, behavior and actions of agents. Autonomous navigation is best suited for dynamic environments like the one created by the Intel RealSense camera in augmented reality applications. Even simple patterns of movement and guidance can form complex behaviors without prior knowledge of the environment. The abundance of available models and algorithms provides the flexibility to implement stand-alone solutions for any application.