ResetScene-Henrik

From Robin

(Difference between revisions)
Jump to: navigation, search
(Opprette siden)
Line 1: Line 1:
Here is a full code example on how to reset a scene to get a deterministic behavior
Here is a full code example on how to reset a scene to get a deterministic behavior
 +
== Unity ==
== Unity ==
 +
Here we have two objects, one Robot manager (RobotManager.cs) that connect to ML-agents on the python side and a Robot (Robot.cs) that is our agent. Robot will be controlled and can collect observations for ML agents.
 +
 +
=== RobotManager ===
 +
public class RobotManager : MonoBehaviour
 +
{
 +
    // The importent part for making a singleton
 +
    private static RobotManager instance;
 +
    public static RobotManager Instance { get { return instance; } }
 +
   
 +
    [Tooltip("Makes sure the physics start after n fixed updates")]
 +
    [SerializeField] int warmupFixedUpdates = 80;
 +
   
 +
    // This is a custom sidechannel to send some more data. Not mandatory
 +
    public ConfigSideChannel rc;
 +
   
 +
    public void Awake() {
 +
        if (instance != null && instance != this) { // The importent part for making a singleton
 +
            this.gameObject.SetActive(false);
 +
            Destroy(this.gameObject);
 +
            return;
 +
        }
 +
        else if (instance == null) {
 +
            instance = this;
 +
        }
 +
        DontDestroyOnLoad(this.gameObject);
 +
       
 +
        if (rc == null) {
 +
            rc = new ConfigSideChannel();
 +
            SideChannelManager.RegisterSideChannel(rc);
 +
            Unity.MLAgents.Academy.Instance.OnEnvironmentReset += ResetScene; //this is to reload the entire scene when sending the reset command from python
 +
        }
 +
    }
 +
   
 +
    public void ResetScene() { // Will be run when sending the reset command
 +
        SceneManager.LoadScene("SampleScene"); //Loads and overwrites the existing scene
 +
        // WARNING SE BELOW
 +
        StartCoroutine(SpawnENV()); // Start a coroutine/thread that waits n FixedUpdates for the scene to load
 +
        // Instantiate the robot/agent here!
 +
    }
 +
   
 +
    IEnumerator SpawnENV() {
 +
        // Wait for scene to load properly
 +
        for (int i = 0; i < warmupFixedUpdates; i++)
 +
            yield return new WaitForFixedUpdate();
 +
    }
 +
   
 +
    public void OnDestroy(){
 +
        if (Academy.IsInitialized && rc != null)
 +
            SideChannelManager.UnregisterSideChannel(rc);
 +
    }
 +
}
 +
 +
The issue with the existing setup is that it relies on how fast the simulation runs, which is likely due to the way co-routines and the scene are configured. Right now, Unity starts the simulation as soon as it can, without waiting for SpawnENV to complete. This is not a problem if you run the simulation at a normal speed (with a timescale of 1), because you'll get consistent results. However, running it at a different speed can lead to unpredictable outcomes.
 +
 +
Currently, the robot is already placed in the scene. It gets created and starts receiving commands from the machine learning agents as soon as the scene is loaded. What would have been better is to create or "spawn" the agent in your code only after SpawnENV is done waiting. This way, you can ensure the agent only starts receiving commands once everything else is set up properly.

Revision as of 10:49, 28 May 2024

Here is a full code example on how to reset a scene to get a deterministic behavior

Unity

Here we have two objects, one Robot manager (RobotManager.cs) that connect to ML-agents on the python side and a Robot (Robot.cs) that is our agent. Robot will be controlled and can collect observations for ML agents.

RobotManager

public class RobotManager : MonoBehaviour
{
   // The importent part for making a singleton 
   private static RobotManager instance;
   public static RobotManager Instance { get { return instance; } }
   
   [Tooltip("Makes sure the physics start after n fixed updates")]
   [SerializeField] int warmupFixedUpdates = 80;
   
   // This is a custom sidechannel to send some more data. Not mandatory
   public ConfigSideChannel rc;
   
   public void Awake() {
       if (instance != null && instance != this) { // The importent part for making a singleton 
           this.gameObject.SetActive(false);
           Destroy(this.gameObject);
           return;
       }
       else if (instance == null) {
           instance = this;
       }
       DontDestroyOnLoad(this.gameObject);
       
       if (rc == null) {
           rc = new ConfigSideChannel();
           SideChannelManager.RegisterSideChannel(rc);
           Unity.MLAgents.Academy.Instance.OnEnvironmentReset += ResetScene; //this is to reload the entire scene when sending the reset command from python
       }
   }
   
   public void ResetScene() { // Will be run when sending the reset command
       SceneManager.LoadScene("SampleScene"); //Loads and overwrites the existing scene
       // WARNING SE BELOW 
       StartCoroutine(SpawnENV()); // Start a coroutine/thread that waits n FixedUpdates for the scene to load
       // Instantiate the robot/agent here!
   }
   
   IEnumerator SpawnENV() {
       // Wait for scene to load properly
       for (int i = 0; i < warmupFixedUpdates; i++)
           yield return new WaitForFixedUpdate();
   }
   
   public void OnDestroy(){
       if (Academy.IsInitialized && rc != null)
           SideChannelManager.UnregisterSideChannel(rc);
   }
}

The issue with the existing setup is that it relies on how fast the simulation runs, which is likely due to the way co-routines and the scene are configured. Right now, Unity starts the simulation as soon as it can, without waiting for SpawnENV to complete. This is not a problem if you run the simulation at a normal speed (with a timescale of 1), because you'll get consistent results. However, running it at a different speed can lead to unpredictable outcomes.

Currently, the robot is already placed in the scene. It gets created and starts receiving commands from the machine learning agents as soon as the scene is loaded. What would have been better is to create or "spawn" the agent in your code only after SpawnENV is done waiting. This way, you can ensure the agent only starts receiving commands once everything else is set up properly.

Personal tools
Front page