ResetScene-Henrik

From Robin

(Difference between revisions)
Jump to: navigation, search
(added robot)
Line 57: Line 57:
Currently, the robot is already placed in the scene. It gets created and starts receiving commands from the machine learning agents as soon as the scene is loaded. What would have been better is to create or "spawn" the agent in your code only after SpawnENV is done waiting. This way, you can ensure the agent only starts receiving commands once everything else is set up properly.
Currently, the robot is already placed in the scene. It gets created and starts receiving commands from the machine learning agents as soon as the scene is loaded. What would have been better is to create or "spawn" the agent in your code only after SpawnENV is done waiting. This way, you can ensure the agent only starts receiving commands once everything else is set up properly.
 +
 +
 +
=== Robot/Agent ===
 +
This robot is based on ML-agents crawler example
 +
public class Robot : Agent
 +
{
 +
 +
    JointDriveController m_JdController; // Control joints
 +
    private Transform Center; // Center of the robot, a prefab
 +
    private ConfigSideChannel parameterChannel; // The custom sidechannel
 +
   
 +
    public void Awake() //Do not need anything here, happens before Initialize()
 +
    {
 +
    }
 +
    public override void Initialize()
 +
    {
 +
        parameterChannel = FindObjectOfType<RobotManager>().rc;
 +
        m_JdController = GetComponent<JointDriveController>();
 +
        Center = transform.Find("Center");
 +
    }
 +
 
 +
    public override void OnEpisodeBegin() // Do not need anything here, because the whole scene is reset
 +
    {
 +
        return;
 +
    }
 +
 +
    public override void OnActionReceived(ActionBuffers actions)
 +
    {
 +
        // For The data from python part
 +
        // ActionBuffers is like an array. To access an continuous action: actions.ContinuousActions[0]
 +
    }
 +
 +
 +
    public override void CollectObservations(VectorSensor sensor){
 +
        sensor.AddObservation(Center.localPosition); // 3 observations
 +
 +
        // Observe the agent local rotation in eulerangels (3 observations)
 +
        sensor.AddObservation(Center.localRotation.eulerAngles);
 +
    }
 +
 +
    public void OnDestroy()
 +
    {
 +
        if (Academy.IsInitialized){
 +
            SideChannelManager.UnregisterSideChannel(parameterChannel);
 +
        }
 +
    }
 +
}

Revision as of 11:03, 28 May 2024

Here is a full code example on how to reset a scene to get a deterministic behavior

Unity

Here we have two objects, one Robot manager (RobotManager.cs) that connect to ML-agents on the python side and a Robot (Robot.cs) that is our agent. Robot will be controlled and can collect observations for ML agents.

RobotManager

public class RobotManager : MonoBehaviour
{
   // The importent part for making a singleton 
   private static RobotManager instance;
   public static RobotManager Instance { get { return instance; } }
   
   [Tooltip("Makes sure the physics start after n fixed updates")]
   [SerializeField] int warmupFixedUpdates = 80;
   
   // This is a custom sidechannel to send some more data. Not mandatory
   public ConfigSideChannel rc;
   
   public void Awake() {
       if (instance != null && instance != this) { // The importent part for making a singleton 
           this.gameObject.SetActive(false);
           Destroy(this.gameObject);
           return;
       }
       else if (instance == null) {
           instance = this;
       }
       DontDestroyOnLoad(this.gameObject);
       
       if (rc == null) {
           rc = new ConfigSideChannel();
           SideChannelManager.RegisterSideChannel(rc);
           Unity.MLAgents.Academy.Instance.OnEnvironmentReset += ResetScene; //this is to reload the entire scene when sending the reset command from python
       }
   }
   
   public void ResetScene() { // Will be run when sending the reset command
       SceneManager.LoadScene("SampleScene"); //Loads and overwrites the existing scene
       // WARNING SE BELOW 
       StartCoroutine(SpawnENV()); // Start a coroutine/thread that waits n FixedUpdates for the scene to load
       // Instantiate the robot/agent here!
   }
   
   IEnumerator SpawnENV() {
       // Wait for scene to load properly
       for (int i = 0; i < warmupFixedUpdates; i++)
           yield return new WaitForFixedUpdate();
   }
   
   public void OnDestroy(){
       if (Academy.IsInitialized && rc != null)
           SideChannelManager.UnregisterSideChannel(rc);
   }
}

The issue with the existing setup is that it relies on how fast the simulation runs, which is likely due to the way co-routines and the scene are configured. Right now, Unity starts the simulation as soon as it can, without waiting for SpawnENV to complete. This is not a problem if you run the simulation at a normal speed (with a timescale of 1), because you'll get consistent results. However, running it at a different speed can lead to unpredictable outcomes.

Currently, the robot is already placed in the scene. It gets created and starts receiving commands from the machine learning agents as soon as the scene is loaded. What would have been better is to create or "spawn" the agent in your code only after SpawnENV is done waiting. This way, you can ensure the agent only starts receiving commands once everything else is set up properly.


Robot/Agent

This robot is based on ML-agents crawler example

public class Robot : Agent
{

   JointDriveController m_JdController; // Control joints
   private Transform Center; // Center of the robot, a prefab
   private ConfigSideChannel parameterChannel; // The custom sidechannel 
   
   public void Awake() //Do not need anything here, happens before Initialize()
   {
   }
   public override void Initialize()
   {
       parameterChannel = FindObjectOfType<RobotManager>().rc;
       m_JdController = GetComponent<JointDriveController>();
       Center = transform.Find("Center");
   }
  
   public override void OnEpisodeBegin() // Do not need anything here, because the whole scene is reset
   {
       return;
   }

   public override void OnActionReceived(ActionBuffers actions)
   {
       // For The data from python part
       // ActionBuffers is like an array. To access an continuous action: actions.ContinuousActions[0]
   }


   public override void CollectObservations(VectorSensor sensor){
       sensor.AddObservation(Center.localPosition); // 3 observations 

       // Observe the agent local rotation in eulerangels (3 observations)
       sensor.AddObservation(Center.localRotation.eulerAngles);
   }

   public void OnDestroy()
   {
       if (Academy.IsInitialized){
           SideChannelManager.UnregisterSideChannel(parameterChannel);
       }
   }
}
Personal tools
Front page