Progress for week 16 (2018)

(Difference between revisions)

Current revision as of 11:54, 20 April 2018

Start DDPG and TRPO implementation
Make a decision and implement on where to validate algorithms
Extend REINFORCE with baseline
Read up on typical state representations and reward signals for locomotion tasks
Get an overview of potensial different tasks to explore with Dyret (Balance, movement speed, etc.)

Skrevet første utkast til Relearning
- For kort avsluttning?
- Henger første halvdel sammen med siste?
Skrevet kort intro
- Trengs det mer?
- Må problemstillingen konkteriseres mer?
Startet på avsluttningen
- Seksjonene er nå:

Result summary

Discussion summary

Thesis conclusion

Future work

@@ Line 8: / Line 8: @@
 === Accounting ===
-*
+* DDPG implemented and apparently working. TRPO implementation not started
+* Validating in continous mountaincar and Pendulum from openai gym environment
+* REINFORCE not working yet.
+* Not investigated. Trying to get the algorithm itself working first
+* Not thoroughly explored
 == Martin Hovin ==