Progress for week 16 (2018)
From Robin
(Difference between revisions)
m (→Vetle Bu Solgård) |
|||
Line 8: | Line 8: | ||
=== Accounting === | === Accounting === | ||
- | * | + | * DDPG implemented and apparently working. TRPO implementation not started |
+ | * Validating in continous mountaincar and Pendulum from openai gym environment | ||
+ | * REINFORCE not working yet. | ||
+ | * Not investigated. Trying to get the algorithm itself working first | ||
+ | * Not thoroughly explored | ||
== Martin Hovin == | == Martin Hovin == |
Current revision as of 11:54, 20 April 2018
Contents |
Vetle Bu Solgård
Budget
- Start DDPG and TRPO implementation
- Make a decision and implement on where to validate algorithms
- Extend REINFORCE with baseline
- Read up on typical state representations and reward signals for locomotion tasks
- Get an overview of potensial different tasks to explore with Dyret (Balance, movement speed, etc.)
Accounting
- DDPG implemented and apparently working. TRPO implementation not started
- Validating in continous mountaincar and Pendulum from openai gym environment
- REINFORCE not working yet.
- Not investigated. Trying to get the algorithm itself working first
- Not thoroughly explored
Martin Hovin
Budget
- Skrive resultater, diskusjon og konklusjon i Relearning
- Skrive introduksjon
- Skrive avsluttning på oppgaven
Accounting
- Skrevet første utkast til Relearning
- For kort avsluttning?
- Henger første halvdel sammen med siste?
- Skrevet kort intro
- Trengs det mer?
- Må problemstillingen konkteriseres mer?
- Startet på avsluttningen
- Seksjonene er nå:
- Result summary
- Discussion summary
- Thesis conclusion
- Future work