PPO CartPole Trainer
Start Training
Stop Training
Evaluate Agent
Reset Agent
Model Configuration
Actor Learning Rate:
Critic Learning Rate:
Gamma (Discount):
Lambda (GAE):
Epsilon (Clip):
PPO Epochs:
Mini Batch Size:
Dropout Rate:
L2 Reg. Rate:
Clip Norm:
Hidden Activation:
Leaky ReLU
ReLU
Sigmoid
Tanh
Linear
Apply Configuration