Reinforcement Learning (Plugins)

Discuss Scirra's 2D Asset Store

Post » Tue Jan 05, 2016 11:53 am

Reinforcement Learning — Now for sale in the Scirra Store!
https://www.scirra.com/store/construct2 ... rning-1873

This plugin is built upon the Deep Q-Learning algorithm developed by Google's DeepMind. It enables the developer to give agents a 'brain' and train them using rewards/punishments. The implementation uses convnet.js by Andrej Karpathy.
For a more detailed description of the features and abilities of the plugin, here is a documentation, as well as an example CAPX.

Please note, the plugin comes as-is and it is not guaranteed when/that the artificial neural network will converge, nor can a certain accuracy be guaranteed.

Use this topic to leave comments, ask questions and talk about Reinforcement Learning
Last edited by fundation2000 on Tue Mar 22, 2016 7:53 pm, edited 2 times in total.
B
10
S
2
Posts: 32
Reputation: 672

Post » Wed Jan 06, 2016 4:25 pm

It would be really great if u could explain more in the capx event sheet what is happening and why it needs to happen.
It gives us a more basic understanding of how to use the plugin in our own setup.
I like what im seeing so far!

Hope u can add to the capx :-)
B
59
S
20
G
14
Posts: 779
Reputation: 13,857

Post » Wed Jan 06, 2016 6:04 pm

I admit the CAPX is a bit bloated, since there's quite a lot going on, but it's all broken down in only 4 steps (Train, triggerAction, Reward and manageEnvironment):

TRAIN:
The agent gets 9 sensors, and each sensor generates three inputs for the brain - "apple", "poison" and "wall", using the distance to the touched object as value.
(In this step, I also draw some lines from the agent to the touched object using the SensedApple, SensedPoison Tiles, but these are purely optional, I guess I could take them out alltogether).

REWARD:
Then there's the reward built upon interaction with the apples, poison or the walls.

TRIGGER ACTION:
And finally there's the output of the brain (for example "right") transformed into an action ("set angle of motion to current angle +50 degrees").

MANAGE ENVIRONMENT:
This just adds more apples and poison if there's not enough laying around.

If you still find this confusing I'll take a look at it in the weekend and trim it down. Although I usually consider more to be better :) .
B
10
S
2
Posts: 32
Reputation: 672

Post » Wed Jan 06, 2016 7:56 pm

That makes sense.
Please keep the events like this, its all nice and clean.
Also the lines drawn are a perfect visual guide to what is happening.

What would be great is a explanation to how the agent is actually training itself.

So we now know it has sensors connected to the brain.

We also know it gets rewarded.
( but how? even more important what does a reward mean to the agent? as in "does it compare good & bad or something else")

And it has a trigger output.
(but how does this trigger output correlate with the brains input and rewards? ,i guess this is important to know for understanding its learned behavior?)

If this could be explained it becomes easier to understand the training process.
And then i can start training something of my own with this plugin.
(Instead of just a copy & paste ;-)

Hope this makes sense :-)
B
59
S
20
G
14
Posts: 779
Reputation: 13,857

Post » Fri Jan 08, 2016 3:14 am

Garbage in, garbage out. It might be better to tell user to feed well preprocessed data input first, like normalize them in range 0 to 1.
B
108
S
26
G
260
Posts: 4,434
Reputation: 146,191

Post » Fri Jan 08, 2016 11:39 am

I'll try to add some tips and hints on the weekend, but Rex is definitely right.

One should definitely invest some time and read up on reinforcement learning - what a neural network is, what inputs and outputs are, what forward and backward propagation are etc, since it's quite the broad topic and at first perhaps not too simple to grasp.

There is a plethora of great sources out there at the moment, as Deep Learning becomes a valuable tool for companies' data analysis. Start with some Wikipedia (like https://en.wikipedia.org/wiki/Reinforcement_learning and https://en.wikipedia.org/wiki/Q-learning) and also google around a bit.
B
10
S
2
Posts: 32
Reputation: 672

Post » Sun Jan 10, 2016 5:10 pm

I'll try to add some tips and hints on the weekend


Thank you for this!

And you are correct, investing time into this subject is needed, but on that same note i already have invested allot of time in to it.
However i find that most explanations given are not based in simplicity.
But im learning as i go :-)
B
59
S
20
G
14
Posts: 779
Reputation: 13,857

Post » Sat Mar 12, 2016 3:52 am

you can upload a preview? pliz
Image
B
56
S
20
G
9
Posts: 29
Reputation: 8,792

Post » Tue Mar 22, 2016 12:30 am

Hello

nice idea and so important for the futur

But
I didnt find the capx on your web page
Any idea to find documentation ?

Thanks
B
14
S
3
G
3
Posts: 30
Reputation: 3,435

Post » Tue Mar 22, 2016 7:36 pm

Hi guys,

so sorry I never got to make a more detailed documentation. I just can't find the time at the moment, I'll try to fit it in sometime.

Until then. here is the capx file and here is the documentation.

If you have specific questions just ask here and I will gladly help.
B
10
S
2
Posts: 32
Reputation: 672

Next

Return to Scirra Store

Who is online

Users browsing this forum: No registered users and 3 guests