View
215
Download
2
Tags:
Embed Size (px)
Citation preview
April 10, 2008 1
Simplifying solar harvesting model-development in situated agents using pre-deployment learning and information sharing
Huzaifa Zafar & Dan Corkill
Computer Science Department
University of Massachusetts, Amherst
Introduction Problem Definition:
How much energy is this agent going to be able to harvest?
Clouds Shading and tilting
How can an agent use its neighboring agents in developing its local models?
Two agents see the same (if not very close) cloud attenuation
Two agents have different shade attenuation at any given time (unless in the deserts of dubai)
{30%,20%}
{30%,40%}
{30%,0%}
Related Work Multi-Agent Reinforcement Learning
Extends traditional reinforcement learning to multiple agents
Each agent learns local policies given policies of neighboring agents
Requires a large observation set and time to converge to optimal policies.
Multi-Agent Inductive Learning Learning models by interacting with other agents in the network
Each agent shares information with other agents in the network in order to better learn local models.
Again, requires a large observation set and time to converge to usable models
Observations Agent performance is reduced while models are learned
Is it possible to reduce the time taken in developing local models once an agent has been deployed?
How can an agent better take advantage of the observations of its neighboring agents in developing its local models?
PLASMA (Pre-deployment Learning And Situated Model-development in Agents)
Two phase strategy Phase 1:
A pre-deployment learning phase Define and develop a parameterized model of the environment
The parameters of the model - environmental effects
Phase 2: Post-deployment model-completion phase Complete the local parameterized model by sharing information among agents
Input: Current time, location (GPS) Energy Harvested depends on:
The maximum energy provided no attenuation Cloud attenuation Shade attenuation Tilt of the solar panel
Assume geographical location and angle of solar panel to be constant for the lifetime of the agent
Combine them into site attenuation
Solar Harvesting Model
Observations Two agents have the same (or very close) cloud attenuation at any time of the day
Very small chance of two agents having the exact the same shade attenuation at any given time (unless you are in the deserts of Dubai)
Maximum energy does not depend on the exact location of the agent (approximate location is enough)
The relationship between cloud attenuation and energy harvested does not depend on the environment of the agent
Same with site attenuation
PLASMA:Pre-deployment learning phase
Learn the maximum observable energy and the relation between attenuations and observed energy
Model for the maximum observable energy - a.sin(time)
Model for the relation between attenuations and observed energy - a.log-1(C(t))
PLASMA:Post-deployment model completion
Agent 1, Day 2 we have the following equations:
Equations from Day 1400 = 1000 * (1 - (f(C(t1)) + k(S(t1,e1)))600 = 1000 * (1 - (f(C(t1)) + k(S(t1,e2)))
Equations from Day 2450 = 1000 * (1 - (f(C(t2)) + k(S(t2,e1)))670 = 1000 * (1 - (f(C(t2)) + k(S(t2,e2)))
{??,??}{30%,20%}
{??,??}{30%,40%}
{??,??}{30%,0%}
PLASMA:Diversity - The deserts of Dubai phenomena
Cloud attenuations remains exactly the same for consecutive days (in general low likelihood)
Site attenuation remains exactly the same across agents (generally low likelihood in most areas)
Take away - diversity is important. Probability of there being no diversity is very very low
PLASMA:The know-it-all agent
Converged agent shares values with all neighboring agents
Neighboring agents can use meaningful values to converge themselves
Take away - If one agent converges, all agents will converge{30%,20%}
{30%,40%}
{30%,0%}
Experiment - I Evaluate PLASMA in a simulated environment
2 Agents, both learning their respective local models.
For one of the agents : Shaded for 4 hours
Result: PLASMA is able to accurately predict the solar radiation collected for day 3
Experiment II - Load Balancing
Benefits of PLASMA in energy dependent load balancing (Kansal et.al.)
Each agent can undertake certain task load depending on available energy
Agents make load balancing decisions depending on predicted energy levels for the near future
10 Agents; 20 Days; Mean Cloud Attenuation is 20%
Experiment II - Load Balancing
Overall utility given no storage capacity and infinite energy storage capacity
Min utility = 2; Max utility = 5 -1 utility for unaccomplished task Result: Can maximize utility with and without residual energy storage (compared with Kansal et.al.)
Conclusions Developed a two phase model-development strategy called PLASMA
Minimize the time and number of observations required in developing models post-deployment by transferring all the learning to the pre-deployment phase
Its all about the diversity (in agent observations)
Agents converge On the first day if there exists a converged agent that shares meaningful observations
On the second day if there exists an agent that shares two meaningful observations