-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Halite helpers #43
Halite helpers #43
Conversation
… for the Board class
…, and added support for exporting a board to an observation
…vor of collision simulation
Fixed blank output lines on every call Added a --out parameter for writing to a file but still getting std out in the terminal Misc style fixes as I saw them Added some unit tests for halite
|
||
return board.action | ||
@board_agent | ||
def random_agent(board): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Random weights have been tweaked for readability as random is meant to be an example for users not a strong bot.
|
||
|
||
agents = {"random": random_agent} | ||
|
||
|
||
def interpreter(state, env): | ||
def populate_board(state, env): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything in the populate_board method is the same as it was before, just refactored into its own method. Reviewers can skip to the next comment.
for uid, index in list(ships.items()): | ||
del ships[uid] | ||
del obs.players[index][2][uid] | ||
def interpreter(state, env): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewers should continue reading here.
All observation update logic is now handled in the Board class, this includes applying actions, resolving collisions, gathering halite, depositing halite, and regenerating halite. Logic to eliminate players or otherwise control the meta-state of the game remains in this method.
THash = TypeVar('TComparable') | ||
|
||
|
||
def group_by(elements: Iterable[TElement], selector: Callable[[TElement], THash]) -> Dict[THash, List[TElement]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needed to bring a friend in from Linq 😄
TValue = TypeVar('TValue') | ||
|
||
|
||
class ReadOnlyDict(Generic[TKey, TValue]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything from here to the start of the Board
class is mainly just data binding models. Reviewers can mostly skim these, but there are some small bits of logic like ShipAction.to_point
and Player.next_actions
Please let me know though if you have suggestions or questions on the format, structure, or purpose of the binding models. I'd generally like to reuse this pattern for future competitions if the community gets good use out of it.
|
||
|
||
class Board: | ||
def __init__( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The constructor is basically just used to deserialize the raw observation and configuration received from the halite interpreter into the richer Board model that we'll make use of in the next() method down below.
|
||
@property | ||
def observation(self) -> Dict[str, Any]: | ||
"""Converts a Board back to the normalized observation that constructed it.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the observation serializer to the constructor's observation deserializer
|
||
def __deepcopy__(self, _) -> 'Board': | ||
actions = [player.next_actions for player in self.players.values()] | ||
return Board(self.observation, self.configuration, actions) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the transformation to an observation copies all of the data into new data structures we can use it for deepcopy as well to isolate a copied instance from its parent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❓ Could someone still mutate next_actions
dictionaries though after this call? Should those be explicitly deepcopy'd as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
next_actions
is a computed property constructed from the player's ships and shipyards so we don't have to worry about modification
Player 1 is letter a/A | ||
Player 2 is letter b/B | ||
etc. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious if people think this belongs in __str__
or another method like print_board
or something like that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine in __str__
. It's easy to add an additional helper later. You could also argue that observations could be __repr__
, but I think being explicit there makes sense because __repr__
is usually just a string.
result += '|\n' | ||
return result | ||
|
||
def _add_ship(self: 'Board', ship: Ship) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Undocumented as these are meant to be internal only
The current board is unmodified. | ||
This can form a halite interpreter, e.g. | ||
next_observation = Board(current_observation, configuration, actions).next.observation | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the replacement for the observation update logic in the halite interpreter.
@@ -63,6 +63,9 @@ | |||
parser.add_argument( | |||
"--host", type=str, help="http-server Host (default=127.0.0.1)." | |||
) | |||
parser.add_argument( | |||
"--out", type=str, help="Output file to write the results of the episode." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This allows you to write the results to an html file while still getting print statements in std out.
|
||
|
||
def test_no_move_on_halite_gathers_halite(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are lots of other scenarios we could test in here (like multiship collisions on shipyards, spawning collisions, etc) but I tried to cover most core rules with this pass.
print(buffer.getvalue()) | ||
if fallback != None: | ||
output = buffer.getvalue() | ||
if output: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change gets rid of most of those annoying blank lines in the output
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👏
@@ -15,6 +15,7 @@ | |||
import json |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is all style and formatting updates, reviewers can likely skip this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exciting to see this launch soon! Looks pretty good. I left a few comments/suggestions.
If you haven't already, make sure to run pylint
and mypy
and either correct the errors given or fix them (and/or explicitly ignore ones you want to ignore with the appropriate comment) to get them to pass without warnings.
Also, I'm assuming that agents can't exploit the runtime internal variables to allow cheating that otherwise wouldn't be possible?
shipyards = me.shipyards | ||
for shipyard in shipyards: | ||
# 20% chance to spawn | ||
shipyard.next_action = choice([ShipyardAction.SPAWN, None, None, None, None]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using the weighted random.choices might be clearer here.
|
||
actions = [agent.action for agent in state] | ||
board = Board(obs, config, actions) | ||
board = board.next() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice and concise 👏
# region Helper Classes and Methods | ||
Point = NewType('Point', Tuple[int, int]) | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could potentially be useful to have a comment and/or ASCII art to show the coordinate system.
See index_to_position for the inverse. | ||
""" | ||
x, y = point | ||
return (size - y - 1) * size + x |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❓ This seems like a change in coordinate system from before? Did you intentionally want (0,0)
to be at (size - 1) * size
? Is that clearer than it being at 0
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This puts 0 in the lower left corner and (size - 1, size - 1) in the upper right which I personally found a lot more intuitive, curious to get others' thoughts on this though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I'm fully understanding, but my personal take is that (0, 0)
as top left makes more sense because that's how 2d arrays work, and in particular the format used in the replay files. I see how your way is intuitive relative to normal XY plane but IMO it's better to match the programming methodology. But just a personal take and IIUC a lot / all of this is internal and the higher-level structure here means users aren't necessarily forced into either paradigm?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dster2 Yeah users would only even notice the coordinate system if they were constructing offsets by hand. If they only used NORTH/SOUTH/EAST/WEST they'd likely never even notice there are coordinates. I'll consider this a bit more, thanks for the feedback all.
if isinstance(data, dict): | ||
self._data = data | ||
else: | ||
# If it's not a Dict it must be a ReadOnlyDict based on our type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✋ Note that they are type hints and not actually enforced by the runtime. You might want to explicitly check and raise
explicitly if not true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My goal here wasn't to validate that it must be a readonlydict but that if the user passed in the correct type and it's not a dict then it must be a readonlydict.
|
||
player._halite += leftover_convert_halite | ||
|
||
def resolve_collision(ships: List[Ship]) -> Tuple[Optional[Ship], List[Ship]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These methods are specific to the game's rules and could be staticmethod
? I wonder if it's worth factoring out to a separate class
for clarity?
board._delete_shipyard(shipyard) | ||
board._delete_ship(ship) | ||
|
||
# Collect halite from cells into ships |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❓ Just to confirm, if a smaller ship collides with a bigger ship, it should also collect in that same turn?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll double check the rules to make sure they're consistent with this behavior but I believe so
delta_halite = int(cell.halite * configuration.collect_rate) | ||
if cell.shipyard_id is None and delta_halite > 0: | ||
ship._halite += delta_halite | ||
cell._halite -= delta_halite |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might want another assert cell._halite > 0
after this? Just to confirm, a cell cannot run out of halite?
def agent_wrapper(obs, config) -> Dict[str, str]: | ||
board = Board(obs, config) | ||
agent(board) | ||
return board.current_player.next_actions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am assuming the JSON type/domain checking is already verifying these actions are valid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These get filtered out and ignored in the board constructor if they're not valid
print(buffer.getvalue()) | ||
if fallback != None: | ||
output = buffer.getvalue() | ||
if output: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👏
@moserware Re: internal only methods and exploits, no bots can't use them unless users are running untrusted code locally (which we ultimately can't protect against). In production we isolate agents in different containers so there's no risk of them being able to call each others' methods. |
Human rendering was not working anymore since Kaggle#43
REVIEWERS' NOTE: This PR is kind of a monster, apologies in advance. I've annotated each file with some commentary about what's worth reading and what's worth skipping, large parts of the PR are boring refactors on which I welcome comments but are not likely the best use of your time.
Feel free to skip any parts you don't feel are relevant to your interests. We are on a relatively tight timeline to get this out before Halite so if there are any larger pieces of feedback I'll likely save those for a future iteration. If you don't have time to read through this, then no worries and feel free to disregard the PR.
This PR provides wrapper classes for the halite environment and state as well as a convenient hook for the interpreter. This also reimplements the bulk of the original halite interpreter in the new board class, straightens out some logic inconsistencies in corner cases of the old implementation, and hopefully makes it a bit easier to parse what's happening in the code.
I tested the interpreter reimplementation by writing unit tests and by running actual episodes using the agents submitted to the shared repo. I can't validate that behavior is exactly the same but I can at least confirm that everyone's agents played complete and reasonably strategic-looking games that seemed similar or identical to the current implementation.