📜 ⬆️ ⬇️

Creation of the world Experience of creating intelligent life with their own hands

Sometimes you spend the day trying to explain to the chief accountant without using the terms “recursive call” and “idiots” why a simple change in the accounting system is actually delayed for almost a week due to a spelling mistake made by someone in the 2009 code. On such days, I want to wrestle the hands of the wise man who created this world and rewrite everything from scratch.

image

TL; DR
Under the cut, the story of how I, as a practice to learn Python, develop my library for agent-based modeling with machine learning and gods.
')
Link to github. To work out of the box you need pygame . For an introductory example, sklearn is required.

The origin of the idea


The idea to build a bicycle, which I will talk about, appeared gradually. First, the popularity of the machine learning topic has not bypassed me. Several courses on the cursor gave a deceptive sense of belonging. Several open contests and registration for kaggle slightly corrected self-esteem, however, they did not cool enthusiasm.

Secondly, being a representative of the untouchable caste of the domestic IT community, I rarely have the opportunity to practice my favorite Python. And from smart people I heard that my project in this regard is just what we need.

But the impetus was the disappointment of No Man's Sky. Technically smart idea, but the procedurally generated world was empty. And like any disappointed fan, I began to think what I would do if they asked me. And he came up with the idea that the world was empty because there is actually very little intelligent life in it. The endless expanses, the habit of relying only on oneself, the joy of the discoverer is all, of course, good. But there is not enough opportunity to return to the base, look around the market, find out the latest gossip in the diner. Deliver the parcel and get your 100 gold for it, after all. It is clear that any city, any dialogue or quest in games is the fruit of the labor of a living person and it is not possible to populate such a huge world with human forces. But what if we could also procedurally generate NPCs with their needs, small stories and quests?

Plan in general


This is how the idea of ​​a library appeared, or even, if you will, a framework that would have the following usage scenarios:

  1. Classical agent-based modeling (the existence of which I learned only when I sat down to write this article). We create the world, describe the actions of agents in this world, look what happened, change some parameters, run the simulation again. And so in a circle, until we find out how changes in the actions of individual agents affect the overall picture. Very useful stuff.

  2. Training with reinforcement (it is also reinforced learning ). Building learning models that adapt to interact with a specific environment. A simple example is learning a game whose rules you do not know, but at any time you can get information about the status of the party, choose one of a specific set of actions and see how it affected the number of points you earned (the competition on this topic , however, has already ended ). There are many differences from the usual models of classifiers or regressions. This is a possible delayed outcome, and the need for planning, and many other features.

  3. And finally, after we create the world and populate it with living creatures that are reasonable and not very good, it would be nice to be able to go there personally, grabbing your faithful blaster, favorite sword, multi-functional pickaxe or red nail puller.

Few technical details


So, first we need to decide on the low-level physics of our world. It should be simple, but flexible enough to simulate different situations:

  1. We take as a basis the usual cellular automaton - a two-dimensional rectangular discrete world, each object of which occupies one Planck length in a square. Distances shorter than the Planck length will not make sense - you cannot place an object between two cells, you cannot arrange it so that it takes up more than one cell, not even completely.

  2. We will measure the distance in steps in only four directions, that is, the cells next to each other will have 4, not 8. There will be 2 steps in each diagonal direction.

  3. To slightly dilute the solidity of the resulting structure, add a little depth: each object will have a sign of permeability. For the same spatial coordinates in the world there can be at least one, but not more than two objects: passable and / or impassable. You can think of it as a surface on which objects stand and on which objects move. Types of surfaces are different, types of objects, too. It is possible to put a curbstone (impassable object) on the carpet (passable object). But you can not lay linoleum on laminate (because who does this at all?) And you can not put a chair on the cabinet.

  4. But in the cabinet can be stored different items. And in the carpet can, and in the pockets of active objects, too. That is, any object can be a container for items. But not for other objects, otherwise we will violate the third law.

  5. Time also goes discretely. Each step, each object lives one Planck time, during which it can receive information about the world around it as of this epoch from outside. Now this is the weakest point - the objects have to act in turn, because of this some rassinhron is obtained. Objects to which the “move” comes later have to take into account the state of objects that have already “walked” in this era. If you allow objects to focus only on the beginning of an epoch, then this can lead to two impassable objects, for example, taking up the same free cell at the beginning of an era. Or they will take the same sock out of the dresser. This can be leveled out a little by addressing the objects of each epoch in a random order, but this approach does not solve the whole problem.

This gives us a few necessary basic objects: the world itself (Field), the object of this world (Entity) and the object (Substance). Hereinafter, the code in the article is just an illustration. You can view it fully in the library on github.

Entity classes with examples
class Entity(object): def __init__(self): # home universe self.board = None # time-space coordinates self.x = None self.y = None self.z = None # lifecycle properties self.age = 0 self.alive = False self.time_of_death = None # common properties self.passable = False self.scenery = True self._container = [] # visualization properties self.color = None def contains(self, substance_type): for element in self._container: if type(element) == substance_type: return True return False def live(self): self.z += 1 self.age += 1 class Blank(Entity): def __init__(self): super(Blank, self).__init__() self.passable = True self.color = "#004400" def live(self): super(Blank, self).live() if random.random() <= 0.0004: self._container.append(substances.Substance()) if len(self._container) > 0: self.color = "#224444" else: self.color = "#004400" class Block(Entity): def __init__(self): super(Block, self).__init__() self.passable = False self.color = "#000000" 


Field class
 class Field(object): def __init__(self, length, height): self.__length = length self.__height = height self.__field = [] self.__epoch = 0 self.pause = False for y in range(self.__height): row = [] self.__field.append(row) for x in range(self.__length): if y == 0 or x == 0 or y == (height - 1) or x == (length - 1): init_object = Block() else: init_object = Blank() init_object.x = x init_object.y = y init_object.z = 0 row.append([init_object]) 


The Substance class does not make sense to describe, it has nothing.

For the time we will meet the world itself. Each era, he will poll all objects in it and force them to make a move. How they will make this move is their business:

Time forward!
 class Field(object): ... def make_time(self): if self.pause: return for y in range(self.height): for x in range(self.length): for element in self.__field[y][x]: if element.z == self.epoch: element.live() self.__epoch += 1 ... 


But why do we need peace, and even with the planned opportunity to place a protagonist in it, if we cannot see it? On the other hand, if you start to deal with graphics, you can get distracted and the world rule will be postponed indefinitely. Therefore, without wasting time, we master this wonderful article about writing a platformer using pygame (in fact, we only need the first third of the article), give each object a color sign, and now we already have some sort of map.

Visualization code
 class Field(object): ... def list_obj_representation(self): representation = [] for y in range(self.height): row_list = [] for cell in self.__field[y]: row_list.append(cell[-1]) representation.append(row_list) return representation .... def visualize(field): pygame.init() screen = pygame.display.set_mode(DISPLAY) pygame.display.set_caption("Field game") bg = Surface((WIN_WIDTH, WIN_HEIGHT)) bg.fill(Color(BACKGROUND_COLOR)) myfont = pygame.font.SysFont("monospace", 15) f = field tick = 10 timer = pygame.time.Clock() go_on = True while go_on: timer.tick(tick) for e in pygame.event.get(): if e.type == QUIT: raise SystemExit, "QUIT" if e.type == pygame.KEYDOWN: if e.key == pygame.K_SPACE: f.pause = not f.pause elif e.key == pygame.K_UP: tick += 10 elif e.key == pygame.K_DOWN and tick >= 11: tick -= 10 elif e.key == pygame.K_ESCAPE: go_on = False screen.blit(bg, (0, 0)) f.integrity_check() f.make_time() level = f.list_obj_representation() label = myfont.render("Epoch: {0}".format(f.epoch), 1, (255, 255, 0)) screen.blit(label, (630, 10)) stats = f.get_stats() for i, element in enumerate(stats): label = myfont.render("{0}: {1}".format(element, stats[element]), 1, (255, 255, 0)) screen.blit(label, (630, 25 + (i * 15))) x = y = 0 for row in level: for element in row: pf = Surface((PLATFORM_WIDTH, PLATFORM_HEIGHT)) pf.fill(Color(element.color)) screen.blit(pf, (x, y)) x += PLATFORM_WIDTH y += PLATFORM_HEIGHT x = 0 pygame.display.update() 


Of course, later it will be possible to write a somewhat more intelligible visualization module, but not one. But while the colorful running squares are enough to immerse themselves in the atmosphere of the emerging life. In addition, it develops a fantasy.

image

Now you need to think about how active agents will act. First, all significant actions will be objects (Python objects, not objects of the world, I apologize for ambiguity). So you can keep history, manipulate their state, distinguish one action from another, even if they are of the same type. So, the actions will look like this:

  1. Every action must have a subject. The subject of action can only be the object of our world (Entity).

  2. Every action must have results. At a minimum, “completed / not completed” and “goal achieved / goal not achieved”. But there may be additional ones, depending on the type of action: for example, the action “Search for the Next Pizzeria” may have, in addition to the required ones, the coordinates or the pizzeria object as results.

  3. Each action may or may not have a set of parameters. For example, the action “Pouring a CupCoffee” may not have parameters, since it does not require clarification, while for the action “Pour” you need the opportunity to clarify what to pour and where.

  4. Action may be instantaneous or non-instantaneous. During one epoch, one object can perform no more than one non-instantaneous action and any number of instantaneous. This is a controversial point - if we have discretely space and we cannot move half a cell, then the ability to perform an unlimited number of actions during one epoch looks strange and blurs a clear discrete course of time. There was also an idea to ask each type of action the time that it is necessary to spend on it, ranging from 0 to 1, where the action of a duration of 1 takes the entire era. While I stopped at the version with the sign of instantaneousness, since for the definition of discrete time, all actions necessary for the simulation can always be made not instantaneous, but the option with the duration makes it all too complicated.

Thus, from a technical point of view, an action object (Action) is some kind of function, which can be set parameters, execute, get a result, and which itself in itself stores both the parameters passed to it, and the result, and everything that associated with its implementation, starting with the one who caused it, and ending with the state of the world around during its implementation. Therefore, we can create it at one time, set the parameters in another, execute the parameters in the third, get the return value and put it on the shelf for further analysis.

Action object
 class Action(object): def __init__(self, subject): self.subject = subject self.accomplished = False self._done = False self.instant = False def get_objective(self): return {} def set_objective(self, control=False, **kwargs): valid_objectives = self.get_objective().keys() for key in kwargs.keys(): if key not in valid_objectives: if control: raise ValueError("{0} is not a valid objective".format(key)) else: pass # maybe need to print else: setattr(self, "_{0}".format(key), kwargs[key]) def action_possible(self): return True def do(self): self.check_set_results() self._done = True def check_set_results(self): self.accomplished = True @property def results(self): out = {"done": self._done, "accomplished": self.accomplished} return out def do_results(self): self.do() return self.results 


If someone besides me suddenly wants to create a cozy little world for himself with the help of this library, then it is assumed that out of the box it will contain a set of necessary low-level actions - go to the coordinates, follow the object, find a specific object or object, pick up an object, etc. d. These actions can be used both on their own and combined to produce some complicated manipulations. An example of such complex actions will be further, in the description of the first experiment.

Secondly, every self-respecting active agent should be able to plan their actions. Therefore, we divide the phase of its activity during the epoch into 2 stages: planning and action. As a planning tool, we will have a simple lineup of actions that we are going to consistently perform. However, if we already have a plan, then there is nothing to ponder once more, we must act quickly, decisively. It turns out that at the beginning of the move, the active object determines whether it is necessary to plan for this move (for a start, we will assume that it is necessary when the action queue is empty), then plans if it decided that this is necessary, and at the end performs actions. Should planning, as a serious process, which does not tolerate haste, take up the whole course - a debatable question. For my own purposes, I have stopped at the fact that there is no - my agents do not think for a long time and start to fulfill the plan on the same turn.

Planning and action
 class Agent(Entity): ... def live(self): ... if self.need_to_update_plan(): self.plan() if len(self.action_queue) > 0: current_action = self.action_queue[0] self.perform_action(current_action) while len(self.action_queue) > 0 and self.action_queue[0].instant: current_action = self.action_queue[0] self.perform_action(current_action) def need_to_update_plan(self): return len(self.action_queue) == 0 def perform_action(self, action): results = action.do_results() if results["done"] or not action.action_possible(): self.action_log.append(self.action_queue.pop(0)) return results ... ... 


In addition to this, it seemed to me convenient to introduce such an entity as a state of an object that could influence its actions. After all, the agent can be tired, not in the mood, get wet, get poisoned or vice versa, be cheerful and full of energy. Sometimes even at the same time. Therefore, we add to our objects an array of states, each of which will affect the object at the beginning of an epoch.

Status code
 class State(object): def __init__(self, subject): self.subject = subject self.duration = 0 def affect(self): self.duration += 1 class Entity(object): def __init__(self): ... self._states_list = [] ... ... def get_affected(self): for state in self._states_list: state.affect() def live(self): self.get_affected() self.z += 1 self.age += 1 ... 


For modeling and training, it is necessary to be able to assess how well we have written an action algorithm or chose a training model. To do this, we add a simple simulation and evaluation module with the ability to describe a method for determining the end of the simulation and collecting the results.

Like that
 import copy def run_simulation(initial_field, check_stop_function, score_function, times=5, verbose=False): list_results = [] for iteration in range(times): field = copy.deepcopy(initial_field) while not check_stop_function(field): field.make_time() current_score = score_function(field) list_results.append(current_score) if verbose: print "Iteration: {0} Score: {1})".format(iteration+1, current_score) return list_results 


At this stage, everything, in principle, is ready to close the first scenario of using our library: modeling, if we don’t want to train agents, but want to prescribe the logic of their actions independently. The procedure in this case is as follows:

  1. We decide which static objects we want to see in the world: walls, mountains, furniture, types of surfaces, etc. We describe them by inheriting the Entity class. We do the same with objects and the Substance class.
  2. Create a world of the right size, fill it with a landscape of these objects and objects.
  3. We inherit the class Action and describe all the actions we need. We do the same with the State class and states if we need them for our simulation.
  4. We create a class of our agents, inheriting Agent. We add service functions to it, we describe the planning process.
  5. We inhabit our world with active agents.
  6. To debug actions and enjoy the contemplation of your creation, you can drive visualization.
  7. And in the end, having played enough with visualization, we start the simulation and evaluate how well the agents we created play according to the rules we create in the world we created.

Proof of concept i


So, we announce the conditions of the first experiment.


Our task will be to write a planning procedure for creatures of the true sex in such a way that the population of creatures reproduces as quickly as possible.

Out of gratitude, I will not tire out reading to this place with long illustrations, I will show only:
A variant of the implementation of integrated action on the example of mating
 class GoMating(Action): def __init__(self, subject): super(GoMating, self).__init__(subject) self.search_action = SearchMatingPartner(subject) self.move_action = MovementToEntity(subject) self.mate_action = Mate(subject) self.current_action = self.search_action def action_possible(self): if not self.current_action: return False return self.current_action.action_possible() def do(self): if self.subject.has_state(states.NotTheRightMood): self._done = True return if self.results["done"]: return if not self.action_possible(): self._done = True return first = True while first or (self.current_action and self.current_action.instant) and not self.results["done"]: first = False current_results = self.current_action.do_results() if current_results["done"]: if current_results["accomplished"]: if isinstance(self.current_action, SearchMatingPartner): if current_results["accomplished"]: self.current_action = self.move_action self.current_action.set_objective(**{"target_entity": current_results["partner"]}) elif isinstance(self.current_action, MovementXY): self.current_action = self.mate_action self.current_action.set_objective(**{"target_entity": self.search_action.results["partner"]}) elif isinstance(self.current_action, Mate): self.current_action = None self.accomplished = True self._done = True else: self.current_action = None self._done = True else: break def check_set_results(self): self.accomplished = self._done 


And the planning option that I decided that the model works
 class Creature(Agent): ... def plan(self): nearest_partner = actions.SearchMatingPartner(self).do_results()["partner"] if nearest_partner is None: chosen_action = actions.HarvestSubstance(self) chosen_action.set_objective(** {"target_substance_type": type(substances.Substance())}) self.queue_action(chosen_action) else: self_has_substance = self.count_substance_of_type(substances.Substance) partner_has_substance = nearest_partner.count_substance_of_type(substances.Substance) if partner_has_substance - self_has_substance > 2: self.queue_action(actions.GoMating(self)) else: chosen_action = actions.HarvestSubstance(self) chosen_action.set_objective(**{"target_substance_type": type(substances.Substance())}) self.queue_action(chosen_action) ... 


About machine learning and gods


Making sure that the simple modeling works, we will begin to increase the degree of fun and add the possibility of machine learning. At the time of this writing, not all of the planned capabilities are implemented, however, I promised to tell you about the gods.

But first we need to decide how we want to train our creatures. Take the same task with resource searches and pairing. If we solved it in the traditional way, then first we would have to decide on a set of signs, based on which we plan to make decisions. Then, acting randomly or somehow, collect the training and test datasets and save them. You should train a couple of models on these datasets, compare them and choose the best one. Finally, rewrite the planning process using this model, run the simulation and see what happens. And here we would have thought of using a new feature, which means reassembling the data, overtraining the models, re-aligning them with each other and restarting, in order to reexamine what will happen again.

And what would we like ideally? Ideally, I would like to define a set of features, configure the training model and run a simulation that would already assemble the datasets, train the model, connect it to the planning process and give us ready results of several runs that we could compare with the results of other models or other sets signs.

And that's how I imagine it:


Here we need a few new objects. First, creatures should have some kind of memory in which they will add their datasets. She should be able to separately memorize a set of signs. Separately attach to them the result of the decision made with this set of features. Return to us in a convenient way. Well, and forget all that we were taught in high school.

Secrets of memory
 class LearningMemory(object): def __init__(self, host): self.host = host self.memories = {} def save_state(self, state, action): self.memories[action] = {"state": state} def save_results(self, results, action): if action in self.memories: self.memories[action]["results"] = results else: pass def make_table(self, action_type): table_list = [] for memory in self.memories: if isinstance(memory, action_type): if "state" not in self.memories[memory] or "results" not in self.memories[memory]: continue row = self.memories[memory]["state"][:] row.append(self.memories[memory]["results"]) table_list.append(row) return table_list def obliviate(self): self.memories = {} 


Secondly, we need to teach agents to get jobs and to remember the environment and the results of their actions.

Getting jobs
 class Agent(Entity): def __init__(self): ... self.memorize_tasks = {} .... ... def set_memorize_task(self, action_types, features_list, target): if isinstance(action_types, list): for action_type in action_types: self.memorize_tasks[action_type] = {"features": features_list, "target": target} else: self.memorize_tasks[action_types] = {"features": features_list, "target": target} def get_features(self, action_type): if action_type not in self.memorize_tasks: return None features_list_raw = self.memorize_tasks[action_type]["features"] features_list = [] for feature_raw in features_list_raw: if isinstance(feature_raw, dict): if "kwargs" in feature_raw: features_list.append(feature_raw["func"](**feature_raw["kwargs"])) else: features_list.append(feature_raw["func"]()) elif callable(feature_raw): features_list.append(feature_raw()) else: features_list.append(feature_raw) return features_list def get_target(self, action_type): if action_type not in self.memorize_tasks: return None target_raw = self.memorize_tasks[action_type]["target"] if callable(target_raw): return target_raw() elif isinstance(target_raw, dict): if "kwargs" in target_raw: return target_raw["func"](**target_raw["kwargs"]) else: return target_raw["func"]() else: return target_raw def queue_action(self, action): if type(action) in self.memorize_tasks: self.private_learning_memory.save_state(self.get_features(type(action)), action) self.public_memory.save_state(self.get_features(type(action)), action) self.action_queue.append(action) def perform_action_save_memory(self, action): self.chosen_action = action if type(action) in self.memorize_tasks: results = self.perform_action(action) if results["done"]: self.private_learning_memory.save_results(self.get_target(type(action)), action) self.public_memory.save_results(self.get_target(type(action)), action) else: results = self.perform_action(action) ... 


, , , - , , , , , , . , , - -, , , .

Demiurge, . , , insert_object. , :

 class Demiurge(object): def handle_creation(self, creation, refuse): pass class Field(object): def __init__(self, length, height): ... self.demiurge = None ... def insert_object(self, x, y, entity_object, epoch_shift=0): if self.demiurge is not None: refuse = False self.demiurge.handle_creation(entity_object, refuse) if refuse: return assert x < self.length assert y < self.height self.__field[y][x][-1] = entity_object entity_object.z = self.epoch + epoch_shift entity_object.board = self entity_object.x = x entity_object.y = y ... 


:


. — - . . — , , . , — . — , , .

Proof of concept II


, . , , . , .

:


What happened
 # Create deity class Priapus(field.Demiurge): # Create deity def __init__(self): self.public_memory = brain.LearningMemory(self) self.public_decision_model = SGDClassifier(warm_start=True) def handle_creation(self, creation, refuse): if isinstance(creation, entities.Creature): creation.public_memory = self.public_memory creation.public_decision_model = self.public_decision_model creation.memory_type = "public" creation.model_type = "public" creation.memory_batch_size = 20 if creation.sex: def difference_in_num_substance(entity): nearest_partner = actions.SearchMatingPartner(entity).do_results()["partner"] if nearest_partner is None: return 9e10 else: self_has_substance = entity.count_substance_of_type(substances.Substance) partner_has_substance = nearest_partner.count_substance_of_type(substances.Substance) return partner_has_substance - self_has_substance def possible_partners_exist(entity): find_partner = actions.SearchMatingPartner(entity) search_results = find_partner.do_results() return float(search_results["accomplished"]) features = [{"func": lambda creation: float(creation.has_state(states.NotTheRightMood)), "kwargs": {"creation": creation}}, {"func": difference_in_num_substance, "kwargs": {"entity": creation}}, {"func": possible_partners_exist, "kwargs": {"entity": creation}}] creation.set_memorize_task(actions.GoMating, features, {"func": lambda creation: creation.chosen_action.results["accomplished"], "kwargs": {"creation": creation}}) def plan(creature): if creature.sex: try: # raise NotFittedError current_features = creature.get_features(actions.GoMating) current_features = np.asarray(current_features).reshape(1, -1) if creature.public_decision_model.predict(current_features): go_mating = actions.GoMating(creature) creature.queue_action(go_mating) return else: harvest_substance = actions.HarvestSubstance(creature) harvest_substance.set_objective( **{"target_substance_type": type(substances.Substance())}) creature.queue_action(harvest_substance) return except NotFittedError: chosen_action = random.choice( [actions.GoMating(creature), actions.HarvestSubstance(creature)]) if isinstance(chosen_action, actions.HarvestSubstance): chosen_action.set_objective( **{"target_substance_type": type(substances.Substance())}) creature.queue_action(chosen_action) return else: harvest_substance = actions.HarvestSubstance(creature) harvest_substance.set_objective(**{"target_substance_type": type(substances.Substance())}) creature.queue_action(harvest_substance) creation.plan_callable = plan universe = field.Field(60, 40) # Create sample universe (length, height universe.set_demiurge(Priapus()) # Assign deity to universe # Fill universe with blanks, blocks, other scenery if necessary for y in range(10, 30): universe.insert_object(20, y, field.Block()) for x in range(21, 40): universe.insert_object(x, 10, field.Block()) for y in range(10, 30): universe.insert_object(40, y, field.Block()) universe.populate(entities.Creature, 20) # Populate universe with creatures def check_stop_function(field): return field.epoch >= 500 def score_function(field): stats = field.get_stats() if "Creature" not in stats: return 0 else: return stats["Creature"] res = modelling.run_simulation(universe, check_stop_function, score_function, verbose=True, times=30) print res print np.asarray(res).mean() 


Conclusion


, , - . ( ) . :


, , , -, , -, . , , - , -, , . , , . , .

github. pygame sklearn , .

Source: https://habr.com/ru/post/315424/


All Articles