當(dāng)前位置：首頁 > 编程语言 > python >内容正文

python

博弈论Python仿真（二）

發(fā)布時間：2023/12/9 python 24 豆豆

生活随笔收集整理的這篇文章主要介紹了博弈论Python仿真（二）小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

上一節(jié)《博弈論P(yáng)ython仿真（一）》傳送門

一、Agenda

1、Prisoner’s dilemma game（囚徒困境）

2、When Finite number of games is played（玩有限次的博弈）

3、When Infinite number of games is played（無限次）

4、Payoff matrix in the two cases（收益矩陣）

5、Game visualization using sparklines（博弈可視化）

二、Axelrod

A Python library with the following principles and goals:

1、Enabling the reproduction of previous Iterated Prisoner's Dilemma research as easily as possible.

2、Creating the de-facto tool for future Iterated Prisoner's Dilemma research.

3、Providing as simple a means as possible for anyone to define and contribute new and original Iterated Prisoner's Dilemma strategies.

4、Emphasizing readability along with an open and welcoming community that is accommodating for developers and researchers of a variety of skill levels.

簡單來說，這個庫可以輕松地重復(fù)囚徒困境的研究，并提供簡單策略。

GitHub項目地址：https://github.com/Axelrod-Python/Axelrod

Python中文網(wǎng)地址：https://www.cnpython.com/pypi/axelrod

需要?Python3.5

pip install axelrod

三、囚徒困境收益矩陣

		Prisoner B
		Cooperate（Deny）	Defect（Confess）
Prisoner A	Cooperate（Deny）	3，3	0，5
Prisoner A	Defect（Confess）	5，0	1，1

? ? ? ? 通過上一節(jié)已知每個囚犯的策略只有兩個，分別是Confess和Deny，也就是坦白和否認(rèn)。上面的表格不同于第一節(jié)的表格，這個表格是一個收益矩陣，這是axelrod的默認(rèn)的收益矩陣。

? ? ? ? 通過上一節(jié)學(xué)到的博弈論的知識，通過Python仿真（或者手動窮舉）：

import nashpy import numpya = numpy.array([[3, 0],[5, 1]]) b = numpy.array([[3, 5],[0, 1]]) rps = nashpy.Game(a, b) print(rps)equilibrium = rps.support_enumeration() for eq in equilibrium:print(eq)

可以得到：

Bi matrix game with payoff matrices: Row player: [[3 0][5 1]] Column player: [[3 5][0 1]] (array([0., 1.]), array([0., 1.]))

????????可以看出納什均衡是（Confess，Confess）。但是這顯然不是最優(yōu)策略。從合作方面來看，雙方選擇“認(rèn)罪”策略，實際上是一種有缺陷的策略，雙方都選擇Confess其實不是最優(yōu)的“合作”。而雙方策略都選擇“拒絕”，才意味著合作。

????????在上一節(jié)中，雙方機(jī)會只有一次。現(xiàn)在，當(dāng)游戲次數(shù)增加時，每個玩家都有一系列其他戰(zhàn)略的可能性。可用的策略取決于游戲是finite/infinite（有限/無限）次。

四、有限次博弈

Case 1：Game is played for a finite number of times (say 10)

? ? ? ? 我們可以大膽的猜測一下，玩有限次數(shù)的博弈，那最終局之前的博弈，其實就是“游戲”啊，我隨便怎么選擇都行。但是最后一次，博弈完這次就沒了，那沒有合作的必要啊，我肯定要追求自己利益啊，所以這兩個囚犯都會選擇Confess，也就是坦白。

Hence, if the game is played a finite/fixed number of times, then each player will defect at each round. Players cooperate, thinking this will induce cooperation in the future. But if there is no chance of future play there is no need to cooperate now.

? ? ? ? 我們通過Python仿真一下：

（注意：Axelrod軟件包中的合作策略是指“ C”和叛逃策略是指“ D”）?

import axelrodplayers = (axelrod.Defector(), axelrod.Defector()) match = axelrod.Match(players, turns=10) match.play()

輸出結(jié)果：

[(D, D), (D, D), (D, D), (D, D), (D, D), (D, D), (D, D), (D, D), (D, D), (D, D)]

可以看到，這兩個囚犯，每次都會選擇叛逃，也就是不合作。

五、無限次博弈

Case 2: Game is played for an infinite number of times

? ? ? ? 當(dāng)游戲玩無限次的時候，策略就會發(fā)生改變。如果囚犯A在一輪博弈中拒絕合作，那么囚犯B可以在下一輪的博弈中拒絕合作。不合作的收益低，然后就威脅著囚犯選擇合作，然后采用帕累托策略，即選擇否認(rèn)（Deny）。

? ? ? ? 這里我們不得不提到一種策略就是“Tit-for-Tat”，也就是以牙還牙，一報還一報。你上一把怎么對我的，我這把就怎么對你。你上把和我合作，我這把也和你合作；你上把不和我合作，我這把也不和你合作。

? ? ? ? 該策略之所以起作用，就是它立即對拒絕和作者進(jìn)行懲罰，立即對合作者進(jìn)行獎勵。這最終會產(chǎn)生（Deny，Deny）的帕累托效率。

? ? ? ? Python仿真：

For the purpose of illustration let us play the game 20 times.?Player A plays ‘Tit-for-tat” strategy and let Player B play a random strategy.（出于示例的目的，讓我們玩游戲20次。玩家A扮演“ tit-for-tat”策略，讓玩家B播放隨機(jī)策略。）

import axelrodplayers = (axelrod.TitForTat(), axelrod.Random()) match = axelrod.Match(players, turns=20) match.play()

結(jié)果：

可能的結(jié)果一：

[(C, C), (C, C), (C, D), (D, D), (D, C), (C, C), (C, C), (C, C), (C, D), (D, D), (D, C), (C, C), (C, C), (C, D), (D, C), (C, C), (C, D), (D, D), (D, C), (C, D)]

可能的結(jié)果二：

[(C, D), (D, D), (D, D), (D, C), (C, D), (D, C), (C, C), (C, D), (D, C), (C, D), (D, C), (C, D), (D, D), (D, D), (D, C), (C, C), (C, C), (C, C), (C, C), (C, C)]

可能的結(jié)果三：

[(C, D), (D, D), (D, C), (C, C), (C, D), (D, C), (C, D), (D, D), (D, D), (D, D), (D, C), (C, C), (C, D), (D, D), (D, C), (C, C), (C, D), (D, C), (C, D), (D, C)]

等等結(jié)果……

? ? ? ? 可以看到，第二個玩家是隨機(jī)選擇規(guī)則；而玩家一是根據(jù)玩家二上一把選擇了什么，對玩家二進(jìn)行獎勵（Confuse）或者懲罰（Deny）。

六、收益和博弈可視化

Axelrod庫中的默認(rèn)收益如下：

		Prisoner B
				Cooperate（Deny）	Defect（Confess）
Prisoner A	Cooperate（Deny）	R，R	S，T
	Defect（Confess）	T，S	P，P

這里的：

R：獎勵回報（Reward payoff）（+3）

P：懲罰收益（Punishment payoff）（+1）

S：上當(dāng)/損失收益（Sucker/Loss payoff）（+0）

T：誘惑收益（Temptation payoff）（+5）

也就是和最上面一樣的表格：

		Prisoner B
		Cooperate（Deny）	Defect（Confess）
Prisoner A	Cooperate（Deny）	3，3	0，5
Prisoner A	Defect（Confess）	5，0	1，1

?Case 1：玩10把，都是默認(rèn)策略

Python：

import axelrodplayers = (axelrod.Defector(), axelrod.Defector()) match = axelrod.Match(players, turns=10) match.play() print(match.scores()) print(match.sparklines())

結(jié)果：

[(1, 1),(1, 1),(1, 1),(1, 1),(1, 1),(1, 1),(1, 1),(1, 1),(1, 1),(1, 1)]

print(match.sparklines())

?這行代碼是可視化玩家A和B決策的，但是在這個Case中，兩個玩家都是選擇的defect，也就都是空白的。

Case 2：Game is played for an infinite number of times（20次）

Python：

import axelrodplayers = (axelrod.Random(), axelrod.TitForTat()) match = axelrod.Match(players, turns=20) match.play() print(match.scores()) print(match.sparklines())

結(jié)果：

控制臺：

?七、部分其他玩家策略舉例

Case 1：Cooperator 和?Alternator

Axelrod還帶有Cooperator和Alternator兩個策略，玩家如果選擇了Cooperator，那么策略一直是合作；如果玩家選擇了Alternator，會一直在合作和拒絕直接切換。

Python：

import axelrodplayers = (axelrod.Cooperator(), axelrod.Alternator()) match = axelrod.Match(players, turns=15) match.play() print(match.sparklines())

結(jié)果：

███████████████ █ █ █ █ █ █ █ █

Case 2：?TrickyCooper

官方描述：

一個棘手的合作者。

A cooperator that is trying to be tricky.

?幾乎總是合作，但會偶爾拒絕合作一次，來獲得更好回報；三輪之后，如果對手沒有背叛到最大歷史深度10，則背叛。

Almost always cooperates, but will try to trick the opponent by defecting.Defect once in a while in order to get a better payout. After 3 rounds, if opponent has not defected to a max history depth of 10, defect.

Python：

?示例1：（cooperator和trickyCooper的較量）

import axelrodplayers = (axelrod.Cooperator(), axelrod.TrickyCooperator()) match = axelrod.Match(players, turns=15) match.play() print(match.sparklines())

結(jié)果：

███████████████ ███

可以看到，就前三次合作，后面一直在背叛。

?示例2：（TitForTat 和 trickyCooper的較量）

Python：

import axelrodplayers = (axelrod.TitForTat(), axelrod.TrickyCooperator()) match = axelrod.Match(players, turns=15) match.play() print(match.sparklines())

結(jié)果：

████ █████████ ███ ██████████

可以看到，第三次之后，trickyCooper背叛了，但是立馬就被TitForTat以牙還牙，可以看出TitForTat還是比較克制trickyCooper的。

Case 3：TitFor2Tats

與TitForTat不同的是，第一次背叛我，我不背叛你；如果你還背叛我，sorry，我要以牙還牙了。

Python：

import axelrodplayers = (axelrod.TitFor2Tats(), axelrod.TrickyCooperator()) match = axelrod.Match(players, turns=20) match.play() print(match.sparklines())

結(jié)果：

█████ ████████████ ███ ███████████

?八、其他策略

總之，策略還有很多很多，感興趣的可以仔細(xì)看看axelrod里面的player。

這里把代碼粘貼上：

1、TitForTat系列

C, D = Action.C, Action.Dclass TitForTat(Player):"""A player starts by cooperating and then mimics the previous action of theopponent.This strategy was referred to as the *'simplest'* strategy submitted toAxelrod's first tournament. It came first.Note that the code for this strategy is written in a fairly verboseway. This is done so that it can serve as an example strategy forthose who might be new to Python.Names:- Rapoport's strategy: [Axelrod1980]_- TitForTat: [Axelrod1980]_"""# These are various properties for the strategyname = "Tit For Tat"classifier = {"memory_depth": 1, # Four-Vector = (1.,0.,1.,0.)"stochastic": False,"long_run_time": False,"inspects_source": False,"manipulates_source": False,"manipulates_state": False,}def strategy(self, opponent: Player) -> Action:"""This is the actual strategy"""# First moveif not self.history:return C# React to the opponent's last moveif opponent.history[-1] == D:return Dreturn Cclass TitFor2Tats(Player):"""A player starts by cooperating and then defects only after two defects byopponent.Submitted to Axelrod's second tournament by John Maynard Smith; it came in24th in that tournament.Names:- Tit for two Tats: [Axelrod1984]_- Slow tit for two tats: Original name by Ranjini Das- JMaynardSmith: [Axelrod1980b]_"""name = "Tit For 2 Tats"classifier = {"memory_depth": 2, # Long memory, memory-2"stochastic": False,"long_run_time": False,"inspects_source": False,"manipulates_source": False,"manipulates_state": False,}@staticmethoddef strategy(opponent: Player) -> Action:"""Actual strategy definition that determines player's action."""return D if opponent.history[-2:] == [D, D] else Cclass TwoTitsForTat(Player):"""A player starts by cooperating and replies to each defect by twodefections.Names:- Two Tits for Tats: [Axelrod1984]_"""name = "Two Tits For Tat"classifier = {"memory_depth": 2, # Long memory, memory-2"stochastic": False,"long_run_time": False,"inspects_source": False,"manipulates_source": False,"manipulates_state": False,}@staticmethoddef strategy(opponent: Player) -> Action:"""Actual strategy definition that determines player's action."""return D if D in opponent.history[-2:] else Cclass DynamicTwoTitsForTat(Player):"""A player starts by cooperating and then punishes its opponent'sdefections with defections, but with a dynamic bias towards cooperatingbased on the opponent's ratio of cooperations to total moves(so their current probability of cooperating regardless of theopponent's move (aka: forgiveness)).Names:- Dynamic Two Tits For Tat: Original name by Grant Garrett-Grossman."""name = "Dynamic Two Tits For Tat"classifier = {"memory_depth": float("inf"),"stochastic": True,"long_run_time": False,"inspects_source": False,"manipulates_source": False,"manipulates_state": False,}def strategy(self, opponent):"""Actual strategy definition that determines player's action."""# First moveif not opponent.history:# Make sure we cooperate first turnreturn Cif D in opponent.history[-2:]:# Probability of cooperating regardlessreturn self._random.random_choice(opponent.cooperations / len(opponent.history))else:return Cclass Bully(Player):"""A player that behaves opposite to Tit For Tat, including first move.Starts by defecting and then does the opposite of opponent's previous move.This is the complete opposite of Tit For Tat, also called Bully in theliterature.Names:- Reverse Tit For Tat: [Nachbar1992]_"""name = "Bully"classifier = {"memory_depth": 1, # Four-Vector = (0, 1, 0, 1)"stochastic": False,"long_run_time": False,"inspects_source": False,"manipulates_source": False,"manipulates_state": False,}@staticmethoddef strategy(opponent: Player) -> Action:"""Actual strategy definition that determines player's action."""return C if opponent.history[-1:] == [D] else Dclass SneakyTitForTat(Player):"""Tries defecting once and repents if punished.Names:- Sneaky Tit For Tat: Original name by Karol Langner"""name = "Sneaky Tit For Tat"classifier = {"memory_depth": float("inf"), # Long memory"stochastic": False,"long_run_time": False,"inspects_source": False,"manipulates_source": False,"manipulates_state": False,}def strategy(self, opponent: Player) -> Action:"""Actual strategy definition that determines player's action."""if len(self.history) < 2:return Cif D not in opponent.history:return Dif opponent.history[-1] == D and self.history[-2] == D:return Creturn opponent.history[-1]class SuspiciousTitForTat(Player):"""A variant of Tit For Tat that starts off with a defection.Names:- Suspicious Tit For Tat: [Hilbe2013]_- Mistrust: [Beaufils1997]_"""name = "Suspicious Tit For Tat"classifier = {"memory_depth": 1, # Four-Vector = (1.,0.,1.,0.)"stochastic": False,"long_run_time": False,"inspects_source": False,"manipulates_source": False,"manipulates_state": False,}@staticmethoddef strategy(opponent: Player) -> Action:"""Actual strategy definition that determines player's action."""return C if opponent.history[-1:] == [C] else Dclass AntiTitForTat(Player):"""A strategy that plays the opposite of the opponents previous move.This is similar to Bully, except that the first move is cooperation.Names:- Anti Tit For Tat: [Hilbe2013]_- Psycho (PSYC): [Ashlock2009]_"""name = "Anti Tit For Tat"classifier = {"memory_depth": 1, # Four-Vector = (1.,0.,1.,0.)"stochastic": False,"long_run_time": False,"inspects_source": False,"manipulates_source": False,"manipulates_state": False,}@staticmethoddef strategy(opponent: Player) -> Action:"""Actual strategy definition that determines player's action."""return D if opponent.history[-1:] == [C] else Cclass HardTitForTat(Player):"""A variant of Tit For Tat that uses a longer history for retaliation.Names:- Hard Tit For Tat: [PD2017]_"""name = "Hard Tit For Tat"classifier = {"memory_depth": 3, # memory-three"stochastic": False,"long_run_time": False,"inspects_source": False,"manipulates_source": False,"manipulates_state": False,}@staticmethoddef strategy(opponent: Player) -> Action:"""Actual strategy definition that determines player's action."""# Cooperate on the first moveif not opponent.history:return C# Defects if D in the opponent's last three movesif D in opponent.history[-3:]:return D# Otherwise cooperatesreturn Cclass HardTitFor2Tats(Player):"""A variant of Tit For Two Tats that uses a longer history forretaliation.Names:- Hard Tit For Two Tats: [Stewart2012]_"""name = "Hard Tit For 2 Tats"classifier = {"memory_depth": 3, # memory-three"stochastic": False,"long_run_time": False,"inspects_source": False,"manipulates_source": False,"manipulates_state": False,}@staticmethoddef strategy(opponent: Player) -> Action:"""Actual strategy definition that determines player's action."""# Cooperate on the first moveif not opponent.history:return C# Defects if two consecutive D in the opponent's last three moveshistory_string = actions_to_str(opponent.history[-3:])if "DD" in history_string:return D# Otherwise cooperatesreturn Cclass OmegaTFT(Player):"""OmegaTFT modifies Tit For Tat in two ways:- checks for deadlock loops of alternating rounds of (C, D) and (D, C),and attempting to break them- uses a more sophisticated retaliation mechanism that is noise tolerantNames:- OmegaTFT: [Slany2007]_"""name = "Omega TFT"classifier = {"memory_depth": float("inf"),"stochastic": False,"long_run_time": False,"inspects_source": False,"manipulates_source": False,"manipulates_state": False,}def __init__(self, deadlock_threshold: int = 3, randomness_threshold: int = 8) -> None:super().__init__()self.deadlock_threshold = deadlock_thresholdself.randomness_threshold = randomness_thresholdself.randomness_counter = 0self.deadlock_counter = 0def strategy(self, opponent: Player) -> Action:"""Actual strategy definition that determines player's action."""# Cooperate on the first moveif not self.history:return C# TFT on round 2if len(self.history) == 1:return opponent.history[-1]# Are we deadlocked? (in a CD -> DC loop)if self.deadlock_counter >= self.deadlock_threshold:move = Cif self.deadlock_counter == self.deadlock_threshold:self.deadlock_counter = self.deadlock_threshold + 1else:self.deadlock_counter = 0else:# Update countersif opponent.history[-2:] == [C, C]:self.randomness_counter -= 1# If the opponent's move changed, increase the counterif opponent.history[-2] != opponent.history[-1]:self.randomness_counter += 1# If the opponent's last move differed from mine,# increase the counterif self.history[-1] != opponent.history[-1]:self.randomness_counter += 1# Compare counts to thresholds# If randomness_counter exceeds Y, Defect for the remainderif self.randomness_counter >= self.randomness_threshold:move = Delse:# TFTmove = opponent.history[-1]# Check for deadlockif opponent.history[-2] != opponent.history[-1]:self.deadlock_counter += 1else:self.deadlock_counter = 0return moveclass OriginalGradual(Player):"""A player that punishes defections with a growing number of defectionsbut after punishing for `punishment_limit` number of times enters a calmingstate and cooperates no matter what the opponent does for two rounds.The `punishment_limit` is incremented whenever the opponent defects and thestrategy is not in either calming or punishing state.Note that `Gradual` appears in [CRISTAL-SMAC2018]_ however that version of`Gradual` does not give the results reported in [Beaufils1997]_ which is thepaper that first introduced the strategy. For a longer discussion of thissee: https://github.com/Axelrod-Python/Axelrod/issues/1294. This is why thisstrategy has been renamed to `OriginalGradual`.Names:- Gradual: [Beaufils1997]_"""name = "Original Gradual"classifier = {"memory_depth": float("inf"),"stochastic": False,"long_run_time": False,"inspects_source": False,"manipulates_source": False,"manipulates_state": False,}def __init__(self) -> None:super().__init__()self.calming = Falseself.punishing = Falseself.punishment_count = 0self.punishment_limit = 0def strategy(self, opponent: Player) -> Action:"""Actual strategy definition that determines player's action."""if self.calming:self.calming = Falsereturn Cif self.punishing:if self.punishment_count < self.punishment_limit:self.punishment_count += 1return Delse:self.calming = Trueself.punishing = Falseself.punishment_count = 0return Cif D in opponent.history[-1:]:self.punishing = Trueself.punishment_count += 1self.punishment_limit += 1return Dreturn Cclass Gradual(Player):"""Similar to OriginalGradual, this is a player that punishes defections with agrowing number of defections but after punishing for `punishment_limit`number of times enters a calming state and cooperates no matter what theopponent does for two rounds.This version of Gradual is an update of `OriginalGradual` and the differenceis that the `punishment_limit` is incremented whenever the opponent defects(regardless of the state of the player).Note that this version of `Gradual` appears in [CRISTAL-SMAC2018]_ howeverthis version of`Gradual` does not give the results reported in [Beaufils1997]_ which is thepaper that first introduced the strategy. For a longer discussion of thissee: https://github.com/Axelrod-Python/Axelrod/issues/1294.This version is based on https://github.com/cristal-smac/ipd/blob/master/src/strategies.py#L224Names:- Gradual: [CRISTAL-SMAC2018]_"""name = "Gradual"classifier = {"memory_depth": float("inf"),"stochastic": False,"long_run_time": False,"inspects_source": False,"manipulates_source": False,"manipulates_state": False,}def __init__(self) -> None:super().__init__()self.calm_count = 0self.punish_count = 0def strategy(self, opponent: Player) -> Action:"""Actual strategy definition that determines player's action."""if len(self.history) == 0:return Cif self.punish_count > 0:self.punish_count -= 1return Dif self.calm_count > 0:self.calm_count -= 1return Cif opponent.history[-1] == D:self.punish_count = opponent.defections - 1self.calm_count = 2return Dreturn C@TrackHistoryTransformer(name_prefix=None) class ContriteTitForTat(Player):"""A player that corresponds to Tit For Tat if there is no noise. In the caseof a noisy match: if the opponent defects as a result of a noisy defectionthen ContriteTitForTat will become 'contrite' until it successfullycooperates.Names:- Contrite Tit For Tat: [Axelrod1995]_"""name = "Contrite Tit For Tat"classifier = {"memory_depth": 3,"stochastic": False,"long_run_time": False,"inspects_source": False,"manipulates_source": False,"manipulates_state": False,}def __init__(self):super().__init__()self.contrite = Falseself._recorded_history = []def strategy(self, opponent: Player) -> Action:"""Actual strategy definition that determines player's action."""if not opponent.history:return C# If contrite but managed to cooperate: apologise.if self.contrite and self.history[-1] == C:self.contrite = Falsereturn C# Check if noise provoked opponentif self._recorded_history[-1] != self.history[-1]: # Check if noiseif self.history[-1] == D and opponent.history[-1] == C:self.contrite = Truereturn opponent.history[-1]class AdaptiveTitForTat(Player):"""ATFT - Adaptive Tit For Tat (Basic Model)Algorithmif (opponent played C in the last cycle) thenworld = world + r*(1-world)elseworld = world + r*(0-world)If (world >= 0.5) play C, else play DAttributesworld : float [0.0, 1.0], set to 0.5continuous variable representing the world's image1.0 - total cooperation0.0 - total defectionother values - something in between of the aboveupdated every round, starting value shouldn't matter as long asit's >= 0.5Parametersrate : float [0.0, 1.0], default=0.5adaptation rate - r in Algorithm abovesmaller value means more gradual and robustto perturbations behaviourNames:- Adaptive Tit For Tat: [Tzafestas2000]_"""name = "Adaptive Tit For Tat"classifier = {"memory_depth": float("inf"),"stochastic": False,"long_run_time": False,"inspects_source": False,"manipulates_source": False,"manipulates_state": False,}world = 0.5def __init__(self, rate: float = 0.5) -> None:super().__init__()self.rate = rateself.world = ratedef strategy(self, opponent: Player) -> Action:"""Actual strategy definition that determines player's action."""if len(opponent.history) == 0:return Cif opponent.history[-1] == C:self.world += self.rate * (1.0 - self.world)else:self.world -= self.rate * self.worldif self.world >= 0.5:return Creturn Dclass SpitefulTitForTat(Player):"""A player starts by cooperating and then mimics the previous action of theopponent until opponent defects twice in a row, at which point playeralways defectsNames:- Spiteful Tit For Tat: [Prison1998]_"""name = "Spiteful Tit For Tat"classifier = {"memory_depth": float("inf"),"stochastic": False,"long_run_time": False,"inspects_source": False,"manipulates_source": False,"manipulates_state": False,}def __init__(self) -> None:super().__init__()self.retaliating = Falsedef strategy(self, opponent: Player) -> Action:"""Actual strategy definition that determines player's action."""# First moveif not self.history:return Cif opponent.history[-2:] == [D, D]:self.retaliating = Trueif self.retaliating:return Delse:# React to the opponent's last moveif opponent.history[-1] == D:return Dreturn Cclass SlowTitForTwoTats2(Player):"""A player plays C twice, then if the opponent plays the same move twice,plays that move, otherwise plays previous move.Names:- Slow Tit For Tat: [Prison1998]_"""name = "Slow Tit For Two Tats 2"classifier = {"memory_depth": 2,"stochastic": False,"long_run_time": False,"inspects_source": False,"manipulates_source": False,"manipulates_state": False,}def strategy(self, opponent: Player) -> Action:"""Actual strategy definition that determines player's action."""# Start with two cooperationsif len(self.history) < 2:return C# Mimic if opponent plays the same move twiceif opponent.history[-2] == opponent.history[-1]:return opponent.history[-1]# Otherwise play previous movereturn self.history[-1]@FinalTransformer((D,), name_prefix=None) class Alexei(Player):"""Plays similar to Tit-for-Tat, but always defect on last turn.Names:- Alexei: [LessWrong2011]_"""name = "Alexei"classifier = {"memory_depth": float("inf"),"stochastic": False,"long_run_time": False,"inspects_source": False,"manipulates_source": False,"manipulates_state": False,}def strategy(self, opponent: Player) -> Action:"""Actual strategy definition that determines player's action."""if not self.history:return Cif opponent.history[-1] == D:return Dreturn C@FinalTransformer((D,), name_prefix=None) class EugineNier(Player):"""Plays similar to Tit-for-Tat, but with two conditions:1) Always Defect on Last Move2) If other player defects five times, switch to all defects.Names:- Eugine Nier: [LessWrong2011]_"""name = "EugineNier"classifier = {"memory_depth": float("inf"),"stochastic": False,"long_run_time": False,"inspects_source": False,"manipulates_source": False,"manipulates_state": False,}def __init__(self):super().__init__()self.is_defector = Falsedef strategy(self, opponent: Player) -> Action:"""Actual strategy definition that determines player's action."""if not self.history:return Cif not (self.is_defector) and opponent.defections >= 5:self.is_defector = Trueif self.is_defector:return Dreturn opponent.history[-1]class NTitsForMTats(Player):"""A parameterizable Tit-for-Tat,The arguments are:1) M: the number of defection before retaliation2) N: the number of retaliationsNames:- N Tit(s) For M Tat(s): Original name by Marc Harper"""name = "N Tit(s) For M Tat(s)"classifier = {"memory_depth": float("inf"),"stochastic": False,"long_run_time": False,"inspects_source": False,"manipulates_source": False,"manipulates_state": False,}def __init__(self, N: int = 3, M: int = 2) -> None:"""Parameters----------N: intNumber of retaliationsM: intNumber of defection before retaliationSpecial Cases-------------NTitsForMTats(1,1) is equivalent to TitForTatNTitsForMTats(1,2) is equivalent to TitFor2TatsNTitsForMTats(2,1) is equivalent to TwoTitsForTatNTitsForMTats(0,*) is equivalent to CooperatorNTitsForMTats(*,0) is equivalent to Defector"""super().__init__()self.N = Nself.M = Mself.classifier["memory_depth"] = max([M, N])self.retaliate_count = 0def strategy(self, opponent: Player) -> Action:"""Actual strategy definition that determines player's action."""# if opponent defected consecutively M times, start the retaliationif not self.M or opponent.history[-self.M :].count(D) == self.M:self.retaliate_count = self.Nif self.retaliate_count:self.retaliate_count -= 1return Dreturn C@FinalTransformer((D,), name_prefix=None) class Michaelos(Player):"""Plays similar to Tit-for-Tat with two exceptions:1) Defect on last turn.2) After own defection and opponent's cooperation, 50 percent of the time,cooperate. The other 50 percent of the time, always defect for the rest ofthe game.Names:- Michaelos: [LessWrong2011]_"""name = "Michaelos"classifier = {"memory_depth": float("inf"),"stochastic": True,"long_run_time": False,"inspects_source": False,"manipulates_source": False,"manipulates_state": False,}def __init__(self):super().__init__()self.is_defector = Falsedef strategy(self, opponent: Player) -> Action:"""Actual strategy definition that determines player's action."""if not self.history:return Cif self.is_defector:return Dif self.history[-1] == D and opponent.history[-1] == C:decision = self._random.random_choice()if decision == C:return Celse:self.is_defector = Truereturn Dreturn opponent.history[-1]class RandomTitForTat(Player):"""A player starts by cooperating and then follows by copying itsopponent (tit for tat style). From then on the playerwill switch between copying its opponent and randomlyresponding every other iteration.Name:- Random TitForTat: Original name by Zachary M. Taylor"""# These are various properties for the strategyname = "Random Tit for Tat"classifier = {"memory_depth": 1,"stochastic": True,"long_run_time": False,"inspects_source": False,"manipulates_source": False,"manipulates_state": False,}def __init__(self, p: float = 0.5) -> None:"""Parameters----------p, floatThe probability to cooperate"""super().__init__()self.p = pself.act_random = Falseif p in [0, 1]:self.classifier["stochastic"] = Falsedef strategy(self, opponent: Player) -> Action:"""This is the actual strategy"""if not self.history:return Cif self.act_random:self.act_random = Falsetry:return self._random.random_choice(self.p)except AttributeError:return D if self.p == 0 else Cself.act_random = Truereturn opponent.history[-1]

2、Cooperator系列

C, D = Action.C, Action.Dclass Cooperator(Player):"""A player who only ever cooperates.Names:- Cooperator: [Axelrod1984]_- ALLC: [Press2012]_- Always cooperate: [Mittal2009]_"""name = "Cooperator"classifier = {"memory_depth": 0,"stochastic": False,"long_run_time": False,"inspects_source": False,"manipulates_source": False,"manipulates_state": False,}@staticmethoddef strategy(opponent: Player) -> Action:"""Actual strategy definition that determines player's action."""return Cclass TrickyCooperator(Player):"""A cooperator that is trying to be tricky.Names:- Tricky Cooperator: Original name by Karol Langner"""name = "Tricky Cooperator"classifier = {"memory_depth": 10,"stochastic": False,"long_run_time": False,"inspects_source": False,"manipulates_source": False,"manipulates_state": False,}_min_history_required_to_try_trickiness = 3_max_history_depth_for_trickiness = -10def strategy(self, opponent: Player) -> Action:"""Almost always cooperates, but will try to trick the opponent bydefecting.Defect once in a while in order to get a better payout.After 3 rounds, if opponent has not defected to a max history depth of10, defect."""if (self._has_played_enough_rounds_to_be_tricky()and self._opponents_has_cooperated_enough_to_be_tricky(opponent)):return Dreturn Cdef _has_played_enough_rounds_to_be_tricky(self):return len(self.history) >= self._min_history_required_to_try_trickinessdef _opponents_has_cooperated_enough_to_be_tricky(self, opponent):rounds_to_be_checked = opponent.history[self._max_history_depth_for_trickiness :]return D not in rounds_to_be_checked

3、Defector系列

C, D = Action.C, Action.Dclass Defector(Player):"""A player who only ever defects.Names:- Defector: [Axelrod1984]_- ALLD: [Press2012]_- Always defect: [Mittal2009]_"""name = "Defector"classifier = {"memory_depth": 0,"stochastic": False,"long_run_time": False,"inspects_source": False,"manipulates_source": False,"manipulates_state": False,}@staticmethoddef strategy(opponent: Player) -> Action:"""Actual strategy definition that determines player's action."""return Dclass TrickyDefector(Player):"""A defector that is trying to be tricky.Names:- Tricky Defector: Original name by Karol Langner"""name = "Tricky Defector"classifier = {"memory_depth": float("inf"), # Long memory"stochastic": False,"long_run_time": False,"inspects_source": False,"manipulates_source": False,"manipulates_state": False,}def strategy(self, opponent: Player) -> Action:"""Almost always defects, but will try to trick the opponent intocooperating.Defect if opponent has cooperated at least once in the past and hasdefected for the last 3 turns in a row."""if (opponent.history.cooperations > 0and opponent.history[-3:] == [D] * 3):return Creturn D

總結(jié)

以上是生活随笔為你收集整理的博弈论Python仿真（二）的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯，歡迎將生活随笔推薦給好友。

上一篇：小程序学习（1）:微信开发者工具安装
下一篇：前端学习（3094）：vue+eleme

3atv精品不卡视频,97人人超碰国产精品最新,中文字幕av一区二区三区人妻少妇,久久久精品波多野结衣,日韩一区二区三区精品

python

博弈论Python仿真（二）

一、Agenda

二、Axelrod

三、囚徒困境收益矩陣

3，3

0，5

5，0

1，1

四、有限次博弈

Case 1：Game is played for a finite number of times (say 10)

五、無限次博弈

Case 2: Game is played for an infinite number of times

六、收益和博弈可視化

R，R

S，T

T，S

P，P

3，3

0，5

5，0

1，1

?Case 1：玩10把，都是默認(rèn)策略

Case 2：Game is played for an infinite number of times（20次）

?七、部分其他玩家策略舉例

Case 1：Cooperator 和?Alternator

Case 2：?TrickyCooper

Case 3：TitFor2Tats

?八、其他策略

1、TitForTat系列

2、Cooperator系列

3、Defector系列

總結(jié)