在for循环上实现大量输入
我一直在尝试改进我的代码(使用numba和multiprocessing),但我无法完全理解它,因为我的函数有很多参数。在for循环上实现大量输入
我已经与其它功能(见下文)简化它...
由于每个代理(一个类的实例)是相互独立的这些动作,我想与Pool
更换for
。
所以我会得到一个大的功能pooling()
,我会打电话,并通过代理
from multiprocessing import Pool
p = Pool(4)
p.map(pooling, list(agents))
但是,你在哪里我补充一点,池功能所需的所有参数列表?
因为它是:
def check_demographics(month, my_agents, families, firms, year, mortality_men, mortality_women, fertility, state_id):
dummy = list(my_agents)
d = str(state_id.iloc[0])
# Place where I would like to replace the LOOP. All below would be a function
for agent in dummy:
if agent.get_region_id()[:2] == d:
# Brithday
if month % 12 == agent.month - 1:
agent.update_age()
# Mortality probability
if agent.get_gender() == 'Male':
prob = mortality_men[mortality_men['age'] == agent.get_age()][year].iloc[0]
# When gender is Female
else:
# Extract specific agent data to calculate mortality 'Female'
prob = mortality_women[mortality_women['age'] == agent.get_age()][year].iloc[0]
# Give birth decision
age = agent.get_age()
if 14 < age < 50:
pregnant(agent, fertility, year, families, my_agents)
# Mortality procedures
if fixed_seed.random() < prob:
mortal(my_agents, my_graveyard, families, agent, firms)
这是我的计划消费函数的时间最多。 和@jit
帮助不大。
谢谢一堆
是的,有很多参数!考虑使用一个类。
那么,因为Pool.map
只支持一个可迭代的参数,所以您需要将所有内容组合在一起。我建议你使用“Facade”模式:一个中间类,用于存储所有需要的参数,并有一个方法(我称之为check
),没有参数(这是一种方法)。
class Facade(object):
def __init__(self, agent, d, families, fertility, firms, month, mortality_men, mortality_women, my_agents,
my_graveyard, year):
self.agent = agent
self.d = d
self.families = families
self.fertility = fertility
self.firms = firms
self.month = month
self.mortality_men = mortality_men
self.mortality_women = mortality_women
self.my_agents = my_agents
self.my_graveyard = my_graveyard
self.year = year
def check(self):
(agent, d, families, fertility, firms,
month, mortality_men, mortality_women,
my_agents, my_graveyard, year) = (
self.agent, self.d, self.families, self.fertility, self.firms,
self.month, self.mortality_men, self.mortality_women,
self.my_agents, self.my_graveyard, self.year)
if agent.get_region_id()[:2] == d:
# Brithday
if month % 12 == agent.month - 1:
agent.update_age()
# Mortality probability
if agent.get_gender() == 'Male':
prob = mortality_men[mortality_men['age'] == agent.get_age()][year].iloc[0]
# When gender is Female
else:
# Extract specific agent data to calculate mortality 'Female'
prob = mortality_women[mortality_women['age'] == agent.get_age()][year].iloc[0]
# Give birth decision
age = agent.get_age()
if 14 < age < 50:
pregnant(agent, fertility, year, families, my_agents)
# Mortality procedures
if fixed_seed.random() < prob:
mortal(my_agents, my_graveyard, families, agent, firms)
注:我重构实在是太丑了,但我想保持变量名不变清晰度。
然后你的循环可能是类似的东西:
def check_demographics(month, my_agents, families, firms,
year, mortality_men, mortality_women,
fertility, state_id, my_graveyard):
d = str(state_id.iloc[0])
pool = Pool(4)
facades = [Facade(agent, d, families, fertility, firms,
month, mortality_men, mortality_women,
my_agents, my_graveyard, year)
for agent in my_agents]
pool.map(Facade.check, facades)
你说,每个代理是相互独立的,但是,在分析环路后,我看你需要的药物的完整列表(的my_agents
参数)。 Facade
班很明显。因此,您的代理列表不得更改,并且每个代理的内部状态必须在循环期间冻结。
非常好,谢谢。但是你正确的是'my_agents'的变化。这就是为什么我创建了一个新的'list(agents)'我通过它迭代。在这种情况下会起作用吗? –
至少应用''map''作为代理的副本:''list(agents)''。为什么列表更改? –
我做到了。我也遇到了这个错误:'_pickle.PicklingError:Can not pickle
注意:全局变量或参数''my_graveyard''丢失。 –
的确,谢谢。 –