Python多处理ThreadPool
问题描述:
我尝试使用python中的多处理读取文件。这里是一个小例子:Python多处理ThreadPool
import multiprocessing
from time import *
class class1():
def function(self, datasheetname):
#here i start reading my datasheet
if __name__ == '__main__':
#Test with multiprosessing
pool = multiprocessing.Pool(processes=4)
pool.map(class1("Datasheetname"))
pool.close()
现在我得到以下错误:
TypeError: map() missing 1 required positional argument: 'iterable'
在这个板上的其他线程我得到了暗示与线程池要做到这一点,但我不无怎么做。有任何想法吗?
答
map(func, iterable[, chunksize])
A parallel equivalent of the map() built-in function (it supports only one iterable argument though). It blocks until the result is ready.
This method chops the iterable into a number of chunks which it submits to the process pool as separate tasks. The (approximate) size of these chunks can be specified by setting chunksize to a positive integer.
您必须通过其中的每个元素被传递给目标func
作为每个过程参数可迭代。
例子:
def function(sheet):
# do something with sheet
return "foo"
pool = Pool(processes=4)
result = pool.map(function, ['sheet1', 'sheet2', 'sheet3', 'sheet4'])
# result will be ['foo', 'foo', 'foo', 'foo']
你需要这样做并行,或者你需要在一堆CSV/Excel表格的阅读?如果后者可能使用[pandas.read_csv](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html)或[pandas.read_excel](http:// pandas .pydata.org/pandas-docs/stable/generated/pandas.read_excel.html),它可以通过一次调用读取多个文件/工作表。 – David