Windows下PyTorch配置cuda加速

PyTorch is not linked with support for cuda devices (getDeviceGuardImpl at C:\w\b\windows\pytorch\c10/core/impl/DeviceGuardImplInterface.h:216)
(no backtrace available)

PyTorch 在Windows下配置cuda加速似乎变得有些诡异

测试1.4,1.5.0,1.5.1,nightly,python(pip install ...)安装之后都不能使用,网上搜索资料发现有遇到的,没解决的.....废话不多说了

在链接器选项中添加,不熟悉的小伙伴们可以参考填写位置如下图

-INCLUDE:THCudaCharTensor_zero

Windows下PyTorch配置cuda加速

填写该参数后需要链接torch_cuda.lib文件,随后就能体验飞一般的速度提升啦.

附:

cuda测试代码

struct Net : torch::nn::Module {
   Net(int64_t N, int64_t M) {
       W = register_parameter("W", torch::randn({ N, M }));
       b = register_parameter("b", torch::randn(M));
   }
   torch::Tensor forward(torch::Tensor input) {
       return torch::addmm(b, input, W);
   }
   torch::Tensor W, b;
};
void testCuda() {
   Net lmodule(4096, 4096);

   try
   {
       torch::Tensor tensor = torch::eye(4096, torch::kFloat).to(deviceGPU);
       lmodule.to(deviceGPU);
       for (size_t i = 0; i < 1024 * 64; i++)
           lmodule.forward(tensor);
       //tensor1* tensor2;
   }
   catch (const std::exception& ex)
   {
       std::cout << ex.what();
   }
   getchar();
}

以下代码是加速失败的情况

       torch::Tensor tensor1 = torch::eye(9128);
       torch::Tensor tensor2 = torch::eye(9128);
       tensor1.to(deviceGPU);
       tensor2.to(deviceGPU);
       for (size_t i = 0; i < 1024 * 64; i++)
       {
           tensor1* tensor2;
       }

Windows下PyTorch配置cuda加速

相关推荐