机器学习开源框架_2020年您需要了解的15种未发现的开源机器学习框架。

机器学习开源框架

Machine Learning (ML) is one of the fastest emerging technologies today. And the application of machine learning to different areas of computing is gaining popularity rapidly.

机器学习(ML)是当今发展最快的技术之一。机器学习在不同计算领域的应用正在Swift普及。

This is not only because of the existence of cheap and powerful hardware. It's also because of the increasing availability of free and open-source Machine learning frameworks, which allow developers to implement machine learning easily.

这不仅是因为存在廉价而强大的硬件。这也是因为免费和开源的可用性不断提高 机器学习 框架，使开发人员可以轻松实现机器学习。

This wide range of open-source machine learning frameworks let data scientists and machine learning engineers build, implement and maintain machine learning systems, generate new projects, and create new and impactful machine learning systems.

这种广泛的开源机器学习框架使数据科学家和机器学习工程师可以构建，实施和维护机器学习系统，生成新项目以及创建具有影响力的新机器学习系统。

Choosing a Machine Learning Framework or library to solve your use case involves making an assessment to decide what is right for your use case. Several factors are important for this assessment such as:

选择机器学习框架或库来解决您的用例需要进行评估，以决定适合您的用例的情况。几个因素对于此评估很重要，例如：

Ease of use.
使用方便。
Support in the market (Community).
市场支持(社区)。
Running Speeds.
运行速度。
Openness.
开放性

本文适用于谁？ (Who’s this article for?)

This article is for those who want to use the knowledge in practice after learning the theory.

本文适用于那些希望在学习了理论之后在实践中使用知识的人。

It's also for those who want to explore other potential open-source machine learning frameworks for their future ML project.

它也适合那些希望为未来的ML项目探索其他潜在的开源机器学习框架的人。

Now here is the list of undiscovered and open-source frameworks or libraries that businesses and individuals can use to build machine learning systems.

现在，这里是企业和个人可以用来构建机器学习系统的未发现的开源框架或库的列表。

1.块 (1.Blocks)

Block's Repository Block的资料库

You can also learn about Fuel, the data processing engine developed primarily for Blocks.

您还可以了解Fuel ，它是主要为Blocks开发的数据处理引擎。

Programming Language: PythonGithub link: https://github.com/mila-iqia/blocks

编程语言：PythonGithub链接： https : //github.com/mila-iqia/blocks

2. Analytics Zoo (2. Analytics Zoo)

Analytics Zoo Repository Analytics Zoo存储库

When you should use Analytics Zoo to develop your AI solution:

当您应该使用Analytics Zoo开发AI解决方案时：

You want to easily prototype AI models.
您想轻松地制作AI模型原型。
When scaling matters to you.
扩展规模对您很重要。
When you want to add automation processes into your machine learning pipeline such as feature engineering and model selection.
当您要将自动化流程添加到机器学习管道中时，例如特征工程和模型选择。

This project is maintained by Intel-analytics.

该项目由Intel-analytics维护。

Programming Language: PythonGithub link: https://github.com/intel-analytics/analytics-zoo

编程语言：PythonGithub链接： https : //github.com/intel-analytics/analytics-zoo

3. ML5.js (3. ML5.js)

TensorFlow.js.TensorFlow.js在浏览器中提供对机器学习算法和模型的访问。 "Ml5.js Repository"“ Ml5.js存储库”

ml5.js is inspired by Processing and p5.js.

ml5.js受到Processing和p5.js的启发。

This open source project is developed and maintained by NYU's Interactive Telecommunications/Interactive Media Arts program and by artists, designers, students, technologists, and developers across the world.

这个开源项目是由纽约大学的互动电信/互动媒体艺术计划以及世界各地的艺术家，设计师，学生，技术人员和开发人员开发和维护的。

NOTE: This project is currently in development.

注意：此项目当前正在开发中。

Programming Language: JavascriptGithub link: https://github.com/ml5js/ml5-library

编程语言：JavascriptGithub链接： https : //github.com/ml5js/ml5-library

4，AdaNet (4.AdaNet)

AdaNet is a lightweight TensorFlow-based framework for automatically learning high-quality models with minimal expert intervention. AdaNet builds on recent AutoML efforts to be fast and flexible while providing learning guarantees. Importantly, AdaNet provides a general framework for not only learning a neural network architecture but also for learning to ensemble to obtain even better models.AdaNet是基于TensorFlow的轻量级框架，可在最少专家干预的情况下自动学习高质量模型。 AdaNet建立在AutoML最近的努力基础上，以快速灵活地提供学习保证。重要的是，AdaNet提供了一个通用框架，不仅用于学习神经网络体系结构，而且还用于学习集成以获得更好的模型。 "AdaNet Repository"“ AdaNet存储库”

AdaNet provides familiar API like Keras for training, evaluating and serving your models in production.

AdaNet提供了熟悉的API，例如Keras，用于训练，评估和服务生产中的模型。

Programming Language: PythonGithub link: https://github.com/tensorflow/adanet

编程语言：PythonGithub链接： https : //github.com/tensorflow/adanet

5.姆哈尔 (5. Mljar)

If you are looking for a platform to create prototype models and deployment service, Mljar is the right choice for you. Mljar tends to search different algorithms and perform hyper-parameters tuning to find the best model.

如果您正在寻找创建原型模型和部署服务的平台，则Mljar是您的正确选择。 Mljar倾向于搜索不同的算法并执行超参数调整以找到最佳模型。

It also provide quick results by running all computations in the cloud and finally creating ensemble models.Then it creates markdown reports from AutoML training.

它还可以通过在云中运行所有计算并最终创建集成模型来提供快速结果，然后通过AutoML培训创建降价报告。

Mljar can train ML models for:

Mljar可以针对以下方面训练ML模型：

binary classification,
二进制分类
multi-class classification,
多类分类
regression.
回归。

Mljar provides two types of interfaces:

Mljar提供了两种类型的接口：

Python wrapper over Mljar API.
基于Mljar API的Python包装器。
Running Machine Learning models in your web browser.
在Web浏览器中运行机器学习模型。

Programming Language: PythonGithub link: https://github.com/mljar/mljar-supervised.

编程语言：PythonGithub链接： https : //github.com/mljar/mljar-supervised 。

6. ConvNetJS (6. ConvNetJS)

"convnetjs Repository"“ convnetjs存储库”

Like Tensorflow.js, ConvNetJS is a JavaScript library that supports training different Deep learning models in your web browser. You don't need GPUs and other heavy software.

与Tensorflow.js一样，ConvNetJS是一个JavaScript库，支持在Web浏览器中训练不同的深度学习模型。您不需要GPU和其他笨重的软件。

ConvNetJS supports:

ConvNetJS支持：

Neural Network modules.
神经网络模块。
Training Convolutional Networks for images.
训练卷积网络获取图像。
Regression and Classification cost functions.
回归和分类成本函数。
Reinforcement Learning module, based on Deep Q Learning.
强化学习模块，基于深度Q学习。

Note: Not actively maintained.

注意：没有积极维护。

Programming Language: JavascriptGithub link: https://github.com/karpathy/convnetjs

编程语言：JavascriptGithub链接： https : //github.com/karpathy/convnetjs

7，NNI(神经网络智能) (7.NNI (Neural Network Intelligence))

NNI (Neural Network Intelligence) is a lightweight but powerful toolkit to help users NNI(神经网络智能)是一个轻量级但功能强大的工具包，可帮助用户automate 自动化 Feature Engineering, 功能工程， Neural Architecture Search, 神经体系结构搜索，超参数Hyperparameter Tuning, and 调整和Model Compression. The tool manages automated machine learning (AutoML) experiments, 模型压缩。该工具管理自动机器学习(AutoML)实验， dispatches and runs experiments’ trial jobs generated by tuning algorithms to search the best neural architecture and/or hyper-parameters in 调度并运行由调整算法生成的实验性试验工作，以搜索different training environments like 不同训练环境(例如Local Machine, 本地计算机， Remote Servers, 远程服务器， OpenPAI, OpenPAI ， Kubeflow, and other cloud options. Kubeflow和其他云选项。 NNI Repository NNI资料库

When you should consider using NNI

什么时候应该考虑使用NNI

If you want to try different AutoML algorithms.
如果您想尝试不同的AutoML算法。
If you want to run AutoML trial jobs in different environments.
如果要在不同的环境中运行AutoML试用作业。
If you want to support AutoML in your platform.
如果要在平台中支持AutoML。

NOTE: Open source project by Microsoft.

注意：Microsoft的开源项目。

Programming Language: PythonGithub link: https://github.com/Microsoft/nni

编程语言：PythonGithub链接： https : //github.com/Microsoft/nni

8，数据框 (8.Datumbox)

"DatumBox Repository"“ DatumBox存储库”

Datumbox provides a number of pre-trained models for different tasks such as Spam Detection, Sentiment Analysis, Language Detection, Topic Classification and so on.

Datumbox提供了许多针对不同任务的预训练模型，例如垃圾邮件检测，情感分析，语言检测，主题分类等。

Programming language: JavaGithub link: https://github.com/datumbox/datumbox-framework

编程语言：JavaGithub链接： https : //github.com/datumbox/datumbox-framework

9.XAI(用于ML的可扩展性工具箱) (9.XAI (An eXplainability toolbox for ML))

XAI is a Machine Learning library that is designed with AI explainability in its core. XAI contains various tools that enable for analysis and evaluation of data and models. The XAI library is maintained by The Institute for Ethical AI & ML, and it was developed based on the 8 principles for Responsible Machine Learning." XAI Repository"

XAI是一个机器学习库，其设计核心是AI的可解释性。 XAI包含各种工具，可用于分析和评估数据和模型。 XAI库由道德与人工智能研究所 ( The Institute for Ethical AI＆ML )维护，它是根据负责任的机器学习的8条原则开发的。 “ XAI存储库”

The 8 principles for Responsible Machine Learning includes:

负责任的机器学习的8条原则包括：

Human augmentation
人体扩增
Bias Evaluation
偏差评估
Explainability by Justification
有理由的解释
Reproducible operations
可重复的操作
Displacement strategy
排量策略
Practical accuracy
实际精度
Trust by privacy
隐私信任
Data risk awareness
数据风险意识

To learn more about XAI, you can check out this talk at Tensorflow London. It contains insight on the definitions and principles of this library.

要了解有关XAI的更多信息，您可以在Tensorflow London上查看此演讲。它包含有关此库的定义和原理的见解。

XAI is currently in early stage development, the current version is 0.05 (Alpha).

XAI当前处于早期开发阶段，当前版本为0.05(Alpha)。

Programming Language: PythonGithub link: https://github.com/EthicalML/xai

编程语言：PythonGithub链接： https : //github.com/EthicalML/xai

10，柏拉图 (10.Plato)

Plato is a flexible framework for development of any conversational AI agents in different environments. Plato was designed both for users with a limited background in conversational AI and seasoned researchers in the field. It provides a clean and understandable design, integrates with existing deep learning and Bayesian optimization frameworks, and reduces the need to write code.

Plato是用于在不同环境中开发任何对话式AI代理的灵活框架。柏拉图既针对对话型AI的背景知识有限的用户，又针对该领域的资深研究人员而设计。它提供了一个干净易懂的设计，并与现有的深度学习和贝叶斯优化框架集成在一起，并减少了编写代码的需求。

It supports interactions through text, speech, and dialogue acts. To learn how the Plato Research Dialogue System works, read the article here.

它支持通过文本，语音和对话行为进行交互。要了解柏拉图研究对话系统如何工作，请在此处阅读文章。

NOTE: Plato is an open source project by Uber.

注意： Plato是Uber的开源项目。

Programming Language: PythonGithub link: https://github.com/uber-research/plato-research-dialogue-system

编程语言：PythonGithub链接： https : //github.com/uber-research/plato-research-dialogue-system

11，DeepDetect (11.DeepDetect)

DeepDetect实现了对图像，文本，时间序列和其他数据的有监督和无监督的深度学习的支持，重点是简单性和易用性，测试以及与现有应用程序的连接。它支持分类，对象检测，分段，回归和自动编码器。 DeepDetect Repository DeepDetect存储库

DeepDetect relies on external machine learning libraries such as:

DeepDetect依赖于外部机器学习库，例如：

Gradient boosting library XGBoost.

梯度提升库XGBoost 。
Deep learning libraries (Caffe, Tensorflow, Caffe2, Torch, NCNN, and Dlib).

深度学习库( Caffe ， Tensorflow ， Caffe2 ， Torch ， NCNN和Dlib )。
clustering with T-SNE.

与T-SNE聚类。
similarity search with Annoy and FAISS.

与Annoy和FAISS进行相似搜索。

DeepDetect is designed, implemented and supported by Jolibrain with the help of other different contributors.

DeepDetect是由Jolibrain在其他不同贡献者的帮助下设计，实施和支持的。

Programming Language: C++Github link: https://github.com/jolibrain/deepdetect

编程语言：C ++ Github链接： https : //github.com/jolibrain/deepdetect

12，流光 (12.Streamlit)

Streamlit — The fastest way to build custom ML tools.

Streamlit —构建自定义ML工具的最快方法。

Streamlit is an awesome tool that allows Data scientists, ML engineers, and developers to quickly build highly interactive web applications for their machine learning projects.

Streamlit是一个了不起的工具，可让数据科学家，机器学习工程师和开发人员快速为其机器学习项目构建高度交互的Web应用程序。

Streamlit doesn’t require any knowledge of web development. If you know Python then you’re good to go!

Streamlit不需要任何Web开发知识。如果您了解Python，那就太好了！

It also supports hot-reloading which means your app updates live while you're editing and saving your files.

它还支持热重载，这意味着您在编辑和保存文件时可以实时更新应用程序。

Take a look at Streamlit in action:

看一下Streamlit的实际效果：

Programming Language: Javascript & PythonGithub link: https://github.com/streamlit/streamlit

编程语言：Javascript和PythonGithub链接： https : //github.com/streamlit/streamlit

13，多巴胺 (13.Dopamine)

Dopamine Repository 多巴胺储存库

The design principles for Dopamine include:

多巴胺的设计原则包括：

Easy experimentation.
简单的实验。
Flexible development.
灵活的发展。
Compact and reliable.
紧凑可靠。
Reproducible.
可重现。

Last year (2019) Dopamine switched its network definitions to use tf.keras.Model. The previous tf.contrib.slim based networks have been removed.

去年(2019)，多巴胺将其网络定义切换为使用tf.keras.Model 。以前的基于tf.contrib.slim的网络已被删除。

To learn how to use Dopamine check out the Colaboratory notebooks.

要了解如何使用多巴胺，请查看Colaboratory笔记本。

Note: Dopamine is an open source project from Google.

注意：多巴胺是Google的开源项目。

Programming Language: PythonGithub link: https://github.com/google/dopamine

编程语言：PythonGithub链接： https ： //github.com/google/dopamine

14.Turi创建 (14.TuriCreate)

TuriCreate is an open-source toolset for creating custom Core ML models.

TuriCreate是用于创建自定义Core ML模型的开源工具集。

With TuriCreate you can accomplish different ML tasks such as Image classification, Sound classification, Object Detection, Style Transfer, Activity classification, Image similarity recommender, text classification, and clustering.

使用TuriCreate，您可以完成不同的ML任务，例如图像分类，声音分类，对象检测，样式转移，活动分类，图像相似性推荐程序，文本分类和聚类。

The framework is simple to use, flexible, and visual. It works on large datasets and is ready to deploy. The trained models can be used right away in iOS, macOS, tvOS and watchOS apps without any extra conversion.

该框架易于使用，灵活且直观。它适用于大型数据集并准备部署。训练有素的模型可以立即在iOS，macOS，tvOS和watchOS应用中使用，而无需任何额外的转换。

Check out TuriCreate talks at WWDC 2019 and WWDC 2018 to learn more about TuriCreate.

在WWDC 2019和WWDC 2018上查看TuriCreate演讲，以了解有关TuriCreate的更多信息。

NOTE: TuriCreate is an 0pen source project by Apple.

注意： TuriCreate是Apple提供的0pen源项目。

Programming Language: PythonGithub link: https://github.com/apple/turicreate

编程语言：PythonGithub链接： https : //github.com/apple/turicreate

15，天才 (15.Flair)

Flair is a simple natural language processing (NLP) framework, developed and open-sourced by the Humboldt University of Berlin. Flair is an official part of the PyTorch ecosystem and is used in hundreds of industrial and academic projects.

Flair是一个简单的自然语言处理(NLP)框架，由柏林洪堡大学开发并开源。 Flair是PyTorch生态系统的正式组成部分，已在数百个工业和学术项目中使用。

Flair Repository Flair存储库

Flair outperforms the previous best methods on a range of NLP tasks: Named Entity Recognition, Part of Speech Tagging, and Chunking. Check out this table:

在一系列NLP任务上，Flair的性能优于以前的最佳方法：命名实体识别，语音标记的一部分和分块。查看此表：

Note: F1 score is an evaluation metric primarily used for classification tasks. The F1 score takes into consideration the distribution of the classes present.

注意：F1分数是主要用于分类任务的评估指标。 F1分数考虑了当前班级的分布。

Learn how to perform text classification Using Flair Embeddings in this article.

在本文中了解如何使用Flair Embeddings执行文本分类。

Programming Language: PythonGithub link: https://github.com/flairNLP/flair

编程语言：PythonGithub链接： https : //github.com/flairNLP/flair

结论 (Conclusion)

Before you start to build a machine learning application, you need to select one ML framework from the many options out there. This can be a difficult task.

在开始构建机器学习应用程序之前，需要从众多选项中选择一个ML框架。这可能是一项艰巨的任务。

Therefore, it’s important to evaluate several options before making a final decision. The open-source machine learning frameworks mentioned above can help anyone build machine learning models efficiently and easily.

因此，在做出最终决定之前评估几个选项很重要。上面提到的开源机器学习框架可以帮助任何人高效，轻松地构建机器学习模型。

Are you wondering what the most popular Machine Learning Frameworks are? Here is the list that most data scientists and Machine learning engineers use most of their time.

您是否想知道最受欢迎的机器学习框架是什么？这是大多数数据科学家和机器学习工程师大部分时间使用的列表。

Tensorflow
张量流
Pytorch
火炬
Fastai
法泰
Keras
凯拉斯
scikit-learn
scikit学习
Microsoft cognitive toolkit
Microsoft认知工具包
Theano
茶野
Caffe2
Caffe2
DL4J
DL4J
MxNet
网络
H20
H20
Accord.NET
雅阁
Apache Spark
Apache Spark

I'll see you in the next post! I can also be reached on Twitter @Davis_McDavid.

我将在下一篇文章中见！也可以通过Twitter @Davis_McDavid与我联系。

翻译自: https://www.freecodecamp.org/news/15-undiscovered-open-source-machine-learning-frameworks-you-need-to-know-in-2020/