跳转到主要内容
Chinese, Simplified

丰富的机器学习工具

当谈到训练计算机在没有明确编程的情况下采取行动时,存在大量来自机器学习领域的工具。学术界和行业专业人士使用这些工具在MRI扫描中构建从语音识别到癌症检测的多种应用。这些工具可在网上免费获得。如果您感兴趣,我已经编制了这些的排名(请参阅本页底部)以及一些区分它们的重要功能的概述。其中,从主页网站获取每种工具的描述,关注机器学习中的特定范例以及学术界和工业界的一些显着用途。

研究人员可以一次使用许多不同的库,编写自己的库,或者不引用任何特定的工具,因此很难量化每种库的相对采用。相反,搜索排名反映了5月份谷歌搜索每个工具的相对大小。该分数并不反映广泛采用,但为我们提供了一个很好的指示,表明正在使用哪些。注意*像“Caffe”这样的模糊名称被评为“Caffe机器学习”,不那么含糊。

 

 

机器学习工具总览

我已经将两个机器学习子领域Deep和Shallow Learning区分开来,这已成为过去几年中的一个重要分支。深度学习负责图像分类和语音识别的记录结果,因此由Google,Facebook和百度等大型数据公司牵头。相反,浅层学习方法包括各种不太前沿的分类,聚类和提升技术,如支持向量机。浅层学习方法仍然广泛应用于自然语言处理,脑计算机接口和信息检索等领域。

机器学习包和库的详细比较

此表还包含有关使用GPU的特定工具支持的信息。 GPU接口已经成为机器学习工具的一个重要特性,因为它可以加速大规模矩阵运算。这对深度学习方法的重要性是显而易见的。例如,在2015年5月初的GPU技术大会上,机器学习下的45个演讲中有39个是关于GPU加速的深度学习应用程序,这些应用程序来自31家主要的科技公司和8所大学。这一吸引力反映了Deep Networks对GPU辅助培训的巨大速度提升,因此是一项重要功能。

还提供了有关通过Hadoop或Spark在集群中分配计算的工具能力的信息。这已成为适合分布式计算的浅学习技术的重要论述点。同样,Deep Networks的分布式计算也成为一个讨论点,因为已经为分布式训练算法开发了新技术。

最后,附上一些关于学术界和工业界对这些工具的不同使用的补充说明。通过搜索机器学习出版物,演示文稿和分布式代码收集了哪些信息。 Google,Facebook和甲骨文的研究人员也支持了一些信息,非常感谢Greg Mori,Adam Pocock和Ronan Collobert。

这项研究的结果表明,目前有许多工具正在使用,目前还不确定哪种工具能够赢得狮子会在工业界或学术界的使用份额。

Search Rank Tool Language Type Description “quote” Use GPU acceleration Distributed computing
100 Theano Python Library umerical computation library for multi-dimensional arrays efficiently Deep and shallow Learning CUDA and Open CL cuDNN    Cutorch
78 Torch 7 Lua Framework Scientific computing framework with wide support for machine learning algorithms Deep and shallow Learning CUDA and Open CL, cuDNN Cutorch
64 R R Environment/ Language Functional language and environment for statistics Shallow Learning
RPUD
HiPLAR
52 LIBSVM Java and C++ Library A Library for Support Vector Machines Support Vector Machines CUDA Not Yet
34 scikit-learn Python Library Machine Learning in Python Shallow Learning Not Yet Not Yet
28

Spark

MLLIB

C++, APIs in JAVA, and Python Library/API Apache Spark’s scalable machine learning library Shallow Learning ScalaCL

Spark and

Hadoop

24 Matlab Matlab Environment/ Language High-level technical computing language and interactive environment for algorithm development, data visualization, data analysis, and numerical analysis Deep and Shallow Learning Parallel Computing Toolbox (not-free not-open source)

Distributed Computing

Package (not-free not-open source)

18 Pylearn2 Python Library Machine Learning Deep Learning CUDA and OpenCL, cuDNN Not Yet
14

VowPal

Wabbit

C++ Library Out-of-core learning system Shallow Learning CUDA Not Yet
13 Caffe C++ Framework Deep learning framework made with expression, speed, and modularity in mind Deep Learning CUDA and OpenCL, cuDNN Not Yet
11
LIBLINEAR Java and C++ Library A Library for Large Linear Classification Support Vector Machines and Logistic Regression CUDA Not Yet
6 Mahout Java Environment/ Framework An environment for building scalable algorithms Shallow Learning JCUDA Spark andHadoop
5

Accord.

NET

.Net Framework Machine learning Deep and Shallow Learning CUDA.net Not Yet
5 NLTK Python Library Programs to work with human language data Text Classification Skits.cuda Not Yet
4

Deep

learning4j

Java Framework Commercial-grade, open-source, distributed deep-learning library Deep and shallow Learning JClubas Spark andHadoop
4 Weka 3 Java Library Collection of machine learning algorithms for data mining tasks Shallow Learning Not Yet

Distributed

Weka Spark

4 MLPY Python Library Machine Learning Shallow Learning Skits.cuda Not Yet
3 Pandas Python Library Data analysis and manipulation Shallow Learning Skits.cuda Not Yet
1 H20 Java, Python and R Environment/ Language open source predictive analytics platform Deep and Shallow Learning Not Yet Spark and Hadoop
0 Cuda-covnet C++ Library machine learning library forneural-network applications Deep Neural Networks CUDA coming in Cuda-covnet2
0 Mallet Java Library Package for statistical natural language processing Shallow Learning JCUDA Spark and Hadoop
0 JSAT Java Library Statistical Analysis Tool Shallow Learning JCUDA Spark and Hadoop
0 MultiBoost C++ Library Machine Learning Boosting Algorithms CUDA Not Yet
0 Shogun C++ Library Machine Learning Shallow Learning CUDA Not Yet
0 MLPACK C++ Library Machine Learning Shallow Learning CUDA Not Yet
0 DLIB C++ Library Machine Learning Shallow Learning CUDA Not Yet
0 Ramp Python Library Machine Learning Shallow Learning Skits.cuda Not Yet
0 Deepnet Python Library GPU-based Machine Learning Deep Learning CUDA Not Yet
0 CUV Python Library GPU-based Machine Learning Deep Learning CUDA Not Yet
0 APRIL-ANN Lua Library Machine Learning Deep Learning Not Yet Not Yet
0 nnForge C++ Framework GPU-basedMachine Learning Convolutionl and fully-connected neural networks CUDA Not Yet
0 PYML Python Framework Object oriented framework for machine learning SVMs and other kernel methods Skits.cuda Not Yet
0 Milk Python Library Machine Learning Shallow Learning Skits.cuda Not Yet
0 MDP Python Library Machine Learning Shallow Learning Skits.cuda Not Yet
0
Orange
Python Library Machine Learning Shallow Learning Skits.cuda Not Yet
0 PYMVPA Python Library Machine Learning Only Classification Skits.cuda Not Yet
0
Monte
Python Library Machine Learning Shallow Learning Skits.cuda Not Yet
0
RPY2
Python to R API Low-level interface to R Shallow Learning Skits.cuda Not Yet
0 NueroLab Python Library Machine Learning Feed Forward Neural Networks Skits.cuda Not Yet
0 PythonXX Python Library Machine Learning Shallow Learning Skits.cuda Not Yet
0 Hcluster Python Library Machine Learning Clustering Algorithms Skits.cuda Not Yet
0 FYANN C Library Machine Learning Feed Forward Neural Networks Not Yet Not Yet
0 PyANN Python Library Machine Learning Nearest Neighbours Classification Not Yet Not Yet
0 FFNET Python Library Machine Learning FeedForwad NeuralNetwors Not Yet Not Yet

帮助我们建立神经系统处理器的桥梁

Knowm Inc专注于开发像kT-RAM这样的神经系统处理器。 像杰弗里·辛顿这样的机器学习先驱者非常清楚,机器学习从根本上与计算能力有关。 我们称之为自适应电源问题,为了解决这个问题,我们需要新的工具来引领下一波智能机器。 虽然GPU(最终!)使我们能够展示在某些任务上接近人类水平的学习算法,但它们的能量和空间效率仍比生物学低100亿到10亿倍。 我们正把这个差距缩小到零。

我们有兴趣知道解决实际机器学习问题的人员,框架和算法最有用,因此我们可以集中精力构建kT-RAM和KnowmAPI的桥梁。 请在下面留言或联系我们告诉我们。

Misc. References

  1. Bryan Catanzaro Senior Researcher, Baidu” Speech: The Next Generation” 05/28/2015 Talk given @ GPUTech conference 2015

  2. Dhruv Batra CloudCV: Large-Scale Distributed Computer Vision as a Cloud Service” 05/28/2015 Talk given @ GPUTech conference 2015

  3. Dilip Patolla. “A GPU based Satellite Image Analysis Tool” 05/28/2015 Talk given @ GPUTech conference 2015

  4. Franco Mana. “A High-Density GPU Solution for DNN Training” 05/28/2015 Talk given @ GPUTech conference 2015</a

  5. Hailin Jin. “Collaborative Feature Learning from Social Media” 05/28/2015 Talk given @ GPUTech conference 2015

  6. Noel, Cyprian & Simon Osindero. “S5552 – Transparent Parallelization of Neural Network Training” 05/28/2015 Talk given @ GPUTech conference 2015

  7. Rob Fergus. “S5581 – Visual Object Recognition using Deep Convolution Neural Networks” 05/28/2015 Talk given @ GPUTech conference 2015

  8. Rodrigo Benenson ” Machine Learning Benchmark Results: MNIST” 05/28/2015

  9. Rodrigo Benenson ” Machine Learning Benchmark Results: CIFAR” 05/28/2015

  10. Tom Simonite “Baidu’s Artificial-Intelligence Supercomputer Beats Google at Image Recognition” 05/28/2015

 

 

Article
知识星球
 
微信公众号
 
视频号