【计算机科学】【2006】支持向量机设计与训练
本文为澳大利亚墨尔本大学(作者:Alistair Shilton)的电子工程博士论文,共364页。
20世纪90年代初,Vapnik首先提出的支持向量机(SVMs)在众多领域得到了广泛应用。作为Vapnik首先在分类器中引入的二进制向量机(SVM),SVM已经被用于人脸识别、说话人识别和文本分类等。作为回归器,SVM已被用于控制系统和通信领域。本文除了介绍二进制分类和回归的标准SVM公式外,还介绍一种新的SVM形式,即带有不等式的回归。这个扩展包括了二进制分类和回归,在扩展为一般形式时也减少了工作量,并且还提供了两种公式之间潜在联系的理论分析。
这种新的SVM形式扩展到覆盖一般代价函数和一般的管收缩函数。人们特别关注二次代价函数,它呈现出许多在传统公式中没有的令人愉悦的特性。本文提出的二次v-SVR形式提供了最小二乘SVM和标准SVM的有用特征的新组合。可以认为,SVM方法的广义性既是一种祝福又是一种诅咒。特别地,内核函数和各种参数所允许的极大自由度是,可以构造支持向量机以适应几乎任何情况。然而,这方面的缺点是选择这些参数可能是一个漫长而费时的过程。
已经尝试过使用具有不同成功程度的不同性能界限来解决这个问题(例如,文献[22])。在这些文献当中,Scholkopf和Smola的工作代表了学者们的一种努力,弥补了新兴领域的支持向量回归与丰富理论研究之间的空隙,很好地探索了极大似然估计的领域。本文将此方法扩展到在一系列条件下的一组非常通用的SVM公式,并探讨了所涉及的理论和实践问题。
随着支持向量机在90年代后期在学术界和工业界得到广泛认可,出现了增量式训练算法,它允许在训练集为非常数的设置中应用支持向量机,例如,自适应控制系统或时变通信信道均衡。本论文的后半部分对这一问题进行了研究。
Support early 90s [28], have found applications ina wide variety of areas. As binary Vector Machines (SVMs), which were firstintroduced by Vapnik in classifiers, SVMs have been used in face recognition[62], speaker identification [69] and text categorisation [49], to name a few.As regressors, they have been used in control systems [30] [89] andcommunications [77] [24], amongst others.
In this thesis I introduce a new and novel form of SVM known as regression with inequalities,in addition to the standard SVM formulations of binary classification andregression. This extension encompasses both binary classification andregression, reducing the workload when extending the general form; and alsoprovides theoretical insight into the underlying connections between the twoformulations. This new SVM formulation is extended to cover general costfunctions and general tube shrinking functions. Particular attention has beenpaid to the quadric cost functions, which present a number of pleasingproperties not present in the traditional formulations. The quadric ν-SVR formulationpresented provides a novel combination of the useful features of theleast-squares SVM and the standard SVM.
It maybe argued that the sheer generality of the SVM methodology is both a blessingand a curse. In particular, the great freedom allowed by the kernel function andthe various parameters is that one may construct an SVM to suit pretty much anysituation. The downside of this, however, is that choosing these parameters canbe a long and time consuming process. Many attempts have been made to tacklethis problem using different performance bounds with varying degrees of success(for example, [22]). Of these, the work of Sch¨olkopf and Smola [76] standsapart, in-so-far as it represents an attempt to bridge the gap between theemerging field of support vector regression and the theoretically rich, wellexplored area of maximum-likelihood estimation. This thesis extends thisapproach significantly to cover a very general set of SVM formulations under arange of conditions, and explores the theoretical and practical issues involved.
AsSVMs gained wider acceptance in the academic and industrial communities duringthe late 90s, there arose for incremental training algorithms, which would allowapplication of SVMs in setting where the training set was non-constant, for example,adaptive control systems or equalization of time-varying communications channels.Our work on this problem is presented in the later chapters of this thesis.
英文原文下载地址:
http://page2.dfpan.com/fs/dlcj82217291c655802/
更多精彩文章请关注微信号: