作者：DMIR 来自： 发表时间：2016-06-18 浏览量：1073
报告题目: Large-scale linear classification: status and challenges
报告人：国立台湾大学 Chih-Jen Lin博士、特聘教授
Chih-Jen Lin is currently a distinguished professor at the Department of Computer Science, National Taiwan University. He obtained his B.S. degree from National Taiwan University in 1993 and Ph.D. degree from University of Michigan in 1998. His major research areas include machine learning, data mining, and numerical optimization. He is best known for his work on support vector machines (SVM) for data classification. His software LIBSVM is one of the most widely used and cited SVM packages. For his research work he has received many awards,including the ACM KDD 2010 and ACM RecSys 2013 best paper awards. He is an IEEE fellow, a AAAI fellow, and an ACM fellow for his contribution to machine learning algorithms and software design. More information about him can be found at http://www.csie.ntu.edu.tw/~cjlin.
Many classification techniques such as kernel methods or decision trees are nonlinear approaches. However, linear methods of using a simple weight vector as the model remain to be very useful for many applications. By careful feature engineering and having data in a rich dimensional space, the performance may be competitive with that of using a highly nonlinear classifier. Successful application areas include document classification and computational advertising.In the first part of this talk, we give an overview of linear classification by introducing commonly used formulations. We discuss optimization techniques developed in our linear-classification package LIBLINEAR for fast training. The flexibility over kernel methods in selecting and employing optimization methods can be clearly seen in our discussion.In the second part of the talk, we select a few examples to demonstrate how linear classification is practically applied.They range from small to big data. The third part of the talk discusses issues in applying linear classification for big-data analytics. We particularly demonstrate our recent work on multi-core and distributedlinear classification.