新西兰机械学习与统计的相对客观分析


在新西兰



我当然写不出这么叼B的东西,但以下是相关的一些读后感和节选。。。
原文是以下地址:
http://brenocon.com/blog/2008/12/statistics-vs-machine-learning-fight/



## 一个图表比较



## 前文提及:machine learning 大部分建基于统计的probability theory...
##.以下是我认为比较贴切的一点,特别是最后几句。。。
## 在实际中,双方的目标是不同的。。
I’ll also note that there are definitely a number of topics in ML that aren’t very related to statistics or probability. Max-margin methods: if all we care about is prediction, why bother using a probability model at all? Why not just optimize the spatial geometry instead? SVM’s don’t require a lick of probability theory to understand. (Of course probability-based approaches are huge in ML, but it’s important to remember they’re not the only game in town, and there is no necessary reason they must be.) And then there are non-traditional settings such as online learning, reinforcement learning, and active learning, where the structure of access to information is in play. There are certainly plenty of things in statistics that aren’t considered part of ML — say, regression diagnostics and significance testing. Finally, many ML problems involve large, high dimensional data and models, where computational issues are very important. For example, in statistical machine translation, alignment models are described with probability theory and fit to data, but their structure is complex enough that optimal inference is intractable, and how you do approximate inference (EM, Viterbi, beam search, etc.) is a very major issue.

这一点也相当有趣:
think this is reflective of the differences in institutional culture between CS and Stats. There’s an interesting John Langford post on part of the issue, which he calls “The Stats Handicap”. He points out that stats Ph.D.’s have a big disadvantage in the job market because statistics has an old-school journal-oriented publishing culture, so students publish much less and have less experience engaging with a research community. CS is conference-oriented — certain conferences have a higher prestige than many journals (e.g. NIPS in ML, CHI in HCI) — and this results in faster turnaround, dissemination, and collaboration. (I’ve heard others make similar comparisons between CS and psychology.) I’d expect any discipline with a larger conference emphasis to have better courses since they should reward presentation/teaching skills — or at least encourage practice — more than in journal world.

## 用machine learning的算法(当然这些很多的算法是基于统计理论的完善的)做data mining
## 以下是一些统计与data mining的看法
Another issue is the definition of statistics itself. In 1997, Jerome Friedman wrote an extremely interesting analysis of the situation: “Data Mining and Statistics: What’s the Connection?”. He points out, quite correctly, the statistical impoverishment of some common approaches to data mining. You can certainly blame statistics for not marketing its ideas well enough, or blame CS for ignoring statistics.

## 以下是一些看法:统计人都被打成这样了,怎么可以阿Q精神一下。
That is not to say statistics is not important — it’s incredibly important. He quotes Efro(boostraping(统计) 的主要贡献人)n as saying “Statistics has been the most successful information science.” However, information science is becoming bigger and broader and more exciting, thanks to computation and ever-increasing amounts of data. What should statisticians do? Friedman continues (light editing and emphasis is mine):


One view says that our field should concentrate on that small part of information science that we do best, namely probabilistic inference based on mathematics. If this view is adopted, we should become resigned to the fact that the role of Statistics as a player in the “information revolution” will steadily diminish over time.

Another point of view holds that statistics ought to be concerned with data analysis. The field should be defined in terms of a set of problems — rather than a set of tools — that pertain to data. Should this point of view ever become the dominant one, a big change would be required in our practice and academic programs.
First and foremost, we would have to make peace with computing. It’s here to stay; that’s where the data is. This has been one of the most glaring omissions in the set of tools that have so far defined Statistics. Had we incorporated computing methodology from its inception as a fundamental statistical tool (as opposed to simply a convenient way to apply our existing tools) many of the other data related fields would not have needed to exist. They would have been part of our field.

Friedman wrote this article more than 10 years ago. All his observations about the importance and increasing prevalence of data and computing power are even more true today than back then. Has the field of statistics changed? Not clear. (I’d appreciate seeing evidence to the contrary.)


## 总结,真心话,其实奥大经济系的计量经济亦有“类统计分析”的效果。。
## 类统计分析指,你会学到为什么会这样在统计系了,但其它系都在用,而且给你相关数据告诉你怎么用。。。
## 奥大的统计往往会令不少人失望,他们会期望教得像澳洲精算那样都是概率模型,或者,教得像中国那样大部分都是数学。
## 没有!奥大的统计现在主要贡献生物,医疗等自然科学。想学偏社会科学的统计,还是早登极乐,脱离苦海,选择经济,社会,心理学(奥大心理学其实更偏向于脑/认知科学。) 吧
I know that I’m interested in quantitative information science, including statistics and data analysis. Machine learning has many strengths, but it is definitely an odd way to go about analysis. But there’s a good case that statistics, as traditionally defined, is only going to have a smaller role in the future. “Data mining” sounds more relevant, but does it even exist as a coherent subject? Maybe it’s time to study a more applied statistical field like econometrics.


评论
以下是一些非电脑,非统计的学生的讨论,他们会应用到统计以及电脑,这比单方面一个统计系学生说统计好,CS学生说CS好,黄婆卖瓜的逻辑来得好.

chemometric : 化学计量学
I come from yet another closely related field: chemometrics which is usually defined as applying statistics to chemical problems/data. Never heard machine learning in the place of statistics here. But chemometrics is heavily focused on prediction (also DoE, but far less about hypothesis testing)

I don't think it is fair to exclude prediction from statistics.  

I rather see a difference in the approach (Ahmed's culture): My guess would be that machine learning is maybe more pragmatic than "pure statistics": if machine learning has an algorithm that solves a problem that's good. Statisticians tend to want thorough theoretical foundations as well. Chemometrics would also be more on the pragmatic side.
(Source: personal experience with chemometrics, where e.g. partial least squares regression has an extremely successful track of records for some 30 years now, including industrial application. Statistics now start to take the approach seriously because finally some statisticians bothered to have a look at the mathematical properties - before it was just an algorithm that happened to work very well with the chemometric data sets).

评论

.......................................

新西兰移民留学

留学签证咨询

新西兰我有旅游签证,如果我在国内申请了学生签证后,在学生签证还没有批下的情况下,我可以先用旅游签证入境吗? 评论 可以吧,只要有有效签证 评论 原则上可以 但海关可能被问 评论 ...

新西兰移民留学

civil engineering vs architecture

新西兰帮亲戚小孩问的,这两在nz哪个更吃香? 评论 都是给开发商打工的,干开发商吧 评论 差了一个或者多个等级,只要是ENGINEERING TITLE的,都不低,Arch听说最高才120K,CIVIL最高200K还多呢。 ...

新西兰移民留学

成绩单怎么翻译

新西兰一般国内的成绩单拿到这边去哪里可以翻译?奥克兰或者陶朗加都行,翻译需要多久? 评论 Department of Internal Affairs 可以翻译,我很多年前在他们那里翻译的,忘了花了多长时间。 或者 ...

新西兰移民留学

雅思考试考点

新西兰请问有人在IDP这个考点考过雅思吗?这个考点咋样?口语考官给分如何?其他考点有无推荐? 评论 IDP考过,坎大考点也考过,没感觉给分有什么区别。几次口语都是7.5,很稳定 打铁还 ...

新西兰移民留学

Master of business analytics

新西兰有今年开始读这个项目的小伙伴吗? 评论 网课吗? ??? 评论 这个专业做什么的 评论 BA最近几年很火啊大有前途 评论 我是 on campus 评论 Jobs related to this programme: Business Analytics profess ...

新西兰移民留学

PRV 时间线

新西兰以下是PRV时间线 2022-10-19 寄出材料(包含NSC). 2022-11-08 收到邮件,告知移民局开始审理 2022-11-09 扣款 2023-01-16 打电话给移民局咨询进度,告知已经SPC 2023-01-24 收到邮件,获批 评论 12-8 扣款 没有 ...

新西兰移民留学

陪工签

新西兰想让老婆过来,我是三年工签,现在申请陪工签要花费多少钱,怎样申请,谢谢指教 评论 找個中介問下比較可靠 评论 像楼上说的,找个移民中介也花不了多少M。俩夫妻一起在这里谋生 ...

新西兰移民留学

夫妻irrv,回国15年了一直未回过NZ

新西兰夫妻irrv,回国15年了一直未回过NZ。孩子国内生的13岁了。现在想回NZ长期生活,请问孩子该怎样申请签证好,国内直接pr还是先旅游签过去再申请pr? 谢谢! 评论 旅游签, 过来后住个几个 ...

新西兰移民留学

新西兰投资移民项目

新西兰新西兰移民局认可的投资项目包括哪些呢?风险如何 应该是不可以买房子的吧 评论 https://www.immigration.govt.nz/ ... t%20be%20in%20bonds,Zealand%20Debt%20Securities%20Market%20(NZDX) [size=1.6em] [size=1.6em] I ...

新西兰移民留学

求 配偶RV 担保两年费用

新西兰请教各位朋友我PR 她RV我担保她两年费用 请问 她申请学生津贴会不会影响以后转PRV. (她做过一次两年移民监,某些原因,这是第二次两年移民监)。 评论 RV没办法申请学生津贴,必须 ...

新西兰移民留学

2021RV 目前身边的朋友都批了

新西兰RT,目前我知道的那些朋友们都批啦。 真替他们高兴。不知道论坛的朋友们怎么样啦? 评论 GI中 放心我给你垫底 评论 你估计不少印度朋友吧 评论 大部分是中国人,基本都是毕业没几 ...