南风金融网 - 中原最权威行情财经门户

热门关键词:  证券  期货  价格行情  财经  股票
热门: 北大数字金融公开课内容回 【MLinEcon文献推送第15期 对付勒索软件的利器:二级 【Econ. for COVID-19文献3 【Econ. for COVID-19文献6 许宪春 | 中国仍为世界最大

《机器学习与经济学实证应用》 教学大纲

来源:南风金融网 作者:南风金融网 人气: 发布时间:2020-08-28 13:04:59

? ? ? 按:2019年秋季学期,在上海财经大学探索性地开设了一门机器学习的选修课程,有20多个学生选课或旁听,11周的课程没有完成教学计划,课程一直持续到期末,而且在2020年春季学期和暑假,还在线组织了学生继续阅读机器学习相关文献,学习Python和机器学习更高级技能。经过整整一年的探索,才将稍完整的机器学习及其相关应用梳理一遍。今年将延续这一模式,11周的规划课程,最终将成为持续一整年的学习计划。加量不加价,欢迎硕博研究生和高年级本科生选课和旁听(线下授课),但需具备Python编程基础的计量经济学基础。


《机器学习与经济学实证应用》

教学大纲

(2020年秋季)

?

一、课程名称

机器学习与经济学实证应用(Machine Learning Methods in Economics)

?

二、授课教师

? ?上海财经大学公共经济与管理学院投资系副教授

办公室:凤凰楼521E-mailguo.feng@mail.shufe.edu.cn

电话:021-65903586

个人主页:http://www.guof1984.net/

公众号:经济数据勘探小分队(guofeng0406

答疑时间:周二 13:3017:00 或事先预约

助教:欲选课或旁听学生,请添加郑同学微信(15249263672

?

三、课程类别

专业选修课

?

四、面向对象

博士研究生及硕士研究生

?

五、时间地点

时间:周二18:00-20:3598日第一次上课),教室:四教109

???????????????????????

六、教学课时数

3 *11=33课时,2学分

?

七、预备知识

数学分析;概率统计;中级计量经济学;Python语言基础


八、教学目的

大数据(Big Data)已经成为经济金融活动的重要基础和各学科关注的重点。本课程的目的是讲述机器学习的基本原理及其在经济学大数据分析中的应用,使学生能够了解机器学习的基本理念,掌握有监督学习、无监督学习和自然语言处理代表性算法的基本原理,并能通过Python语言实现这些算法,并通过研读使用机器学习进行实证分析的经济学学术论文,可以将本课程学习到的机器学习原理和算法应用到经济学实证分析当中。

?

九、考核形式

课堂出勤(20%)、文献汇报(30%)、读书笔记(50%


十、教学大纲

?第一部分:授课部分(21课时)


1 ?机器学习原理及其对经济学实证研究的启示(3课时)

2 ?有监督学习之K近邻和贝叶斯分类原理与实操(3课时)

3 ?自然语言处理之分词、TFIDF、文本相似度原理与实操(3课时)

4 ?有监督学习之决策树、随机森林、支持向量机原理与实操(3课时)

5 ?无监督学习之聚类与降维原理与实操(3课时)

6 ?深度学习之神经网络原理与实操(3课时)

7 ?自然语言处理之LDAWord2vecBert模型原理与实操(3课时)

?

第二部分:机器学习经济学论文选读(12课时)

选课和旁听学生从以下文献列表选择文献(或自选自己关注领域的文献),进行研读汇报。每人约45分钟,若课时不够,课程结束后将举行“加时赛”。


十一、研读文献

(文末注*文献为课程汇报时可以认领的文献)

?

1、算法教材
[1]???? Burkov A., The Hundred-Page Machine LearningBook. Quebec City, Can.: Andriy Burkov, 2019. (入门)
[2]???? James, G., Witten, D., Hastie, T., and Tibshirani,R., An Introduction to Statistical Learning, Springer, 2013. (中等)
[3]???? Hastie, T., Tibshirani, R., and Friedman, F., TheElements of Statistical Learning: Data Mining, Inference, and Prediction, SecondEdition, Springer, 2017.(较难)

2、综述与概论

[4]???? Abadie, A., and Kasy, M., “Choosing amongRegularized Estimators in Empirical Economics: The Risk of Machine Learning”, TheReview of Economics and Statistics, 2018, 101(5), 743-762. (较难)
[5]???? Athey, S. “Beyond Prediction: Using Big Data forPolicy Problems”, Science, 2017, 355(6324), 483-485. (初级)
[6]???? Athey, S., “The Impact of Machine Learning on Economics”,Chapter in NBER book The Economics of Artificial Intelligence: An Agenda, 2019,p.507-547, edited by Agrawal, A., Gans, J., and Goldfarb, A.(较难)
[7]???? Athey, S., and Imbens, G., “Machine LearningMethods Economists Should Know About”, Annual Review of Economics, 2019,11(1), 685-725.(较难)
[8]???? Athey, S., and Luca, M., “Economists (and Economics)in Tech Companies”, Journal of Economic Perspectives, 2019, 33(1), 209-230.(初级)
[9]???? Berger, J., Humphreys, A., Ludwig, S., Moe, W., Netzer,O., and Schweidel, D., “Uniting the Tribes: Using Text for Marketing Insight”, Journalof Marketing, 2020, 84(1), 1-25.(初级)
[10]? Glaeser L. E., Kominers D. S., Luca, M., and Naik, N.,“Big Data and Big Cities:The Promises and Limitations of Improved Measures ofUrban Life”, Economic Inquiry, 2018, 56(1), 114-137. (中等)
[11]? Gentzkow, M., Kelly,T. B. and Taddy, M., “Text asdata”, Journal of Economic Literature, 2019, 57 (3), 535-74..(较难)
[12]? Guo, R., Cheng, L., Hahn, R., and Liu, H., “A Surveyof Learning Causality with Data: Problems and Methods”, arXiv:1809.09337, arXiv.org,2020. (较难)
[13]? Igami, M., “Artificial Intelligence as StructuralEstimation: Economic Interpretations of Deep Blue, Bonanza, and AlphaGo”, arXiv:1710.10967, arXiv.org, 2018. (较难)
[14]? Kleinberg, J., Ludwig, J., Mullainathan, S., andObermeyer, Z., “Prediction Policy Problems”, American Economic Review, 2015,105(5), 491-95. (中等)
[15]? Loughran, T., and McDonald, B., "TextualAnalysis in Accounting and Finance: A Survey." Journal of Accounting Research,2016, 54(4), 1187-1230. (中等)
[16]? Mullainathan, S., and Spiess, J., “Machine Learning: An Applied Econometric Approach”, Journal ofEconomic Perspectives, 2017, 31(2), 87-106.(中等)
[17]? Varian, H. R., “Big Data: New Tricks for Econometrics”,Journal of Economic Perspectives, 2014, 28(2), 3-28. (中等)
[18]? 黄乃静、于明哲,《机器学习对经济学研究的影响研究进展》,《经济学动态》,2018年第7期,第115-129页。(初级)
[19]? 沈艳、陈赟、黄卓,《文本大数据分析在经济学和金融学中的应用:一个文献综述》,《经济学季刊》,2019年第4期,第1153-1186页。(初级)
[20]? 王芳、王宣艺、陈硕,《经济学研究中的机器学习:回顾与展望》,《数量经济技术经济研究》,2020年第4期,第146-164页。(初级)
?
3、机器学习与因果推断

[21]? Athey, S., and Imbens, G., “Machine Learning Methodsfor Estimating Heterogeneous Causal Effects”, Statistics, 2015, 113 (27),7353-7360. (因果森林,较难*
[22]? Athey, S., Tibshirani, J., and Wager, S.,"Generalized Random Forests", Annals of Statistics, 2019, 47(2),1148-1178. (因果森林,较难*
[23]? ?Athey, S.,and Wager, S., “Estimating Treatment Effects with Causal Forests: An Application”,Working Paper, 2019. (因果森林,较难*
[24]? Belloni, A., Chen D., Chernozhukov, V., and Hansen,C., “Sparse Models and Methods for Optimal Instruments with an Application toEminent Domain”, Econometrica, 2012, 80(6), 2369-2429.Lasso方法挑选工具变量,较难*
[25]?Branson, Z., Rischard, M., Bornn, L., Miratrix, L., “ANonparametric Bayesian Methodology for Regression Discontinuity Designs”, Journalof Statistical Planning and Inference, 2019, 202, 14-30. (构建对照组,断点回归,中等*)
[26]? Brodersen, K. H., Gallusser, F., Koehler, J., Remy,N., and Scott, S., “Inferring Causal Impact using Bayesian StructuralTime-series Models”, Annals of Applied Stats, 2015, 9(1), 247-274. (贝叶斯结构时间序列模型,中等*)

[27]? Chernozhukov, Victor, Mert Demirer, Esther Duflo,and Ivan Fernandez-Val. “Generic Machine Learning Inference on Heterogenous TreatmentEffects in Randomized Experiments.” NBER Working Paper 24678, 2018. (异质性因果,中等*
[28]? Chin, S., Kahn, E. M., and Moon R. H., “Estimating theGains from New Rail Transit Investment: A Machine Learning Tree Approach”, RealEstate Economics, 2018, 1-29. (异质性,决策树,中等*
[29]?Doudchenko, N., and Imbens, G.,W., “Balancing, Regression, Difference-In-Differences and Synthetic Control Methods:A Synthesis”, NBER Working Paper No. 22791, 2016.(构建对照组,DID、合成控制,中等*

[30]? Egami, N., Fong, C., Grimmer, J., Roberts, M., and Stewart,M., “How to Make Causal Inferences Using Texts,” arXiv:1802.02163, 2018. (文本数据因果推断,中等*)
[31]?Gilchrist, D. S., and Sands,E. G., “Something to Talk About: Social Spillovers in Movie Consumption”, Journalof Political Economy, 2016, 124(5), 339-1382. (Lasso挑选工具变中等*)
[32]? Guo,R., Cheng, Lu, Li, J., Hahn, R., and Liu, H., “A Survey of Learning Causalitywith Data: Problems and Methods”, ACM Trans, 2020, 9(4), Articale 39. (综述,较难)
[33]?Hansen, C., and Kozbur, D., “Instrumental Variables Estimationwith Many Weak Instruments using Regularized JIVE”, Journal of Econometrics,2014, 182(2), 290-308.(岭回归挑选工具变量,较难*
[34]? Herlands,W., McFowland, E., Wilson. A., and Neil, D., “Automated Local RegressionDiscontinuity Design Discovery”, KDD '18: Proceedings of the 24th ACM SIGKDDInternational Conference on Knowledge Discovery & Data MiningJuly 2018Pages 1512–1520https://doi.org/10.1145/3219819.3219982. (构建对照组,断点回归,中等*

[35]? Kinn, D., “SyntheticControl Methods and Big Data", ?arXiv:1803.00096,2018. (构建对照组,合成控制,中等*
[36]???Kleinberg, J., Lakkaraju, H. Leskovec, J., Ludwig, J., and Mullainathan, S.,“Human Decisions and Machine Predictions”, Quarterly Journal of Economics,2018, 133(1), 237-293. (异质性因果,LASSO-logit回归,中等*
[37]?Knaus, M., Lechner, M., and Strittmatter, A., “Heterogeneous EmploymentEffects of Job Search Programmes: A Machine Learning Approach”, Journal ofHuman Resources, 2020, forthcoming. (异质性,Lasso, 中等*)
[38]? Knittel C. R., and Stolper, S., “Using Machine Learning to TargetTreatment: The Case of Household Energy Use”, NBER Working Papers, No.26531,2019.(异质性,因果树, 中等*
[39]? Kreif, N., and DiazOrdaz, K., “Machine Learning in PolicyEvaluation: New Tools for Causal Inference”, ??????? arXiv:1903.00402,2019.(综述,中等)
[40]? Kreif, N., Grieve, R., Díaz, I., and Harrison, D.,“Evaluation of the Effect of a Continuous Treatment: A Machine LearningApproach with an Application to Treatment for Traumatic Brain Injury”, HealthEconomics, 2015, 24(9), 1213-1228. (构建对照组,匹配,中等*
[41]? Li, S.,Vlassis, N., Kawale, J., and Fu, Y., “Matching via Dimensionality Reduction forEstimation of Treatment Effects in Digital Marketing Campaigns”, Proceedingsof the Twenty-Fifth International Joint Conference on Artificial Intelligence,2016, 3768-3774. (构建对照组,倾向评分匹配,中等*)
[42]? Mozer,R., Miratrix, L., Kaufman, A. R., and Anastasopoulos, L. J., “Matching withText Data: An Experimental Evaluation of Methods for Matching Documents and ofMeasuring Match Quality”, arXiv preprint, arXiv:1801.00644v7, 2019, Availableat https://arxiv.org/abs/1801.00644v7.(构建对照组,文本数据匹配,中等*)
[43]? Mühlbach, N. N., “Tree-based Synthetic ControlMethods: Consequences of moving the US Embassy”,Institut for ?konomi, Aarhus Universitet. CREATES Research Papers, No.2020-04, 2020.(构建对照组,用随机森林进行合成控制,中等*)
[44]? Peysakhovich, A., Eckles, D., “Learning Causal EffectsFrom Many Randomized Experiments Using Regularized Instrumental Variables”,arXiv:1701.01140v3, 2017.(多重随机试验,中等*
[45]? Pham,T., and Shen, Y., “A Deep Causal Inference Approach to Measuring the Effects ofForming Group Loans in Online Non-profit Microfinance Platform”, arXiv preprint, arXiv: 1706.02795, 2017, Available at https://arxiv.org/abs/1706.02795.(神经网络估算反事实,文本大数据,中等*
[46]?Qiu, Y., Chen, X.; Shi, W., “Impacts of Social andEconomic Factors on the Transmission of Coronavirus Disease 2019 (COVID-19) inChina”, Journal of Population Economics, forthcoming. (用LASSO来挑选工具变量,初级*
[47]?Rischard, M., Branson, Z., Miratrix, L., Bornn, L., “ABayesian Nonparametric Approach to Geographic Regression Discontinuity Designs:Do School Districts Affect NYC House Prices?”, arXiv preprint, arXiv:1807.04516, 2018. Available at https://arxiv.org/abs/ 1807.04516. (构建对照组,地理断点回归,中等*
[48]?Robertsy, M., Stewartz, B., and Nielsen, R., “Adjustingfor Confounding with Text Matching”, American Journal of Political Science, 2020,forthcoming.(构建对照组,文本大数据的匹配问题, 中等*
[49]?Sommervoll, A., and Sommervoll, D. E., “Learning from Manor Machine: Spatial Fixed Effects in Urban Econometrics”, Regional Scienceand Urban Economics, 2019, 77, 239-252. (机器学习算法更好地汇总微观数据)
[50]? WagerS., and Athey, S., “Estimation and Inference of HeterogeneousTreatment Effects using Random Forests”, Journal of the American StatisticalAssociation, 2018, 113(253), 1228-1242.(因果森林,较难*
[51]? 郭峰、陶旭辉,《机器学习与社会科学中的因果关系:一个文献综述》,上海财经大学公共经济与管理学院工作论文,2020。(综述,中等)

4、经济预测
[52]? Basuchoudhary, A., Bang, J., and Sen, T. MachineLearning Techniques in Economics New Tools for Predicting Economic Growth, Springer,2017. (多算法,中等)
[53]? Bj?rkegren, D., and Grissen, D., “Behavior Revealedin Mobile Phone Usage Predicts Loan Repayment”, Policy Research Working PaperSeries 9074, The World Bank, 2019. (随机森林、Logistic回归,初级)
[54]? Goel, S., Rao, M. J., and Shroff, R., “Precinct or Prejudice?Understanding Racial Disparities in New York City’s Stop-and-Frisk Policy”, TheAnnals of Applied Statistics, 2016, 10(1), 365-394. logistic回归,中等*
[55]? Gu, S., Kelly, B., and Xiu, D., "Empirical AssetPricing via Machine Learning”, The Review of Financial Studies, 2020,33(5), 2223-2273. (资产定价,中等*
[56]? Kang, J. S., Kuznetsova, P., Luca, M., Choi, Y., “WhereNot to Eat? Improving Public Policy by Predicting Hygiene Inspections Using OnlineReviews”, Proceedings of the 2013 Conference on Empirical Methods in NaturalLanguage Processing, pages 1443–1448,(支持向量机,初级)
[57]? Jean, N., Burke, M., Xie, M., Davis, W. M., Lobell,D. B., Ermon, S., “Combining Satellite Imagery and Machine Learning to Predict Poverty”,Science. 2016, 353(6301):790-794. (卫星数据,中等)
[58]? Plakandaras, V., Papadimitriou, T., Gogas, P., “ForecastingDaily and Monthly Exchange Rates with Machine Learning Techniques”, Journalof Forecasting, 2015, 34(7), 560-573. (时间序列,支持向量机、神经网络,中等)

5、变量生成
[59]? Burgess, R., Hansen, M., Olken, A. B., Potapov, P.,and Sieber, S., “The Political Economy of Deforestation in the Tropics”, TheQuarterly Journal of Economics, 2012, 127(4), 1707-1754.Tree-bagging算法,中等*
[60]? Dubé, J., and Misra, S., “Scalable Price Targeting”,NBER Working Paper No. 23775, ?2017.Lasso回归,中等*
[61]? Galdo, V., Li, Y., Rama, M., “Identifying UrbanAreas by Combining Human Judgment and Machine Learning: An Application to India”,Journal of Urban Economics, 2019, forthcoming. Lasso和随机森林,校准官方城市化率,中等*
[62]? Gründler, K., and Krieger, T., "Democracy and Growth:Evidence from a Machine Learning Indicator", European Journal of PoliticalEconomy, 2016, 45, 85-107.(用ML测度Democracy,中等*
[63]?Edelman, B., Luca, M.,Svirsky, D., “Racial Discrimination in the Sharing Economy: Evidence from aField Experiment”, American Economic Journal: Applied Economics, 2017,9(2), 1-22. (图片中识别性别,中等*

6、文本相似度
[64]? Iaria, A., Schwarz, C., and Waldinger, F., “FrontierKnowledge and Scientific Production: Evidence from the Collapse of InternationalScience”, Quarterly Journal of Economics, 2018, 133(2), 927-991.(文本相似度计算,较难*
[65]? Kelly, B., Papanikolaou, D., Seru, A., and Taddy,M., “Measuring Technological Innovation over the Long Run”, NBER Working PaperNo. 25266, 2018. (文本相似度计算,中等*).

7、文本情感分析
[66]? Algaba, A., Ardia, D., Bluteau, K., Borm, S., and Boudt,K., “Econometrics Meets Sentiment: An Overview of Methodology and Applications”,Journal of Economic Surveys, 2020, 34(3), 512-547. (综述,较难*)
[67]? Antweiler, W., and Frank, M. Z., “Is All That TalkJust Noise? The Information Content of Internet Stock Message Boards”, The Journalof Finance, 2004, 59(3), 1259-1294.(朴素贝叶斯算法,中等*
[68]? Li, J., Chen, Y., Shen, Y., Wang, J. and Huang, X.,“Measuring China's Stock Market Sentiment”, Working Paper, 2019. (卷积神经网络,中等*)
[69]? Cookson, J. A. and Niessner, M., “Why Don't We Agree?Evidence from a Social Network of Investors”, Journal of Finance, 2020, 75(1),pp.173-228. (中等*)

8LDA主题模型
[70]? Anders, T. L., “Words are the New Numbers: A NewsyCoincident Index of the Business Cycle”, Journal of Business & EconomicStatistics, 2018, 1-35. (经济预测,中等*)
[71]? Bandiera, O., Hansen, S., Prat, A., and Sadun, R., “CEOBehavior and Firm Performance”, Journal of Political Economy, 2020, 128(4),1325–1369. (中等*)
[72]? Hansen, S., McMahon, M., and Prat, A., “Transparencyand Deliberation Within the FOMC: A Computational Linguistics Approach”, QuarterlyJournal of Economics, 2018, 133(2), 801–870. (央行沟通,中等*
[73]? Mueller, H., and Rauh, C., “Reading Between theLines: Prediction of Political Violence Using Newspaper Text, American PoliticalScience Review, 2018, 112(2), 358-375. LDA主题模型,较难*
[74]? Larseny, H. V., “Components of Uncertainty”, WorkingPaper, Norges Bank Research, 2017, No.5. (中等*
[75]? Wong, F., Wong, T. J., Zhang, T., “Politics andSpecificity of Information: Evidence from Financial Analysts’ Earnings Forecastsin a Relationship-based Economy”, Working Paper, 2018. (中等*)
[76]? 王靖一、黄益平,《金融科技媒体情绪的刻画与对网贷市场的影响》,《经济学季刊》,2018年第17卷第4期,第1623-1650页。(多算法,较难*

9、文本可读性
[77]? 陈霄、叶德珠、邓洁,《借款描述的可读性能够提高网络借款成功率吗》,《中国工业经济》,2018年第3期,第174-192页。(文本可读性,中等*
[78]? 丘心颖、郑小翠、邓可斌,《分析师能有效发挥专业解读信息的作用吗?——基于汉字年报复杂性指标的研究》,《经济学季刊》,2016年第15卷第4期,第1483-1506页。(文本可读性,中等*

10、无监督学习
[79]? Athey, S., and Mobius, M., and Pál, J., “The Impactof Aggregators on Internet News Consumption”, Stanford University GraduateSchool of Business Research Paper No. 17-8, 2017. (数据降维,中等*)
[80]? Qin, B., Stromberg, D., and Wu, Y., “Media Bias inChina”, American Economic Review, 2018, 108(9), 2442-2476. (主成分分析,中等*

11、其他自然语言处理
[81]? Baker, S., Bloom N., and Davis, S., “MeasuringEconomic Policy Uncertainty”, Quarterly Journal of Economics,2016, 131(4), 1593-1636.(政策不确定性,中等*
[82]? Gentzkow, M., and Shapiro, J., “What Drives MediaSlant? Evidence from U.S. Daily Newspapers”, Econometrica, 2010, 78(1), 35-71.(媒体偏见,中等*
[83]? Gentzkow, M., Shapiro,J., and Taddy, M., "MeasuringGroup Differences in HighDimensional Choices: Method and Application to CongressionalSpeech", Econometrica, 2019, 87(4), 1307-1340. (国会演讲,中等*
[84]? Hassan, T., Hollander, S., van Lent, L., Tahoun, A.,“Firm-Level Political Risk: Measurement and Effects", The QuarterlyJournal of Economics, 2019, 134(4), 2135-2202. (政治风险,中等*
[85]? Hillert, A., Jacobs, H., and Müller, S., “JournalistDisagreement”, Journal of Financial Markets, 2018, 41, 57-76. (媒体分歧,词典法,中等*)
[86]? Hoberg G., and Phillips, G., "Text-BasedNetwork Industries and Endogenous Product Differentiation", Journal ofPolitical Economy, 2016, 124(5), 1423-1465. (用文本对行业重新分类,中等*)
[87]? Li, K., Mai, F., Shen, R., and Yan, X., “MeasuringCorporate Culture Using Machine Learning”, Review of Financial Studies, 2020,forthcoming. (词嵌入,中等*)
[88]? Manela, A., and Alan Moreira, A., “News Implied Volatilityand Disaster Concerns”, Journal of Financial Economics, 2017, 123, 137-162.(隐含波动率,中等*
[89]? Shapiro, A., H., Sudhof, M., and Wilson, D., “MeasuringNews Sentiment”, Working Paper 2017-01, Federal Reserve Bank of San Francisco,2020. (媒体情绪,中等*
[90]? Wu, A., “Gender Bias in Rumors Among Professionals:An Identity-based Interpretation”, Review of Economics and Statistics, onlinepublication, 2019. (文本大数据,中等*
Zhong,W., and Chan J. T., “Reading China: Predicting Policy Change with Machine Learning”,AEI Economics Working Paper Series, 2018-11, 2018.(报纸隐含政策变动指数,词嵌入与循环神经网络;中等*



责任编辑:南风金融网
首页 | 财经资讯 | 金融理财 | 价格行情

豫ICP备12016580号  技术支持:南风金融网

电脑版 | 移动版