• 基础知识《一》


    太棒了 又收集到一些好东西---2014-11-05

    1 http://www.huxiu.com/article/6550/1.html

    http://blog.csdn.net/lzt1983/article/details/7696578

    3 https://code.google.com/p/recsyscode/

    http://www.lifecrunch.biz/

    http://iamcaihuafeng.blog.sohu.com/150048878.html

    6 我爱自然语言

    2012届KDD Cup

    Track1任务:社交网络中的个性化推荐系统

    根据腾讯微博中的用户属性(User Profile)、SNS社交关系、在社交网络中的互动记录(retweet、comment、at)等,以及过去30天内的历史item推荐记录,来预测接下来最有可能被用户接受的推荐item列表

    Track2任务:搜索广告系统的pTCR点击率预估

    提供用户在腾讯搜索的查询词(query)、展现的广告信息(包括广告标题、描述、url等),以及广告的相对位置(多条广告中的排名)和用户点击情况,以及广告主和用户的属性信息,来预测后续时间用户对广告的点击情况

    数据集:http://www.kddcup2012.org/c/kddcup2012-track1/data

    论文:http://www.kddcup2012.org/workshop

    2011届KDD Cup

    Track1任务:音乐评分预测

    根据用户在雅虎音乐上item的历史评分记录,来预测用户对其他item(包括歌曲、专辑等)的评分和实际评分之间的差异RMSE(最小均方误差)。同时提供的还有歌曲所属的专辑、歌手、曲风等信息

    Track2任务:识别音乐是否被用户评分

    每个用户提供6首候选的歌曲,其中3首为用户已评分数据,另3首是该用户未评分,但是出自用户中整体评分较高的歌曲。歌曲的属性信息(专辑、歌手、曲风等)也同样提供。参赛者给出二分分类结果(0/1分类),并根据整体准确率计算最终排名

    数据集:http://kddcup.yahoo.com/datasets.php#

    论文:http://kddcup.yahoo.com/workshop.php

    2009届KDD Cup

    法国电信运营商Orange的大规模数据中,积累了大量客户的行为记录。竞赛者需要设计一个良好的客户关系管理系统(CRM),用快速、稳定的方法,预测客户三个维度的属性,包括:1、忠诚度:用户切换运营商的可能性(Churn);2、购买欲:购买新服务的可能性(Appetency);3、增值性:客户升级或追加购买高利润产品的可能性(Up-selling)。结果用AUC曲线来评估

    数据集:http://www.sigkdd.org/kddcup/index.php

    论文:http://jmlr.csail.mit.edu/proceedings/papers/v7/

    附上我收集的资料链接,格式基本按照‘URL+资料名称+出现在书中的页数’,某些链接可能需要你翻过一道‘墙’,某些重复引用的我就没重复贴上链接了 
       
       
      http://en.wikipedia.org/wiki/Information_overload 
       P1 
       
      http://www.readwriteweb.com/archives/recommender_systems.php 
      (A Guide to Recommender System) P4 
       
      http://en.wikipedia.org/wiki/Cross-selling 
       (Cross Selling) P6 
       
      http://blog.kiwitobes.com/?p=58 , http://stanford2009.wikispaces.com/ 
      (课程:Data Mining and E-Business: The Social Data Revolution) P7 
       
       http://thesearchstrategy.com/ebooks/an%20introduction%20to%20search%20engines%20and%20web%20navigation.pdf 
      (An Introduction to Search Engines and Web Navigation) p7 
       
      http://www.netflixprize.com/ 
      p8 
       
      http://cdn-0.nflximg.com/us/pdf/Consumer_Press_Kit.pdf 
       p9 
       
       http://stuyresearch.googlecode.com/hg-history/c5aa9d65d48c787fd72dcd0ba3016938312102bd/blake/resources/p293-davidson.pdf 
      (The Youtube video recommendation system) p9 
       
       http://www.slideshare.net/plamere/music-recommendation-and-discovery 
      ( PPT: Music Recommendation and Discovery) p12 
       
      http://www.facebook.com/instantpersonalization/ 
      P13 
       
       http://about.digg.com/blog/digg-recommendation-engine-updates 
       (Digg Recommendation Engine Updates) P16 
       
       http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en//pubs/archive/36955.pdf 
       (The Learning Behind Gmail Priority Inbox)p17 
       
      http://www.grouplens.org/papers/pdf/mcnee-chi06-acc.pdf 
      (Accurate is not always good: How Accuracy Metrics have hurt Recommender Systems) P20 
       
      http://www-users.cs.umn.edu/~mcnee/mcnee-cscw2006.pdf 
       (Don’t Look Stupid: Avoiding Pitfalls when Recommending Research Papers)P23 
       
      http://www.sigkdd.org/explorations/issues/9-2-2007-12/7-Netflix-2.pdf 
       (Major componets of the gravity recommender system) P25 
       
      http://cacm.acm.org/blogs/blog-cacm/22925-what-is-a-good-recommendation-algorithm/fulltext 
      (What is a Good Recomendation Algorithm?) P26 
       
      http://research.microsoft.com/pubs/115396/evaluationmetrics.tr.pdf 
       (Evaluation Recommendation Systems) P27 
       
      http://mtg.upf.edu/static/media/PhD_ocelma.pdf 
      (Music Recommendation and Discovery in the Long Tail) P29 
       
      http://ir.ii.uam.es/divers2011/ 
      (Internation Workshop on Novelty and Diversity in Recommender Systems) p29 
       
      http://www.cs.ucl.ac.uk/fileadmin/UCL-CS/research/Research_Notes/RN_11_21.pdf 
      (Auralist: Introducing Serendipity into Music Recommendation ) P30 
       
      http://www.springerlink.com/content/978-3-540-78196-7/#section=239197&page=1&locus=21 
      (Metrics for evaluating the serendipity of recommendation lists) P30 
       
      http://dare.uva.nl/document/131544 
      (The effects of transparency on trust in and acceptance of a content-based art recommender) P31 
       
      http://brettb.net/project/papers/2007%20Trust-aware%20recommender%20systems.pdf 
       (Trust-aware recommender systems) P31 
       
      http://recsys.acm.org/2011/pdfs/RobustTutorial.pdf 
      (Tutorial on robutness of recommender system) P32 
       
      http://youtube-global.blogspot.com/2009/09/five-stars-dominate-ratings.html 
       (Five Stars Dominate Ratings) P37 
       
      http://www.informatik.uni-freiburg.de/~cziegler/BX/ 
      (Book-Crossing Dataset) P38 
       
      http://www.dtic.upf.edu/~ocelma/MusicRecommendationDataset/lastfm-1K.html 
      (Lastfm Dataset) P39 
       
      http://mmdays.com/2008/11/22/power_law_1/ 
      (浅谈网络世界的Power Law现象) P39 
       
      http://www.grouplens.org/node/73/ 
      (MovieLens Dataset) P42 
       
      http://research.microsoft.com/pubs/69656/tr-98-12.pdf 
      (Empirical Analysis of Predictive Algorithms for Collaborative Filtering) P49 
       
      http://vimeo.com/1242909 
      (Digg Vedio) P50 
       
      http://glaros.dtc.umn.edu/gkhome/fetch/papers/itemrsCIKM01.pdf 
       (Evaluation of Item-Based Top-N Recommendation Algorithms) P58 
       
      http://www.cs.umd.edu/~samir/498/Amazon-Recommendations.pdf 
      (Amazon.com Recommendations Item-to-Item Collaborative Filtering) P59 
       
      http://glinden.blogspot.com/2006/03/early-amazon-similarities.html 
       (Greg Linden Blog) P63 
       
      http://www.hpl.hp.com/techreports/2008/HPL-2008-48R1.pdf 
      (One-Class Collaborative Filtering) P67 
       
      http://en.wikipedia.org/wiki/Stochastic_gradient_descent 
      (Stochastic Gradient Descent) P68 
       
      http://www.ideal.ece.utexas.edu/seminar/LatentFactorModels.pdf 
       (Latent Factor Models for Web Recommender Systems) P70 
       
      http://en.wikipedia.org/wiki/Bipartite_graph 
      (Bipatite Graph) P73 
       
      http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=4072747&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D4072747 
      (Random-Walk Computation of Similarities between Nodes of a Graph with Application to Collaborative Recommendation) P74 
       
      http://www-cs-students.stanford.edu/~taherh/papers/topic-sensitive-pagerank.pdf 
      (Topic Sensitive Pagerank) P74 
       
      http://www.stanford.edu/dept/ICME/docs/thesis/Li-2009.pdf 
      (FAST ALGORITHMS FOR SPARSE MATRIX INVERSE COMPUTATIONS) P77 
       
      https://www.aaai.org/ojs/index.php/aimagazine/article/view/1292 
       (LIFESTYLE FINDER: Intelligent User Profiling Using Large-Scale Demographic Data) P80 
       
      http://research.yahoo.com/files/wsdm266m-golbandi.pdf 
      ( adaptive bootstrapping of recommender systems using decision trees) P87 
       
      http://en.wikipedia.org/wiki/Vector_space_model 
      (Vector Space Model) P90 
       
      http://tunedit.org/challenge/VLNetChallenge 
      (冷启动问题的比赛) P92 
       
      http://www.cs.princeton.edu/~blei/papers/BleiNgJordan2003.pdf 
       (Latent Dirichlet Allocation) P92 
       
      http://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence 
       (Kullback–Leibler divergence) P93 
       
      http://www.pandora.com/about/mgp 
      (About The Music Genome Project) P94 
       
      http://en.wikipedia.org/wiki/List_of_Music_Genome_Project_attributes 
      (Pandora Music Genome Project Attributes) P94 
       
      http://www.jinni.com/movie-genome.html 
      (Jinni Movie Genome) P94 
       
      http://www.shilad.com/papers/tagsplanations_iui2009.pdf 
       (Tagsplanations: Explaining Recommendations Using Tags) P96 
       
      http://en.wikipedia.org/wiki/Tag_(metadata) 
      (Tag Wikipedia) P96 
       
      http://www.shilad.com/shilads_thesis.pdf 
      (Nurturing Tagging Communities) P100 
       
      http://www.stanford.edu/~morganya/research/chi2007-tagging.pdf 
       (Why We Tag: Motivations for Annotation in Mobile and Online Media ) P100 
       
      http://www.google.com/url?sa=t&rct=j&q=delicious%20dataset%20dai-larbor&source=web&cd=1&ved=0CFIQFjAA&url=http%3A%2F%2Fwww.dai-labor.de%2Fen%2Fcompetence_centers%2Firml%2Fdatasets%2F&ei=1R4JUKyFOKu0iQfKvazzCQ&;usg=AFQjCNGuVzzKIKi3K2YFybxrCNxbtKqS4A&cad=rjt 
      (Delicious Dataset) P101 
       
      http://research.microsoft.com/pubs/73692/yihgoca-www06.pdf 
       (Finding Advertising Keywords on Web Pages) P118 
       
      http://www.kde.cs.uni-kassel.de/ws/rsdc08/ 
      (基于标签的推荐系统比赛) P119 
       
      http://delab.csd.auth.gr/papers/recsys.pdf 
      (Tag recommendations based on tensor dimensionality reduction)P119 
       
      http://www.l3s.de/web/upload/documents/1/recSys09.pdf 
      (latent dirichlet allocation for tag recommendation) P119 
       
      http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.94.5271&rep=rep1&type=pdf 
      (Folkrank: A ranking algorithm for folksonomies) P119 
       
      http://www.grouplens.org/system/files/tagommenders_numbered.pdf 
       (Tagommenders: Connecting Users to Items through Tags) P119 
       
      http://www.grouplens.org/system/files/group07-sen.pdf 
      (The Quest for Quality Tags) P120 
       
      http://2011.camrachallenge.com/ 
      (Challenge on Context-aware Movie Recommendation) P123 
       
      http://bits.blogs.nytimes.com/2011/09/07/the-lifespan-of-a-link/ 
      (The Lifespan of a link) P125 
       
      http://www0.cs.ucl.ac.uk/staff/l.capra/publications/lathia_sigir10.pdf 
       (Temporal Diversity in Recommender Systems) P129 
       
      http://staff.science.uva.nl/~kamps/ireval/papers/paper_14.pdf 
       (Evaluating Collaborative Filtering Over Time) P129 
       
      http://www.google.com/places/ 
      (Hotpot) P139 
       
      http://www.readwriteweb.com/archives/google_launches_recommendation_engine_for_places.php 
      (Google Launches Hotpot, A Recommendation Engine for Places) P139 
       
      http://xavier.amatriain.net/pubs/GeolocatedRecommendations.pdf 
       (geolocated recommendations) P140 
       
      http://www.nytimes.com/interactive/2010/01/10/nyregion/20100110-netflix-map.html 
      (A Peek Into Netflix Queues) P141 
       
      http://www.cs.umd.edu/users/meesh/420/neighbor.pdf 
      (Distance Browsing in Spatial Databases1) P142 
       
      http://www.eng.auburn.edu/~weishinn/papers/MDM2010.pdf 
       (Efficient Evaluation of k-Range Nearest Neighbor Queries in Road Networks) P143 
       
       
      http://blog.nielsen.com/nielsenwire/consumer/global-advertising-consumers-trust-real-friends-and-virtual-strangers-the-most/ 
      (Global Advertising: Consumers Trust Real Friends and Virtual Strangers the Most) P144 
       
      http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en//pubs/archive/36371.pdf 
      (Suggesting Friends Using the Implicit Social Graph) P145 
       
      http://blog.nielsen.com/nielsenwire/online_mobile/friends-frenemies-why-we-add-and-remove-facebook-friends/ 
      (Friends & Frenemies: Why We Add and Remove Facebook Friends) P147 
       
      http://snap.stanford.edu/data/ 
      (Stanford Large Network Dataset Collection) P149 
       
      http://www.dai-labor.de/camra2010/ 
      (Workshop on Context-awareness in Retrieval and Recommendation) P151 
       
      http://www.comp.hkbu.edu.hk/~lichen/download/p245-yuan.pdf 
       (Factorization vs. Regularization: Fusing Heterogeneous 
      Social Relationships in Top-N Recommendation) P153 
       
      http://www.infoq.com/news/2009/06/Twitter-Architecture/ 
      (Twitter, an Evolving Architecture) P154 
       
      http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&ved=0CGQQFjAB&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.165.3679%26rep%3Drep1%26type%3Dpdf&ei=dIIJUMzEE8WviQf5tNjcCQ&usg=AFQjCNGw2bHXJ6MdYpksL66bhUE8krS41w&sig2=5EcEDhRe9S5SQNNojWk7_Q 
      (Recommendations in taste related domains) P155 
       
      http://www.ercim.eu/publication/ws-proceedings/DelNoe02/RashmiSinha.pdf 
      (Comparing Recommendations Made by Online Systems and Friends) P155 
       
      http://techcrunch.com/2010/04/22/facebook-edgerank/ 
      (EdgeRank: The Secret Sauce That Makes Facebook's News Feed Tick) P157 
       
      http://www.grouplens.org/system/files/p217-chen.pdf 
      (Speak Little and Well: Recommending Conversations in Online Social Streams) P158
       
      http://blog.linkedin.com/2008/04/11/learn-more-abou-2/ 
      (Learn more about “People You May Know”) P160 
       
      http://domino.watson.ibm.com/cambridge/research.nsf/58bac2a2a6b05a1285256b30005b3953/8186a48526821924852576b300537839/$FILE/TR%202009.09%20Make%20New%20Frends.pdf 
      (“Make New Friends, but Keep the Old” – Recommending People on Social Networking Sites) P164 
       
      http://www.google.com.hk/url?sa=t&rct=j&q=social+recommendation+using+prob&source=web&cd=2&ved=0CFcQFjAB&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.141.465%26rep%3Drep1%26type%3Dpdf&ei=LY0JUJ7OL9GPiAfe8ZzyCQ&usg=AFQjCNH-xTUWrs9hkxTA8si5fztAdDAEng 
      (SoRec: Social Recommendation Using Probabilistic Matrix) P165 
       
      http://olivier.chapelle.cc/pub/DBN_www2009.pdf 
      (A Dynamic Bayesian Network Click Model for Web Search Ranking) P177 
       
      http://www.google.com.hk/url?sa=t&rct=j&q=online+learning+from+click+data+spnsored+search&source=web&cd=1&ved=0CFkQFjAA&url=http%3A%2F%2Fwww.research.yahoo.net%2Ffiles%2Fp227-ciaramita.pdf&ei=HY8JUJW8CrGuiQfpx-XyCQ&usg=AFQjCNE_CYbEs8DVo84V-0VXs5FeqaJ5GQ&cad=rjt 
      (Online Learning from Click Data for Sponsored Search) P177 
       
      http://www.cs.cmu.edu/~deepay/mywww/papers/www08-interaction.pdf 
      (Contextual Advertising by Combining Relevance with Click Feedback) P177 
      http://tech.hulu.com/blog/2011/09/19/recommendation-system/ 
      (Hulu 推荐系统架构) P178 
       
      http://mymediaproject.codeplex.com/ 
      (MyMedia Project) P178 
       
      http://www.grouplens.org/papers/pdf/www10_sarwar.pdf 
      (item-based collaborative filtering recommendation algorithms) P185 
       
      http://www.stanford.edu/~koutrika/Readings/res/Default/billsus98learning.pdf 
      (Learning Collaborative Information Filters) P186 
       
      http://sifter.org/~simon/journal/20061211.html 
      (Simon Funk Blog:Funk SVD) P187 
       
      http://courses.ischool.berkeley.edu/i290-dm/s11/SECURE/a1-koren.pdf 
      (Factor in the Neighbors: Scalable and Accurate Collaborative Filtering) P190 
       
      http://nlpr-web.ia.ac.cn/2009papers/gjhy/gh26.pdf 
      (Time-dependent Models in Collaborative Filtering based Recommender System) P193 
       
      http://sydney.edu.au/engineering/it/~josiah/lemma/kdd-fp074-koren.pdf 
      (Collaborative filtering with temporal dynamics) P193 
       
      http://en.wikipedia.org/wiki/Least_squares 
      (Least Squares Wikipedia) P195 
       
      http://www.mimuw.edu.pl/~paterek/ap_kdd.pdf 
      (Improving regularized singular value decomposition for collaborative filtering) P195 
       
      http://public.research.att.com/~volinsky/netflix/kdd08koren.pdf 
       (Factorization Meets the Neighborhood: a Multifaceted 
      Collaborative Filtering Model) P195 

    Where to Learn Deep Learning – Courses, Tutorials, Software

    Deep Learning is a very hot Machine Learning techniques which has been achieving remarkable results recently. We give a list of free resources for learning and using Deep Learning.

    By Gregory Piatetsky, @kdnuggets, May 26, 2014. 

    Deep Learning is a very hot area of Machine Learning Research, with many remarkable recent successes, such as 97.5% accuracy on face recognition, nearly perfect German traffic sign recognition, or even Dogs vs Cats image recognition with 98.9% accuracy. Many winning entries in recent Kaggle Data Science competitions have used Deep Learning. 

    The term "deep learning" refers to the method of training multi-layered neural networks, and became popular after papers by Geoffrey Hinton and his co-workers which showed a fast way to train such networks. 

    Yann LeCun, a student of Geoff Hinton, also developed a very effective algorithm for deep learning, called Filters learned by ConvNetConvNet, which was successfully used in late 80-s and early 90-s for automatic reading of amounts on bank checks. 

    See more on ConvNet and factors enabled recent success of Deep Learning in my exclusive interview with Yann LeCun

    In May 2014, Baidu, the Chinese search giant, hashired Andrew Ng, a leading Machine Learning and Deep Learning expert (and co-founder of Coursera) to head their new AI Lab in Silicon Valley, setting up an AI & Deep Learning race with Google (which hired Geoff Hinton) and Facebook (which hired Yann LeCun to head Facebook AI Lab). 

    Here are some useful and free (!) resources for learning and using Deep Learning:
     
    The packages which support Deep Learning include
    • Torch7, an extension of the LuaJIT language which includes an object-oriented package for deep learning and computer vision. The main advantage of Torch7 is that LuaJIT is extremely fast and very flexible.
    • Theano + Pylearn2, which has the advantage of using Python (widely used), and the disadvantage of using Python (slow for big data).
    • cuda-convnet, High-performance C++/CUDA implementation of convolutional neural networks, based on Yann LeCun work.

     
    Related:
  • 相关阅读:
    DDD:再谈:实体能否处于非法状态?
    EntityFramework:迁移工具入门
    技术人生:态度决定人生
    EntityFramework:EF Migrations Command Reference
    DDD:聊天笔记
    DCI:DCI学习总结
    DCI:The DCI Architecture: A New Vision of Object-Oriented Programming
    设计原则:消除Switch...Case的过程,可能有点过度设计了。
    .NET:动态代理的 “5 + 1” 模式
    Silverlight:《Pro Silverlight5》读书笔记 之 Dependency Properties And Routed Event
  • 原文地址:https://www.cnblogs.com/abc8023/p/4063756.html
Copyright © 2020-2023  润新知