• Collection of materials for Newbie in Deep Learning and Machine Learning and Data Science.


    https://itnext.io/collection-of-materials-for-newbie-in-deep-learning-and-machine-learning-and-data-science-56ccaa73c18

    Over the course of several engagements and projects, I have stumbled upon several somewhat distinct patterns that have emerged from the need to on-board new folks into ML type of project or provide quick training to folks that already on the project and who like to get hands-on with ML, AI and DL type of work. So, in short, the question that we had to answer repeatedly “how do we get folks, who’s interested in learning ML-AI, trained with minimal time away from the work and get them hands-on as much as possible?”. Initially, to answer that question, we had to do quite a bit of legwork, come up with few approaches and go through many hops of trial-and-error. Ultimately, the outcome was satisfactory and the team still uses these frameworks and processes to get folks up to speed with ML. But, this post is not about that. The aim of this post is to share collection of materials, training courses and links that we have gathered and compiled while developing the process for training and on-boarding.

    All materials and links posted here are selected at discretion of the research team and somewhat subjective given team’s background and the objectives.


    First things first — some terminology articles and differences between AI, DL and ML.

    I absolutely adore simplicity. One can always make more complex out of of simple. That’s why I always prefer to start from simple and add complexity later.. so, first link to article called “The simplest explanation of machine learning you’ll ever read”.

    The simplest explanation of machine learning you’ll ever read.

    The simplest explanation of machine learning you’ll ever read
    You’ve probably heard of machine learning and artificial intelligence, but are you sure you know what they are? If…
    hackernoon.com
     

    Are you using the term ‘AI’ incorrectly?

    Are you using the term ‘AI’ incorrectly?
    There, I said it: I don’t mind that industry uses ‘AI’ and ‘machine learning’ to mean the same thing. But is it…
    becominghuman.ai
     

    A simple way to understand machine learning vs deep learning

    A simple way to understand machine learning vs deep learning - Zendesk
    Understanding the latest advancements in artificial intelligence can seem overwhelming, but it really boils down to two…
    www.zendesk.com
     

    What’s the Difference Between Artificial Intelligence, Machine Learning, and Deep Learning?

    The Difference Between AI, Machine Learning, and Deep Learning? | NVIDIA Blog
    This is the first of a multi-part series explaining the fundamentals of deep learning by long-time tech journalist…
    blogs.nvidia.com
     

    Learning Courses

    Machine Learning (Class Central) — Study free online Machine Learning courses & MOOCs from top universities and colleges. Read reviews to decide if a class is right for you.

    Machine Learning — Stanford University — Free course created by Stanford University, taught by Andrew Ng.

    Artificial Intelligence (AI) — Learn the fundamentals of Artificial Intelligence (AI). Design intelligent agents to solve real-world problems including, search, games, machine learning, logic, and constraint satisfaction problems.

    Machine Learning. Master the essentials of machine learning and algorithms to help improve learning from data without human intervention.

    Learning From Data — Online Course (MOOC) — More than 4M views on Youtube and iTunes. Featured on edX. Free Course, Lecture Videos Available.

    Machine Learning — Free Course with Lecture Slides and Video recordings
    Department of Computer Science, 2014–2015, ml, Machine Learning.

    Machine Learning — University of Washington — Master machine learning fundamentals in four hands-on courses.

    Neural Networks for Machine Learning — Learn about artificial neural networks and how they’re being applied to speech and object recognition, image segmentation, modeling language and human motion. Free Course.


    Video Courses and Video Lessons

    Practical Machine Learning Tutorial with Python Intro (playlist)

    DeepLearning.TV

    DeepLearning.TV
    DeepLearning.TV is all about Deep Learning, the field of study that teaches machines to perceive the world. Starting…
    www.youtube.com
     

    Machine Leaning Recipes

    Sentdex

    Practical Deep Learning For Coders (Jeremy Howard & Rachel Thomas)

    Deep Learning For Coders-36 hours of lessons for free
    fast.ai's practical deep learning MOOC for coders. Learn CNNs, RNNs, computer vision, NLP, recommendation systems…
    course.fast.ai
     

    Machine Learning, Udacity (Georgia Tech)

    Machine Learning | Udacity
    In this course, you'll learn how to apply Supervised, Unsupervised and Reinforcement Learning techniques for solving a…
    www.udacity.com
     

    Intro to Machine Learning, Udacity (Sebastian Thrun)

    Udacity
    Edit description
    classroom.udacity.com

    Neural Networks for Machine Learning, Coursera (Geoffrey Hinton)

    Neural Networks for Machine Learning | Coursera
    About this course: Learn about artificial neural networks and how they're being used for machine learning, as applied…
    www.coursera.org
     

    Keras — Python Deep Learning Neural Network API

    TensorFlow Tutorials

    Deep Learning TensorFlow and Deep Learning With Neural Networks by Simplilearn

    Deep Learning with R tutorial

    The art if neural networks by Mike Tyka

    Blogs

    Below you find list of blogs in ML/DL/AI related topics.

    Andrej Karpathy blog
    Musings of a Computer Scientist.
    karpathy.github.io
    i am trask
    A machine learning craftsmanship blog.
    iamtrask.github.io
    Machine Learning in Practice
    Practical insights for executives, managers, and project managers eager to deploy machine learning inside their…
    medium.com
     
    Blog · Explosion AI
    Explosion AI is a digital studio specialising in Artificial Intelligence and Natural Language Processing. We're the…
    explosion.ai
     
    Adventures in NI
    artificial and natural intelligence, including politics, art, and higher education
    joanna-bryson.blogspot.de
     
    Machine Learning Mastery Blog
    What neural network is appropriate for your predictive modeling problem? It can be difficult for a beginner to the…
    machinelearningmastery.com
     
    WildML
    The academic Deep Learning research community has largely stayed away from the financial markets. Maybe that's because…
    www.wildml.com
    FastML
    Last year, we published a new dataset for book recommendations, goodbooks-10k. As the name suggests, it contains…
    fastml.com

    Cheat sheets

    1. Cheat Sheets for AI, Neural Networks, Machine Learning, Deep Learning & Big Data
    Cheat Sheets for AI, Neural Networks, Machine Learning, Deep Learning & Big Data
    The Most Complete List of Best AI Cheat Sheets
    becominghuman.ai
     

    2. Essential Cheat Sheets for Machine Learning and Deep Learning Engineers

    Essential Cheat Sheets for Machine Learning and Deep Learning Engineers
    Learning machine learning and deep learning is difficult for newbies. As well as deep learning libraries are difficult…
    startupsventurecapital.com
     

    Github resources

    Build software better, together
    GitHub is where people build software. More than 28 million people use GitHub to discover, fork, and contribute to over…
    github.com
     
    Build software better, together
    GitHub is where people build software. More than 28 million people use GitHub to discover, fork, and contribute to over…
    github.com
     
    Build software better, together
    GitHub is where people build software. More than 28 million people use GitHub to discover, fork, and contribute to over…
    github.com
     
    Build software better, together
    GitHub is where people build software. More than 28 million people use GitHub to discover, fork, and contribute to over…
    github.com
     

    Where can you practice?

    Machine Learning Kaggle Competition

    I highly recommend trying out Kaggle competitions even though you have a slim chance of even getting to top-100. The value of Kaggle competitions is the community. Read the kernels and take good practices from them. Read comments and engage in discussions. That’s where you will learn tremendously.

    Machine Learning Kaggle Competition Part One: Getting Started
    Learning the Kaggle Environment and an Introductory Notebook
    towardsdatascience.com
     

    …and a few articles related to the subject

    Every single Machine Learning course on the internet, ranked by your reviews
    A year and a half ago, I dropped out of one of the best computer science programs in Canada. I started creating my own…
    medium.freecodecamp.org
     
    Ultimate Guide to Leveraging NLP & Machine Learning for your Chatbot
    Code Snippets and Github Included
    chatbotslife.com
     
    How Machine Learning is changing Software Development
    I’m not here to talk to you about how amazing A.I. is, what Deepmind is working on, or speculate about robotic…
    medium.com
     

    Data Sources

    (Courtesy of Elite Data Science)

    Datasets for Exploratory Analysis

    Exploratory analysis is your first step in most data science exercises. The best datasets for practicing exploratory analysis should be fun, interesting, and non-trivial (i.e. require you to dig a little to uncover all the insights).

    • Game of Thrones — Game of Thrones is a popular TV series based on George R.R. Martin’s A Song of Fire and Ice book series. With this dataset, you can explore its political landscape, characters, and battles.
    • World University Rankings — Ranking universities can be difficult and controversial. There are hundreds of ranking systems, and they rarely reach a consensus. This dataset contains three global university rankings.
    • IMDB 5000 Movie Dataset — This dataset explores the question of whether we can anticipate a movie’s popularity before it’s even released.

    Aggregators:

    • Kaggle Datasets — Open datasets contributed by the Kaggle community. Here, you’ll find a grab bag of topics. Plus, you can learn from the short tutorials and scripts that accompany the datasets.
    • r/datasets — Open datasets contributed by the Reddit community. This is another source of interesting and quirky datasets, but the datasets tend to less refined.

    Datasets for General Machine Learning

    In this context, we refer to “general” machine learning as Regression, Classification, and Clustering with relational (i.e. table-format) data. These are the most common ML tasks.

    Aggregators:

    • UCI Machine Learning Repository — The UCI ML repository is an old and popular aggregator for machine learning datasets. Tip: Most of their datasets have linked academic papers that you can use for benchmarks.

    Datasets for Deep Learning

    While not appropriate for general-purpose machine learning, deep learning has been dominating certain niches, especially those that use image, text, or audio data. From our experience, the best way to get started with deep learning is to practice on image data because of the wealth of tutorials available.

    • MNIST — MNIST contains images for handwritten digit classification. It’s considered a great entry dataset for deep learning because it’s complex enough to warrant neural networks, while still being manageable on a single CPU. (We also have a tutorial.)
    • CIFAR — The next step up in difficulty is the CIFAR-10 dataset, which contains 60,000 images broken into 10 different classes. For a bigger challenge, you can try the CIFAR-100 dataset, which has 100 different classes.
    • ImageNet — ImageNet hosts a computer vision competition every year, and many consider it to be the benchmark for modern performance. The current image dataset has 1000 different classes.
    • YouTube 8M — Ready to tackle videos, but can’t spare terabytes of storage? This dataset contains millions of YouTube video ID’s and billions of audio and visual features that were pre-extracted using the latest deep learning models.

    Aggregators:

    • Deeplearning.net — Up-to-date list of datasets for benchmarking deep learning algorithms.
    • DeepLearning4J.org — Up-to-date list of high-quality datasets for deep learning research.

    YouTube-8M

    Datasets for Natural Language Processing

    Natural Language Processing (N.L.P.) is about text data. And for messy data like text, it’s especially important for the datasets to have real-world applications so that you can perform easy sanity checks.

    • Enron Dataset — Email data from the senior management of Enron, organized into folders. This dataset was originally made public and posted to the web by the Federal Energy Regulatory Commission during its investigation.
    • Amazon Reviews — Contains ~35 million reviews from Amazon spanning 18 years. Data include product and user information, ratings, and the plaintext review.
    • Newsgroup Classification — Collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across 20 different newsgroups. Great for practicing text classification and topic modeling.

    Aggregators:

    Datasets for Cloud Machine Learning

    Technically, any dataset can be used for cloud-based machine learning if you just upload it to the cloud. However, if you’re just starting out and evaluating a platform, you may wish to skip all the data piping.

    Fortunately, the major cloud computing services all provide public datasets that you can easily import. Their datasets are all comparable.

    Datasets for Time Series Analysis

    Time series analysis requires observations marked with a timestamp. In other words, each subject and/or feature is tracked across time.

    • EOD Stock Prices — End of day stock prices, dividends, and splits for 3,000 US companies, curated by the Quandl community.
    • Zillow Real Estate Research — Home prices and rents by size, type, and tier, sliced by zip code, neighborhood, city, metro area, county and state.
    • Global Education Statistics — Over 4,000 internationally comparable indicators for education access, progression, completion, literacy, teachers, population, and expenditures.

    Aggregators:

    • Quandl — Quandl contains free and premium time series datasets for financial analysis.
    • The World Bank — Contains global macroeconomic time series and searchable by country or indicator.

    Zillow Real Estate Data

    Datasets for Recommender Systems

    Recommender systems have taken the entertainment and e-commerce industries by storm. Amazon, Netflix, and Spotify are great examples.

    • MovieLens — Rating data sets from the MovieLens web site. Perfect for getting started thanks to the various dataset sizes available.
    • Jester — Ideal for building a simple collaborative filter. Contains 4.1 Million continuous ratings (-10.00 to +10.00) of 100 jokes from 73,421 users.
    • Million Song Dataset — Large, rich dataset for music recommendations. You can start with a pure collaborative filter and then expand it with other methods such as content-based models or web scraping.

    Aggregators:

    • entaroadun (Github) — Collection of datasets for recommender systems. Tip: Check the comments section for recent datasets.

    Datasets for Specific Industries

    In this compendium, we’ve organized datasets by their use case. This is helpful if you need to practice a certain skill, such as deep learning or time series analysis.

    However, you may also wish to search by a specific industry, such as datasets for neuroscience, weather, or manufacturing. Here are a couple options:

    Aggregators:

    Datasets for Streaming

    Streaming datasets are used for building real-time applications, such as data visualization, trend tracking, or updatable (i.e. “online”) machine learning models.

    • Twitter API — The twitter API is a classic source for streaming data. You can track tweets, hashtags, and more.
    • StockTwits API — StockTwits is like a twitter for traders and investors. You can expand this dataset in many interesting ways by joining it to time series datasets using the timestamp and ticker symbol.
    • Weather Underground — A reliable weather API with global coverage. Features a free tier and paid options for scaling up.

    Aggregators:

    • Satori — Satori is a platform that lets you connect to streaming live data at ultra-low latency (for free). They frequently add new datasets.

    Datasets for Web Scraping

    Web scraping is a common part of data science research, but you must be careful of violating websites’ terms of services. Fortunately, there’s a whole site that’s designed to be freely scraped.

    Datasets for Current Events

    Finding datasets for current events can be tricky. Fortunately, some publications have started releasing the datasets they use in their articles.

    Aggregators:

    • FiveThirtyEight — FiveThirtyEight is a news and sports site with data-driven articles. They make their datasets openly available on Github.
    • BuzzFeedNews — BuzzFeed became (in)famous for their listicles and superficial pieces, but they’ve since expanded into investigative journalism. Their datasets are available on Github.

    This article is a living and breathing matter. Your feedback matters to me. Please leave comments on what resources I should add and what you found the most helpful when you were learning ML/DL/AI. Thanks a lot for reading!

  • 相关阅读:
    VUE assets里的scss没有引用会被打包进代码里,本地代码和打包后的代码样式不一致解决办法
    echarts图表配置
    关于哈希路由多项目部署同一个服务器的链接访问问题
    git操作失误,提交代码因为网络问题没有成功,然后操作时候点错按钮导致代码全部没有了,也没用备份,如何解决
    浏览器刷新时候不删除信息,关闭后删除用户信息处理办法,浏览器监听刷新以及删除事件、cookie、session、sessionStorage、localStorage区别
    angular打包部署设置publicPath文件目录及访问地址,解决打包完成后,运行打包文件,报错404,js,css未找到
    Oracle spatial与arcsde 的关系
    Oracle Spatial图层元数据坐标范围影响R-TREE索引的ROOT MBR吗?
    centos下安装supervisor的步骤详解
    laravel 队列
  • 原文地址:https://www.cnblogs.com/dhcn/p/12393945.html
Copyright © 2020-2023  润新知