思维导图备注

LearningDataMiningwithPython-SecondEditi.epub
首页 收藏书籍 阅读记录
  • 书签 我的书签
  • 添加书签 添加书签 移除书签 移除书签

The mrjob package

浏览 23 扫码
  • 小字体
  • 中字体
  • 大字体
2022-01-24 10:04:12
请 登录 再阅读
上一篇:
下一篇:
  • 书签
  • 添加书签 移除书签
  • Preface
    • What this book covers
    • What you need for this book
    • Who this book is for
    • Conventions
    • Reader feedback
    • Customer support
      • Downloading the example code
      • Errata
      • Piracy
      • Questions
  • Getting Started with Data Mining
    • Introducing data mining
    • Using Python and the Jupyter Notebook
      • Installing Python
      • Installing Jupyter Notebook
      • Installing scikit-learn
    • A simple affinity analysis example
      • What is affinity analysis?
    • Product recommendations
      • Loading the dataset with NumPy
        • Downloading the example code
      • Implementing a simple ranking of rules
      • Ranking to find the best rules
    • A simple classification example
    • What is classification?
      • Loading and preparing the dataset
      • Implementing the OneR algorithm
      • Testing the algorithm
    • Summary
  • Classifying with scikit-learn Estimators
    • scikit-learn estimators
      • Nearest neighbors
      • Distance metrics
      • Loading the dataset
      • Moving towards a standard workflow
      • Running the algorithm
      • Setting parameters
    • Preprocessing
      • Standard pre-processing
      • Putting it all together
    • Pipelines
    • Summary
  • Predicting Sports Winners with Decision Trees
    • Loading the dataset
      • Collecting the data
      • Using pandas to load the dataset
      • Cleaning up the dataset
      • Extracting new features
    • Decision trees
      • Parameters in decision trees
      • Using decision trees
    • Sports outcome prediction
      • Putting it all together
    • Random forests
      • How do ensembles work?
      • Setting parameters in Random Forests
      • Applying random forests
      • Engineering new features
    • Summary
  • Recommending Movies Using Affinity Analysis
    • Affinity analysis
      • Algorithms for affinity analysis
      • Overall methodology
    • Dealing with the movie recommendation problem
      • Obtaining the dataset
        • Loading with pandas
        • Sparse data formats
    • Understanding the Apriori algorithm and its implementation
      • Looking into the basics of the Apriori algorithm
      • Implementing the Apriori algorithm
      • Extracting association rules
      • Evaluating the association rules
    • Summary
  • Features and scikit-learn Transformers
    • Feature extraction
      • Representing reality in models
      • Common feature patterns
      • Creating good features
    • Feature selection
      • Selecting the best individual features
    • Feature creation
    • Principal Component Analysis
    • Creating your own transformer
      • The transformer API
      • Implementing a Transformer
    • Unit testing
    • Putting it all together
    • Summary
  • Social Media Insight using Naive Bayes
    • Disambiguation
    • Downloading data from a social network
      • Loading and classifying the dataset
      • Creating a replicable dataset from Twitter
    • Text transformers
      • Bag-of-words models
      • n-gram features
      • Other text features
    • Naive Bayes
      • Understanding Bayes' theorem
      • Naive Bayes algorithm
      • How it works
    • Applying of Naive Bayes
      • Extracting word counts
      • Converting dictionaries to a matrix
      • Putting it all together
      • Evaluation using the F1-score
    • Getting useful features from models
    • Summary
  • Follow Recommendations Using Graph Mining
    • Loading the dataset
      • Classifying with an existing model
    • Getting follower information from Twitter
      • Building the network
    • Creating a graph
      • Creating a similarity graph
    • Finding subgraphs
      • Connected components
      • Optimizing criteria
    • Summary
  • Beating CAPTCHAs with Neural Networks
    • Artificial neural networks
      • An introduction to neural networks
    • Creating the dataset
      • Drawing basic CAPTCHAs
      • Splitting the image into individual letters
      • Creating a training dataset
    • Training and classifying
      • Back-propagation
    • Predicting words
      • Improving accuracy using a dictionary
      • Ranking mechanisms for word similarity
      • Putting it all together
    • Summary
  • Authorship Attribution
    • Attributing documents to authors
      • Applications and use cases
      • Authorship attribution
    • Getting the data
    • Using function words
      • Counting function words
      • Classifying with function words
    • Support Vector Machines
      • Classifying with SVMs
      • Kernels
    • Character n-grams
      • Extracting character n-grams
    • The Enron dataset
      • Accessing the Enron dataset
      • Creating a dataset loader
    • Putting it all together
    • Evaluation
    • Summary
  • Clustering News Articles
    • Trending topic discovery
      • Using a web API to get data
      • Reddit as a data source
      • Getting the data
    • Extracting text from arbitrary websites
      • Finding the stories in arbitrary websites
      • Extracting the content
    • Grouping news articles
    • The k-means algorithm
      • Evaluating the results
      • Extracting topic information from clusters
      • Using clustering algorithms as transformers
    • Clustering ensembles
      • Evidence accumulation
      • How it works
      • Implementation
    • Online learning
      • Implementation
    • Summary
  • Object Detection in Images using Deep Neural Networks
    • Object classification
      • Use cases
    • Application scenario
    • Deep neural networks
      • Intuition
      • Implementing deep neural networks
    • An Introduction to TensorFlow
    • Using Keras
      • Convolutional Neural Networks
    • GPU optimization
      • When to use GPUs for computation
      • Running our code on a GPU
      • Setting up the environment
    • Application
      • Getting the data
      • Creating the neural network
      • Putting it all together
    • Summary
  • Working with Big Data
    • Big data
      • Applications of big data
    • MapReduce
      • The intuition behind MapReduce
        • A word count example
      • Hadoop MapReduce
    • Applying MapReduce
      • Getting the data
    • Naive Bayes prediction
      • The mrjob package
    • Extracting the blog posts
    • Training Naive Bayes
    • Putting it all together
    • Training on Amazon's EMR infrastructure
    • Summary
  • Next Steps...
    • Getting Started with Data Mining
      • Scikit-learn tutorials
      • Extending the Jupyter Notebook
      • More datasets
      • Other Evaluation Metrics
      • More application ideas
    • Classifying with scikit-learn Estimators
      • Scalability with the nearest neighbor
      • More complex pipelines
      • Comparing classifiers
      • Automated Learning
    • Predicting Sports Winners with Decision Trees
      • More complex features
      • Dask
      • Research
    • Recommending Movies Using Affinity Analysis
      • New datasets
      • The Eclat algorithm
      • Collaborative Filtering
    • Extracting Features with Transformers
      • Adding noise
      • Vowpal Wabbit
      • word2vec
    • Social Media Insight Using Naive Bayes
      • Spam detection
      • Natural language processing and part-of-speech tagging
    • Discovering Accounts to Follow Using Graph Mining
      • More complex algorithms
        • NetworkX
    • Beating CAPTCHAs with Neural Networks
      • Better (worse?) CAPTCHAs
      • Deeper networks
      • Reinforcement learning
    • Authorship Attribution
      • Increasing the sample size
      • Blogs dataset
      • Local n-grams
    • Clustering News Articles
      • Clustering Evaluation
      • Temporal analysis
      • Real-time clusterings
    • Classifying Objects in Images Using Deep Learning
      • Mahotas
      • Magenta
    • Working with Big Data
      • Courses on Hadoop
      • Pydoop
      • Recommendation engine
      • W.I.L.L
    • More resources
      • Kaggle competitions
        • Coursera
暂无相关搜索结果!
    展开/收起文章目录

    二维码

    手机扫一扫,轻松掌上学

    《LearningDataMiningwithPython-SecondEditi.epub》电子书下载

    请下载您需要的格式的电子书,随时随地,享受学习的乐趣!
    EPUB 电子书

    书签列表

      阅读记录

      阅读进度: 0.00% ( 0/0 ) 重置阅读进度