📓
Study
  • README
  • Application
    • Contest
      • 竞赛trick
  • Basic Know
    • 半监督学习
    • 贝叶斯
      • 朴素贝叶斯分类器
    • 对抗训练
    • 概率图模型
      • CRF
      • HMM
      • 概率图模型
    • 关联分析
    • 归纳偏置
      • [什么是 Inductive bias(归纳偏置)?](BasicKnow/归纳偏置/什么是 Inductive bias(归纳偏置)?.md)
    • 聚类
    • 决策树
    • 绿色深度学习
    • 树模型&集成学习
      • 提升树
      • Ada Boost
      • [集成学习]
    • 特征工程
      • 数据分桶
      • 特征工程概述
      • 特征选择
      • LDA
      • PCA
    • 线性模型
      • 感知机
      • 最大熵模型
      • SVM
        • SVM支持向量机
      • 逻辑回归
      • 线性回归
    • 优化算法
      • 拉格朗日对偶性
      • 牛顿法
        • 牛顿法&拟牛顿法
      • 梯度下降法
        • 梯度下降算法
      • 优化算法
    • 预处理
      • [1-1]正则表达式
      • [1-2]文本预处理
      • [1-3]词性
      • [1-4]语法分析
      • [1-6]文本分类
      • [1-7]网络爬取
      • 【备用】正则表达式
      • 7.re模块
      • 词典匹配
      • 分词
      • 子表达式
      • Todo
    • 主题模型
      • LDA
    • Deep Learning
      • 反向传播
      • 梯度消失&梯度爆炸
      • Batch Size
      • 1.DLbasis
      • 小概念
      • MLstrategy
      • CNN
      • RNN及其应用
      • 关于深度学习实践
      • 神经网络概述
      • Batch Normalization
      • Program CNN
      • Program D Lbasis
      • Program DN Nimprove
      • Program Neural Style Transfer
      • Summer DL
    • EM算法
    • GAN
      • Gans In Action Master
    • GNN
      • 搜广推之GNN
      • Representation Learning
        • Anomalydetection
        • Conclusion
        • Others
        • Papernotes
        • Recommadation
    • k近邻法
      • K近邻
    • Language Model
      • 语言模型解码采样策略
      • [1-1][语言模型]从N-gram模型讲起
      • [1-2][语言模型]NNLM(神经网络语言模型)
      • [1-3][语言模型]基于RNN的语言模型
      • [1-4][语言模型]用N-gram来做完形填空
      • [1-5][语言模型]用KenLM来做完形填空
    • Loss Function
      • 常用损失函数
      • Focal Loss
      • softmax+交叉熵
    • Machine Learning
      • [基础]概念
      • 待整合
      • 交叉验证
      • 无监督学习
      • 优缺点
      • ML Yearning
      • SVD
    • Statistics Math
      • 程序员的数学基础课
      • 数学基础
      • 统计&高数
      • 统计题目
      • 线性代数
      • 组合数学
      • Discrete Choice Model
      • Nested Choice Model
  • Course Note
    • 基于TensorFlow的机器学习速成课程
      • [Key ML Terminology](CourseNote/基于TensorFlow的机器学习速成课程/Key ML Terminology.md)
    • 集训营
      • 任务说明
      • 算法实践1.1模型构建
      • 算法实践1.2模型构建之集成模型
      • 算法实践2.1数据预处理
    • 李宏毅机器学习
      • 10DNN训练Tips
        • Chapter 18
      • 16无监督学习
        • Chapter 25
    • 贪心NLP
      • 贪心NLP笔记
    • Cs 224 N 2019
      • [A Simple But Tough To Beat Baseline For Sentence Embeddings](CourseNote/cs224n2019/A Simple but Tough-to-beat Baseline for Sentence Embeddings.md)
      • [Lecture 01 Introduction And Word Vectors](CourseNote/cs224n2019/Lecture 01 Introduction and Word Vectors.md)
      • [Lecture 02 Word Vectors 2 And Word Senses](CourseNote/cs224n2019/Lecture 02 Word Vectors 2 and Word Senses.md)
      • [Lecture 03 Word Window Classification Neural Networks And Matrix Calculus](CourseNote/cs224n2019/Lecture 03 Word Window Classification, Neural Networks, and Matrix Calculus.md)
      • [Lecture 04 Backpropagation And Computation Graphs](CourseNote/cs224n2019/Lecture 04 Backpropagation and Computation Graphs.md)
      • [Lecture 05 Linguistic Structure Dependency Parsing](CourseNote/cs224n2019/Lecture 05 Linguistic Structure Dependency Parsing.md)
      • [Lecture 06 The Probability Of A Sentence Recurrent Neural Networks And Language Models](CourseNote/cs224n2019/Lecture 06 The probability of a sentence Recurrent Neural Networks and Language Models.md)
      • Stanford NLP
    • Deep Learning Book Goodfellow
      • Books
        • Deep Learning Book Chapter Summaries Master
      • 提纲
      • C 5
      • C 6
      • [Part I Applied Math And Machine Learning Basics](CourseNote/Deep-Learning-Book-Goodfellow/Part I - Applied Math and Machine Learning basics.md)
    • Lihang
    • NLP实战高手课
      • 极客时间_NLP实战高手课
    • 工具&资料
    • 机器学习、深度学习面试知识点汇总
    • 七月kaggle课程
    • 算法工程师
    • 贪心科技机器学习必修知识点特训营
    • 唐宇迪机器学习
    • 语言及工具
    • AI技术内参
    • Suggestions
  • Data Related
    • 数据质量
      • 置信学习
    • 自然语言处理中的数据增广_车万翔
      • 自然语言处理中的数据增广
    • Mixup
    • 数据不均衡问题
    • 数据增强的方法
  • Knowledge Graph
    • Information Extraction
      • 联合抽取
        • PRGC
      • Code
        • BERT微调
      • NER
        • 阅读理解做NER
          • MRC
        • FLAT
        • Global Pointer
        • 命名实体识别NER
    • Keyword Extraction
      • 关键词抽取
    • 小米在知识表示学习的探索与实践
    • KG
  • Multi Task
    • EXT 5
      • Ex T 5
  • NLG
    • Dailogue
      • 比赛
        • 对话评估比赛
          • [simpread-DSTC10 开放领域对话评估比赛冠军方法总结](NLG/Dailogue/比赛/对话评估比赛/simpread-DSTC10 开放领域对话评估比赛冠军方法总结.md)
      • 任务型对话
        • DST
          • DST概述
        • NLG
          • NLG概述
        • NLU
          • NLU概述
        • 任务型对话概述
        • simpread-任务型对话系统预训练最新研究进展
      • 问答型对话
        • 检索式问答
          • 基于预训练模型的检索式对话系统
          • 检索式文本问答
        • 业界分享
          • 低资源场景下的知识图谱表示学习和问答_阿里_李杨
          • QQ浏览器搜索智能问答
        • 问答型对话系统概述
      • 闲聊型对话
        • 闲聊型对话系统概述
      • 业界分享
        • 人工智能与心理咨询
        • 腾讯多轮对话机器人
        • 微软小冰
        • 小布助手闲聊生成式算法
        • 美团智能客服实践_江会星
        • 去哪儿智能客服探索和实践
        • 实时语音对话场景下的算法实践_阿里_陈克寒
        • 智能语音交互中的无效query识别_小米_崔世起
        • UNIT智能对话
      • 主动对话
      • EVA
        • EVA分享
        • EVA模型
      • PLATO
      • RASA
    • Machine Translation
      • 业界分享
        • 爱奇艺台词翻译分享
      • Paper
        • Deep Encoder Shallow Decoder
    • RAGRelated
    • Text 2 SQL
      • M SQL
        • [M SQL 2](NLG/Text2SQL/M-SQL/M-SQL (2).md)
      • [Text2SQL Baseline解析](NLG/Text2SQL/Text2SQL Baseline解析.md)
      • Text 2 SQL
    • Text Summarization
      • [文本摘要][paper]CTRLSUM
      • 文本摘要
  • Pre Training
    • 业界分享
      • 超大语言模型与语言理解_黄民烈
        • 超大语言模型与语言理解
      • 大模型的加速算法_腾讯微信
        • 大模型的加速算法
      • 孟子轻量化预训练模型
      • 悟道文汇文图生成模型
      • 悟道文澜图文多模态大模型
      • 语义驱动可视化内容创造_微软
        • 语义驱动可视化内容创造
    • Base
      • Attention
      • Mask
        • NLP中的Mask
      • Position Encoding
        • 位置编码
    • BERT
      • ALBERT
      • Bert
        • Venv
          • Lib
            • Site Packages
              • idna-3.2.dist-info
                • LICENSE
              • Markdown-3.3.4.dist-info
                • LICENSE
              • Tensorflow
                • Include
                  • External
                    • Libjpeg Turbo
                      • LICENSE
                  • Unsupported
                    • Eigen
                      • CXX 11
                        • Src
                          • Tensor
              • Werkzeug
                • Debug
                  • Shared
                    • ICON LICENSE
        • CONTRIBUTING
        • Multilingual
      • Ro BER Ta
      • BERT
      • BERT面试问答
      • BERT源码解析
      • NSP BERT
    • BERT Flow
    • BERT Zip
      • Distilling The Knowledge In A Neural Network
      • TINYBERT
      • 模型压缩
    • CPM
    • CPT
      • 兼顾理解和生成的中文预训练模型CPT
    • ELECTRA
    • EL Mo
    • ERNIE系列语言模型
    • GPT
    • MBART
    • NEZHA
    • NLG Sum
      • [simpread-预训练时代下的文本生成|模型 & 技巧](Pre-training/NLGSum/simpread-预训练时代下的文本生成|模型 & 技巧.md)
    • Prompt
      • 预训练模型的提示学习方法_刘知远
        • 预训练模型的提示学习方法
    • T 5
      • Unified SKG
      • T 5
    • Transformer
    • Uni LM
    • XL Net
    • 预训练语言模型
    • BERT变种
  • Recsys
    • 多任务Multi-task&推荐
    • 推荐介绍
    • 推荐系统之召回与精排
      • 代码
        • Python
          • Recall
            • Deep Match Master
              • Docs
                • Source
                  • Examples
                  • FAQ
                  • Features
                  • History
                  • Model Methods
                  • Quick Start
    • 业界分享
      • 腾讯基于知识图谱长视频推荐
    • 召回
    • Sparrow Rec Sys
    • 深度学习推荐系统实战
    • 推荐模型
    • Deep FM
  • Search
    • 搜索
    • 业界分享
      • 爱奇艺搜索排序算法实践
      • 语义搜索技术和应用
    • 查询关键字理解
    • 搜索排序
    • BM 25
    • KDD21-淘宝搜索中语义向量检索技术
    • query理解
    • TFIDF
  • Self Supervised Learning
    • Contrastive Learning
      • 业界分享
        • 对比学习在微博内容表示的应用_张俊林
      • Paper
      • R Drop
      • Sim CSE
    • 自监督学习
  • Text Classification
    • [多标签分类(Multi-label Classification)](TextClassification/多标签分类(Multi-label Classification)/多标签分类(Multi-label Classification).md)
    • Fast Text
    • Text CNN
    • 文本分类
  • Text Matching
    • 文本匹配和多轮检索
    • CNN SIM
    • Word Embedding
      • Skip Gram
      • Glove
      • Word 2 Vec
    • 文本匹配概述
  • Tool
    • 埋点
    • 向量检索(Faiss等)
    • Bigdata
      • 大数据基础task1_创建虚拟机+熟悉linux
      • 任务链接
      • Mr
      • Task1参考答案
      • Task2参考答案
      • Task3参考答案
      • Task4参考答案
      • Task5参考答案
    • Docker
    • Elasticsearch
    • Keras
    • Numpy
    • Python
      • 可视化
        • Interactivegraphics
        • Matplotlib
        • Tkinter
        • Turtle
      • 数据类型
        • Datatype
      • python爬虫
        • Python Scraping Master
          • phantomjs-2.1.1-windows
        • Regularexp
        • Scrapying
        • Selenium
      • 代码优化
      • 一行代码
      • 用python进行语言检测
      • Debug
      • Exception
      • [Features Tricks](Tool/python/Features & Tricks.md)
      • Fileprocess
      • Format
      • Functional Programming
      • I Python
      • Magic
      • Math
      • Os
      • Others
      • Pandas
      • Python Datastructure
      • Python操作数据库
      • Streamlit
      • Time
    • Pytorch
      • Dive Into DL Py Torch
        • 02 Softmax And Classification
        • 03 Mlp
        • 04 Underfit Overfit
        • 05 Gradient Vanishing Exploding
        • 06 Text Preprocess
        • 07 Language Model
        • 08 Rnn Basics
        • 09 Machine Translation
        • 10 Attention Seq 2 Seq
        • 11 Transformer
        • 12 Cnn
        • 14 Batchnorm Resnet
        • 15 Convexoptim
        • 16 Gradientdescent
        • 17 Optim Advance
    • Spark
      • Pyspark
        • pyspark之填充缺失的时间数据
      • Spark
    • SQL
      • 数据库
      • Hive Sql
      • MySQL实战45讲
    • Tensor Flow
      • TensorFlow入门
  • Common
  • NLP知识体系
Powered by GitBook
On this page
  • Why ML Strategy
  • 正交化
  • 单一数字评估指标
  • 满足和优化指标
  • 训练/开发/测试集分布
  • 开发机和测试集的大小
  • 什么时候该改变开发/测试集和指标
  • 和人类水平比较
  • 可避免误差
  • 理解人类水平表现
  • 超过人的表现
  • 改善你的模型表现
  • 进行误差分析
  • 清除标注错误的数据
  • 快速搭建第一个系统
  • 训练集和测试集来自不同分布
  • 偏差和方差(不匹配数据分布中)
  • 定位数据不匹配
  • 迁移学习
  • 多任务学习
  • 端到端深度学习
  • 是否要用端到端学习

Was this helpful?

  1. Basic Know
  2. Deep Learning

MLstrategy

Why ML Strategy

Ideas:

  • Collect more data

  • Collect more diverse training set

  • Train algorithm longer with gradient descent

  • Try Adam instead of gradient descent

  • Try bigger network

  • Try smaller network

  • Try dropout

  • Add 𝐿2regularization

  • Network architecture

    • Activation functions

    • num of hidden units

    • ...

      ​

正交化

Orthogonalization or orthogonality is a system design property that assures that modifying an instruction or a component of an algorithm will not create or propagate side effects to other components of the system. It becomes easier to verify the algorithms independently from one another, it reduces testing and development time. When a supervised learning system is design, these are the 4 assumptions that needs to be true and orthogonal.

  1. Fit training set well in cost function

    • If it doesn’t fit well, the use of a bigger neural network or switching to a better optimization algorithm might help.

  2. Fit development set well on cost function

    • If it doesn’t fit well, regularization or using bigger training set might help.

  3. Fit test set well on cost function

    • If it doesn’t fit well, the use of a bigger development set might help

  4. Performs well in real world

    • If it doesn’t perform well, the development test set is not set correctly or the cost function is not evaluating the right thing.

单一数字评估指标

Single number evaluation metric

To choose a classifier, a well-defined development set and an evaluation metric speed up the iteration process.

比如可以用F1-score综合precision和recall,这样只用看一个指标,在precision和recall无法判断的情形下:如

classifier

Precision(p)

Recall(r)

F1-Score

A

95%

90%

92.4%

B

98%

85%

91.0%

这样从F1-Score上判断,分类器A更好

满足和优化指标

Satisficing and optimizing metric

There are different metrics to evaluate the performance of a classifier, they are called evaluation matrices. They can be categorized as satisficing and optimizing matrices. It is important to note that these evaluation matrices must be evaluated on a training set, a development set or on the test set.

比如在某些任务中,accuracy是我们想要最大化的目标,它是优化指标,但我们也不想运行时间太长,可以设置运行时间小于100ms,它是满足指标。

训练/开发/测试集分布

Setting up the training, development and test sets have a huge impact on productivity. It is important to choose the development and test sets from the same distribution and it must be taken randomly from all the data. Guideline Choose a development set and test set to reflect data you expect to get in the future and consider important to do well.

开发机和测试集的大小

Old way of splitting data We had smaller data set therefore we had to use a greater percentage of data to develop and test ideas and models.

70/30或60/20/20

Modern era – Big data Now, because a large amount of data is available, we don’t have to compromised as much and can use a greater portion to train the model.

98/1/1

Guidelines

  • Set up the size of the test set to give a high confidence in the overall performance of the system.

  • Test set helps evaluate the performance of the final classifier which could be less 30% of the whole data set.

  • The development set has to be big enough to evaluate different ideas.

什么时候该改变开发/测试集和指标

The evaluation metric fails to correctly rank order preferences between algorithms. The evaluation metric or the development set or test set should be changed.

If doing well on your metric + dev/test set does not correspond to doing well on your application, change your metric and/or dev/test set.

Guideline

  1. Define correctly an evaluation metric that helps better rank order classifiers

  2. Optimize the evaluation metric

和人类水平比较

Today, machine learning algorithms can compete with human-level performance since they are more productive and more feasible in a lot of application. Also, the workflow of designing and building a machine learning system, is much more efficient than before. Moreover, some of the tasks that humans do are close to ‘’perfection’’, which is why machine learning tries to mimic human-level performance.

Machine learning progresses slowly when it surpasses human-level performance. One of the reason is that human-level performance can be close to Bayes optimal error, especially for natural perception problem. Bayes optimal error is defined as the best possible error. In other words, it means that any functions mapping from x to y can’t surpass a certain level of accuracy. Also, when the performance of machine learning is worse than the performance of humans, you can improve it with different tools. They are harder to use once its surpasses human-level performance. These tools are:

  • Get labeled data from humans

  • Gain insight from manual error analysis: Why did a person get this right?

  • Better analysis of bias/variance.

可避免误差

Avoidable bias

By knowing what the human-level performance is, it is possible to tell when a training set is performing well or not.

在某些任务中,比如计算机视觉,可以用human leval error代替Bayes error,因为人类在这方便的误差很小,接近最优水平。

Scenario A There is a 7% gap between the performance of the training set and the human level error. It means that the algorithm isn’t fitting well with the training set since the target is around 1%. To resolve the issue, we use bias reduction technique such as training a bigger neural network or running the training set longer. Scenario B The training set is doing good since there is only a 0.5% difference with the human level error. The difference between the training set and the human level error is called avoidable bias. The focus here is to reduce the variance since the difference between the training error and the development error is 2%. To resolve the issue, we use variance reduction technique such as regularization or have a bigger training set.

理解人类水平表现

The definition of human-level error depends on the purpose of the analysis.比如,如果目标是替代贝叶斯误差,就应该选择最小的误差作为人类误差。

Summary of bias/variance with human-level performance

  • Human - level error – proxy for Bayes error

  • If the difference between human-level error and the training error is bigger than the difference between the training error and the development error. The focus should be on bias reduction technique

  • If the difference between training error and the development error is bigger than the difference between the human-level error and the training error. The focus should be on variance reduction technique

超过人的表现

Surpassing human-level performance

There are many problems where machine learning significantly surpasses human-level performance, especially with structured data:

  • Online advertising

  • Product recommendations

  • Logistics (predicting transit time)

  • Loan approvals

改善你的模型表现

The two fundamental assumptions of supervised learning

The first one is to have a low avoidable bias which means that the training set fits well. The second one is to have a low or acceptable variance which means that the training set performance generalizes well to the development set and test set.

Avoidable bias (Human-level <——> Training error)

  • Train bigger model

  • Train longer, better optimization algorithms

  • Neural Networks architecture / hyperparameters search

Variance (Training error <——> Development error)

  • More data

  • Regularization

  • Neural Networks architecture / hyperparameters search

进行误差分析

Error analysis:

  • Get ~100 mislabeled dev set examples.

  • Count up how many are dogs.统计不同类型错误的比例

Ideas for cat detection:

  • Fix pictures of dogs being recognized as cats

  • Fix great cats (lions, panthers, etc..) being misrecognized

  • Improve performance on blurry images

清除标注错误的数据

DL algorithms are quite robust to random errors in the training set.(but not systematic errors比如所有白色的小狗都被标记成猫)

误差分析:

Overall dev set error

Errors due incorrect labels

Errors due to other causes

根据误差来源决定应该侧重于修正哪方面的误差

Correcting incorrect dev/test set examples

  • Apply same process to your dev and test sets to make sure they continue to come from the same distribution

  • Consider examining examples your algorithm got right as well as ones it got wrong.(通常不这么做,因为分类正确的样本太多)

  • Train and dev/test data may now come from slightly different distributions.

快速搭建第一个系统

  • Set up dev/test set and metric

  • Build initial system quickly

  • Use Bias/Variance analysis & Error analysis to prioritize next steps

Guideline:

Build your first system quickly, then iterate

训练集和测试集来自不同分布

可以把比较好的数据(如清晰的图片)作为训练集,不太好的数据(比如模糊的图片)分成三部分:训练集、开发集和测试集

偏差和方差(不匹配数据分布中)

Training-dev set: Same distribution as training set, but not used for training

Human level——(avoidable bias)——Training error——(variance)——Training-dev error——(data mismatch)——Dev error——(degree of overfitting to dev set)——Test error

定位数据不匹配

如果认为存在数据不匹配问题,可以做误差分析,或者看看训练集或开发集来找出这两个数据分布到底有什么不同,然后看看是否能够收集更多看起来像开发集的数据做训练。其中一种方法是人工数据合成,在语音识别中有用到。但是要注意在合成时是不是从所有可能性的空间只选了很小一部分去模拟数据。

迁移学习

神经网络中把从一个任务中学到的知识应用到另一个独立的任务中。

如,把图像识别中学到的知识应用或迁移到放射科诊断上。

假设已经训练好一个图像识别神经网络。应用到放射科诊断上时,可以初始化最后一层的权重,重新训练。

如果数据量小,可以只训练后面一道两层;如果数据量很大,可以对所有参数重新训练。

用图像识别的数据对神经网络的参数进行训练——预训练pre-training

用放射科数据重新训练权重——微调fine tuning

用于迁移目标任务数据很少,而相关知识任务数据量很大的情况下。

When transfer learning makes sense?

  • Task A and B have the same input x

  • You have a lot more data for Task A than Task B

  • Low level features from A could be helpful for learning B

多任务学习

让单个神经网络同时做多件事,每个任务都能帮到其他所有任务。

使用频率低于迁移学习。计算机视觉,物体检测中常用多任务学习。

When multi-task learning makes sense

  • Training on a set of tasks that could benefit from having shared lower-level features.

  • Usually: Amount of data you have for each task is quite similar.至少一个任务的数据要低于其它任务的数据之和

  • Can train a big enough neural network to do well on all the tasks.

端到端深度学习

end-to-end learning, 学习从输入到输出的直接映射,省略了中间步骤

需要数据集很大

端到端学习不一定是最好的,有时候多步学习会表现更好

是否要用端到端学习

Pros:

  • Let the data speak

  • Less hand-designing of components needed

Cons:

  • May need large amount of data

  • Excludes potentially useful hand-designed components

Key question: Do you have sufficient data to learn a function of the complexity needed to map x to y?

Previous小概念NextCNN

Last updated 3 years ago

Was this helpful?