html Meta 标签屏蔽蜘蛛抓取的方法教程

html Meta 标签屏蔽蜘蛛抓取的方法教程

今天有人说可以利用 html 页面的中 meta 标签来屏蔽搜索引擎蜘蛛的抓取或不让蜘蛛抓取网页中的其它链接，本着好奇的精神，就查了一些相关的资料，发现了一串类似 <meta name='robots' content='noindex,nofollow' /> 这样的HTML代码，可以告诉来网页爬取的蜘蛛本页内容是否可以抓取，是否可以抓取网页上的其它链接。

SEO优化之 meta 标签的解析

html meta标签的代码如下：

<meta name='robots' content='noindex,nofollow' />

解释：

name ：是指的所有的搜索引擎.它的值有多个，下面列出几个。

robots：表示所有的搜索引擎

Baiduspide：百度

Googlebot：谷歌

content：表示对搜索引擎蜘蛛的行为做出指示，它有下面的几值来表示。

index：允许本页被抓取

noindex：禁止许本页被抓取

follow：允许本页的上其它链接可以被跟踪抓取

nofollow：禁止本页上的其它链接被跟踪抓取

noarchive：禁止搜索引擎对网站建立快照

SEO优化之 meta 标签的用法

根据上面 content 属性的四个值，我们可以列出以下几个组合的用法：

<meta name='robots' content='index,follow' />：可以抓取本页，也可以顺着本页继续索引别的链接

<meta name='robots' content='noindex,follow' />：禁止抓取本页，但可以抓取跟踪本页的其它链接

<meta name='robots' content='index,nofollow' />：可以抓取本页，禁止抓取和跟踪本页的其它链接

<meta name='robots' content='noindex,nofollow' />：禁止抓取本页，禁止抓取和跟踪本页的其它链接

<meta name='robots' content='noarchive' />：禁止搜索引擎建立快照

需要注意的是：

1、index 与 follow 的组合可以简化为 all

<meta name='robots' content='index,follow' />

可以写成

<meta name='robots' content='all' />

2、noindex 与 nofollow 组合可以简化为 none

<meta name='robots' content='noindex,nofollow' />

可以写成

<meta name='robots' content='none' />

3、相反的属性值是不能写在一样的，比如 index 与 noindex 不能写到一起，follow 与 nofollow 不能写到一起

下面的两个示例是错误的

<meta name='robots' content='index,onindex' />

<meta name='robots' content='follow,onfollow' />

转自：https://www.feiniaomy.com/post/596.html
相关阅读:
谈谈 rm -rf * 后的几点体会（年轻人得讲码德）
shell读取文档中的命令并逐行执行
 被踢出工作群聊后的若干反思
 units命令单位转换
 想买保时捷的运维李先生学Java性能之垃圾收集器
 想买保时捷的运维李先生学Java性能之垃圾收集算法
 想买保时捷的运维李先生学Java性能之生存与毁灭
 想买保时捷的运维李先生求救求救求救求救
 想买保时捷的运维李先生学Java性能之运行时数据区域
 想买保时捷的运维李先生学Java性能之 JIT即时编译器
原文地址：https://www.cnblogs.com/zinging/p/13569434.html

html Meta 标签屏蔽蜘蛛抓取的方法教程

SEO优化之 meta 标签的解析

SEO优化之 meta 标签的用法