• failed with: java.lang.NullPointerException


    failed with: java.lang.NullPointerException
    
    需要在nutch的配置文件 'conf/nutch-site.xml'. 里设置如下,不然就报上面的错误了。
    
    当然在crawl-urlfilter.txt里面也要相应于 urls/url.txt里的域名进行设置。
    
    
    <?xml version="1.0"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    
    <!-- Put site-specific property overrides in this file. -->
    
    <configuration>
    <property>
    <name>http.agent.name</name>
    <value>MySearch</value>
    <description>My Search Engine</description>
    </property>
    
    <property>
    <name>http.agent.description</name>
    <value></value>
    <description>Further description of our bot- this text is used in
    the User-Agent header. It appears in parenthesis after the agent name.
    </description>
    </property>
    
    <property>
    <name>http.agent.url</name>
    <value></value>
    <description>A URL to advertise in the User-Agent header. This will
    appear in parenthesis after the agent name. Custom dictates that this
    should be a URL of a page explaining the purpose and behavior of this
    crawler.
    </description>
    </property>
    
    <property>
    <name>http.agent.email</name>
    <value></value>
    <description>An email address to advertise in the HTTP 'From' request
    header and User-Agent header. A good practice is to mangle this
    address (e.g. 'info at example dot com') to avoid spamming.
    </description>
    </property>
    
    </configuration>
  • 相关阅读:
    Cocos2dx-Android屏幕适配方案
    Cocos2dx-Mac下的环境配置
    Cocos2dx-C++和Lua数据通讯
    Cocos2dx-Lua与C++混合使用
    安装宝塔面板
    nginx配置ssl安全证书
    在nginx上运行spring
    spring在服务器上的启动和关闭jar包
    idea常用插件
    maven打包war
  • 原文地址:https://www.cnblogs.com/i80386/p/3972350.html
Copyright © 2020-2023  润新知