安装pyhanlp
pyhanlp是java写的,外层封装了python。
对于新手,在使用的时候稍有难度。
1. 下载源码
https://github.com/hankcs/pyhanlp
git clone https://github.com/hankcs/pyhanlp.git
2. 创建虚机
python3 -m venv env
source env/bin/activate
3. 安装pyhanlp
cd pyhanlp
pip install -e .
以下是日志
Obtaining file:///Users/huihui/git/pyhanlp
Collecting jpype1==0.7.0 (from pyhanlp==0.1.62)
Using cached https://files.pythonhosted.org/packages/28/63/784834e8a24ec2e1ad7f703c3dc6c6fb372a77cc68a2fdff916e18a4449e/JPype1-0.7.0.tar.gz
Installing collected packages: jpype1, pyhanlp
Running setup.py install for jpype1 ... done
Running setup.py develop for pyhanlp
Successfully installed jpype1-0.7.0 pyhanlp
You are using pip version 19.0.3, however version 20.0.2 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
尝试调用,加载词典
(env) huihui@192 pyhanlp % python
Python 3.7.3 (default, Nov 15 2019, 04:04:52)
[Clang 11.0.0 (clang-1100.0.33.16)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyhanlp
下载 http://hanlp.com/static/release/hanlp-1.7.6-release.zip 到 /Users/huihui/git/pyhanlp/pyhanlp/static/hanlp-1.7.6-release.zip
100.00%, 1 MB, 514 KB/s, 还有 0 分 0 秒
下载 https://file.hankcs.com/hanlp/data-for-1.7.zip 到 /Users/huihui/git/pyhanlp/pyhanlp/static/data-for-1.7.6.zip
0.38%, 2 MB, 795 KB/s, 还有 13 分 37 秒 下载失败 https://file.hankcs.com/hanlp/data-for-1.7.zip 由于 timeout('The read operation timed out')
请参考 https://github.com/hankcs/pyhanlp 执行手动安装.
或手动下载 https://file.hankcs.com/hanlp/data-for-1.7.zip 到 /Users/huihui/git/pyhanlp/pyhanlp/static/data-for-1.7.6.zip
是否前往 https://github.com/hankcs/pyhanlp ?(y/n)y
(env) huihui@192 pyhanlp %
4. 下载词典文件
手动下载。
词典文件有668M,下载之后按照上述提示放置目录,并解压。
5.测试pyhanlp
测试例子1,在命令行
(env) huihui@192 pyhanlp % python
Python 3.7.3 (default, Nov 15 2019, 04:04:52)
[Clang 11.0.0 (clang-1100.0.33.16)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from pyhanlp import *
>>>
>>> print(HanLP.segment('出事了电脑'))
[出事/vi, 了/ule, 电脑/n]
>>>
测试例子2,在PyCharm
选择刚才创建的虚机
from pyhanlp import *
print(HanLP.segment('你好,欢迎在Python中调用HanLP的API'))
for term in HanLP.segment('下雨天地面积水'):
print('{} {}'.format(term.word, term.nature)) # 获取单词与词性
print(HanLP.segment('你好,欢迎在Python中调用HanLP的API'))