Can‘t find model ‘en

06-01 1026阅读

        用spacy库进行文本数据的实体识别,代码如下:

import spacy
# load a pipelinef using the name of an installed package, a string path or a Path-like object.
nlp = spacy.load("en_core_web_sm")
text = "Apple is looking at buying U.K. startup for $1 billion"
doc = nlp(text)
for ent in doc.ents:print(ent.text, ent.label_)

在python IDEL Shell中运行,报错如下:

Traceback (most recent call last):
  File "D:\source\python demo\entity-recognition.py", line 3, in 
    nlp = spacy.load("en_core_web_sm") # load a pipelinef using the name of an installed package, a string path or a Path-like object.
  File "C:\Users\JX\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\LocalCache\local-packages\Python312\site-packages\spacy\__init__.py", line 51, in load
    return util.load_model(
  File "C:\Users\JX\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\LocalCache\local-packages\Python312\site-packages\spacy\util.py", line 472, in load_model
    raise IOError(Errors.E050.format(name=name))
OSError: [E050] Can't find model 'en_core_web_sm'. It doesn't seem to be a Python package or a valid path to a data directory.

错误提示:OSError: [E050] Can't find model 'en_core_web_sm'. It doesn't seem to be a Python package or a valid path to a data directory.

找不到“en_core_web_sm”这个python模块。

【解决方法和步骤】:

1. 查看安装的spacy的版本,cmd中输入命令:

>pip show spacy

显示当前版本是:3.8.4

2. 查找与spacy版本一致的en_core_web_sm模块

去en_core_web_sm地址查找对应版本的en_core_web_sm-3.8.0的链接并下载安装包en_core_web_sm-3.8.0.tar.gz

3. 安装en_core_web_sm-3.8.0,cmd输入命令:

pip install XXX\en_core_web_sm-3.8.0.tar.gz

注意XXX表示en_core_web_sm-3.8.0.tar.gz文件的存放路径,视实际情况而定。

安装成功会显示:

Defaulting to user installation because normal site-packages is not writeable
Processing f:\chromedownload\en_core_web_sm-3.8.0.tar.gz
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Building wheels for collected packages: en_core_web_sm
  Building wheel for en_core_web_sm (pyproject.toml) ... done
  Created wheel for en_core_web_sm: filename=en_core_web_sm-3.8.0-py3-none-any.whl size=12806171 sha256=4234cf698b46566be25d426394d8d095dee4b7563936c44c6714d8cf56619469
  Stored in directory: c:\users\jx\appdata\local\pip\cache\wheels\90\e0\2a\2251f0107678422c64ebd606676a42192a19277237c4575e03
Successfully built en_core_web_sm
Installing collected packages: en_core_web_sm
Successfully installed en_core_web_sm-3.8.0

4. 再次运行文章开头的文本数据的实体识别代码,显示:

Apple ORG
U.K. GPE
$1 billion MONEY

表示文本数据的实体识别成功

【博主按】:网上有很多博客是直接通过命令:

Can‘t find model ‘en
(图片来源网络,侵删)
python -m spacy download en_core_web_sm

安装en_core_web_sm,官方文档也是这么推荐的。本人实际操作中,出现如下错误:

Traceback (most recent call last):

Can‘t find model ‘en
(图片来源网络,侵删)

  File "C:\Users\JX\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\LocalCache\local-packages\Python312\site-packages\urllib3\connection.py", line 199, in _new_conn

    sock = connection.create_connection(

Can‘t find model ‘en
(图片来源网络,侵删)

           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "C:\Users\JX\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\LocalCache\local-packages\Python312\site-packages\urllib3\util\connection.py", line 60, in create_connection

    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):

               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.12_3.12.2544.0_x64__qbz5n2kfra8p0\Lib\socket.py", line 978, in getaddrinfo

    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):

               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

socket.gaierror: [Errno 11004] getaddrinfo failed

The above exception was the direct cause of the following exception:

Traceback (most recent call last):

  File "C:\Users\JX\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\LocalCache\local-packages\Python312\site-packages\urllib3\connectionpool.py", line 789, in urlopen

    response = self._make_request(

               ^^^^^^^^^^^^^^^^^^^

  File "C:\Users\JX\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\LocalCache\local-packages\Python312\site-packages\urllib3\connectionpool.py", line 490, in _make_request

    raise new_e

......

如红色部分提示,解读出来,根本原因是:Winsock错误,在 Windows 操作系统中,这通常表示 "WSAETIMEDOUT"(超时),通常因为一个网络连接尝试,等待时间超时导致。考虑是我的网络原因,导致连接github服务器超时。

免责声明:我们致力于保护作者版权,注重分享,被刊用文章因无法核实真实出处,未能及时与作者取得联系,或有版权异议的,请联系管理员,我们会立即处理! 部分文章是来自自研大数据AI进行生成,内容摘自(百度百科,百度知道,头条百科,中国民法典,刑法,牛津词典,新华词典,汉语词典,国家院校,科普平台)等数据,内容仅供学习参考,不准确地方联系删除处理! 图片声明:本站部分配图来自人工智能系统AI生成,觅知网授权图片,PxHere摄影无版权图库和百度,360,搜狗等多加搜索引擎自动关键词搜索配图,如有侵权的图片,请第一时间联系我们。

相关阅读

目录[+]

取消
微信二维码
微信二维码
支付宝二维码