综述:
学会在FreeBSD安装Miniconda后,在一台服务器上安装却碰到问题,安装好后,执行python报错:Segmentation fault (core dumped) 。
以前成功的是在FreeBSD13版本,报错的这个是FreeBSD14版本,不知道是不是版本的问题。
暂时还是没有解决该问题,但是执行python xx.py文件是可以的,所以可以凑合着用。但是Nvidia显卡还没有用起来...后来cuda起来了,见文档:安装英伟达nvidia p4计算卡驱动@FreeBSD14-CSDN博客
为了解决问题,开干
第一步重装
准备再次试验安装python,干脆一步到位,直接安装Anaconda!
下载软件:https://mirrors.tuna.tsinghua.edu.cn/anaconda/archive/Anaconda3-2023.09-0-Linux-x86_64.sh
等等,服务器没有显示啊,先不整了,还是miniconda
安装Miniconda python3.11
下载软件:
wget https://mirrors4.tuna.tsinghua.edu.cn/anaconda/miniconda/Miniconda3-py311_23.11.0-1-Linux-x86_64.sh
安装:
sh download/Miniconda3-py311_23.11.0.1-Linux-x86_64.sh
提示:
WARNING:
Your operating system appears not to be 64-bit, but you are trying to
install a 64-bit version of Miniconda3.
Are sure you want to continue the installation? [yes|no]
我明白了,这是说系统不是64位啊,我得找找问题。
本以为是这个问题导致的python装好后运行崩溃,仔细看了才发现这是调用的FreeBSD的sh吧 ? 用linux的sh试试:
/compat/linux/bin/sh ~/download/Miniconda3-py311_24.3.0-0-Linux-x86_64.s
h
Welcome to Miniconda3 py311_24.3.0-0
In order to continue the installation process, please review the license
安装完成后提示:
Do you wish to update your shell profile to automatically initialize conda?
This will activate conda on startup and change the command prompt when activated.
If you'd prefer that conda's base environment not be activated on startup,
run the following command when conda is activated:
conda config --set auto_activate_base false
You can undo this by running `conda init --reverse $SHELL`? [yes|no]
[no] >>>
You have chosen to not have conda modify your shell scripts at all.
To activate conda's base environment in your current shell session:
eval "$(/home/skywalk/conda311/bin/conda shell.YOUR_SHELL_NAME hook)"
To install conda's shell functions for easier access, first activate, then:
conda init
Thank you for installing Miniconda3!
执行这个试试:
./conda311/bin/conda shell.csh hook
还是不行,好像去打印出来了
设置环境
先进入bash,然后执行source /home/skywalk/conda311/etc/profile.d/conda.sh
这时候conda至少可以用了。
但是新安装的conda下的python还是Segmentation fault (core dumped)
用gdb看看python
安装gdb
pkg install gdb
使用gdb调试
咱也不太会,就按照网上的帖子用最简单的步骤,先运行gdb,然后在gdb中执行run python脚本。
用gdb看这不挺正常的吗?
(base) bash-4.2$ gdb python
GNU gdb (GDB) 14.1 [GDB v14.1 for FreeBSD]
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-portbld-freebsd14.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from python...
(gdb) run testpython.py
Starting program: /home/skywalk/conda311/bin/python testpython.py
hello
[Inferior 1 (process 6662
通过gdb调试,发现能正常调用python的py文件进行执行。也就是不能进入交互界面,但勉强能用了。
后面就尝试了飞桨、pytorch等。
尝试飞桨
安装飞桨:
python -m pip install paddlepaddle-gpu==2.6.1.post120 -f https://www.paddlepaddle.org.cn/whl/linux/mkl/avx/stable.html
写飞桨测试代码:
import io
print("hello")
import paddle
print("import")
x = paddle.randn((2,2))
print("x", x)
print("x+1", x+1)
执行报错,
from . import libpaddle
ImportError: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.20' not found (required by /home/skywalk/conda311/lib/python3.11/site-packages/paddle/base/libpaddle.so)
问题先搁置。
尝试安装测试pytorch和fastai
conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
conda时间太长断开链接了,用pip安装:pip3 install torch torchvision torchaudio
还安装了fastai
测试pytorch的代码:
import io
print("hello")
import torch
print("import")
x = torch.randn((2,2))
print("x", x)
print("x+1", x+1)
测试通过,但是没有走gpu
测试fastai的例子代码:
cat testtorch.py
import io
print("hello")
import torch
print("import")
x = torch.randn((2,2))
print("x", x)
print("x+1", x+1)
(base) bash-4.2$ cat testai.py
from fastai.text.all import *
path = untar_data(URLs.IMDB)
path.ls()
dls = TextDataLoaders.from_folder(untar_data(URLs.IMDB), valid='test')
learn = text_classifier_learner(dls, AWD_LSTM, drop_mult=0.5, metrics=accuracy)
learn.fine_tune(4, 1e-2)
learn.show_results()
发现要一个多小时.....明显gpu没有调用。
总结:
暂时没找到python (core dumped)的原因和解决方法。但是可以python xx.py执行任务。
暂时没搞定FreeBSD下的Nvidia显卡。2024.5.2日搞定显卡,见:安装英伟达nvidia p4计算卡驱动@FreeBSD14-CSDN博客
调试
飞桨报错
ImportError: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.20' not found (required by /home/skywalk/conda311/lib/python3.11/site-packages/paddle/base/libpaddle.so)
这个先搁置
conda安装torch:CondaHTTPError: HTTP 000 CONNECTION FAILED for url
CondaHTTPError: HTTP 000 CONNECTION FAILED for url <https://conda.anaconda.org/nvidia/linux-64/libcusolver-11.4.4.55-0.tar.bz2>
Elapsed: -
An HTTP error occurred when trying to retrieve this URL.
HTTP errors are often intermittent, and a simple retry will get you on your way.
CondaHTTPError: HTTP 000 CONNECTION FAILED for url <https://conda.anaconda.org/nvidia/linux-64/libcusolver-11.4.4.55-0.tar.bz2>
Elapsed: -
An HTTP error occurred when trying to retrieve this URL.
HTTP errors are often intermittent, and a simple retry will get you on your way.
改用pip:pip3 install torch torchvision torchaudio
torch的gpu没用起来,报错
/home/skywalk/conda311/lib/python3.11/site-packages/torch/cuda/__init__.py:118: UserWarning: CUDA initialization: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 304: OS call failed or operation not supported on this OS (Triggered internally at ../c10/cuda/CUDAFunctions.cpp:108.)
return torch._C._cuda_getDeviceCount() > 0
看来用自己的方法,gpu没用起来。