书生·浦语大模型实战营之手把手带你评测 Llama 3 能力(OpenCompass 版)
环境配置
conda create -n llama3 python=3.10 pytorch torchvision pytorch-cuda -c nvidia -c pytorch -y
conda activate llama3
conda install git
git-lfs install
✨下载 Llama3 模型
通过 OpenXLab 下载 Llama-3-8B-Instruct 这个模型
mkdir -p ~/model
cd ~/model
git clone https://code.openxlab.org.cn/MrCat/Llama-3-8B-Instruct.git Meta-Llama-3-8B-Instruct
或者软链接 InternStudio 中的模型
mkdir -p ~/model
cd ~/model
git clone https://code.openxlab.org.cn/MrCat/Llama-3-8B-Instruct.git Meta-Llama-3-8B-Instruct
🛠️安装 OpenCompass
cd ~
git clone https://github.com/open-compass/opencompass opencompass
cd opencompass
pip install -e .
运行结果为:
📂 数据准备
wget https://github.com/open-compass/opencompass/releases/download/0.2.2.rc1/OpenCompassData-core-20240207.zip
unzip OpenCompassData-core-20240207.zip
运行结果为:
(llama3) root@intern-studio-061925:~/opencompass# wget https://github.com/open-compass/opencompass/releases/download/0.2.2.rc1/OpenCompassData-core-20240207.zip
--2024-05-07 13:40:46-- https://github.com/open-compass/opencompass/releases/download/0.2.2.rc1/OpenCompassData-core-20240207.zip
Resolving proxy.intern-ai.org.cn (proxy.intern-ai.org.cn)... 172.18.128.194
Connecting to proxy.intern-ai.org.cn (proxy.intern-ai.org.cn)|172.18.128.194|:50000... connected.
Proxy request sent, awaiting response... 302 Found
Location: https://objects.githubusercontent.com/github-production-release-asset-2e65be/654124617/b6ea57a4-4c8c-4be6-afa3-c63a5e511564?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAVCODYLSA53PQK4ZA%2F20240507%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20240507T054046Z&X-Amz-Expires=300&X-Amz-Signature=4749bc808fdbd810c85662ceb9e23ad83b63084cf74d66a9248506d94cce24f5&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=654124617&response-content-disposition=attachment%3B%20filename%3DOpenCompassData-core-20240207.zip&response-content-type=application%2Foctet-stream [following]
--2024-05-07 13:40:46-- https://objects.githubusercontent.com/github-production-release-asset-2e65be/654124617/b6ea57a4-4c8c-4be6-afa3-c63a5e511564?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAVCODYLSA53PQK4ZA%2F20240507%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20240507T054046Z&X-Amz-Expires=300&X-Amz-Signature=4749bc808fdbd810c85662ceb9e23ad83b63084cf74d66a9248506d94cce24f5&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=654124617&response-content-disposition=attachment%3B%20filename%3DOpenCompassData-core-20240207.zip&response-content-type=application%2Foctet-stream
Connecting to proxy.intern-ai.org.cn (proxy.intern-ai.org.cn)|172.18.128.194|:50000... connected.
Proxy request sent, awaiting response... 200 OK
Length: 156098144 (149M) [application/octet-stream]
Saving to: 'OpenCompassData-core-20240207.zip'
OpenCompassData-core-20240207.zi 100%[==========================================================>] 148.87M 396KB/s in 5m 41s
2024-05-07 13:46:28 (447 KB/s) - 'OpenCompassData-core-20240207.zip' saved [156098144/156098144]
(llama3) root@intern-studio-061925:~/opencompass# unzip OpenCompassData-core-20240207.zip
数据集共85个目录,1062个文件。
(llama3) root@intern-studio-061925:~/opencompass/data# tree
.
|-- AGIEval
| `-- data
| |-- few_shot_prompts.csv
| `-- v1
| |-- LICENSE
| |-- aqua-rat.jsonl
| |-- gaokao-biology.jsonl
| |-- gaokao-chemistry.jsonl
| |-- gaokao-chinese.jsonl
| |-- gaokao-english.jsonl
| |-- gaokao-geography.jsonl
| |-- gaokao-history.jsonl
| |-- gaokao-mathcloze.jsonl
| |-- gaokao-mathqa.jsonl
| |-- gaokao-physics.jsonl
| |-- jec-qa-ca.jsonl
| |-- jec-qa-kd.jsonl
| |-- logiqa-en.jsonl
| |-- logiqa-zh.jsonl
| |-- lsat-ar.jsonl
| |-- lsat-lr.jsonl
| |-- lsat-rc.jsonl
| |-- math.jsonl
| |-- sat-en-without-passage.jsonl
| |-- sat-en.jsonl
| `-- sat-math.jsonl
|-- ARC
| |-- ARC-c
| | |-- ARC-Challenge-Dev.jsonl
| | |-- ARC-Challenge-Test.jsonl
| | `-- ARC_c_test_contamination_annotations.json
| `-- ARC-e
| |-- ARC-Easy-Dev.jsonl
| `-- ARC-Easy-Test.jsonl
|-- BBH
| |-- data
| | |-- README.md
| | |-- boolean_expressions.json
| | |-- causal_judgement.json
| | |-- date_understanding.json
| | |-- disambiguation_qa.json
| | |-- dyck_languages.json
| | |-- formal_fallacies.json
| | |-- geometric_shapes.json
| | |-- hyperbaton.json
| | |-- logical_deduction_five_objects.json
| | |-- logical_deduction_seven_objects.json
| | |-- logical_deduction_three_objects.json
| | |-- movie_recommendation.json
| | |-- multistep_arithmetic_two.json
| | |-- navigate.json
| | |-- object_counting.json
| | |-- penguins_in_a_table.json
| | |-- reasoning_about_colored_objects.json
| | |-- ruin_names.json
| | |-- salient_translation_error_detection.json
| | |-- snarks.json
| | |-- sports_understanding.json
| | |-- temporal_sequences.json
| | |-- tracking_shuffled_objects_five_objects.json
| | |-- tracking_shuffled_objects_seven_objects.json
| | |-- tracking_shuffled_objects_three_objects.json
| | |-- web_of_lies.json
| | `-- word_sorting.json
| `-- lib_prompt
| |-- boolean_expressions.txt
| |-- causal_judgement.txt
| |-- date_understanding.txt
| |-- disambiguation_qa.txt
| |-- dyck_languages.txt
| |-- formal_fallacies.txt
| |-- geometric_shapes.txt
| |-- hyperbaton.txt
| |-- logical_deduction_five_objects.txt
| |-- logical_deduction_seven_objects.txt
| |-- logical_deduction_three_objects.txt
| |-- movie_recommendation.txt
| |-- multistep_arithmetic_two.txt
| |-- navigate.txt
| |-- object_counting.txt
| |-- penguins_in_a_table.txt
| |-- reasoning_about_colored_objects.txt
| |-- ruin_names.txt
| |-- salient_translation_error_detection.txt
| |-- snarks.txt
| |-- sports_understanding.txt
| |-- temporal_sequences.txt
| |-- tracking_shuffled_objects_five_objects.txt
| |-- tracking_shuffled_objects_seven_objects.txt
| |-- tracking_shuffled_objects_three_objects.txt
| |-- web_of_lies.txt
| `-- word_sorting.txt
|-- CLUE
| |-- AFQMC
| | |-- dev.json
| | `-- test_public.json
| |-- C3
| | |-- d-dev.json
| | |-- dev_0.json
| | `-- m-dev.json
| |-- CMRC
| | |-- dev.json
| | `-- test_public.json
| |-- DRCD
| | |-- dev.json
| | `-- test_public.json
| |-- OCNLI
| | `-- dev.json
| `-- cmnli
| `-- cmnli_public
| `-- dev.json
|-- FewCLUE
| |-- bustm
| | |-- dev_0.json
| | |-- dev_1.json
| | |-- dev_2.json
| | |-- dev_3.json
| | |-- dev_4.json
| | |-- dev_few_all.json
| | |-- test.json
| | |-- test_public.json
| | |-- train_0.json
| | |-- train_1.json
| | |-- train_2.json
| | |-- train_3.json
| | |-- train_4.json
| | |-- train_few_all.json
| | `-- unlabeled.json
| |-- chid
| | |-- dev_0.json
| | |-- dev_1.json
| | |-- dev_2.json
| | |-- dev_3.json
| | |-- dev_4.json
| | |-- dev_few_all.json
| | |-- test.json
| | |-- test_public.json
| | |-- train_0.json
| | |-- train_1.json
| | |-- train_2.json
| | |-- train_3.json
| | |-- train_4.json
| | |-- train_few_all.json
| | `-- unlabeled.json
| |-- cluewsc
| | |-- dev_0.json
| | |-- dev_1.json
| | |-- dev_2.json
| | |-- dev_3.json
| | |-- dev_4.json
| | |-- dev_few_all.json
| | |-- label_distribution.json
| | |-- test.json
| | |-- test_public.json
| | |-- train_0.json
| | |-- train_1.json
| | |-- train_2.json
| | |-- train_3.json
| | |-- train_4.json
| | |-- train_few_all.json
| | `-- unlabeled.json
| |-- csl
| | |-- dev_0.json
| | |-- dev_1.json
| | |-- dev_2.json
| | |-- dev_3.json
| | |-- dev_4.json
| | |-- dev_few_all.json
| | |-- test.json
| | |-- test_public.json
| | |-- train_0.json
| | |-- train_1.json
| | |-- train_2.json
| | |-- train_3.json
| | |-- train_4.json
| | |-- train_few_all.json
| | `-- unlabeled.json
| |-- csldcp
| | |-- dev_0.json
| | |-- dev_1.json
| | |-- dev_2.json
| | |-- dev_3.json
| | |-- dev_4.json
| | |-- dev_few_all.json
| | |-- labelDesc2label.py
| | |-- labels_all.txt
| | |-- test.json
| | |-- test_public.json
| | |-- train_0.json
| | |-- train_1.json
| | |-- train_2.json
| | |-- train_3.json
| | |-- train_4.json
| | |-- train_few_all.json
| | `-- unlabeled.json
| |-- eprstmt
| | |-- dev_0.json
| | |-- dev_1.json
| | |-- dev_2.json
| | |-- dev_3.json
| | |-- dev_4.json
| | |-- dev_few_all.json
| | |-- dev_public.json
| | |-- test.json
| | |-- test_public.json
| | |-- train_0.json
| | |-- train_1.json
| | |-- train_2.json
| | |-- train_3.json
| | |-- train_4.json
| | |-- train_few_all.json
| | `-- unlabeled.json
| |-- iflytek
| | |-- dev_0.json
| | |-- dev_1.json
| | |-- dev_2.json
| | |-- dev_3.json
| | |-- dev_4.json
| | |-- dev_few_all.json
| | |-- label_id2des_desc2short.py
| | |-- test.json
| | |-- test_public.json
| | |-- train_0.json
| | |-- train_1.json
| | |-- train_2.json
| | |-- train_3.json
| | |-- train_4.json
| | |-- train_few_all.json
| | `-- unlabeled.json
| |-- ocnli
| | |-- dev_0.json
| | |-- dev_1.json
| | |-- dev_2.json
| | |-- dev_3.json
| | |-- dev_4.json
| | |-- dev_few_all.json
| | |-- test.json
| | |-- test_public.json
| | |-- train_0.json
| | |-- train_1.json
| | |-- train_2.json
| | |-- train_3.json
| | |-- train_4.json
| | |-- train_few_all.json
| | `-- unlabeled.json
| |-- readme.md
| `-- tnews
| |-- dev_0.json
| |-- dev_1.json
| |-- dev_2.json
| |-- dev_3.json
| |-- dev_4.json
| |-- dev_few_all.json
| |-- label_index2en2zh.json
| |-- test.json
| |-- test_public.json
| |-- train_0.json
| |-- train_1.json
| |-- train_2.json
| |-- train_3.json
| |-- train_4.json
| |-- train_few_all.json
| `-- unlabeled.json
|-- GAOKAO-BENCH
| `-- data
| |-- Fill-in-the-blank_Questions
| | |-- 2010-2022_Chinese_Language_Famous_Passages_and_Sentences_Dictation.json
| | |-- 2010-2022_Math_II_Fill-in-the-Blank.json
| | |-- 2010-2022_Math_I_Fill-in-the-Blank.json
| | `-- 2014-2022_English_Language_Cloze_Passage.json
| |-- Multiple-choice_Questions
| | |-- 2010-2013_English_MCQs.json
| | |-- 2010-2022_Biology_MCQs.json
| | |-- 2010-2022_Chemistry_MCQs.json
| | |-- 2010-2022_Chinese_Lang_and_Usage_MCQs.json
| | |-- 2010-2022_Chinese_Modern_Lit.json
| | |-- 2010-2022_English_Fill_in_Blanks.json
| | |-- 2010-2022_English_Reading_Comp.json
| | |-- 2010-2022_Geography_MCQs.json
| | |-- 2010-2022_History_MCQs.json
| | |-- 2010-2022_Math_II_MCQs.json
| | |-- 2010-2022_Math_I_MCQs.json
| | |-- 2010-2022_Physics_MCQs.json
| | |-- 2010-2022_Political_Science_MCQs.json
| | `-- 2012-2022_English_Cloze_Test.json
| `-- Open-ended_Questions
| |-- 2010-2022_Biology_Open-ended_Questions.json
| |-- 2010-2022_Chemistry_Open-ended_Questions.json
| |-- 2010-2022_Chinese_Language_Ancient_Poetry_Reading.json
| |-- 2010-2022_Chinese_Language_Classical_Chinese_Reading.json
| |-- 2010-2022_Chinese_Language_Language_and_Writing_Skills_Open-ended_Questions.json
| |-- 2010-2022_Chinese_Language_Literary_Text_Reading.json
| |-- 2010-2022_Chinese_Language_Practical_Text_Reading.json
| |-- 2010-2022_Geography_Open-ended_Questions.json
| |-- 2010-2022_History_Open-ended_Questions.json
| |-- 2010-2022_Math_II_Open-ended_Questions.json
| |-- 2010-2022_Math_I_Open-ended_Questions.json
| |-- 2010-2022_Physics_Open-ended_Questions.json
| |-- 2010-2022_Political_Science_Open-ended_Questions.json
| `-- 2012-2022_English_Language_Error_Correction.json
|-- LCSTS
| |-- test.src.txt
| `-- test.tgt.txt
|-- SuperGLUE
| |-- AX-b
| | `-- AX-b.jsonl
| |-- AX-g
| | `-- AX-g.jsonl
| |-- BoolQ
| | |-- test.jsonl
| | `-- val.jsonl
| |-- CB
| | |-- test.jsonl
| | `-- val.jsonl
| |-- COPA
| | |-- test.jsonl
| | `-- val.jsonl
| |-- MultiRC
| | |-- test.jsonl
| | `-- val.jsonl
| |-- RTE
| | |-- test.jsonl
| | `-- val.jsonl
| |-- ReCoRD
| | |-- test.jsonl
| | `-- val.jsonl
| |-- WSC
| | |-- test.jsonl
| | `-- val.jsonl
| `-- WiC
| |-- test.jsonl
| `-- val.jsonl
|-- TheoremQA
| `-- test.csv
|-- Xsum
| |-- dev.csv
| |-- dev.json
| `-- dev.jsonl
|-- ceval
| `-- formal_ceval
| |-- dev
| | |-- accountant_dev.csv
| | |-- advanced_mathematics_dev.csv
| | |-- art_studies_dev.csv
| | |-- basic_medicine_dev.csv
| | |-- business_administration_dev.csv
| | |-- chinese_language_and_literature_dev.csv
| | |-- civil_servant_dev.csv
| | |-- clinical_medicine_dev.csv
| | |-- college_chemistry_dev.csv
| | |-- college_economics_dev.csv
| | |-- college_physics_dev.csv
| | |-- college_programming_dev.csv
| | |-- computer_architecture_dev.csv
| | |-- computer_network_dev.csv
| | |-- discrete_mathematics_dev.csv
| | |-- education_science_dev.csv
| | |-- electrical_engineer_dev.csv
| | |-- environmental_impact_assessment_engineer_dev.csv
| | |-- fire_engineer_dev.csv
| | |-- high_school_biology_dev.csv
| | |-- high_school_chemistry_dev.csv
| | |-- high_school_chinese_dev.csv
| | |-- high_school_geography_dev.csv
| | |-- high_school_history_dev.csv
| | |-- high_school_mathematics_dev.csv
| | |-- high_school_physics_dev.csv
| | |-- high_school_politics_dev.csv
| | |-- ideological_and_moral_cultivation_dev.csv
| | |-- law_dev.csv
| | |-- legal_professional_dev.csv
| | |-- logic_dev.csv
| | |-- mao_zedong_thought_dev.csv
| | |-- marxism_dev.csv
| | |-- metrology_engineer_dev.csv
| | |-- middle_school_biology_dev.csv
| | |-- middle_school_chemistry_dev.csv
| | |-- middle_school_geography_dev.csv
| | |-- middle_school_history_dev.csv
| | |-- middle_school_mathematics_dev.csv
| | |-- middle_school_physics_dev.csv
| | |-- middle_school_politics_dev.csv
| | |-- modern_chinese_history_dev.csv
| | |-- operating_system_dev.csv
| | |-- physician_dev.csv
| | |-- plant_protection_dev.csv
| | |-- probability_and_statistics_dev.csv
| | |-- professional_tour_guide_dev.csv
| | |-- sports_science_dev.csv
| | |-- tax_accountant_dev.csv
| | |-- teacher_qualification_dev.csv
| | |-- urban_and_rural_planner_dev.csv
| | `-- veterinary_medicine_dev.csv
| |-- test
| | |-- accountant_test.csv
| | |-- advanced_mathematics_test.csv
| | |-- art_studies_test.csv
| | |-- basic_medicine_test.csv
| | |-- business_administration_test.csv
| | |-- chinese_language_and_literature_test.csv
| | |-- civil_servant_test.csv
| | |-- clinical_medicine_test.csv
| | |-- college_chemistry_test.csv
| | |-- college_economics_test.csv
| | |-- college_physics_test.csv
| | |-- college_programming_test.csv
| | |-- computer_architecture_test.csv
| | |-- computer_network_test.csv
| | |-- discrete_mathematics_test.csv
| | |-- education_science_test.csv
| | |-- electrical_engineer_test.csv
| | |-- environmental_impact_assessment_engineer_test.csv
| | |-- fire_engineer_test.csv
| | |-- high_school_biology_test.csv
| | |-- high_school_chemistry_test.csv
| | |-- high_school_chinese_test.csv
| | |-- high_school_geography_test.csv
| | |-- high_school_history_test.csv
| | |-- high_school_mathematics_test.csv
| | |-- high_school_physics_test.csv
| | |-- high_school_politics_test.csv
| | |-- ideological_and_moral_cultivation_test.csv
| | |-- law_test.csv
| | |-- legal_professional_test.csv
| | |-- logic_test.csv
| | |-- mao_zedong_thought_test.csv
| | |-- marxism_test.csv
| | |-- metrology_engineer_test.csv
| | |-- middle_school_biology_test.csv
| | |-- middle_school_chemistry_test.csv
| | |-- middle_school_geography_test.csv
| | |-- middle_school_history_test.csv
| | |-- middle_school_mathematics_test.csv
| | |-- middle_school_physics_test.csv
| | |-- middle_school_politics_test.csv
| | |-- modern_chinese_history_test.csv
| | |-- operating_system_test.csv
| | |-- physician_test.csv
| | |-- plant_protection_test.csv
| | |-- probability_and_statistics_test.csv
| | |-- professional_tour_guide_test.csv
| | |-- sports_science_test.csv
| | |-- tax_accountant_test.csv
| | |-- teacher_qualification_test.csv
| | |-- urban_and_rural_planner_test.csv
| | `-- veterinary_medicine_test.csv
| `-- val
| |-- accountant_val.csv
| |-- advanced_mathematics_val.csv
| |-- art_studies_val.csv
| |-- basic_medicine_val.csv
| |-- business_administration_val.csv
| |-- ceval_contamination_annotations.json
| |-- chinese_language_and_literature_val.csv
| |-- civil_servant_val.csv
| |-- clinical_medicine_val.csv
| |-- college_chemistry_val.csv
| |-- college_economics_val.csv
| |-- college_physics_val.csv
| |-- college_programming_val.csv
| |-- computer_architecture_val.csv
| |-- computer_network_val.csv
| |-- discrete_mathematics_val.csv
| |-- education_science_val.csv
| |-- electrical_engineer_val.csv
| |-- environmental_impact_assessment_engineer_val.csv
| |-- fire_engineer_val.csv
| |-- high_school_biology_val.csv
| |-- high_school_chemistry_val.csv
| |-- high_school_chinese_val.csv
| |-- high_school_geography_val.csv
| |-- high_school_history_val.csv
| |-- high_school_mathematics_val.csv
| |-- high_school_physics_val.csv
| |-- high_school_politics_val.csv
| |-- ideological_and_moral_cultivation_val.csv
| |-- law_val.csv
| |-- legal_professional_val.csv
| |-- logic_val.csv
| |-- mao_zedong_thought_val.csv
| |-- marxism_val.csv
| |-- metrology_engineer_val.csv
| |-- middle_school_biology_val.csv
| |-- middle_school_chemistry_val.csv
| |-- middle_school_geography_val.csv
| |-- middle_school_history_val.csv
| |-- middle_school_mathematics_val.csv
| |-- middle_school_physics_val.csv
| |-- middle_school_politics_val.csv
| |-- modern_chinese_history_val.csv
| |-- operating_system_val.csv
| |-- physician_val.csv
| |-- plant_protection_val.csv
| |-- probability_and_statistics_val.csv
| |-- professional_tour_guide_val.csv
| |-- sports_science_val.csv
| |-- tax_accountant_val.csv
| |-- teacher_qualification_val.csv
| |-- urban_and_rural_planner_val.csv
| `-- veterinary_medicine_val.csv
|-- cmmlu
| |-- dev
| | |-- agronomy.csv
| | |-- anatomy.csv
| | |-- ancient_chinese.csv
| | |-- arts.csv
| | |-- astronomy.csv
| | |-- business_ethics.csv
| | |-- chinese_civil_service_exam.csv
| | |-- chinese_driving_rule.csv
| | |-- chinese_food_culture.csv
| | |-- chinese_foreign_policy.csv
| | |-- chinese_history.csv
| | |-- chinese_literature.csv
| | |-- chinese_teacher_qualification.csv
| | |-- clinical_knowledge.csv
| | |-- college_actuarial_science.csv
| | |-- college_education.csv
| | |-- college_engineering_hydrology.csv
| | |-- college_law.csv
| | |-- college_mathematics.csv
| | |-- college_medical_statistics.csv
| | |-- college_medicine.csv
| | |-- computer_science.csv
| | |-- computer_security.csv
| | |-- conceptual_physics.csv
| | |-- construction_project_management.csv
| | |-- economics.csv
| | |-- education.csv
| | |-- electrical_engineering.csv
| | |-- elementary_chinese.csv
| | |-- elementary_commonsense.csv
| | |-- elementary_information_and_technology.csv
| | |-- elementary_mathematics.csv
| | |-- ethnology.csv
| | |-- food_science.csv
| | |-- genetics.csv
| | |-- global_facts.csv
| | |-- high_school_biology.csv
| | |-- high_school_chemistry.csv
| | |-- high_school_geography.csv
| | |-- high_school_mathematics.csv
| | |-- high_school_physics.csv
| | |-- high_school_politics.csv
| | |-- human_sexuality.csv
| | |-- international_law.csv
| | |-- journalism.csv
| | |-- jurisprudence.csv
| | |-- legal_and_moral_basis.csv
| | |-- logical.csv
| | |-- machine_learning.csv
| | |-- management.csv
| | |-- marketing.csv
| | |-- marxist_theory.csv
| | |-- modern_chinese.csv
| | |-- nutrition.csv
| | |-- philosophy.csv
| | |-- professional_accounting.csv
| | |-- professional_law.csv
| | |-- professional_medicine.csv
| | |-- professional_psychology.csv
| | |-- public_relations.csv
| | |-- security_study.csv
| | |-- sociology.csv
| | |-- sports_science.csv
| | |-- traditional_chinese_medicine.csv
| | |-- virology.csv
| | |-- world_history.csv
| | `-- world_religions.csv
| `-- test
| |-- agronomy.csv
| |-- anatomy.csv
| |-- ancient_chinese.csv
| |-- arts.csv
| |-- astronomy.csv
| |-- business_ethics.csv
| |-- chinese_civil_service_exam.csv
| |-- chinese_driving_rule.csv
| |-- chinese_food_culture.csv
| |-- chinese_foreign_policy.csv
| |-- chinese_history.csv
| |-- chinese_literature.csv
| |-- chinese_teacher_qualification.csv
| |-- clinical_knowledge.csv
| |-- college_actuarial_science.csv
| |-- college_education.csv
| |-- college_engineering_hydrology.csv
| |-- college_law.csv
| |-- college_mathematics.csv
| |-- college_medical_statistics.csv
| |-- college_medicine.csv
| |-- computer_science.csv
| |-- computer_security.csv
| |-- conceptual_physics.csv
| |-- construction_project_management.csv
| |-- economics.csv
| |-- education.csv
| |-- electrical_engineering.csv
| |-- elementary_chinese.csv
| |-- elementary_commonsense.csv
| |-- elementary_information_and_technology.csv
| |-- elementary_mathematics.csv
| |-- ethnology.csv
| |-- food_science.csv
| |-- genetics.csv
| |-- global_facts.csv
| |-- high_school_biology.csv
| |-- high_school_chemistry.csv
| |-- high_school_geography.csv
| |-- high_school_mathematics.csv
| |-- high_school_physics.csv
| |-- high_school_politics.csv
| |-- human_sexuality.csv
| |-- international_law.csv
| |-- journalism.csv
| |-- jurisprudence.csv
| |-- legal_and_moral_basis.csv
| |-- logical.csv
| |-- machine_learning.csv
| |-- management.csv
| |-- marketing.csv
| |-- marxist_theory.csv
| |-- modern_chinese.csv
| |-- nutrition.csv
| |-- philosophy.csv
| |-- professional_accounting.csv
| |-- professional_law.csv
| |-- professional_medicine.csv
| |-- professional_psychology.csv
| |-- public_relations.csv
| |-- security_study.csv
| |-- sociology.csv
| |-- sports_science.csv
| |-- traditional_chinese_medicine.csv
| |-- virology.csv
| |-- world_history.csv
| `-- world_religions.csv
|-- commonsenseqa
| |-- dev_rand_split.jsonl
| |-- test_rand_split_no_answers.jsonl
| `-- train_rand_split.jsonl
|-- drop
| |-- drop_dataset_dev.json
| |-- drop_dataset_train.json
| `-- license.txt
|-- flores_first100
| |-- dev
| | |-- afr_Latn.dev
| | |-- amh_Ethi.dev
| | |-- arb_Arab.dev
| | |-- asm_Beng.dev
| | |-- ast_Latn.dev
| | |-- azj_Latn.dev
| | |-- bel_Cyrl.dev
| | |-- ben_Beng.dev
| | |-- bos_Latn.dev
| | |-- bul_Cyrl.dev
| | |-- cat_Latn.dev
| | |-- ceb_Latn.dev
| | |-- ces_Latn.dev
| | |-- ckb_Arab.dev
| | |-- cym_Latn.dev
| | |-- dan_Latn.dev
| | |-- deu_Latn.dev
| | |-- ell_Grek.dev
| | |-- eng_Latn.dev
| | |-- est_Latn.dev
| | |-- fin_Latn.dev
| | |-- fra_Latn.dev
| | |-- fuv_Latn.dev
| | |-- gaz_Latn.dev
| | |-- gle_Latn.dev
| | |-- glg_Latn.dev
| | |-- guj_Gujr.dev
| | |-- hau_Latn.dev
| | |-- heb_Hebr.dev
| | |-- hin_Deva.dev
| | |-- hrv_Latn.dev
| | |-- hun_Latn.dev
| | |-- hye_Armn.dev
| | |-- ibo_Latn.dev
| | |-- ind_Latn.dev
| | |-- isl_Latn.dev
| | |-- ita_Latn.dev
| | |-- jav_Latn.dev
| | |-- jpn_Jpan.dev
| | |-- kam_Latn.dev
| | |-- kan_Knda.dev
| | |-- kat_Geor.dev
| | |-- kaz_Cyrl.dev
| | |-- kea_Latn.dev
| | |-- khk_Cyrl.dev
| | |-- khm_Khmr.dev
| | |-- kir_Cyrl.dev
| | |-- kor_Hang.dev
| | |-- lao_Laoo.dev
| | |-- lin_Latn.dev
| | |-- lit_Latn.dev
| | |-- ltz_Latn.dev
| | |-- lug_Latn.dev
| | |-- luo_Latn.dev
| | |-- lvs_Latn.dev
| | |-- mal_Mlym.dev
| | |-- mar_Deva.dev
| | |-- mkd_Cyrl.dev
| | |-- mlt_Latn.dev
| | |-- mri_Latn.dev
| | |-- mya_Mymr.dev
| | |-- nld_Latn.dev
| | |-- nob_Latn.dev
| | |-- npi_Deva.dev
| | |-- nso_Latn.dev
| | |-- nya_Latn.dev
| | |-- oci_Latn.dev
| | |-- ory_Orya.dev
| | |-- pan_Guru.dev
| | |-- pbt_Arab.dev
| | |-- pes_Arab.dev
| | |-- pol_Latn.dev
| | |-- por_Latn.dev
| | |-- ron_Latn.dev
| | |-- rus_Cyrl.dev
| | |-- slk_Latn.dev
| | |-- slv_Latn.dev
| | |-- sna_Latn.dev
| | |-- snd_Arab.dev
| | |-- som_Latn.dev
| | |-- spa_Latn.dev
| | |-- srp_Cyrl.dev
| | |-- swe_Latn.dev
| | |-- swh_Latn.dev
| | |-- tam_Taml.dev
| | |-- tel_Telu.dev
| | |-- tgk_Cyrl.dev
| | |-- tgl_Latn.dev
| | |-- tha_Thai.dev
| | |-- tur_Latn.dev
| | |-- ukr_Cyrl.dev
| | |-- umb_Latn.dev
| | |-- urd_Arab.dev
| | |-- uzn_Latn.dev
| | |-- vie_Latn.dev
| | |-- wol_Latn.dev
| | |-- xho_Latn.dev
| | |-- yor_Latn.dev
| | |-- zho_Hans.dev
| | |-- zho_Hant.dev
| | |-- zsm_Latn.dev
| | `-- zul_Latn.dev
| `-- devtest
| |-- afr_Latn.devtest
| |-- amh_Ethi.devtest
| |-- arb_Arab.devtest
| |-- asm_Beng.devtest
| |-- ast_Latn.devtest
| |-- azj_Latn.devtest
| |-- bel_Cyrl.devtest
| |-- ben_Beng.devtest
| |-- bos_Latn.devtest
| |-- bul_Cyrl.devtest
| |-- cat_Latn.devtest
| |-- ceb_Latn.devtest
| |-- ces_Latn.devtest
| |-- ckb_Arab.devtest
| |-- cym_Latn.devtest
| |-- dan_Latn.devtest
| |-- deu_Latn.devtest
| |-- ell_Grek.devtest
| |-- eng_Latn.devtest
| |-- est_Latn.devtest
| |-- fin_Latn.devtest
| |-- fra_Latn.devtest
| |-- fuv_Latn.devtest
| |-- gaz_Latn.devtest
| |-- gle_Latn.devtest
| |-- glg_Latn.devtest
| |-- guj_Gujr.devtest
| |-- hau_Latn.devtest
| |-- heb_Hebr.devtest
| |-- hin_Deva.devtest
| |-- hrv_Latn.devtest
| |-- hun_Latn.devtest
| |-- hye_Armn.devtest
| |-- ibo_Latn.devtest
| |-- ind_Latn.devtest
| |-- isl_Latn.devtest
| |-- ita_Latn.devtest
| |-- jav_Latn.devtest
| |-- jpn_Jpan.devtest
| |-- kam_Latn.devtest
| |-- kan_Knda.devtest
| |-- kat_Geor.devtest
| |-- kaz_Cyrl.devtest
| |-- kea_Latn.devtest
| |-- khk_Cyrl.devtest
| |-- khm_Khmr.devtest
| |-- kir_Cyrl.devtest
| |-- kor_Hang.devtest
| |-- lao_Laoo.devtest
| |-- lin_Latn.devtest
| |-- lit_Latn.devtest
| |-- ltz_Latn.devtest
| |-- lug_Latn.devtest
| |-- luo_Latn.devtest
| |-- lvs_Latn.devtest
| |-- mal_Mlym.devtest
| |-- mar_Deva.devtest
| |-- mkd_Cyrl.devtest
| |-- mlt_Latn.devtest
| |-- mri_Latn.devtest
| |-- mya_Mymr.devtest
| |-- nld_Latn.devtest
| |-- nob_Latn.devtest
| |-- npi_Deva.devtest
| |-- nso_Latn.devtest
| |-- nya_Latn.devtest
| |-- oci_Latn.devtest
| |-- ory_Orya.devtest
| |-- pan_Guru.devtest
| |-- pbt_Arab.devtest
| |-- pes_Arab.devtest
| |-- pol_Latn.devtest
| |-- por_Latn.devtest
| |-- ron_Latn.devtest
| |-- rus_Cyrl.devtest
| |-- slk_Latn.devtest
| |-- slv_Latn.devtest
| |-- sna_Latn.devtest
| |-- snd_Arab.devtest
| |-- som_Latn.devtest
| |-- spa_Latn.devtest
| |-- srp_Cyrl.devtest
| |-- swe_Latn.devtest
| |-- swh_Latn.devtest
| |-- tam_Taml.devtest
| |-- tel_Telu.devtest
| |-- tgk_Cyrl.devtest
| |-- tgl_Latn.devtest
| |-- tha_Thai.devtest
| |-- tur_Latn.devtest
| |-- ukr_Cyrl.devtest
| |-- umb_Latn.devtest
| |-- urd_Arab.devtest
| |-- uzn_Latn.devtest
| |-- vie_Latn.devtest
| |-- wol_Latn.devtest
| |-- xho_Latn.devtest
| |-- yor_Latn.devtest
| |-- zho_Hans.devtest
| |-- zho_Hant.devtest
| |-- zsm_Latn.devtest
| `-- zul_Latn.devtest
|-- gsm8k
| |-- test.jsonl
| |-- test_socratic.jsonl
| |-- train.jsonl
| `-- train_socratic.jsonl
|-- hellaswag
| |-- hellaswag.jsonl
| |-- hellaswag_train.jsonl
| |-- hellaswag_train_sampled25.jsonl
| `-- hellaswag_val_contamination_annotations.json
|-- humaneval
| `-- human-eval-v2-20210705.jsonl
|-- lambada
| `-- test.jsonl
|-- math
| `-- math.json
|-- mbpp
| |-- mbpp.jsonl
| `-- sanitized-mbpp.jsonl
|-- mmlu
| |-- README.txt
| |-- dev
| | |-- abstract_algebra_dev.csv
| | |-- anatomy_dev.csv
| | |-- astronomy_dev.csv
| | |-- business_ethics_dev.csv
| | |-- clinical_knowledge_dev.csv
| | |-- college_biology_dev.csv
| | |-- college_chemistry_dev.csv
| | |-- college_computer_science_dev.csv
| | |-- college_mathematics_dev.csv
| | |-- college_medicine_dev.csv
| | |-- college_physics_dev.csv
| | |-- computer_security_dev.csv
| | |-- conceptual_physics_dev.csv
| | |-- econometrics_dev.csv
| | |-- electrical_engineering_dev.csv
| | |-- elementary_mathematics_dev.csv
| | |-- formal_logic_dev.csv
| | |-- global_facts_dev.csv
| | |-- high_school_biology_dev.csv
| | |-- high_school_chemistry_dev.csv
| | |-- high_school_computer_science_dev.csv
| | |-- high_school_european_history_dev.csv
| | |-- high_school_geography_dev.csv
| | |-- high_school_government_and_politics_dev.csv
| | |-- high_school_macroeconomics_dev.csv
| | |-- high_school_mathematics_dev.csv
| | |-- high_school_microeconomics_dev.csv
| | |-- high_school_physics_dev.csv
| | |-- high_school_psychology_dev.csv
| | |-- high_school_statistics_dev.csv
| | |-- high_school_us_history_dev.csv
| | |-- high_school_world_history_dev.csv
| | |-- human_aging_dev.csv
| | |-- human_sexuality_dev.csv
| | |-- international_law_dev.csv
| | |-- jurisprudence_dev.csv
| | |-- logical_fallacies_dev.csv
| | |-- machine_learning_dev.csv
| | |-- management_dev.csv
| | |-- marketing_dev.csv
| | |-- medical_genetics_dev.csv
| | |-- miscellaneous_dev.csv
| | |-- moral_disputes_dev.csv
| | |-- moral_scenarios_dev.csv
| | |-- nutrition_dev.csv
| | |-- philosophy_dev.csv
| | |-- prehistory_dev.csv
| | |-- professional_accounting_dev.csv
| | |-- professional_law_dev.csv
| | |-- professional_medicine_dev.csv
| | |-- professional_psychology_dev.csv
| | |-- public_relations_dev.csv
| | |-- security_studies_dev.csv
| | |-- sociology_dev.csv
| | |-- us_foreign_policy_dev.csv
| | |-- virology_dev.csv
| | `-- world_religions_dev.csv
| |-- possibly_contaminated_urls.txt
| |-- test
| | |-- MMLU_test_contamination_annotations.json
| | |-- abstract_algebra_test.csv
| | |-- anatomy_test.csv
| | |-- astronomy_test.csv
| | |-- business_ethics_test.csv
| | |-- clinical_knowledge_test.csv
| | |-- college_biology_test.csv
| | |-- college_chemistry_test.csv
| | |-- college_computer_science_test.csv
| | |-- college_mathematics_test.csv
| | |-- college_medicine_test.csv
| | |-- college_physics_test.csv
| | |-- computer_security_test.csv
| | |-- conceptual_physics_test.csv
| | |-- econometrics_test.csv
| | |-- electrical_engineering_test.csv
| | |-- elementary_mathematics_test.csv
| | |-- formal_logic_test.csv
| | |-- global_facts_test.csv
| | |-- high_school_biology_test.csv
| | |-- high_school_chemistry_test.csv
| | |-- high_school_computer_science_test.csv
| | |-- high_school_european_history_test.csv
| | |-- high_school_geography_test.csv
| | |-- high_school_government_and_politics_test.csv
| | |-- high_school_macroeconomics_test.csv
| | |-- high_school_mathematics_test.csv
| | |-- high_school_microeconomics_test.csv
| | |-- high_school_physics_test.csv
| | |-- high_school_psychology_test.csv
| | |-- high_school_statistics_test.csv
| | |-- high_school_us_history_test.csv
| | |-- high_school_world_history_test.csv
| | |-- human_aging_test.csv
| | |-- human_sexuality_test.csv
| | |-- international_law_test.csv
| | |-- jurisprudence_test.csv
| | |-- logical_fallacies_test.csv
| | |-- machine_learning_test.csv
| | |-- management_test.csv
| | |-- marketing_test.csv
| | |-- medical_genetics_test.csv
| | |-- miscellaneous_test.csv
| | |-- moral_disputes_test.csv
| | |-- moral_scenarios_test.csv
| | |-- nutrition_test.csv
| | |-- philosophy_test.csv
| | |-- prehistory_test.csv
| | |-- professional_accounting_test.csv
| | |-- professional_law_test.csv
| | |-- professional_medicine_test.csv
| | |-- professional_psychology_test.csv
| | |-- public_relations_test.csv
| | |-- security_studies_test.csv
| | |-- sociology_test.csv
| | |-- us_foreign_policy_test.csv
| | |-- virology_test.csv
| | `-- world_religions_test.csv
| `-- val
| |-- abstract_algebra_val.csv
| |-- anatomy_val.csv
| |-- astronomy_val.csv
| |-- business_ethics_val.csv
| |-- clinical_knowledge_val.csv
| |-- college_biology_val.csv
| |-- college_chemistry_val.csv
| |-- college_computer_science_val.csv
| |-- college_mathematics_val.csv
| |-- college_medicine_val.csv
| |-- college_physics_val.csv
| |-- computer_security_val.csv
| |-- conceptual_physics_val.csv
| |-- econometrics_val.csv
| |-- electrical_engineering_val.csv
| |-- elementary_mathematics_val.csv
| |-- formal_logic_val.csv
| |-- global_facts_val.csv
| |-- high_school_biology_val.csv
| |-- high_school_chemistry_val.csv
| |-- high_school_computer_science_val.csv
| |-- high_school_european_history_val.csv
| |-- high_school_geography_val.csv
| |-- high_school_government_and_politics_val.csv
| |-- high_school_macroeconomics_val.csv
| |-- high_school_mathematics_val.csv
| |-- high_school_microeconomics_val.csv
| |-- high_school_physics_val.csv
| |-- high_school_psychology_val.csv
| |-- high_school_statistics_val.csv
| |-- high_school_us_history_val.csv
| |-- high_school_world_history_val.csv
| |-- human_aging_val.csv
| |-- human_sexuality_val.csv
| |-- international_law_val.csv
| |-- jurisprudence_val.csv
| |-- logical_fallacies_val.csv
| |-- machine_learning_val.csv
| |-- management_val.csv
| |-- marketing_val.csv
| |-- medical_genetics_val.csv
| |-- miscellaneous_val.csv
| |-- moral_disputes_val.csv
| |-- moral_scenarios_val.csv
| |-- nutrition_val.csv
| |-- philosophy_val.csv
| |-- prehistory_val.csv
| |-- professional_accounting_val.csv
| |-- professional_law_val.csv
| |-- professional_medicine_val.csv
| |-- professional_psychology_val.csv
| |-- public_relations_val.csv
| |-- security_studies_val.csv
| |-- sociology_val.csv
| |-- us_foreign_policy_val.csv
| |-- virology_val.csv
| `-- world_religions_val.csv
|-- nq
| |-- nq-dev.qa.csv
| `-- nq-test.qa.csv
|-- openbookqa
| |-- Additional
| | |-- crowdsourced-facts.txt
| | |-- dev_complete.jsonl
| | |-- test_complete.jsonl
| | `-- train_complete.jsonl
| `-- Main
| |-- dev.jsonl
| |-- dev.tsv
| |-- openbook.txt
| |-- test.jsonl
| |-- test.tsv
| |-- train.jsonl
| `-- train.tsv
|-- piqa
| |-- dev-labels.lst
| |-- dev.jsonl
| |-- train-labels.lst
| `-- train.jsonl
|-- race
| |-- test
| | |-- high.jsonl
| | `-- middle.jsonl
| `-- validation
| |-- high.jsonl
| `-- middle.jsonl
|-- siqa
| |-- dev-labels.lst
| |-- dev.jsonl
| |-- train-labels.lst
| `-- train.jsonl
|-- strategyqa
| `-- strategyQA_train.json
|-- summedits
| `-- summedits.jsonl
|-- triviaqa
| |-- trivia-dev.qa.csv
| |-- trivia-test.qa.csv
| |-- triviaqa-train.jsonl
| `-- triviaqa-validation.jsonl
|-- tydiqa
| `-- dev
| |-- arabic-dev.jsonl
| |-- bengali-dev.jsonl
| |-- english-dev.jsonl
| |-- finnish-dev.jsonl
| |-- indonesian-dev.jsonl
| |-- japanese-dev.jsonl
| |-- korean-dev.jsonl
| |-- russian-dev.jsonl
| |-- swahili-dev.jsonl
| |-- telugu-dev.jsonl
| `-- thai-dev.jsonl
|-- winogrande
| |-- README.md
| |-- dev-labels.lst
| |-- dev.jsonl
| |-- eval.py
| |-- sample-submission-labels.lst
| |-- test.jsonl
| |-- train_debiased-labels.lst
| |-- train_debiased.jsonl
| |-- train_l-labels.lst
| |-- train_l.jsonl
| |-- train_m-labels.lst
| |-- train_m.jsonl
| |-- train_s-labels.lst
| |-- train_s.jsonl
| |-- train_xl-labels.lst
| |-- train_xl.jsonl
| |-- train_xs-labels.lst
| `-- train_xs.jsonl
`-- xstory_cloze
|-- ar_eval.jsonl
|-- ar_train.jsonl
|-- en_eval.jsonl
|-- en_train.jsonl
|-- es_eval.jsonl
|-- es_train.jsonl
|-- eu_eval.jsonl
|-- eu_train.jsonl
|-- hi_eval.jsonl
|-- hi_train.jsonl
|-- id_eval.jsonl
|-- id_train.jsonl
|-- my_eval.jsonl
|-- my_train.jsonl
|-- ru_eval.jsonl
|-- ru_train.jsonl
|-- sw_eval.jsonl
|-- sw_train.jsonl
|-- te_eval.jsonl
|-- te_train.jsonl
|-- zh_eval.jsonl
`-- zh_train.jsonl
85 directories, 1062 files
查询Llama 的配置文件路径
(llama3) root@intern-studio-061925:~/opencompass# python tools/list_configs.py llama ceval
+----------------------------+-------------------------------------------------------+
| Model | Config Path |
|----------------------------+-------------------------------------------------------|
| accessory_llama2_7b | configs/models/accessory/accessory_llama2_7b.py |
| hf_codellama_13b | configs/models/codellama/hf_codellama_13b.py |
| hf_codellama_13b_instruct | configs/models/codellama/hf_codellama_13b_instruct.py |
| hf_codellama_13b_python | configs/models/codellama/hf_codellama_13b_python.py |
| hf_codellama_34b | configs/models/codellama/hf_codellama_34b.py |
| hf_codellama_34b_instruct | configs/models/codellama/hf_codellama_34b_instruct.py |
| hf_codellama_34b_python | configs/models/codellama/hf_codellama_34b_python.py |
| hf_codellama_7b | configs/models/codellama/hf_codellama_7b.py |
| hf_codellama_7b_instruct | configs/models/codellama/hf_codellama_7b_instruct.py |
| hf_codellama_7b_python | configs/models/codellama/hf_codellama_7b_python.py |
| hf_gsm8k_rft_llama7b2_u13b | configs/models/others/hf_gsm8k_rft_llama7b2_u13b.py |
| hf_llama2_13b | configs/models/hf_llama/hf_llama2_13b.py |
| hf_llama2_13b_chat | configs/models/hf_llama/hf_llama2_13b_chat.py |
| hf_llama2_70b | configs/models/hf_llama/hf_llama2_70b.py |
| hf_llama2_70b_chat | configs/models/hf_llama/hf_llama2_70b_chat.py |
| hf_llama2_7b | configs/models/hf_llama/hf_llama2_7b.py |
| hf_llama2_7b_chat | configs/models/hf_llama/hf_llama2_7b_chat.py |
| hf_llama3_70b | configs/models/hf_llama/hf_llama3_70b.py |
| hf_llama3_70b_instruct | configs/models/hf_llama/hf_llama3_70b_instruct.py |
| hf_llama3_8b | configs/models/hf_llama/hf_llama3_8b.py |
| hf_llama3_8b_instruct | configs/models/hf_llama/hf_llama3_8b_instruct.py |
| hf_llama_13b | configs/models/hf_llama/hf_llama_13b.py |
| hf_llama_30b | configs/models/hf_llama/hf_llama_30b.py |
| hf_llama_65b | configs/models/hf_llama/hf_llama_65b.py |
| hf_llama_7b | configs/models/hf_llama/hf_llama_7b.py |
| llama2_13b | configs/models/llama/llama2_13b.py |
| llama2_13b_chat | configs/models/llama/llama2_13b_chat.py |
| llama2_70b | configs/models/llama/llama2_70b.py |
| llama2_70b_chat | configs/models/llama/llama2_70b_chat.py |
| llama2_7b | configs/models/llama/llama2_7b.py |
| llama2_7b_chat | configs/models/llama/llama2_7b_chat.py |
| llama_13b | configs/models/llama/llama_13b.py |
| llama_30b | configs/models/llama/llama_30b.py |
| llama_65b | configs/models/llama/llama_65b.py |
| llama_7b | configs/models/llama/llama_7b.py |
+----------------------------+-------------------------------------------------------+
+--------------------------------+------------------------------------------------------------------+
| Dataset | Config Path |
|--------------------------------+------------------------------------------------------------------|
| base_medium_llama | configs/datasets/collections/base_medium_llama.py |
| ceval_clean_ppl | configs/datasets/ceval/ceval_clean_ppl.py |
| ceval_contamination_ppl_810ec6 | configs/datasets/contamination/ceval_contamination_ppl_810ec6.py |
| ceval_gen | configs/datasets/ceval/ceval_gen.py |
| ceval_gen_2daf24 | configs/datasets/ceval/ceval_gen_2daf24.py |
| ceval_gen_5f30c7 | configs/datasets/ceval/ceval_gen_5f30c7.py |
| ceval_internal_ppl_1cd8bf | configs/datasets/ceval/ceval_internal_ppl_1cd8bf.py |
| ceval_ppl | configs/datasets/ceval/ceval_ppl.py |
| ceval_ppl_1cd8bf | configs/datasets/ceval/ceval_ppl_1cd8bf.py |
| ceval_ppl_578f8d | configs/datasets/ceval/ceval_ppl_578f8d.py |
| ceval_ppl_93e5ce | configs/datasets/ceval/ceval_ppl_93e5ce.py |
| ceval_zero_shot_gen_bd40ef | configs/datasets/ceval/ceval_zero_shot_gen_bd40ef.py |
+--------------------------------+------------------------------------------------------------------+
(llama3) root@intern-studio-061925:~/opencompass#
🏗️命令行快速评测
以C-Eval_gen为例:
(llama3) root@intern-studio-061925:~/opencompass# python run.py --datasets ceval_gen --hf-path /root/model/Meta-Llama-3-8B-Instruct/ --tokenizer-path /root/model/Meta-Llama-3-8B-Instruct --tokenizer-kwargs padding_side='left' truncation='left' trust_remote_code=True --model-kwargs trust_remote_code=True device_map='auto' --max-seq-len 2048 --max-out-len 16 --batch-size 4 --num-gpus 1 --debug
命令解析:
python run.py \
--datasets ceval_gen \
--hf-path /root/model/Meta-Llama-3-8B-Instruct/ \ # HuggingFace 模型路径
--tokenizer-path /root/model/Meta-Llama-3-8B-Instruct/ \ # HuggingFace tokenizer 路径(如果与模型路径相同,可以省略)
--tokenizer-kwargs padding_side='left' truncation='left' trust_remote_code=True \ # 构建 tokenizer 的参数
--model-kwargs device_map='auto' trust_remote_code=True \ # 构建模型的参数
--max-seq-len 2048 \ # 模型可以接受的最大序列长度
--max-out-len 16 \ # 生成的最大 token 数
--batch-size 4 \ # 批量大小
--num-gpus 1 \ # 运行模型所需的 GPU 数量
--debug
查询gpu情况,设置export CUDA_VISIBLE_DEVICES=0
然后重新运行:
python run.py --datasets ceval_gen --hf-path /root/model/Meta-Llama-3-8B-Instruct/ --tokenizer-path /root/model/Meta-Llama-3-8B-Instruct --tokenizer-kwargs padding_side='left' truncation='left' trust_remote_code=True --model-kwargs trust_remote_code=True device_map='auto' --max-seq-len 2048 --max-out-len 16 --batch-size 4 --num-gpus 1 --debug
运行结果为:
遇到 问题 解决方案:
pip install protobuf
(llama3) root@intern-studio-061925:~/opencompass# pip install protobuf
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Collecting protobuf
Downloading https://pypi.tuna.tsinghua.edu.cn/packages/2c/2a/d2741cad35fa5f06d9c59dda3274e5727ca11075dfd7de3f69c100efdcad/protobuf-5.26.1-cp37-abi3-manylinux2014_x86_64.whl (302 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 302.8/302.8 kB 4.1 MB/s eta 0:00:00
Installing collected packages: protobuf
Successfully installed protobuf-5.26.1
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
重新运行命令,结果依然报错
安装:LMDeploy
pip install lmdeploy[all]==0.3.0
设置以下配置:
export MKL_SERVICE_FORCE_INTEL=1
#或
export MKL_THREADING_LAYER=GNU
运行结果为:
(llama3) root@intern-studio-061925:~/opencompass# python run.py --datasets ceval_gen --hf-path /root/model/Meta-Llama-3-8B-Instruct/ --tokenizer-path /root/model/Meta-Llama-3-8B-Instruct --tokenizer-kwargs padding_side='left' truncation='left' trust_remote_code=True --model-kwargs trust_remote_code=True device_map='auto' --max-seq-len 2048 --max-out-len 16 --batch-size 4 --num-gpus 1 --debug
05/07 18:45:31 - OpenCompass - INFO - Loading ceval_gen: configs/datasets/ceval/ceval_gen.py
05/07 18:45:31 - OpenCompass - INFO - Loading example: configs/summarizers/example.py
/root/.conda/envs/llama3/lib/python3.10/site-packages/mmengine/utils/path.py
/root/opencompass/outputs/default/20240507_184531/configs/20240507_184531.py
05/07 18:45:31 - OpenCompass - WARNING - SlurmRunner is not used, so the partition argument is ignored.
05/07 18:45:31 - OpenCompass - DEBUG - Modules of opencompass's partitioner registry have been automatically imported from opencompass.partitioners
05/07 18:45:31 - OpenCompass - DEBUG - Get class `SizePartitioner` from "partitioner" registry in "opencompass"
05/07 18:45:31 - OpenCompass - DEBUG - An `SizePartitioner` instance is built from registry, and its implementation can be found in opencompass.partitioners.size
05/07 18:45:31 - OpenCompass - DEBUG - Key eval.runner.task.judge_cfg not found in config, ignored.
05/07 18:45:31 - OpenCompass - DEBUG - Key eval.runner.task.dump_details not found in config, ignored.
05/07 18:45:31 - OpenCompass - DEBUG - Key eval.given_pred not found in config, ignored.
05/07 18:45:31 - OpenCompass - DEBUG - Additional config: {}
05/07 18:45:31 - OpenCompass - INFO - Partitioned into 1 tasks.
05/07 18:45:31 - OpenCompass - DEBUG - Task 0: [opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-college_economics,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-accountant,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-tax_accountant,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-physician,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-civil_servant,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-urban_and_rural_planner,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-teacher_qualification,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-college_programming,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-electrical_engineer,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-business_administration,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-art_studies,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-fire_engineer,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-environmental_impact_assessment_engineer,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-education_science,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-professional_tour_guide,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-college_chemistry,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-metrology_engineer,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-mao_zedong_thought,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-law,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-veterinary_medicine,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-modern_chinese_history,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-chinese_language_and_literature,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-legal_professional,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-logic,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-middle_school_history,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-plant_protection,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-clinical_medicine,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-computer_architecture,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-middle_school_biology,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-middle_school_politics,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-middle_school_chemistry,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-high_school_history,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-computer_network,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-operating_system,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-college_physics,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-advanced_mathematics,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-high_school_physics,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-high_school_chemistry,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-high_school_biology,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-middle_school_mathematics,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-middle_school_physics,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-marxism,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-high_school_politics,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-high_school_geography,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-ideological_and_moral_cultivation,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-high_school_chinese,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-sports_science,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-basic_medicine,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-probability_and_statistics,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-high_school_mathematics,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-discrete_mathematics,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-middle_school_geography]
05/07 18:45:31 - OpenCompass - DEBUG - Modules of opencompass's runner registry have been automatically imported from opencompass.runners
05/07 18:45:31 - OpenCompass - DEBUG - Get class `LocalRunner` from "runner" registry in "opencompass"
05/07 18:45:31 - OpenCompass - DEBUG - An `LocalRunner` instance is built from registry, and its implementation can be found in opencompass.runners.local
05/07 18:45:31 - OpenCompass - DEBUG - Modules of opencompass's task registry have been automatically imported from opencompass.tasks
05/07 18:45:31 - OpenCompass - DEBUG - Get class `OpenICLInferTask` from "task" registry in "opencompass"
05/07 18:45:31 - OpenCompass - DEBUG - An `OpenICLInferTask` instance is built from registry, and its implementation can be found in opencompass.tasks.openicl_infer
Error: mkl-service + Intel(R) MKL: MKL_THREADING_LAYER=INTEL is incompatible with libgomp-a34b3233.so.1 library.
Try to import numpy first or set the threading layer accordingly. Set MKL_SERVICE_FORCE_INTEL to force it.
/root/.conda/envs/llama3/lib/python3.10/site-packages/mmengine/utils/path.py
/root/opencompass/tmp/1100_params.py
05/07 18:46:00 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-college_economics,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-accountant,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-tax_accountant,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-physician,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-civil_servant,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-urban_and_rural_planner,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-teacher_qualification,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-college_programming,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-electrical_engineer,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-business_administration,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-art_studies,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-fire_engineer,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-environmental_impact_assessment_engineer,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-education_science,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-professional_tour_guide,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-college_chemistry,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-metrology_engineer,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-mao_zedong_thought,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-law,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-veterinary_medicine,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-modern_chinese_history,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-chinese_language_and_literature,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-legal_professional,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-logic,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-middle_school_history,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-plant_protection,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-clinical_medicine,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-computer_architecture,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-middle_school_biology,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-middle_school_politics,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-middle_school_chemistry,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-high_school_history,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-computer_network,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-operating_system,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-college_physics,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-advanced_mathematics,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-high_school_physics,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-high_school_chemistry,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-high_school_biology,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-middle_school_mathematics,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-middle_school_physics,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-marxism,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-high_school_politics,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-high_school_geography,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-ideological_and_moral_cultivation,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-high_school_chinese,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-sports_science,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-basic_medicine,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-probability_and_statistics,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-high_school_mathematics,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-discrete_mathematics,opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-middle_school_geography]
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
05/07 18:46:08 - OpenCompass - WARNING - pad_token_id is not set for the tokenizer.
05/07 18:46:08 - OpenCompass - WARNING - Using eos_token_id <|end_of_text|> as pad_token_id.
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████| 4/4 [02:25<00:00, 36.38s/it]
05/07 18:49:07 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-college_economics]
100%|██████████████████████████████████████████████████████████████████████████████████████████| 55/55 [00:00<00:00, 1469342.17it/s]
[2024-05-07 18:49:07,887] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
0%| | 0/14 [00:00<?, ?it/s]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
7%|██████▊ | 1/14 [00:08<01:45, 8.08s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
14%|█████████████▋ | 2/14 [00:11<01:02, 5.22s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
21%|████████████████████▌ | 3/14 [00:15<00:52, 4.73s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
29%|███████████████████████████▍ | 4/14 [00:18<00:39, 3.98s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
36%|██████████████████████████████████▎ | 5/14 [00:21<00:32, 3.59s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
43%|█████████████████████████████████████████▏ | 6/14 [00:24<00:27, 3.43s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
50%|████████████████████████████████████████████████ | 7/14 [00:27<00:23, 3.41s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
57%|██████████████████████████████████████████████████████▊ | 8/14 [00:31<00:20, 3.46s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
64%|█████████████████████████████████████████████████████████████▋ | 9/14 [00:33<00:15, 3.19s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
71%|███████████████████████████████████████████████████████████████████▊ | 10/14 [00:37<00:13, 3.33s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
79%|██████████████████████████████████████████████████████████████████████████▋ | 11/14 [00:40<00:10, 3.34s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
86%|█████████████████████████████████████████████████████████████████████████████████▍ | 12/14 [00:44<00:06, 3.41s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
93%|████████████████████████████████████████████████████████████████████████████████████████▏ | 13/14 [00:47<00:03, 3.34s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
100%|███████████████████████████████████████████████████████████████████████████████████████████████| 14/14 [00:50<00:00, 3.58s/it]
05/07 18:49:58 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-accountant]
100%|██████████████████████████████████████████████████████████████████████████████████████████| 49/49 [00:00<00:00, 1447330.25it/s]
[2024-05-07 18:49:58,230] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
0%| | 0/13 [00:00<?, ?it/s]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
8%|███████▍ | 1/13 [00:03<00:45, 3.78s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
15%|██████████████▊ | 2/13 [00:08<00:47, 4.30s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
23%|██████████████████████▏ | 3/13 [00:13<00:48, 4.81s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
31%|█████████████████████████████▌ | 4/13 [00:18<00:42, 4.70s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
38%|████████████████████████████████████▉ | 5/13 [00:22<00:35, 4.41s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
46%|████████████████████████████████████████████▎ | 6/13 [00:26<00:31, 4.45s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
54%|███████████████████████████████████████████████████▋ | 7/13 [00:36<00:36, 6.02s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
62%|███████████████████████████████████████████████████████████ | 8/13 [00:40<00:26, 5.39s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
69%|██████████████████████████████████████████████████████████████████▍ | 9/13 [00:43<00:19, 4.80s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
77%|█████████████████████████████████████████████████████████████████████████ | 10/13 [00:46<00:13, 4.35s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
85%|████████████████████████████████████████████████████████████████████████████████▍ | 11/13 [00:52<00:09, 4.59s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
92%|███████████████████████████████████████████████████████████████████████████████████████▋ | 12/13 [00:56<00:04, 4.56s/it]The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
100%|███████████████████████████████████████████████████████████████████████████████████████████████| 13/13 [00:57<00:00, 4.42s/it]
05/07 18:50:55 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-tax_accountant]
100%|██████████████████████████████████████████████████████████████████████████████████████████| 49/49 [00:00<00:00, 1053953.31it/s]
[2024-05-07 18:50:55,905] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
0%| | 0/13 [00:00<?, ?it/s]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
8%|███████▍ | 1/13 [00:04<00:49, 4.16s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
15%|██████████████▊ | 2/13 [00:08<00:43, 3.99s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
23%|██████████████████████▏ | 3/13 [00:12<00:39, 4.00s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
31%|█████████████████████████████▌ | 4/13 [00:15<00:34, 3.82s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
38%|████████████████████████████████████▉ | 5/13 [00:19<00:31, 3.95s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
46%|████████████████████████████████████████████▎ | 6/13 [00:23<00:26, 3.74s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
54%|███████████████████████████████████████████████████▋ | 7/13 [00:28<00:25, 4.31s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
62%|███████████████████████████████████████████████████████████ | 8/13 [00:32<00:20, 4.14s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
69%|██████████████████████████████████████████████████████████████████▍ | 9/13 [00:36<00:16, 4.11s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
77%|█████████████████████████████████████████████████████████████████████████ | 10/13 [00:39<00:11, 3.92s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
85%|████████████████████████████████████████████████████████████████████████████████▍ | 11/13 [00:43<00:07, 3.93s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
92%|███████████████████████████████████████████████████████████████████████████████████████▋ | 12/13 [00:47<00:03, 3.75s/it]The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
100%|███████████████████████████████████████████████████████████████████████████████████████████████| 13/13 [00:48<00:00, 3.70s/it]
05/07 18:51:44 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-physician]
100%|██████████████████████████████████████████████████████████████████████████████████████████| 49/49 [00:00<00:00, 1468006.40it/s]
[2024-05-07 18:51:44,149] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
0%| | 0/13 [00:00<?, ?it/s]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
8%|███████▍ | 1/13 [00:03<00:45, 3.82s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
15%|██████████████▊ | 2/13 [00:06<00:35, 3.20s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
23%|██████████████████████▏ | 3/13 [00:09<00:30, 3.04s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
31%|█████████████████████████████▌ | 4/13 [00:12<00:28, 3.13s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
38%|████████████████████████████████████▉ | 5/13 [00:15<00:24, 3.05s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
46%|████████████████████████████████████████████▎ | 6/13 [00:18<00:21, 3.02s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
54%|███████████████████████████████████████████████████▋ | 7/13 [00:21<00:17, 2.97s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
62%|███████████████████████████████████████████████████████████ | 8/13 [00:24<00:14, 2.95s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
69%|██████████████████████████████████████████████████████████████████▍ | 9/13 [00:27<00:11, 2.95s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
77%|█████████████████████████████████████████████████████████████████████████ | 10/13 [00:31<00:10, 3.40s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
85%|████████████████████████████████████████████████████████████████████████████████▍ | 11/13 [00:34<00:06, 3.33s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
92%|███████████████████████████████████████████████████████████████████████████████████████▋ | 12/13 [00:37<00:03, 3.18s/it]The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
100%|███████████████████████████████████████████████████████████████████████████████████████████████| 13/13 [00:38<00:00, 2.97s/it]
05/07 18:52:22 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-civil_servant]
100%|██████████████████████████████████████████████████████████████████████████████████████████| 47/47 [00:00<00:00, 1516402.22it/s]
[2024-05-07 18:52:22,900] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
0%| | 0/12 [00:00<?, ?it/s]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
8%|████████ | 1/12 [00:05<01:05, 5.94s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
17%|████████████████ | 2/12 [00:11<00:55, 5.53s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
25%|████████████████████████ | 3/12 [00:16<00:47, 5.32s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
33%|████████████████████████████████ | 4/12 [00:21<00:41, 5.21s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
42%|████████████████████████████████████████ | 5/12 [00:27<00:38, 5.44s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
50%|████████████████████████████████████████████████ | 6/12 [00:33<00:34, 5.67s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
58%|████████████████████████████████████████████████████████ | 7/12 [00:38<00:27, 5.55s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
67%|████████████████████████████████████████████████████████████████ | 8/12 [00:44<00:22, 5.55s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
75%|████████████████████████████████████████████████████████████████████████ | 9/12 [00:48<00:15, 5.28s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
83%|███████████████████████████████████████████████████████████████████████████████▏ | 10/12 [00:53<00:10, 5.14s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
92%|███████████████████████████████████████████████████████████████████████████████████████ | 11/12 [00:58<00:04, 4.98s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
100%|███████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [01:02<00:00, 5.20s/it]
05/07 18:53:25 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-urban_and_rural_planner]
100%|██████████████████████████████████████████████████████████████████████████████████████████| 46/46 [00:00<00:00, 1495643.29it/s]
[2024-05-07 18:53:25,477] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
0%| | 0/12 [00:00<?, ?it/s]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
8%|████████ | 1/12 [00:03<00:36, 3.35s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
17%|████████████████ | 2/12 [00:07<00:39, 3.96s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
25%|████████████████████████ | 3/12 [00:11<00:33, 3.68s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
33%|████████████████████████████████ | 4/12 [00:14<00:29, 3.68s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
42%|████████████████████████████████████████ | 5/12 [00:18<00:24, 3.54s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
50%|████████████████████████████████████████████████ | 6/12 [00:21<00:21, 3.62s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
58%|████████████████████████████████████████████████████████ | 7/12 [00:25<00:18, 3.79s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
67%|████████████████████████████████████████████████████████████████ | 8/12 [00:30<00:15, 3.96s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
75%|████████████████████████████████████████████████████████████████████████ | 9/12 [00:33<00:11, 3.75s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
83%|███████████████████████████████████████████████████████████████████████████████▏ | 10/12 [00:36<00:07, 3.59s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
92%|███████████████████████████████████████████████████████████████████████████████████████ | 11/12 [00:40<00:03, 3.54s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
100%|███████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:42<00:00, 3.53s/it]
05/07 18:54:07 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-teacher_qualification]
100%|██████████████████████████████████████████████████████████████████████████████████████████| 44/44 [00:00<00:00, 1356980.71it/s]
[2024-05-07 18:54:07,972] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
0%| | 0/11 [00:00<?, ?it/s]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
9%|████████▋ | 1/11 [00:02<00:29, 2.99s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
18%|█████████████████▍ | 2/11 [00:06<00:29, 3.32s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
27%|██████████████████████████▏ | 3/11 [00:08<00:22, 2.86s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
36%|██████████████████████████████████▉ | 4/11 [00:11<00:18, 2.68s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
45%|███████████████████████████████████████████▋ | 5/11 [00:14<00:16, 2.76s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
55%|████████████████████████████████████████████████████▎ | 6/11 [00:16<00:13, 2.78s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
64%|█████████████████████████████████████████████████████████████ | 7/11 [00:19<00:10, 2.67s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
73%|█████████████████████████████████████████████████████████████████████▊ | 8/11 [00:21<00:07, 2.60s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
82%|██████████████████████████████████████████████████████████████████████████████▌ | 9/11 [00:24<00:05, 2.57s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
91%|██████████████████████████████████████████████████████████████████████████████████████▎ | 10/11 [00:27<00:02, 2.62s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
100%|███████████████████████████████████████████████████████████████████████████████████████████████| 11/11 [00:29<00:00, 2.70s/it]
05/07 18:54:37 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-college_programming]
100%|██████████████████████████████████████████████████████████████████████████████████████████| 37/37 [00:00<00:00, 1048576.00it/s]
[2024-05-07 18:54:37,775] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
0%| | 0/10 [00:00<?, ?it/s]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
10%|█████████▌ | 1/10 [00:03<00:28, 3.12s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
20%|███████████████████▏ | 2/10 [00:06<00:24, 3.04s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
30%|████████████████████████████▊ | 3/10 [00:08<00:19, 2.77s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
40%|██████████████████████████████████████▍ | 4/10 [00:11<00:16, 2.71s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
50%|████████████████████████████████████████████████ | 5/10 [00:13<00:13, 2.74s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
60%|█████████████████████████████████████████████████████████▌ | 6/10 [00:16<00:10, 2.63s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
70%|███████████████████████████████████████████████████████████████████▏ | 7/10 [00:18<00:07, 2.61s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
80%|████████████████████████████████████████████████████████████████████████████▊ | 8/10 [00:21<00:05, 2.55s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
90%|██████████████████████████████████████████████████████████████████████████████████████▍ | 9/10 [00:24<00:02, 2.62s/it]The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
100%|███████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:25<00:00, 2.50s/it]
05/07 18:55:02 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-electrical_engineer]
100%|███████████████████████████████████████████████████████████████████████████████████████████| 37/37 [00:00<00:00, 749706.51it/s]
[2024-05-07 18:55:02,946] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
0%| | 0/10 [00:00<?, ?it/s]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
10%|█████████▌ | 1/10 [00:02<00:23, 2.60s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
20%|███████████████████▏ | 2/10 [00:05<00:20, 2.55s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
30%|████████████████████████████▊ | 3/10 [00:07<00:18, 2.63s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
40%|██████████████████████████████████████▍ | 4/10 [00:10<00:16, 2.69s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
50%|████████████████████████████████████████████████ | 5/10 [00:13<00:13, 2.69s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
60%|█████████████████████████████████████████████████████████▌ | 6/10 [00:15<00:10, 2.66s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
70%|███████████████████████████████████████████████████████████████████▏ | 7/10 [00:18<00:08, 2.80s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
80%|████████████████████████████████████████████████████████████████████████████▊ | 8/10 [00:21<00:05, 2.70s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
90%|██████████████████████████████████████████████████████████████████████████████████████▍ | 9/10 [00:24<00:02, 2.67s/it]The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
100%|███████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:24<00:00, 2.47s/it]
05/07 18:55:27 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-business_administration]
100%|██████████████████████████████████████████████████████████████████████████████████████████| 33/33 [00:00<00:00, 1116226.06it/s]
[2024-05-07 18:55:27,818] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
0%| | 0/9 [00:00<?, ?it/s]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
11%|██████████▊ | 1/9 [00:02<00:20, 2.57s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
22%|█████████████████████▌ | 2/9 [00:05<00:17, 2.50s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
33%|████████████████████████████████▎ | 3/9 [00:07<00:15, 2.65s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
44%|███████████████████████████████████████████ | 4/9 [00:10<00:12, 2.52s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
56%|█████████████████████████████████████████████████████▉ | 5/9 [00:13<00:11, 2.80s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
67%|████████████████████████████████████████████████████████████████▋ | 6/9 [00:16<00:08, 2.91s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
78%|███████████████████████████████████████████████████████████████████████████▍ | 7/9 [00:19<00:05, 2.78s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
89%|██████████████████████████████████████████████████████████████████████████████████████▏ | 8/9 [00:21<00:02, 2.74s/it]The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:22<00:00, 2.48s/it]
05/07 18:55:50 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-art_studies]
100%|██████████████████████████████████████████████████████████████████████████████████████████| 33/33 [00:00<00:00, 1116226.06it/s]
[2024-05-07 18:55:50,273] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
0%| | 0/9 [00:00<?, ?it/s]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
11%|██████████▊ | 1/9 [00:01<00:15, 1.95s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
22%|█████████████████████▌ | 2/9 [00:03<00:14, 2.00s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
33%|████████████████████████████████▎ | 3/9 [00:06<00:12, 2.15s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
44%|███████████████████████████████████████████ | 4/9 [00:08<00:11, 2.26s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
56%|█████████████████████████████████████████████████████▉ | 5/9 [00:10<00:08, 2.22s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
67%|████████████████████████████████████████████████████████████████▋ | 6/9 [00:13<00:07, 2.41s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
78%|███████████████████████████████████████████████████████████████████████████▍ | 7/9 [00:15<00:04, 2.36s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
89%|██████████████████████████████████████████████████████████████████████████████████████▏ | 8/9 [00:18<00:02, 2.28s/it]The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:19<00:00, 2.17s/it]
05/07 18:56:09 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-fire_engineer]
100%|██████████████████████████████████████████████████████████████████████████████████████████| 31/31 [00:00<00:00, 1160923.43it/s]
[2024-05-07 18:56:09,920] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
0%| | 0/8 [00:00<?, ?it/s]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
12%|████████████▏ | 1/8 [00:02<00:20, 2.92s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
25%|████████████████████████▎ | 2/8 [00:06<00:19, 3.21s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
38%|████████████████████████████████████▍ | 3/8 [00:09<00:15, 3.01s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
50%|████████████████████████████████████████████████▌ | 4/8 [00:12<00:12, 3.01s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
62%|████████████████████████████████████████████████████████████▋ | 5/8 [00:15<00:09, 3.17s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
75%|████████████████████████████████████████████████████████████████████████▊ | 6/8 [00:18<00:06, 3.21s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
88%|████████████████████████████████████████████████████████████████████████████████████▉ | 7/8 [00:21<00:03, 3.04s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:23<00:00, 2.92s/it]
05/07 18:56:33 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-environmental_impact_assessment_engineer]
100%|██████████████████████████████████████████████████████████████████████████████████████████| 31/31 [00:00<00:00, 1111311.32it/s]
[2024-05-07 18:56:33,444] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
0%| | 0/8 [00:00<?, ?it/s]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
12%|████████████▏ | 1/8 [00:02<00:19, 2.74s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
25%|████████████████████████▎ | 2/8 [00:05<00:16, 2.80s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
38%|████████████████████████████████████▍ | 3/8 [00:08<00:13, 2.75s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
50%|████████████████████████████████████████████████▌ | 4/8 [00:10<00:10, 2.69s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
62%|████████████████████████████████████████████████████████████▋ | 5/8 [00:13<00:08, 2.82s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
75%|████████████████████████████████████████████████████████████████████████▊ | 6/8 [00:16<00:05, 2.79s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
88%|████████████████████████████████████████████████████████████████████████████████████▉ | 7/8 [00:19<00:02, 2.76s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:22<00:00, 2.80s/it]
05/07 18:56:55 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-education_science]
100%|██████████████████████████████████████████████████████████████████████████████████████████| 29/29 [00:00<00:00, 1022141.31it/s]
[2024-05-07 18:56:55,970] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
0%| | 0/8 [00:00<?, ?it/s]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
12%|████████████▏ | 1/8 [00:02<00:15, 2.18s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
25%|████████████████████████▎ | 2/8 [00:04<00:12, 2.16s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
38%|████████████████████████████████████▍ | 3/8 [00:06<00:11, 2.28s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
50%|████████████████████████████████████████████████▌ | 4/8 [00:09<00:09, 2.41s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
62%|████████████████████████████████████████████████████████████▋ | 5/8 [00:11<00:07, 2.40s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
75%|████████████████████████████████████████████████████████████████████████▊ | 6/8 [00:14<00:04, 2.38s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
88%|████████████████████████████████████████████████████████████████████████████████████▉ | 7/8 [00:16<00:02, 2.30s/it]The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:16<00:00, 2.11s/it]
05/07 18:57:12 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-professional_tour_guide]
100%|██████████████████████████████████████████████████████████████████████████████████████████| 29/29 [00:00<00:00, 1057694.05it/s]
[2024-05-07 18:57:12,914] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
0%| | 0/8 [00:00<?, ?it/s]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
12%|████████████▏ | 1/8 [00:02<00:14, 2.05s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
25%|████████████████████████▎ | 2/8 [00:04<00:13, 2.19s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
38%|████████████████████████████████████▍ | 3/8 [00:06<00:11, 2.21s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
50%|████████████████████████████████████████████████▌ | 4/8 [00:08<00:08, 2.23s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
62%|████████████████████████████████████████████████████████████▋ | 5/8 [00:11<00:06, 2.26s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
75%|████████████████████████████████████████████████████████████████████████▊ | 6/8 [00:13<00:04, 2.23s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
88%|████████████████████████████████████████████████████████████████████████████████████▉ | 7/8 [00:15<00:02, 2.35s/it]The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:16<00:00, 2.06s/it]
05/07 18:57:29 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-college_chemistry]
100%|███████████████████████████████████████████████████████████████████████████████████████████| 24/24 [00:00<00:00, 940778.47it/s]
[2024-05-07 18:57:29,509] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
0%| | 0/6 [00:00<?, ?it/s]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
17%|████████████████▏ | 1/6 [00:02<00:12, 2.53s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
33%|████████████████████████████████▎ | 2/6 [00:04<00:09, 2.43s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
50%|████████████████████████████████████████████████▌ | 3/6 [00:07<00:07, 2.52s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
67%|████████████████████████████████████████████████████████████████▋ | 4/6 [00:10<00:05, 2.52s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
83%|████████████████████████████████████████████████████████████████████████████████▊ | 5/6 [00:12<00:02, 2.46s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:14<00:00, 2.46s/it]
05/07 18:57:44 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-metrology_engineer]
100%|███████████████████████████████████████████████████████████████████████████████████████████| 24/24 [00:00<00:00, 906876.54it/s]
[2024-05-07 18:57:44,380] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
0%| | 0/6 [00:00<?, ?it/s]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
17%|████████████████▏ | 1/6 [00:02<00:13, 2.70s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
33%|████████████████████████████████▎ | 2/6 [00:05<00:11, 2.76s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
50%|████████████████████████████████████████████████▌ | 3/6 [00:08<00:08, 2.73s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
67%|████████████████████████████████████████████████████████████████▋ | 4/6 [00:11<00:05, 2.90s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
83%|████████████████████████████████████████████████████████████████████████████████▊ | 5/6 [00:14<00:02, 2.81s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:16<00:00, 2.82s/it]
05/07 18:58:01 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-mao_zedong_thought]
100%|███████████████████████████████████████████████████████████████████████████████████████████| 24/24 [00:00<00:00, 906876.54it/s]
[2024-05-07 18:58:01,403] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
0%| | 0/6 [00:00<?, ?it/s]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
17%|████████████████▏ | 1/6 [00:03<00:17, 3.47s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
33%|████████████████████████████████▎ | 2/6 [00:06<00:11, 2.98s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
50%|████████████████████████████████████████████████▌ | 3/6 [00:09<00:08, 2.95s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
67%|████████████████████████████████████████████████████████████████▋ | 4/6 [00:11<00:05, 2.75s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
83%|████████████████████████████████████████████████████████████████████████████████▊ | 5/6 [00:14<00:02, 2.72s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:18<00:00, 3.06s/it]
05/07 18:58:19 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-law]
100%|███████████████████████████████████████████████████████████████████████████████████████████| 24/24 [00:00<00:00, 774333.05it/s]
[2024-05-07 18:58:19,872] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
0%| | 0/6 [00:00<?, ?it/s]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
17%|████████████████▏ | 1/6 [00:03<00:18, 3.70s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
33%|████████████████████████████████▎ | 2/6 [00:07<00:16, 4.04s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
50%|████████████████████████████████████████████████▌ | 3/6 [00:11<00:11, 3.98s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
67%|████████████████████████████████████████████████████████████████▋ | 4/6 [00:14<00:07, 3.63s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
83%|████████████████████████████████████████████████████████████████████████████████▊ | 5/6 [00:18<00:03, 3.63s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:21<00:00, 3.62s/it]
05/07 18:58:41 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-veterinary_medicine]
100%|███████████████████████████████████████████████████████████████████████████████████████████| 23/23 [00:00<00:00, 846219.23it/s]
[2024-05-07 18:58:41,701] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
0%| | 0/6 [00:00<?, ?it/s]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
17%|████████████████▏ | 1/6 [00:02<00:13, 2.61s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
33%|████████████████████████████████▎ | 2/6 [00:04<00:09, 2.46s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
50%|████████████████████████████████████████████████▌ | 3/6 [00:07<00:07, 2.49s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
67%|████████████████████████████████████████████████████████████████▋ | 4/6 [00:10<00:05, 2.74s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
83%|████████████████████████████████████████████████████████████████████████████████▊ | 5/6 [00:13<00:02, 2.93s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:15<00:00, 2.65s/it]
05/07 18:58:57 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-modern_chinese_history]
100%|███████████████████████████████████████████████████████████████████████████████████████████| 23/23 [00:00<00:00, 790729.44it/s]
[2024-05-07 18:58:57,693] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
0%| | 0/6 [00:00<?, ?it/s]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
17%|████████████████▏ | 1/6 [00:02<00:12, 2.52s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
33%|████████████████████████████████▎ | 2/6 [00:05<00:11, 2.82s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
50%|████████████████████████████████████████████████▌ | 3/6 [00:09<00:09, 3.24s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
67%|████████████████████████████████████████████████████████████████▋ | 4/6 [00:11<00:06, 3.02s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
83%|████████████████████████████████████████████████████████████████████████████████▊ | 5/6 [00:14<00:02, 2.98s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:16<00:00, 2.82s/it]
05/07 18:59:14 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-chinese_language_and_literature]
100%|███████████████████████████████████████████████████████████████████████████████████████████| 23/23 [00:00<00:00, 665303.39it/s]
[2024-05-07 18:59:14,689] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
0%| | 0/6 [00:00<?, ?it/s]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
17%|████████████████▏ | 1/6 [00:02<00:10, 2.11s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
33%|████████████████████████████████▎ | 2/6 [00:04<00:09, 2.50s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
50%|████████████████████████████████████████████████▌ | 3/6 [00:07<00:08, 2.71s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
67%|████████████████████████████████████████████████████████████████▋ | 4/6 [00:10<00:05, 2.65s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
83%|████████████████████████████████████████████████████████████████████████████████▊ | 5/6 [00:12<00:02, 2.47s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:14<00:00, 2.36s/it]
05/07 18:59:28 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-legal_professional]
100%|███████████████████████████████████████████████████████████████████████████████████████████| 23/23 [00:00<00:00, 853707.89it/s]
[2024-05-07 18:59:28,987] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
0%| | 0/6 [00:00<?, ?it/s]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
17%|████████████████▏ | 1/6 [00:04<00:20, 4.04s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
33%|████████████████████████████████▎ | 2/6 [00:09<00:19, 4.85s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
50%|████████████████████████████████████████████████▌ | 3/6 [00:14<00:14, 4.87s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
67%|████████████████████████████████████████████████████████████████▋ | 4/6 [00:18<00:09, 4.68s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
83%|████████████████████████████████████████████████████████████████████████████████▊ | 5/6 [00:22<00:04, 4.28s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:24<00:00, 4.13s/it]
05/07 18:59:53 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-logic]
100%|███████████████████████████████████████████████████████████████████████████████████████████| 22/22 [00:00<00:00, 795471.45it/s]
[2024-05-07 18:59:53,922] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
0%| | 0/6 [00:00<?, ?it/s]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
17%|████████████████▏ | 1/6 [00:05<00:25, 5.17s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
33%|████████████████████████████████▎ | 2/6 [00:14<00:30, 7.70s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
50%|████████████████████████████████████████████████▌ | 3/6 [00:20<00:20, 6.73s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
67%|████████████████████████████████████████████████████████████████▋ | 4/6 [00:24<00:11, 5.87s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
83%|████████████████████████████████████████████████████████████████████████████████▊ | 5/6 [00:29<00:05, 5.44s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:32<00:00, 5.40s/it]
05/07 19:00:26 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-middle_school_history]
100%|███████████████████████████████████████████████████████████████████████████████████████████| 22/22 [00:00<00:00, 623477.62it/s]
[2024-05-07 19:00:26,437] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
0%| | 0/6 [00:00<?, ?it/s]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
17%|████████████████▏ | 1/6 [00:02<00:10, 2.05s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
33%|████████████████████████████████▎ | 2/6 [00:04<00:08, 2.12s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
50%|████████████████████████████████████████████████▌ | 3/6 [00:06<00:06, 2.12s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
67%|████████████████████████████████████████████████████████████████▋ | 4/6 [00:08<00:04, 2.18s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
83%|████████████████████████████████████████████████████████████████████████████████▊ | 5/6 [00:10<00:02, 2.19s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:11<00:00, 1.98s/it]
05/07 19:00:38 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-plant_protection]
100%|███████████████████████████████████████████████████████████████████████████████████████████| 22/22 [00:00<00:00, 846556.77it/s]
[2024-05-07 19:00:38,399] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
0%| | 0/6 [00:00<?, ?it/s]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
17%|████████████████▏ | 1/6 [00:03<00:16, 3.26s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
33%|████████████████████████████████▎ | 2/6 [00:05<00:11, 2.94s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
50%|████████████████████████████████████████████████▌ | 3/6 [00:08<00:07, 2.64s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
67%|████████████████████████████████████████████████████████████████▋ | 4/6 [00:10<00:05, 2.55s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
83%|████████████████████████████████████████████████████████████████████████████████▊ | 5/6 [00:12<00:02, 2.37s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:13<00:00, 2.31s/it]
05/07 19:00:52 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-clinical_medicine]
100%|███████████████████████████████████████████████████████████████████████████████████████████| 22/22 [00:00<00:00, 704386.93it/s]
[2024-05-07 19:00:52,363] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
0%| | 0/6 [00:00<?, ?it/s]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
17%|████████████████▏ | 1/6 [00:02<00:12, 2.44s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
33%|████████████████████████████████▎ | 2/6 [00:05<00:11, 2.80s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
50%|████████████████████████████████████████████████▌ | 3/6 [00:08<00:08, 2.81s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
67%|████████████████████████████████████████████████████████████████▋ | 4/6 [00:10<00:05, 2.72s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
83%|████████████████████████████████████████████████████████████████████████████████▊ | 5/6 [00:13<00:02, 2.62s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:14<00:00, 2.48s/it]
05/07 19:01:07 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-computer_architecture]
100%|███████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:00<00:00, 599186.29it/s]
[2024-05-07 19:01:07,330] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
0%| | 0/6 [00:00<?, ?it/s]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
17%|████████████████▏ | 1/6 [00:02<00:14, 2.97s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
33%|████████████████████████████████▎ | 2/6 [00:06<00:12, 3.13s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
50%|████████████████████████████████████████████████▌ | 3/6 [00:09<00:09, 3.05s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
67%|████████████████████████████████████████████████████████████████▋ | 4/6 [00:11<00:05, 2.88s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
83%|████████████████████████████████████████████████████████████████████████████████▊ | 5/6 [00:14<00:02, 2.87s/it]The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:15<00:00, 2.61s/it]
05/07 19:01:23 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-middle_school_biology]
100%|███████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:00<00:00, 699050.67it/s]
[2024-05-07 19:01:23,091] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
0%| | 0/6 [00:00<?, ?it/s]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
17%|████████████████▏ | 1/6 [00:02<00:12, 2.57s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
33%|████████████████████████████████▎ | 2/6 [00:05<00:10, 2.59s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
50%|████████████████████████████████████████████████▌ | 3/6 [00:07<00:07, 2.66s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
67%|████████████████████████████████████████████████████████████████▋ | 4/6 [00:10<00:05, 2.58s/it]Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
83%|████████████████████████████████████████████████████████████████████████████████▊ | 5/6 [00:12<00:02, 2.58s/it]
....
05/07 19:07:23 - OpenCompass - DEBUG - Additional config: {'eval': {'runner': { 'task': {}}}}
05/07 19:07:23 - OpenCompass - INFO - Partitioned into 52 tasks.
05/07 19:07:23 - OpenCompass - DEBUG - Task 0: [opencompass.models.huggingface. HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-computer_network]
05/07 19:07:23 - OpenCompass - DEBUG - Task 1: [opencompass.models.huggingface. HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-operating_system]
05/07 19:07:23 - OpenCompass - DEBUG - Task 2: [opencompass.models.huggingface. HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-computer_architecture]
05/07 19:07:23 - OpenCompass - DEBUG - Task 3: [opencompass.models.huggingface. HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-college_programming]
05/07 19:07:23 - OpenCompass - DEBUG - Task 4: [opencompass.models.huggingface. HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-college_physics]
05/07 19:07:23 - OpenCompass - DEBUG - Task 5: [opencompass.models.huggingface. HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-college_chemistry]
05/07 19:07:23 - OpenCompass - DEBUG - Task 6: [opencompass.models.huggingface. HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-advanced_mathematics]
05/07 19:07:23 - OpenCompass - DEBUG - Task 7: [opencompass.models.huggingface. HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-probability_and_statistics]
05/07 19:07:23 - OpenCompass - DEBUG - Task 8: [opencompass.models.huggingface. HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-discrete_mathematics]
05/07 19:07:23 - OpenCompass - DEBUG - Task 9: [opencompass.models.huggingface. HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-electrical_engineer]
05/07 19:07:23 - OpenCompass - DEBUG - Task 10: [opencompass.models.huggingface .HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-metrology_engineer]
05/07 19:07:23 - OpenCompass - DEBUG - Task 11: [opencompass.models.huggingface .HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-high_school_mathematics]
05/07 19:07:23 - OpenCompass - DEBUG - Task 12: [opencompass.models.huggingface .HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-high_school_physics]
05/07 19:07:23 - OpenCompass - DEBUG - Task 13: [opencompass.models.huggingface .HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-high_school_chemistry]
05/07 19:07:23 - OpenCompass - DEBUG - Task 14: [opencompass.models.huggingface .HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-high_school_biology]
05/07 19:07:23 - OpenCompass - DEBUG - Task 15: [opencompass.models.huggingface .HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-middle_school_mathematics]
05/07 19:07:23 - OpenCompass - DEBUG - Task 16: [opencompass.models.huggingface .HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-middle_school_biology]
05/07 19:07:23 - OpenCompass - DEBUG - Task 17: [opencompass.models.huggingface .HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-middle_school_physics]
05/07 19:07:23 - OpenCompass - DEBUG - Task 18: [opencompass.models.huggingface .HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-middle_school_chemistry]
05/07 19:07:23 - OpenCompass - DEBUG - Task 19: [opencompass.models.huggingface .HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-veterinary_medicine]
05/07 19:07:23 - OpenCompass - DEBUG - Task 20: [opencompass.models.huggingface .HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-college_economics]
05/07 19:07:23 - OpenCompass - DEBUG - Task 21: [opencompass.models.huggingface .HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-business_administration]
05/07 19:07:23 - OpenCompass - DEBUG - Task 22: [opencompass.models.huggingface .HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-marxism]
05/07 19:07:23 - OpenCompass - DEBUG - Task 23: [opencompass.models.huggingface .HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-mao_zedong_thought]
05/07 19:07:23 - OpenCompass - DEBUG - Task 24: [opencompass.models.huggingface .HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-education_science]
05/07 19:07:23 - OpenCompass - DEBUG - Task 25: [opencompass.models.huggingface .HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-teacher_qualification]
05/07 19:07:23 - OpenCompass - DEBUG - Task 26: [opencompass.models.huggingface .HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-high_school_politics]
05/07 19:07:23 - OpenCompass - DEBUG - Task 27: [opencompass.models.huggingface .HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-high_school_geography]
05/07 19:07:23 - OpenCompass - DEBUG - Task 28: [opencompass.models.huggingface .HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-middle_school_politics]
05/07 19:07:23 - OpenCompass - DEBUG - Task 29: [opencompass.models.huggingface .HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-middle_school_geography]
05/07 19:07:23 - OpenCompass - DEBUG - Task 30: [opencompass.models.huggingface .HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-modern_chinese_history]
05/07 19:07:23 - OpenCompass - DEBUG - Task 31: [opencompass.models.huggingface .HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-ideological_and_moral_cultiva tion]
05/07 19:07:23 - OpenCompass - DEBUG - Task 32: [opencompass.models.huggingface .HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-logic]
05/07 19:07:23 - OpenCompass - DEBUG - Task 33: [opencompass.models.huggingface .HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-law]
05/07 19:07:23 - OpenCompass - DEBUG - Task 34: [opencompass.models.huggingface .HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-chinese_language_and_literatu re]
05/07 19:07:23 - OpenCompass - DEBUG - Task 35: [opencompass.models.huggingface .HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-art_studies]
05/07 19:07:23 - OpenCompass - DEBUG - Task 36: [opencompass.models.huggingface .HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-professional_tour_guide]
05/07 19:07:23 - OpenCompass - DEBUG - Task 37: [opencompass.models.huggingface .HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-legal_professional]
05/07 19:07:23 - OpenCompass - DEBUG - Task 38: [opencompass.models.huggingface .HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-high_school_chinese]
05/07 19:07:23 - OpenCompass - DEBUG - Task 39: [opencompass.models.huggingface .HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-high_school_history]
05/07 19:07:23 - OpenCompass - DEBUG - Task 40: [opencompass.models.huggingface .HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-middle_school_history]
05/07 19:07:23 - OpenCompass - DEBUG - Task 41: [opencompass.models.huggingface .HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-civil_servant]
05/07 19:07:23 - OpenCompass - DEBUG - Task 42: [opencompass.models.huggingface .HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-sports_science]
05/07 19:07:23 - OpenCompass - DEBUG - Task 43: [opencompass.models.huggingface .HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-plant_protection]
05/07 19:07:23 - OpenCompass - DEBUG - Task 44: [opencompass.models.huggingface .HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-basic_medicine]
05/07 19:07:23 - OpenCompass - DEBUG - Task 45: [opencompass.models.huggingface .HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-clinical_medicine]
05/07 19:07:23 - OpenCompass - DEBUG - Task 46: [opencompass.models.huggingface .HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-urban_and_rural_planner]
05/07 19:07:23 - OpenCompass - DEBUG - Task 47: [opencompass.models.huggingface .HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-accountant]
05/07 19:07:23 - OpenCompass - DEBUG - Task 48: [opencompass.models.huggingface .HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-fire_engineer]
05/07 19:07:23 - OpenCompass - DEBUG - Task 49: [opencompass.models.huggingface .HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-environmental_impact_assessme nt_engineer]
05/07 19:07:23 - OpenCompass - DEBUG - Task 50: [opencompass.models.huggingface .HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-tax_accountant]
05/07 19:07:23 - OpenCompass - DEBUG - Task 51: [opencompass.models.huggingface .HuggingFace_model_Meta-Llama-3-8B-Instruct/ceval-physician]
05/07 19:07:23 - OpenCompass - DEBUG - Get class `LocalRunner` from "runner" re gistry in "opencompass"
05/07 19:07:23 - OpenCompass - DEBUG - An `LocalRunner` instance is built from registry, and its implementation can be found in opencompass.runners.local
05/07 19:07:23 - OpenCompass - DEBUG - Get class `OpenICLEvalTask` from "task" registry in "opencompass"
05/07 19:07:23 - OpenCompass - DEBUG - An `OpenICLEvalTask` instance is built f rom registry, and its implementation can be found in opencompass.tasks.openicl_ eval
05/07 19:07:25 - OpenCompass - DEBUG - Modules of opencompass's load_dataset re gistry have been automatically imported from opencompass.datasets
05/07 19:07:25 - OpenCompass - DEBUG - Get class `CEvalDataset` from "load_data set" registry in "opencompass"
05/07 19:07:25 - OpenCompass - DEBUG - An `CEvalDataset` instance is built from registry, and its implementation can be found in opencompass.datasets.ceval
05/07 19:07:25 - OpenCompass - DEBUG - Modules of opencompass's text_postproces sors registry have been automatically imported from opencompass.utils.text_post processors
05/07 19:07:25 - OpenCompass - DEBUG - Get class `first_capital_postprocess` fr om "text_postprocessors" registry in "opencompass"
05/07 19:07:25 - OpenCompass - DEBUG - Modules of opencompass's icl_evaluators registry have been automatically imported from opencompass.openicl.icl_evaluato r
05/07 19:07:25 - OpenCompass - DEBUG - Get class `AccEvaluator` from "icl_evalu ators" registry in "opencompass"
05/07 19:07:25 - OpenCompass - DEBUG - An `AccEvaluator` instance is built from registry, and its implementation can be found in opencompass.openicl.icl_evalu ator.icl_hf_evaluator
05/07 19:07:25 - OpenCompass - INFO - Task [opencompass.models.huggingface.Hugg ingFace_model_Meta-Llama-3-8B-Instruct/ceval-computer_network]: {'accuracy': 63 .1578947368421}
05/07 19:07:25 - OpenCompass - DEBUG - Get class `OpenICLEvalTask` from "task" registry in "opencompass"
05/07 19:07:25 - OpenCompass - DEBUG - An `OpenICLEvalTask` instance is built f rom registry, and its implementation can be found in opencompass.tasks.openicl_ eval
05/07 19:07:27 - OpenCompass - DEBUG - Get class `CEvalDataset` from "load_data set" registry in "opencompass"
05/07 19:07:27 - OpenCompass - DEBUG - An `CEvalDataset` instance is built from registry, and its implementation can be found in opencompass.datasets.ceval
05/07 19:07:27 - OpenCompass - DEBUG - Get class `first_capital_postprocess` fr om "text_postprocessors" registry in "opencompass"
05/07 19:07:27 - OpenCompass - DEBUG - Get class `AccEvaluator` from "icl_evalu ators" registry in "opencompass"
05/07 19:07:27 - OpenCompass - DEBUG - An `AccEvaluator` instance is built from registry, and its implementation can be found in opencompass.openicl.icl_evalu ator.icl_hf_evaluator
05/07 19:07:27 - OpenCompass - INFO - Task [opencompass.models.huggingface.Hugg ingFace_model_Meta-Llama-3-8B-Instruct/ceval-operating_system]: {'accuracy': 63 .1578947368421}
05/07 19:07:27 - OpenCompass - DEBUG - Get class `OpenICLEvalTask` from "task" registry in "opencompass"
05/07 19:07:27 - OpenCompass - DEBUG - An `OpenICLEvalTask` instance is built f rom registry, and its implementation can be found in opencompass.tasks.openicl_ eval
05/07 19:07:29 - OpenCompass - DEBUG - Get class `CEvalDataset` from "load_data set" registry in "opencompass"
05/07 19:07:29 - OpenCompass - DEBUG - An `CEvalDataset` instance is built from registry, and its implementation can be found in opencompass.datasets.ceval
05/07 19:07:29 - OpenCompass - DEBUG - Get class `first_capital_postprocess` fr om "text_postprocessors" registry in "opencompass"
05/07 19:07:29 - OpenCompass - DEBUG - Get class `AccEvaluator` from "icl_evalu ators" registry in "opencompass"
05/07 19:07:29 - OpenCompass - DEBUG - An `AccEvaluator` instance is built from registry, and its implementation can be found in opencompass.openicl.icl_evalu ator.icl_hf_evaluator
05/07 19:07:29 - OpenCompass - INFO - Task [opencompass.models.huggingface.Hugg ingFace_model_Meta-Llama-3-8B-Instruct/ceval-computer_architecture]: {'accuracy ': 52.38095238095239}
05/07 19:07:29 - OpenCompass - DEBUG - Get class `OpenICLEvalTask` from "task" registry in "opencompass"
05/07 19:07:29 - OpenCompass - DEBUG - An `OpenICLEvalTask` instance is built f rom registry, and its implementation can be found in opencompass.tasks.openicl_ eval
05/07 19:07:31 - OpenCompass - DEBUG - Get class `CEvalDataset` from "load_data set" registry in "opencompass"
05/07 19:07:31 - OpenCompass - DEBUG - An `CEvalDataset` instance is built from registry, and its implementation can be found in opencompass.datasets.ceval
05/07 19:07:31 - OpenCompass - DEBUG - Get class `first_capital_postprocess` fr om "text_postprocessors" registry in "opencompass"
05/07 19:07:31 - OpenCompass - DEBUG - Get class `AccEvaluator` from "icl_evalu ators" registry in "opencompass"
05/07 19:07:31 - OpenCompass - DEBUG - An `AccEvaluator` instance is built from registry, and its implementation can be found in opencompass.openicl.icl_evalu ator.icl_hf_evaluator
05/07 19:07:31 - OpenCompass - INFO - Task [opencompass.models.huggingface.Hugg ingFace_model_Meta-Llama-3-8B-Instruct/ceval-college_programming]: {'accuracy': 62.16216216216216}
05/07 19:07:31 - OpenCompass - DEBUG - Get class `OpenICLEvalTask` from "task" registry in "opencompass"
05/07 19:07:31 - OpenCompass - DEBUG - An `OpenICLEvalTask` instance is built f rom registry, and its implementation can be found in opencompass.tasks.openicl_ eval
05/07 19:07:33 - OpenCompass - DEBUG - Get class `CEvalDataset` from "load_data set" registry in "opencompass"
05/07 19:07:33 - OpenCompass - DEBUG - An `CEvalDataset` instance is built from registry, and its implementation can be found in opencompass.datasets.ceval
05/07 19:07:34 - OpenCompass - DEBUG - Get class `first_capital_postprocess` fr om "text_postprocessors" registry in "opencompass"
05/07 19:07:34 - OpenCompass - DEBUG - Get class `AccEvaluator` from "icl_evalu ators" registry in "opencompass"
05/07 19:07:34 - OpenCompass - DEBUG - An `AccEvaluator` instance is built from registry, and its implementation can be found in opencompass.openicl.icl_evalu ator.icl_hf_evaluator
05/07 19:07:34 - OpenCompass - INFO - Task [opencompass.models.huggingface.Hugg ingFace_model_Meta-Llama-3-8B-Instruct/ceval-college_physics]: {'accuracy': 42. 10526315789473}
05/07 19:07:34 - OpenCompass - DEBUG - Get class `OpenICLEvalTask` from "task" registry in "opencompass"
05/07 19:07:34 - OpenCompass - DEBUG - An `OpenICLEvalTask` instance is built f rom registry, and its implementation can be found in opencompass.tasks.openicl_ eval
05/07 19:07:35 - OpenCompass - DEBUG - Get class `CEvalDataset` from "load_data set" registry in "opencompass"
05/07 19:07:36 - OpenCompass - DEBUG - An `CEvalDataset` instance is built from registry, and its implementation can be found in opencompass.datasets.ceval
05/07 19:07:36 - OpenCompass - DEBUG - Get class `first_capital_postprocess` fr om "text_postprocessors" registry in "opencompass"
05/07 19:07:36 - OpenCompass - DEBUG - Get class `AccEvaluator` from "icl_evalu ators" registry in "opencompass"
05/07 19:07:36 - OpenCompass - DEBUG - An `AccEvaluator` instance is built from registry, and its implementation can be found in opencompass.openicl.icl_evalu ator.icl_hf_evaluator
05/07 19:07:36 - OpenCompass - INFO - Task [opencompass.models.huggingface.Hugg ingFace_model_Meta-Llama-3-8B-Instruct/ceval-college_chemistry]: {'accuracy': 2 9.166666666666668}
05/07 19:07:36 - OpenCompass - DEBUG - Get class `OpenICLEvalTask` from "task" registry in "opencompass"
05/07 19:07:36 - OpenCompass - DEBUG - An `OpenICLEvalTask` instance is built f rom registry, and its implementation can be found in opencompass.tasks.openicl_ eval
05/07 19:07:37 - OpenCompass - DEBUG - Get class `CEvalDataset` from "load_data set" registry in "opencompass"
05/07 19:07:37 - OpenCompass - DEBUG - An `CEvalDataset` instance is built from registry, and its implementation can be found in opencompass.datasets.ceval
05/07 19:07:38 - OpenCompass - DEBUG - Get class `first_capital_postprocess` fr om "text_postprocessors" registry in "opencompass"
05/07 19:07:38 - OpenCompass - DEBUG - Get class `AccEvaluator` from "icl_evalu ators" registry in "opencompass"
05/07 19:07:38 - OpenCompass - DEBUG - An `AccEvaluator` instance is built from registry, and its implementation can be found in opencompass.openicl.icl_evalu ator.icl_hf_evaluator
05/07 19:07:38 - OpenCompass - INFO - Task [opencompass.models.huggingface.Hugg ingFace_model_Meta-Llama-3-8B-Instruct/ceval-advanced_mathematics]: {'accuracy' : 42.10526315789473}
05/07 19:07:38 - OpenCompass - DEBUG - Get class `OpenICLEvalTask` from "task" registry in "opencompass"
05/07 19:07:38 - OpenCompass - DEBUG - An `OpenICLEvalTask` instance is built f rom registry, and its implementation can be found in opencompass.tasks.openicl_ eval
05/07 19:07:39 - OpenCompass - DEBUG - Get class `CEvalDataset` from "load_data set" registry in "opencompass"
05/07 19:07:40 - OpenCompass - DEBUG - An `CEvalDataset` instance is built from registry, and its implementation can be found in opencompass.datasets.ceval
05/07 19:07:40 - OpenCompass - DEBUG - Get class `first_capital_postprocess` fr om "text_postprocessors" registry in "opencompass"
05/07 19:07:40 - OpenCompass - DEBUG - Get class `AccEvaluator` from "icl_evalu ators" registry in "opencompass"
05/07 19:07:40 - OpenCompass - DEBUG - An `AccEvaluator` instance is built from registry, and its implementation can be found in opencompass.openicl.icl_evalu ator.icl_hf_evaluator
05/07 19:07:40 - OpenCompass - INFO - Task [opencompass.models.huggingface.Hugg ingFace_model_Meta-Llama-3-8B-Instruct/ceval-probability_and_statistics]: {'acc uracy': 27.77777777777778}
05/07 19:07:40 - OpenCompass - DEBUG - Get class `OpenICLEvalTask` from "task" registry in "opencompass"
05/07 19:07:40 - OpenCompass - DEBUG - An `OpenICLEvalTask` instance is built f rom registry, and its implementation can be found in opencompass.tasks.openicl_ eval
05/07 19:07:42 - OpenCompass - DEBUG - Get class `CEvalDataset` from "load_data set" registry in "opencompass"
05/07 19:07:42 - OpenCompass - DEBUG - An `CEvalDataset` instance is built from registry, and its implementation can be found in opencompass.datasets.ceval
05/07 19:07:42 - OpenCompass - DEBUG - Get class `first_capital_postprocess` fr om "text_postprocessors" registry in "opencompass"
05/07 19:07:42 - OpenCompass - DEBUG - Get class `AccEvaluator` from "icl_evalu ators" registry in "opencompass"
05/07 19:07:42 - OpenCompass - DEBUG - An `AccEvaluator` instance is built from registry, and its implementation can be found in opencompass.openicl.icl_evalu ator.icl_hf_evaluator
05/07 19:07:42 - OpenCompass - INFO - Task [opencompass.models.huggingface.Hugg ingFace_model_Meta-Llama-3-8B-Instruct/ceval-discrete_mathematics]: {'accuracy' : 25.0}
05/07 19:07:42 - OpenCompass - DEBUG - Get class `OpenICLEvalTask` from "task" registry in "opencompass"
05/07 19:07:42 - OpenCompass - DEBUG - An `OpenICLEvalTask` instance is built f rom registry, and its implementation can be found in opencompass.tasks.openicl_ eval
05/07 19:07:44 - OpenCompass - DEBUG - Get class `CEvalDataset` from "load_data set" registry in "opencompass"
05/07 19:07:44 - OpenCompass - DEBUG - An `CEvalDataset` instance is built from registry, and its implementation can be found in opencompass.datasets.ceval
05/07 19:07:44 - OpenCompass - DEBUG - Get class `first_capital_postprocess` fr om "text_postprocessors" registry in "opencompass"
05/07 19:07:44 - OpenCompass - DEBUG - Get class `AccEvaluator` from "icl_evalu ators" registry in "opencompass"
05/07 19:07:44 - OpenCompass - DEBUG - An `AccEvaluator` instance is built from registry, and its implementation can be found in opencompass.openicl.icl_evalu ator.icl_hf_evaluator
05/07 19:07:44 - OpenCompass - INFO - Task [opencompass.models.huggingface.Hugg ingFace_model_Meta-Llama-3-8B-Instruct/ceval-electrical_engineer]: {'accuracy': 32.432432432432435}
05/07 19:07:44 - OpenCompass - DEBUG - Get class `OpenICLEvalTask` from "task" registry in "opencompass"
05/07 19:07:44 - OpenCompass - DEBUG - An `OpenICLEvalTask` instance is built f rom registry, and its implementation can be found in opencompass.tasks.openicl_ eval
05/07 19:07:46 - OpenCompass - DEBUG - Get class `CEvalDataset` from "load_data set" registry in "opencompass"
05/07 19:07:46 - OpenCompass - DEBUG - An `CEvalDataset` instance is built from registry, and its implementation can be found in opencompass.datasets.ceval
05/07 19:07:46 - OpenCompass - DEBUG - Get class `first_capital_postprocess` fr om "text_postprocessors" registry in "opencompass"
05/07 19:07:46 - OpenCompass - DEBUG - Get class `AccEvaluator` from "icl_evalu ators" registry in "opencompass"
05/07 19:07:46 - OpenCompass - DEBUG - An `AccEvaluator` instance is built from registry, and its implementation can be found in opencompass.openicl.icl_evalu ator.icl_hf_evaluator
05/07 19:07:46 - OpenCompass - INFO - Task [opencompass.models.huggingface.Hugg ingFace_model_Meta-Llama-3-8B-Instruct/ceval-metrology_engineer]: {'accuracy': 62.5}
05/07 19:07:46 - OpenCompass - DEBUG - Get class `OpenICLEvalTask` from "task" registry in "opencompass"
05/07 19:07:46 - OpenCompass - DEBUG - An `OpenICLEvalTask` instance is built f rom registry, and its implementation can be found in opencompass.tasks.openicl_ eval
05/07 19:07:48 - OpenCompass - DEBUG - Get class `CEvalDataset` from "load_data set" registry in "opencompass"
05/07 19:07:48 - OpenCompass - DEBUG - An `CEvalDataset` instance is built from registry, and its implementation can be found in opencompass.datasets.ceval
05/07 19:07:48 - OpenCompass - DEBUG - Get class `first_capital_postprocess` fr om "text_postprocessors" registry in "opencompass"
05/07 19:07:48 - OpenCompass - DEBUG - Get class `AccEvaluator` from "icl_evalu ators" registry in "opencompass"
05/07 19:07:48 - OpenCompass - DEBUG - An `AccEvaluator` instance is built from registry, and its implementation can be found in opencompass.openicl.icl_evalu ator.icl_hf_evaluator
05/07 19:07:48 - OpenCompass - INFO - Task [opencompass.models.huggingface.Hugg ingFace_model_Meta-Llama-3-8B-Instruct/ceval-high_school_mathematics]: {'accura cy': 5.555555555555555}
05/07 19:07:48 - OpenCompass - DEBUG - Get class `OpenICLEvalTask` from "task" registry in "opencompass"
05/07 19:07:48 - OpenCompass - DEBUG - An `OpenICLEvalTask` instance is built f rom registry, and its implementation can be found in opencompass.tasks.openicl_ eval
05/07 19:07:50 - OpenCompass - DEBUG - Get class `CEvalDataset` from "load_data set" registry in "opencompass"
05/07 19:07:50 - OpenCompass - DEBUG - An `CEvalDataset` instance is built from registry, and its implementation can be found in opencompass.datasets.ceval
05/07 19:07:50 - OpenCompass - DEBUG - Get class `first_capital_postprocess` fr om "text_postprocessors" registry in "opencompass"
05/07 19:07:50 - OpenCompass - DEBUG - Get class `AccEvaluator` from "icl_evalu ators" registry in "opencompass"
05/07 19:07:50 - OpenCompass - DEBUG - An `AccEvaluator` instance is built from registry, and its implementation can be found in opencompass.openicl.icl_evalu ator.icl_hf_evaluator
05/07 19:07:50 - OpenCompass - INFO - Task [opencompass.models.huggingface.Hugg ingFace_model_Meta-Llama-3-8B-Instruct/ceval-high_school_physics]: {'accuracy': 26.31578947368421}
05/07 19:07:50 - OpenCompass - DEBUG - Get class `OpenICLEvalTask` from "task" registry in "opencompass"
05/07 19:07:50 - OpenCompass - DEBUG - An `OpenICLEvalTask` instance is built f rom registry, and its implementation can be found in opencompass.tasks.openicl_ eval
05/07 19:07:52 - OpenCompass - DEBUG - Get class `CEvalDataset` from "load_data set" registry in "opencompass"
05/07 19:07:52 - OpenCompass - DEBUG - An `CEvalDataset` instance is built from registry, and its implementation can be found in opencompass.datasets.ceval
05/07 19:07:52 - OpenCompass - DEBUG - Get class `first_capital_postprocess` fr om "text_postprocessors" registry in "opencompass"
05/07 19:07:52 - OpenCompass - DEBUG - Get class `AccEvaluator` from "icl_evalu ators" registry in "opencompass"
05/07 19:07:52 - OpenCompass - DEBUG - An `AccEvaluator` instance is built from registry, and its implementation can be found in opencompass.openicl.icl_evalu ator.icl_hf_evaluator
05/07 19:07:52 - OpenCompass - INFO - Task [opencompass.models.huggingface.Hugg ingFace_model_Meta-Llama-3-8B-Instruct/ceval-high_school_chemistry]: {'accuracy ': 63.1578947368421}
05/07 19:07:52 - OpenCompass - DEBUG - Get class `OpenICLEvalTask` from "task" registry in "opencompass"
05/07 19:07:52 - OpenCompass - DEBUG - An `OpenICLEvalTask` instance is built f rom registry, and its implementation can be found in opencompass.tasks.openicl_ eval
05/07 19:07:54 - OpenCompass - DEBUG - Get class `CEvalDataset` from "load_data set" registry in "opencompass"
05/07 19:07:54 - OpenCompass - DEBUG - An `CEvalDataset` instance is built from registry, and its implementation can be found in opencompass.datasets.ceval
05/07 19:07:54 - OpenCompass - DEBUG - Get class `first_capital_postprocess` fr om "text_postprocessors" registry in "opencompass"
05/07 19:07:54 - OpenCompass - DEBUG - Get class `AccEvaluator` from "icl_evalu ators" registry in "opencompass"
....
05/07 19:09:12 - OpenCompass - DEBUG - An `AccEvaluator` instance is built from registry, and its implementation can be found in opencompass.openicl.icl_evalu ator.icl_hf_evaluator
05/07 19:09:12 - OpenCompass - INFO - Task [opencompass.models.huggingface.Hugg ingFace_model_Meta-Llama-3-8B-Instruct/ceval-physician]: {'accuracy': 57.142857 14285714}
05/07 19:09:12 - OpenCompass - DEBUG - An `DefaultSummarizer` instance is built from registry, and its implementation can be found in opencompass.summarizers. default
dataset version metric mode opencompass.models.huggingface.HuggingFace_model_Meta-Llama-3-8B-Instruct
---------------------------------------------- --------- ------------- ----- - ---------------------------------------------------------------------------
ceval-computer_network db9ce2 accuracy gen 63.16
ceval-operating_system 1c2571 accuracy gen 63.16
ceval-computer_architecture a74dad accuracy gen 52.38
ceval-college_programming 4ca32a accuracy gen 62.16
ceval-college_physics 963fa8 accuracy gen 42.11
ceval-college_chemistry e78857 accuracy gen 29.17
ceval-advanced_mathematics ce03e2 accuracy gen 42.11
ceval-probability_and_statistics 65e812 accuracy gen 27.78
ceval-discrete_mathematics e894ae accuracy gen 25
ceval-electrical_engineer ae42b9 accuracy gen 32.43
ceval-metrology_engineer ee34ea accuracy gen 62.5
ceval-high_school_mathematics 1dc5bf accuracy gen 5.56
ceval-high_school_physics adf25f accuracy gen 26.32
ceval-high_school_chemistry 2ed27f accuracy gen 63.16
ceval-high_school_biology 8e2b9a accuracy gen 36.84
ceval-middle_school_mathematics bee8d5 accuracy gen 31.58
ceval-middle_school_biology 86817c accuracy gen 71.43
ceval-middle_school_physics 8accf6 accuracy gen 57.89
ceval-middle_school_chemistry 167a15 accuracy gen 80
ceval-veterinary_medicine b4e08d accuracy gen 52.17
ceval-college_economics f3f4e6 accuracy gen 45.45
ceval-business_administration c1614e accuracy gen 30.3
ceval-marxism cf874c accuracy gen 47.37
ceval-mao_zedong_thought 51c7a4 accuracy gen 50
ceval-education_science 591fee accuracy gen 51.72
ceval-teacher_qualification 4e4ced accuracy gen 72.73
ceval-high_school_politics 5c0de2 accuracy gen 68.42
ceval-high_school_geography 865461 accuracy gen 42.11
ceval-middle_school_politics 5be3e7 accuracy gen 57.14
ceval-middle_school_geography 8a63be accuracy gen 50
ceval-modern_chinese_history fc01af accuracy gen 52.17
ceval-ideological_and_moral_cultivation a2aa4a accuracy gen 78.95
ceval-logic f5b022 accuracy gen 40.91
ceval-law a110a1 accuracy gen 33.33
ceval-chinese_language_and_literature 0f8b68 accuracy gen 34.78
ceval-art_studies 2a1300 accuracy gen 54.55
ceval-professional_tour_guide 4e673e accuracy gen 55.17
ceval-legal_professional ce8787 accuracy gen 30.43
ceval-high_school_chinese 315705 accuracy gen 31.58
ceval-high_school_history 7eb30a accuracy gen 65
ceval-middle_school_history 48ab4a accuracy gen 59.09
ceval-civil_servant 87d061 accuracy gen 34.04
ceval-sports_science 70f27b accuracy gen 63.16
ceval-plant_protection 8941f9 accuracy gen 68.18
ceval-basic_medicine c409d6 accuracy gen 57.89
ceval-clinical_medicine 49e82d accuracy gen 54.55
ceval-urban_and_rural_planner 95b885 accuracy gen 52.17
ceval-accountant 002837 accuracy gen 44.9
ceval-fire_engineer bc23f5 accuracy gen 38.71
ceval-environmental_impact_assessment_engineer c64e2d accuracy gen 45.16
ceval-tax_accountant 3a5e3c accuracy gen 34.69
ceval-physician 6e277d accuracy gen 57.14
ceval-stem - naive_average gen 46.34
ceval-social-science - naive_average gen 51.52
ceval-humanities - naive_average gen 48.72
ceval-other - naive_average gen 50.05
ceval-hard - naive_average gen 32.65
ceval - naive_average gen 48.63
05/07 19:09:12 - OpenCompass - INFO - write summary to /root/opencompass/output s/default/20240507_184531/summary/summary_20240507_184531.txt
05/07 19:09:12 - OpenCompass - INFO - write csv to /root/opencompass/outputs/de fault/20240507_184531/summary/summary_20240507_184531.csv
大模型技术分享
《企业级生成式人工智能LLM大模型技术、算法及案例实战》线上高级研修讲座
模块一:Generative AI 原理本质、技术内核及工程实践周期详解
模块二:工业级 Prompting 技术内幕及端到端的基于LLM 的会议助理实战
模块三:三大 Llama 2 模型详解及实战构建安全可靠的智能对话系统
模块四:生产环境下 GenAI/LLMs 的五大核心问题及构建健壮的应用实战
模块五:大模型应用开发技术:Agentic-based 应用技术及案例实战
模块六:LLM 大模型微调及模型 Quantization 技术及案例实战
模块七:大模型高效微调 PEFT 算法、技术、流程及代码实战进阶
模块八:LLM 模型对齐技术、流程及进行文本Toxicity 分析实战
模块九:构建安全的 GenAI/LLMs 核心技术Red Teaming 解密实战
模块十:构建可信赖的企业私有安全大模型Responsible AI 实战
Llama3关键技术深度解析与构建Responsible AI、算法及开发落地实战
1、Llama开源模型家族大模型技术、工具和多模态详解:学员将深入了解Meta Llama 3的创新之处,比如其在语言模型技术上的突破,并学习到如何在Llama 3中构建trust and safety AI。他们将详细了解Llama 3的五大技术分支及工具,以及如何在AWS上实战Llama指令微调的案例。
2、解密Llama 3 Foundation Model模型结构特色技术及代码实现:深入了解Llama 3中的各种技术,比如Tiktokenizer、KV Cache、Grouped Multi-Query Attention等。通过项目二逐行剖析Llama 3的源码,加深对技术的理解。
3、解密Llama 3 Foundation Model模型结构核心技术及代码实现:SwiGLU Activation Function、FeedForward Block、Encoder Block等。通过项目三学习Llama 3的推理及Inferencing代码,加强对技术的实践理解。
4、基于LangGraph on Llama 3构建Responsible AI实战体验:通过项目四在Llama 3上实战基于LangGraph的Responsible AI项目。他们将了解到LangGraph的三大核心组件、运行机制和流程步骤,从而加强对Responsible AI的实践能力。
5、Llama模型家族构建技术构建安全可信赖企业级AI应用内幕详解:深入了解构建安全可靠的企业级AI应用所需的关键技术,比如Code Llama、Llama Guard等。项目五实战构建安全可靠的对话智能项目升级版,加强对安全性的实践理解。
6、Llama模型家族Fine-tuning技术与算法实战:学员将学习Fine-tuning技术与算法,比如Supervised Fine-Tuning(SFT)、Reward Model技术、PPO算法、DPO算法等。项目六动手实现PPO及DPO算法,加强对算法的理解和应用能力。
7、Llama模型家族基于AI反馈的强化学习技术解密:深入学习Llama模型家族基于AI反馈的强化学习技术,比如RLAIF和RLHF。项目七实战基于RLAIF的Constitutional AI。
8、Llama 3中的DPO原理、算法、组件及具体实现及算法进阶:学习Llama 3中结合使用PPO和DPO算法,剖析DPO的原理和工作机制,详细解析DPO中的关键算法组件,并通过综合项目八从零开始动手实现和测试DPO算法,同时课程将解密DPO进阶技术Iterative DPO及IPO算法。
9、Llama模型家族Safety设计与实现:在这个模块中,学员将学习Llama模型家族的Safety设计与实现,比如Safety in Pretraining、Safety Fine-Tuning等。构建安全可靠的GenAI/LLMs项目开发。
10、Llama 3构建可信赖的企业私有安全大模型Responsible AI系统:构建可信赖的企业私有安全大模型Responsible AI系统,掌握Llama 3的Constitutional AI、Red Teaming。
解码Sora架构、技术及应用
一、为何Sora通往AGI道路的里程碑?
1,探索从大规模语言模型(LLM)到大规模视觉模型(LVM)的关键转变,揭示其在实现通用人工智能(AGI)中的作用。
2,展示Visual Data和Text Data结合的成功案例,解析Sora在此过程中扮演的关键角色。
3,详细介绍Sora如何依据文本指令生成具有三维一致性(3D consistency)的视频内容。 4,解析Sora如何根据图像或视频生成高保真内容的技术路径。
5,探讨Sora在不同应用场景中的实践价值及其面临的挑战和局限性。
二、解码Sora架构原理
1,DiT (Diffusion Transformer)架构详解
2,DiT是如何帮助Sora实现Consistent、Realistic、Imaginative视频内容的?
3,探讨为何选用Transformer作为Diffusion的核心网络,而非技术如U-Net。
4,DiT的Patchification原理及流程,揭示其在处理视频和图像数据中的重要性。
5,Conditional Diffusion过程详解,及其在内容生成过程中的作用。
三、解码Sora关键技术解密
1,Sora如何利用Transformer和Diffusion技术理解物体间的互动,及其对模拟复杂互动场景的重要性。
2,为何说Space-time patches是Sora技术的核心,及其对视频生成能力的提升作用。
3,Spacetime latent patches详解,探讨其在视频压缩和生成中的关键角色。
4,Sora Simulator如何利用Space-time patches构建digital和physical世界,及其对模拟真实世界变化的能力。
5,Sora如何实现faithfully按照用户输入文本而生成内容,探讨背后的技术与创新。
6,Sora为何依据abstract concept而不是依据具体的pixels进行内容生成,及其对模型生成质量与多样性的影响。