基于词袋模型的场景识别(附源代码!!!)


目录

  • 1. 任务要求
  • 2. 数据集
  • 3. 实现算法
  • 4. 实验结果
  • 5. 源代码


1. 任务要求

  • 输入:给定测试集图片,预测在15个场景中的类别。
  • 任务
    • 实现Tiny images representation。
    • 实现最近邻分类器nearest neighbor classifier。
    • 实现SIFT特征词袋表示
  • 输出
    • 针对Tiny images representation 和SIFT 词袋表示,报告每个类别的准确度和平均准确度。
    • 对这两种方案,对正确和错误的识别结果挑出示例进行可视化。
    • 探索不同的参数设置对结果的影响,总结成表格。
    • 通过实验讨论词汇量的大小对识别分类结果的影响,比如哪个类别的识别准确率最高/最低,原因是什么。

2. 数据集

http://www.cad.zju.edu.cn/home/gfzhang/course/cv/Homework3.zip


3. 实现算法

  1. Tiny images representation
  2. SIFT特征词袋表示
  3. 分类算法:HOG、NNS和SVM。

4. 实验结果

场景识别与词袋模型

在这里插入图片描述


5. 源代码

  • 完整项目地址:https://github.com/Jurio0304/Scene_Recognition_with_Bag_of_Words,参考引用麻烦点点star,非常感谢!
  • main.py源代码如下:
#!/usr/bin/python
import sys

import numpy as np
import os
import argparse

from create_results import create_results
from get_image_path import get_image_paths
from get_tiny_images import get_tiny_images
from build_vocabulary import build_vocabulary, build_vocabulary_sift
from get_bags_of_words import get_bags_of_words, get_bags_of_words_sift
from svm_classify import svm_classify
from nearest_neighbor_classify import nearest_neighbor_classify


# from create_results_webpage import create_results_webpage


def scene_recognition(feature='chance_feature', feature_detector='sift', classifier='chance_classifier'):
    """
    For this project, you will need to report performance for three
    combinations of features / classifiers. We recommend that you code them in
    this order:
        1) Tiny image features and nearest neighbor classifier
        2) Bag of word features and nearest neighbor classifier
        3) Bag of word features and linear SVM classifier
    The starter code is initialized to 'chance_' just so that the starter
    code does not crash when run unmodified and you can get a preview of how
    results are presented.

    Interpreting your performance with 100 training examples per category:
     accuracy  =   0 -> Something is broken.
     accuracy ~= .07 -> Your performance is equal to chance.
                        Something is broken or you ran the starter code unchanged.
     accuracy ~= .20 -> Rough performance with tiny images and nearest
                        neighbor classifier. Performance goes up a few
                        percentage points with K-NN instead of 1-NN.
     accuracy ~= .20 -> Rough performance with tiny images and linear SVM
                        classifier. Although the accuracy is about the same as
                        nearest neighbor, the confusion matrix is very different.
     accuracy ~= .40 -> Rough performance with bag of word and nearest
                        neighbor classifier. Can reach .60 with K-NN and
                        different distance metrics.
     accuracy ~= .50 -> You've gotten things roughly correct with bag of
                        word and a linear SVM classifier.
     accuracy >= .70 -> You've also tuned your parameters well. E.g. number
                        of clusters, SVM regularization, number of patches
                        sampled when building vocabulary, size and step for
                        dense features.
     accuracy >= .80 -> You've added in spatial information somehow or you've
                        added additional, complementary image features. This
                        represents state of the art in Lazebnik et al 2006.
     accuracy >= .85 -> You've done extremely well. This is the state of the
                        art in the 2010 SUN database paper from fusing many
                        features. Don't trust this number unless you actually
                        measure many random splits.
     accuracy >= .90 -> You used modern deep features trained on much larger
                        image databases.
     accuracy >= .96 -> You can beat a human at this task. This isn't a
                        realistic number. Some accuracy calculation is broken
                        or your classifier is cheating and seeing the test
                        labels.
    """

    # Step 0: Set up parameters, category list, and image paths.
    FEATURE = feature
    CLASSIFIER = classifier

    # This is the path the script will look at to load images from.
    data_path = './data/'

    # This is the list of categories / directories to use. The categories are
    # somewhat sorted by similarity so that the confusion matrix looks more
    # structured (indoor and then urban and then rural).
    categories = ['Kitchen', 'Store', 'Bedroom', 'LivingRoom', 'Office',
                  'Industrial', 'Suburb', 'InsideCity', 'TallBuilding', 'Street',
                  'Highway', 'OpenCountry', 'Coast', 'Mountain', 'Forest']

    # This list of shortened category names is used later for visualization.
    abbr_categories = ['Kit', 'Sto', 'Bed', 'Liv', 'Off', 'Ind', 'Sub',
                       'Cty', 'Bld', 'St', 'HW', 'OC', 'Cst', 'Mnt', 'For']

    # Number of training examples per category to use. Max is 100.
    # For simplicity, we assume this is the number of test cases per category as well.
    num_train_per_cat = 100

    # This function returns string arrays containing the file path for each train and test image
    print('Getting paths and labels for all train and test data.')

    train_image_paths, test_image_paths, train_labels, test_labels = \
        get_image_paths(data_path, categories, num_train_per_cat)
    #   train_image_paths  1500x1   list
    #   test_image_paths   1500x1   list
    #   train_labels       1500x1   list
    #   test_labels        1500x1   list

    ############################################################################
    # Step 1: Represent each image with the appropriate feature
    # Each function to construct features should return an N x d matrix, where
    # N is the number of paths passed to the function and d is the
    # dimensionality of each image representation. See the starter code for
    # each function for more details.
    ############################################################################

    print('Using %s representation for images.' % FEATURE)

    if FEATURE.lower() == 'tiny_image':
        print('Loading tiny images...')
        h, w = 16, 32

        train_image_feats = get_tiny_images(train_image_paths, h_size=h, w_size=w)
        test_image_feats = get_tiny_images(test_image_paths, h_size=h, w_size=w)
        print('Tiny images loaded.')

    elif FEATURE.lower() == 'bag_of_words':
        # Because building the vocabulary takes a long time, we save the generated
        # vocab to a file and re-load it each time to make testing faster.

        # Larger values will work better (to a point), but are slower to compute
        vocab_size = 50
        if not os.path.isfile(f'{feature_detector}_vocab_{vocab_size}.npy'):
            print('No existing visual word vocabulary found. Computing one from training images.')

            if feature_detector.lower() == 'sift':
                vocab = build_vocabulary_sift(train_image_paths, vocab_size)
            else:
                vocab = build_vocabulary(train_image_paths, vocab_size)

            np.save(f'{feature_detector}_vocab_{vocab_size}.npy', vocab)

        if feature_detector.lower() == 'sift':
            train_image_feats = get_bags_of_words_sift(train_image_paths, vocab_size, feature_detector)
            test_image_feats = get_bags_of_words_sift(test_image_paths, vocab_size, feature_detector)
        else:
            train_image_feats = get_bags_of_words(train_image_paths, vocab_size)
            test_image_feats = get_bags_of_words(test_image_paths, vocab_size)

    elif FEATURE.lower() == 'chance_feature':
        train_image_feats = []
        test_image_feats = []

    else:
        raise ValueError('Unknown feature type!')

    ############################################################################
    # Step 2: Classify each test image by training and using the appropriate classifier
    # Each function to classify test features will return an N x 1 string array,
    # where N is the number of test cases and each entry is a string indicating
    # the predicted category for each test image. Each entry in
    # 'predicted_categories' must be one of the 15 strings in 'categories',
    # 'train_labels', and 'test_labels'. See the starter code for each function
    # for more details.
    ############################################################################

    print('Using %s classifier to predict test set categories.' % CLASSIFIER)

    if CLASSIFIER.lower() == 'nearest_neighbor':
        predicted_categories = nearest_neighbor_classify(train_image_feats, train_labels, test_image_feats)

    elif CLASSIFIER.lower() == 'support_vector_machine':
        predicted_categories = svm_classify(train_image_feats, train_labels, test_image_feats)

    elif CLASSIFIER.lower() == 'chance_classifier':
        # The placeholder classifier simply predicts a random category for every test case
        random_permutation = np.random.permutation(len(test_labels))
        predicted_categories = [test_labels[i] for i in random_permutation]

    else:
        raise ValueError('Unknown classifier type')

    ############################################################################
    # Step 3: Build a confusion matrix and score the recognition system
    # You do not need to code anything in this section.

    # If we wanted to evaluate our recognition method properly we would train
    # and test on many random splits of the data. You are not required to do so
    # for this project.

    # This function will recreate results_webpage/index.html and various image
    # thumbnails each time it is called. View the webpage to help interpret
    # your classifier performance. Where is it making mistakes? Are the
    # confusions reasonable?
    ############################################################################
    result_path = f'results/{feature}_{classifier}'
    if not os.path.isdir('./results'):
        print('Making results directory.')
        os.mkdir('./results')
    if not os.path.isdir(result_path):
        os.mkdir(result_path)

    create_results(train_image_paths, test_image_paths, train_labels, test_labels, categories, abbr_categories,
                   predicted_categories, result_path)


if __name__ == '__main__':
    '''
    Command line usage:
    python main.py [-f | --feature <representation to use>] [-c | --classifier <classifier method>]
    
    '''
    # create the command line parser
    parser = argparse.ArgumentParser()

    parser.add_argument('-f', '--feature', default='bag_of_words',
                        help='Either chance_feature, tiny_image, or bag_of_words')
    parser.add_argument('-fd', '--feature_detector', default='sift',
                        help='Either sift or hog')
    parser.add_argument('-c', '--classifier', default='support_vector_machine',
                        help='Either chance_classifier, nearest_neighbor, or support_vector_machine')

    args = parser.parse_args()

    # RUN THE MAIN SCRIPT
    scene_recognition(args.feature, args.feature_detector, args.classifier)

    sys.exit(0)

创作不易,麻烦点点赞和关注咯!

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:/a/433899.html

如若内容造成侵权/违法违规/事实不符,请联系我们进行投诉反馈qq邮箱809451989@qq.com,一经查实,立即删除!

相关文章

原生IP是什么?如何获取海外原生IP?

一、什么是原生IP 原生IP地址是互联网服务提供商&#xff08;ISP&#xff09;直接分配给用户的真实IP地址&#xff0c;无需代理或转发。这类IP的注册国家与IP所在服务器的注册地相符。这种IP地址直接与用户的设备或网络关联&#xff0c;不会被任何中间服务器或代理转发或隐藏。…

嵌入式学习-FreeRTOS-Day1

一、重点 1、VCC和GND VCC&#xff1a; 1、电路中为电源&#xff0c;供应电压 2、3.3v-5v 3、数字信号中用1表示GND&#xff1a; 1、表示地线 2、一般为0v 3、数字信号中用0表示2、电容和电阻 电容 存储电荷 存储能量&#xff1a; 电容器可以在其两个导体板&#xff08;极…

java开发工程师面试技巧,小白必看

什么是分布式锁&#xff1f;在回答这个问题之前&#xff0c;我们先回答一下什么是锁。 普通的锁&#xff0c;即在单机多线程环境下&#xff0c;当多个线程需要访问同一个变量或代码片段时&#xff0c;被访问的变量或代码片段叫做临界区域&#xff0c;我们需要控制线程一个一个…

随机变量及其分布错题本

《1800》 1 需要从概率密度出发&#xff0c;在积分成为分布函数的情况下将 x 拉回为 -x来进行计算&#xff0c;所以X与-X最后得出的分布函数会一样。 2 3 4 5 6 7 8 9 10 11 12 13 14 15

二维码门楼牌管理系统应用场景:市场研究机构的新宠

文章目录 前言一、市场研究机构的新工具二、市场分析与区域趋势研究三、支持企业决策与市场营销策略四、与市场研究机构的联动效应五、未来展望 前言 在数字化时代&#xff0c;二维码门楼牌管理系统以其独特的优势&#xff0c;正在成为市场研究机构的新宠。通过收集和分析门牌…

Linux常用命令之top监测

(/≧▽≦)/~┴┴ 嗨~我叫小奥 ✨✨✨ &#x1f440;&#x1f440;&#x1f440; 个人博客&#xff1a;小奥的博客 &#x1f44d;&#x1f44d;&#x1f44d;&#xff1a;个人CSDN ⭐️⭐️⭐️&#xff1a;传送门 &#x1f379; 本人24应届生一枚&#xff0c;技术和水平有限&am…

算法——动态规划

1. 什么是动态规划&#xff1f; 动态规划&#xff08;Dynamic Programming&#xff09;是一种解决多阶段决策问题的优化方法。它通常用于解决具有重叠子问题和最优子结构性质的问题&#xff0c;能够将一个大问题分解为多个重叠的子问题&#xff0c;并通过存储子问题的解来避免重…

SpringBoot+Mybatis-plus+shardingsphere实现分库分表

SpringBootMybatis-plusshardingsphere实现分库分表 文章目录 SpringBootMybatis-plusshardingsphere实现分库分表介绍引入依赖yaml配置DDL准备数据库ds0数据库ds1 entitycotrollerserviceMapper启动类测试添加修改查询删除 总结 介绍 实现亿级数据量分库分表的项目是一个挑战…

小白跟做江科大51单片机之DS1302按键可调时钟

1.引入上一个程序的代码 2.引入Key和Timer0文件 3.获取按键值 定义全局变量unsigned char keynum main函数中 keynumKey(); 4.设置第一个按键的两种模式&#xff0c;以此来控制时钟的设定和显示 if(keynum1) { if(MODE0) { …

GDB调试入门笔记

文章目录 What&#xff1f;WhyHow安装GDB安装命令查看是否安装成功调试简单的程序预备一个程序调试 使用breakinfolistnextprintstep一些小技巧在gdb前shell日志功能watch point| catch point 调试core调试一个运行的程序 What&#xff1f; GDB是什么&#xff1f; 全称GNU sym…

lowcode-engine接入编辑器

https://lowcode-engine.cn/site/docs/guide/create/useEditor 方案1 pnpm init pnpm add "alilc/create-elementlatest"pnpm create "alilc/element" editor-project-name选择编辑器 进入执行pnpm install命令安装包 pnpm start报错 pnpm add &qu…

JMeter VS RunnerGo :两大主流性能测试工具对比

说起JMeter&#xff0c;估计很多测试人员都耳熟能详。它小巧、开源&#xff0c;还能支持多种协议的接口和性能测试&#xff0c;所以在测试圈儿里很受欢迎&#xff0c;也是测试人员常用的工具&#xff0c;不少企业也基于JMeter建立起自己的自动化测试能力&#xff0c;提升工作效…

leetcode 经典题目42.接雨水

链接&#xff1a;https://leetcode.cn/problems/trapping-rain-water 题目描述 给定 n 个非负整数表示每个宽度为 1 的柱子的高度图&#xff0c;计算按此排列的柱子&#xff0c;下雨之后能接多少雨水。 思路分析 首先&#xff0c;我们需要遍历数组&#xff0c;对于每个元素&am…

链路负载均衡之策略路由

一、策略路由的概念 一般来说&#xff0c;防火墙是根据目的地址查看路由&#xff0c;这种情况下只能根据报文的目的地址为用户提供服务&#xff0c;没办法更加灵活对内网用户进行区分&#xff0c;让不同用户流量走不同的链路转发&#xff0c;如根据源地址、应用协议等区分流量…

3.3改造from框

1.如何解决如何导入组件 2.导入组件如何传值 我们如何区分哪个父组件那个子组件我们如何区分 我们现在只知道我们导入的组件&#xff0c;导入的组件是父组件还是子组件 看一下专业回答 如何进行传值的方式 父组件穿的通过是 v-bind的方式 子组件是通过defineProps接受的方…

如何构建用于物体和标志检测的自定义模型

让我们快速了解一下AWS的机器学习技术栈&#xff0c;它几乎提供了解决我们业务问题所需的所有机器学习方面的支持。 物体检测是什么&#xff1f; 物体检测是从图像或视频帧中检测特定类别实例的任务。我们的目标是在图像/视频帧中找出哪里有什么物体。它是其他依赖物体的任务…

基于单片机的室内空气质量监控系统设计

目 录 摘 要 I Abstract II 引 言 1 1 控制系统设计 3 1.1 方案选择 3 1.2 系统控制原理 4 2系统硬件设计 5 2.1 单片机的选择与设计 5 2.2 温湿度模块设计 6 2.3 甲醛采集模块设计 8 2.4 显示器模块设计 9 2.5 按键模块设计 10 2.6 报警模块设计 11 2.7 加湿及风扇模块设计 1…

【JavaEE】_Spring MVC项目之使用对象传参

目录 1. 使用对象传参 2. 后端参数重命名问题 2.1 关于RequestParam注解 本专栏关于Spring MVC项目的单个及多个参数传参一文中&#xff0c;已经介绍过了对于不同个数的参数传参问题&#xff0c;原文链接如下&#xff1a; 【JavaEE】_Spring MVC 项目单个及多个参数传参-CS…

部署LVS集群之DR模式

直接路由模式----DR模式 理念&#xff1a; 直接路由&#xff08;是lvs的默认模式&#xff09; DR模式和隧道模式唯一的区别&#xff1a;dr模式这四台服务器在同一网段&#xff0c;隧道模式 &#xff1a;这四台服务器不在同一网段 客户端 ------->代理服务器------->真实…

Linux命令之top命令

目录 语法 参数说明&#xff1a; 显示信息 top 命令的一些常用功能和显示信息&#xff1a; 第一行&#xff1a;系统负载信息 第二行&#xff1a;进程信息 进程列表 总体系统信息&#xff1a; 进程信息&#xff1a; 功能和交互操作&#xff1a; Linux top 是一个在 L…