使用k-近邻算法改进约会网站的配对效果(kNN)

目录

谷歌笔记本(可选)

准备数据:从文本文件中解析数据

编写算法:编写kNN算法

分析数据:使用Matplotlib创建散点图

准备数据:归一化数值

测试算法:作为完整程序验证分类器

使用算法:构建完整可用系统


谷歌笔记本(可选)


from google.colab import drive
drive.mount("/content/drive")

Mounted at /content/drive


准备数据:从文本文件中解析数据


def file2matrix(filename):
  fr = open(filename)
  arrayOfLines = fr.readlines()
  numberOfLines = len(arrayOfLines)
  returnMat = zeros((numberOfLines, 3))
  classLabelVector = []
  index = 0
  for line in arrayOfLines:
    line = line.strip()
    listFromLine = line.split('\t')
    returnMat[index, :] = listFromLine[0:3]
    classLabelVector.append(int(listFromLine[-1]))
    index += 1
  return returnMat, classLabelVector
datingDataMat, datingLabels = file2matrix('/content/drive/MyDrive/MachineLearning/机器学习/k-近邻算法/使用k-近邻算法改进约会网站的配对效果/datingTestSet2.txt')
datingDataMat

array([[4.0920000e+04, 8.3269760e+00, 9.5395200e-01], [1.4488000e+04, 7.1534690e+00, 1.6739040e+00], [2.6052000e+04, 1.4418710e+00, 8.0512400e-01], ..., [2.6575000e+04, 1.0650102e+01, 8.6662700e-01], [4.8111000e+04, 9.1345280e+00, 7.2804500e-01], [4.3757000e+04, 7.8826010e+00, 1.3324460e+00]])

datingLabels[:10]

[3, 2, 1, 1, 1, 1, 3, 3, 1, 3]


编写算法:编写kNN算法


from numpy import *
import operator

def classify0(inX, dataSet, labels, k):
  dataSetSize = dataSet.shape[0]
  diffMat = tile(inX, (dataSetSize, 1)) - dataSet
  sqDiffMat = diffMat ** 2
  sqDistances = sqDiffMat.sum(axis=1)
  distances = sqDistances**0.5
  sortedDistIndicies = distances.argsort()
  classCount = {}
  for i in range(k):
    voteIlabel = labels[sortedDistIndicies[i]]
    classCount[voteIlabel] = classCount.get(voteIlabel, 0) + 1
  sortedClassCount = sorted(classCount.items(), key=operator.itemgetter(1), reverse=True)
  return sortedClassCount[0][0]

分析数据:使用Matplotlib创建散点图


import matplotlib
import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.add_subplot(111)
ax.scatter(datingDataMat[:, 1], datingDataMat[:, 2])
plt.show()

 

import matplotlib
import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.add_subplot(111)
ax.scatter(datingDataMat[:, 1], datingDataMat[:, 2],
           15.0*array(datingLabels), 15.0*array(datingLabels))
plt.show()

import matplotlib
import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.add_subplot(111)
ax.scatter(datingDataMat[:, 0], datingDataMat[:, 1],
           15.0*array(datingLabels), 15.0*array(datingLabels))
plt.show()


准备数据:归一化数值


def autoNorm(dataSet):
  minVals = dataSet.min(0)
  maxVals = dataSet.max(0)
  ranges = maxVals - minVals
  normDataSet = zeros(shape(dataSet))
  m = dataSet.shape[0]
  normDataSet = dataSet - tile(minVals, (m,1))
  normDataSet = normDataSet/tile(ranges, (m,1))
  return normDataSet, ranges, minVals
normMat, ranges, minVals = autoNorm(datingDataMat)
normMat
array([[0.44832535, 0.39805139, 0.56233353],
       [0.15873259, 0.34195467, 0.98724416],
       [0.28542943, 0.06892523, 0.47449629],
       ...,
       [0.29115949, 0.50910294, 0.51079493],
       [0.52711097, 0.43665451, 0.4290048 ],
       [0.47940793, 0.3768091 , 0.78571804]])
ranges
array([9.1273000e+04, 2.0919349e+01, 1.6943610e+00])
minVals
array([0.      , 0.      , 0.001156])

测试算法:作为完整程序验证分类器


def datingClassTest():
  hoRatio = 0.1
  datingDataMat, datingLabels = file2matrix('/content/drive/MyDrive/MachineLearning/机器学习/k-近邻算法/使用k-近邻算法改进约会网站的配对效果/datingTestSet2.txt')
  normMat, ranges, minVals = autoNorm(datingDataMat)
  m = normMat.shape[0]
  numTestVecs = int(m*hoRatio)
  errorCount = 0
  for i in range(numTestVecs):
    classifierResult = classify0(normMat[i,:], normMat[numTestVecs:m,:],
                                 datingLabels[numTestVecs:m],3)
    print("the classifierResult came back with: %d,\
    the real answer is: %d" % (classifierResult, datingLabels[i]))
    if (classifierResult != datingLabels[i]):
      errorCount += 1
  print("the total error rate is: %f" % (errorCount/float(numTestVecs)))

datingClassTest()
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 3,    the real answer is: 2
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 3,    the real answer is: 1
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 3,    the real answer is: 1
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 2,    the real answer is: 3
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 3,    the real answer is: 3
the classifierResult came back with: 2,    the real answer is: 2
the classifierResult came back with: 1,    the real answer is: 1
the classifierResult came back with: 3,    the real answer is: 1
the total error rate is: 0.050000


使用算法:构建完整可用系统


def classifyPerson():
  resultList = ['not at all',
          'in small doses',
          'in large doses',]
  percentTats = float(input("percentage of time spent playing video games?"))
  ffMiles = float(input("frequent flier miles earned per year?"))
  iceCream = float(input("liters of ice cream consumed per year?"))
  datingDataMat, datingLabels = file2matrix('/content/drive/MyDrive/MachineLearning/机器学习/k-近邻算法/使用k-近邻算法改进约会网站的配对效果/datingTestSet2.txt')
  normMat, ranges, minVals = autoNorm(datingDataMat)
  inArr = array([ffMiles, percentTats, iceCream])
  classifierResult = classify0((inArr - minVals)/ranges, normMat, datingLabels, 3)
  print("You will probably like this person:", resultList[classifierResult - 1])
classifyPerson()
percentage of time spent playing video games?10
frequent flier miles earned per year?10000
liters of ice cream consumed per year?0.5
You will probably like this person: in small doses

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:/a/403609.html

如若内容造成侵权/违法违规/事实不符,请联系我们进行投诉反馈qq邮箱809451989@qq.com,一经查实,立即删除!

相关文章

MybatisPlus基础入门以及入门案例

目录 一、MyBatisPlus简介 二、MyBatisPlus特性 三、MyBatisPlus支持的数据库 四、框架结构 五、入门案例 1.开发环境 2.创建数据库及表 3.创建Spring Boot工程 4.导入依赖 5.项目结构 6.配置application.yml 7.配置SpringBoot启动类 8.添加实体类 9.添加mapper接口 10.测试…

vue后台管理添加水印简单方式watermark-package

详情参考:https://www.npmjs.com/package/watermark-package 示例方法 <el-button type"primary" click"AddWatermark">添加水印</el-button><el-button type"primary" click"RemoveWatermark">清除水印</el-but…

MKdocs添加顶部公告栏

效果如图&#xff1a; docs/overrides下新建main.html &#xff0c;针对main.html文件 树状结构如下: $ tree -a . ├── .github │ ├── .DS_Store │ └── workflows │ └── PublishMySite.yml ├── docs │ └── index.md │ └──overrides │…

领域驱动设计在互联网业务开发中的实践

至少30年以前&#xff0c;一些软件设计人员就已经意识到领域建模和设计的重要性&#xff0c;并形成一种思潮&#xff0c;Eric Evans将其定义为领域驱动设计&#xff08;Domain-Driven Design&#xff0c;简称DDD&#xff09;。在互联网开发“小步快跑&#xff0c;迭代试错”的大…

数字孪生城市及其他应用场景应用

数字孪生的“虚拟副本”让城市治理不再盲人摸象。 从城市治理的角度来看&#xff0c;数字孪生城市相当于真实世界的“操作系统”&#xff0c;有了它就可以远程对城市的每一个角落进行监测、智慧调度&#xff0c;无论是街道、社区&#xff0c;还是商场、变电站乃至城市排水系统…

使用 JMeter 生成测试数据对 MySQL 进行压力测试

博主历时三年精心创作的《大数据平台架构与原型实现&#xff1a;数据中台建设实战》一书现已由知名IT图书品牌电子工业出版社博文视点出版发行&#xff0c;点击《重磅推荐&#xff1a;建大数据平台太难了&#xff01;给我发个工程原型吧&#xff01;》了解图书详情&#xff0c;…

【C语言基础教程】getline函数与临时文件

文章目录 前言一、getline函数1.1 为什么需要这个函数1.2 getline函数的使用1.3 使用示例 二、临时文件2.1 临时文件的使用2.2 示例代码 总结 前言 在C语言编程中&#xff0c;处理文本文件是一个常见的任务。然而&#xff0c;有时候我们需要处理那些我们不想在磁盘上创建的临时…

Qwen-VL本地化部署及微调实践

Qwen-VL本地化部署及微调实践 创建虚拟环境模型部署下载模型文件下载项目代码安装python依赖环境修改web_demo_mm.py及openai_api.py的部分代码启动测试 模型微调环境部署数据准备微调 问题 创建虚拟环境 conda create -name vl python3.10.8模型部署 下载模型文件 https://…

React学习——快速上手

文章目录 初步模块思维 初步 https://php.cn/faq/400956.html 1、可以手动使用npm来安装各种插件&#xff0c;来从头到尾自己搭建环境。 如&#xff1a; npm install react react-dom --save npm install babel babel-loader babel-core babel-preset-es2015 babel-preset-rea…

从MATLAB到MWORKS,科学计算与系统建模仿真平台的中国选项

“中国需要自主的科学计算与系统建模仿真平台。” 工业软件是所有复杂系统研发设计、仿真验证和数字制造的必备工具&#xff0c;已经成为衡量一个国家工业竞争力的核心指标。在传统工业软件领域&#xff0c;我们一直处于落后状态&#xff0c;尤其是研发设计类工业软件&#xff…

uniapp开发微信小程序跳转到另一个小程序中

注意&#xff1a;一开始我的云上务工模块是单独的tabbar界面&#xff0c;但是小程序跳转好像不能直接点击tabbar进行&#xff0c;所以我将这里改成了点击首页中的按钮进行跳转 点击这里进行小程序跳转 目录 基础讲解 uniapp小程序跳转的两个方法 调用说明&#xff08;半屏跳转…

助力国产BMS管理芯片品牌发展,世强硬创获迈巨微电子授权代理

作为电池的核心半导体器件&#xff0c;BMS电池管理芯片在电动化时代的需求持续旺盛&#xff0c;并迎来了快速迭代的时期。 为满足电动自行车、AGV等市场对BMS电池管理芯片的需求&#xff0c;世强先进&#xff08;深圳&#xff09;科技股份有限公司&#xff08;下称“世强先进”…

记阿里云mysql丢表丢数据的实践记录

第一时间挂工单&#xff0c;联系工程师指引&#xff0c;现在回过来想&#xff0c;第一时间要确认发生时间。 1.通过性能视图&#xff08;马后炮的总结&#xff0c;实际凭记忆恢复了三四次才找到数据&#xff09; 2.先恢复数据 通过Navicat工具&#xff0c;结构同步&#xff0…

动态绑定样式,uniapp,用三元运算动态绑定多个class类样式,动态绑定的样式可以和原始样式共存

介绍 | uni-app官网 vue、uniapp中动态添加绑定style、class 9种方法实现_vue style动态绑定-CSDN博客 uniapp使用三元运算符动态绑定元素的style样式_uniapp style动态绑定-CSDN博客 对象写法,可以写多个class类 class类的名字&#xff1a;判断条件&#xff0c;最后结果只有…

元宵佳节到,互动礼品,该怎么邮寄最便宜呢?

家人们&#xff0c;一年一度的元宵佳节就要到了&#xff0c;大家吃汤圆了没&#xff0c;朋友之间是不是可以相互寄送点礼物啊&#xff0c;但是该怎么邮寄呢&#xff1f;用什么方式寄快递最便宜呢&#xff1f;客官别着急&#xff0c;听我慢慢说给你听。 首先&#xff0c;先说说各…

NPS配置内网穿透-Windows,PVE

Windows和PVE的区别就是下载客户端的时候一个选windows-amd64(64位的电脑)另一个选 linux-amd64(64位电脑),386对应的是32位的电脑. Releases ehang-io/nps (github.com) PVE的安装参考的是以下视频安装.利用PVE虚拟机&#xff0c;来打造属于自己的All In One系统吧&#xf…

谷歌掀桌子!开源Gemma:可商用,性能超过Llama 2!

2月22日&#xff0c;谷歌在官网宣布&#xff0c;开源大语言模型Gemma。 Gemma与谷歌最新发布的Gemini 使用了同一架构&#xff0c;有20亿、70亿两种参数&#xff0c;每种参数都有预训练和指令调优两个版本。 根据谷歌公布的测试显示&#xff0c;在MMLU、BBH、GSM8K等主流测试…

大型语言模型的语义搜索(一):关键词搜索

关键词搜索(Keyword Search)是文本搜索种一种常用的技术&#xff0c;很多知名的应用app比如Spotify、YouTube 或 Google map等都会使用关键词搜索的算法来实现用户的搜索任务&#xff0c;关键词搜索是构建搜索系统最常用的方法&#xff0c;最常用的搜索算法是Okapi BM25&#x…

两种动态代理(可以看到代理类的样子,方便理解)

这里写目录标题 jdk动态代理例子CGlib动态代理例子手写spring中的事务部分自定义注解版aop实现方式 Spring的两大重点&#xff0c;IOC和AOP&#xff0c;今天我们就来学AOP&#xff0c;众所周知AOP的底层是动态代理&#xff0c;让我们看一下这两种动态代理的区别。 例子&#x…

软件压力测试:测试方法与步骤详解

随着软件应用的不断发展&#xff0c;用户对系统性能的要求也逐渐提高。在不同的负载条件下&#xff0c;系统必须能够保持稳定、高效的运行。软件压力测试是一种验证系统在各种负载情况下性能表现的关键手段。本文将详细探讨软件压力测试的方法和步骤。 1. 明确测试目标 在进行压…