• Thou_
    了解作者
  • Python
    开发工具
  • 334KB
    文件大小
  • rar
    文件格式
  • 0
    收藏次数
  • 1 积分
    下载积分
  • 0
    下载次数
  • 2020-10-23 08:59
    上传日期
简单的归一化;补空值;3σ剔除异常值;除去重复值
deal_grape.rar
  • deal_grape
  • __pycache__
  • firstdeal.cpython-37.pyc
    1.1KB
  • .idea
  • inspectionProfiles
  • profiles_settings.xml
    174B
  • misc.xml
    193B
  • modules.xml
    279B
  • workspace.xml
    14.1KB
  • .gitignore
    229B
  • deal_grape.iml
    291B
  • 附件2-处理后指标.xlsx
    86KB
  • wine.csv
    11KB
  • firstdeal.py
    1.5KB
  • 预处理指标.xlsx
    86KB
  • deal_Grape_Wine.py
    1.2KB
  • 预处理指标2.xlsx
    86KB
  • 附件2-指标总表.xls
    174.5KB
内容介绍
import numpy as np def dealdata(GrapeOrWine,key=1): GrapeOrWine = GrapeOrWine.reset_index(drop=True) # 重新排列删除原Index if key == 1 : GrapeOrWine.drop_duplicates('样品编号', inplace=True) else: GrapeOrWine.drop_duplicates('品种编号', inplace=True) ncloumn = np.size(GrapeOrWine,1) # 3s法则剔除异常 与 修正为0数据 for i in GrapeOrWine.columns[1:ncloumn]: #一列一列的查找 listErro = [] # 存储错误数值 indexErro = [] # 存储错误索引 # loc 通过列头名去访问. iloc 通过索引去访问 thesum = GrapeOrWine.loc[:, i].sum() aver = GrapeOrWine.loc[:, i].mean() std = GrapeOrWine.loc[:, i].std() N = len(GrapeOrWine.loc[:, i]) cnt = 0 for item in GrapeOrWine.loc[:, i]: if (((np.abs(item - aver)) >= 3*std) or (item == 0)): listErro.append(item) indexErro.append(cnt) cnt += 1 thesum = thesum - sum(listErro); aver_now = thesum / (N - len(listErro)) for k in indexErro: GrapeOrWine.loc[k, i] = aver_now # 数据归一化处理 for i in GrapeOrWine.columns[1:ncloumn]: minVal = GrapeOrWine.loc[:, i].min() maxVal = GrapeOrWine.loc[:, i].max() for j in range(len(GrapeOrWine.loc[:, i])): GrapeOrWine.loc[j, i] = (GrapeOrWine.loc[j, i] - minVal) / (maxVal - minVal) return GrapeOrWine
评论
    相关推荐