均匀变异MATLAB代码-VisualSearchZeroShot:零射不变和有效的视觉搜索

  • H2_178151
    了解作者
  • 13.4MB
    文件大小
  • zip
    文件格式
  • 0
    收藏次数
  • VIP专享
    资源类型
  • 0
    下载次数
  • 2022-04-30 02:59
    上传日期
均匀变异MATLAB代码用零击不变和有效的视觉搜索找到任何Waldo 作者:张梦蜜,冯家石,马敬德,林周慧,赵琦和加布里埃尔·克雷曼(Gabriel Kreiman) 该存储库包含用于不变且有效的视觉搜索的零镜头深度学习模型的实现。 我们的论文发表在《自然通讯》上。 免费获取我们的手稿,补充材料 项目介绍 在混乱的场景中搜索目标对象是日常视力中的一项基本挑战。 视觉搜索必须具有足够的选择性,以将目标与干扰因素区分开,目标的外观不变,有效避免图像的详尽探索,并且必须通过零镜头训练来概括定位新颖的目标对象。 在广泛的类别特定培训之后,先前的工作重点是寻找目标的完美匹配。 在这里,我们首次展示了人类可以高效而不变地在复杂场景中搜索自然物体。 为了深入了解引导视觉搜索的机制,我们提出了一种具有生物启发性的计算模型,该模型可以有选择地,不变地和有效地定位目标,并推广到新颖的对象。 该模型为在自然场景中搜索过程中自下而上和自上而下的信号集成机制提供了近似值。 刺激物 目标 我们的模型预测的注意力图 固定的地方 前提条件 该代码已在MAC OSX和Ubuntu 14.04中成功测试。 仅需要CPU。
VisualSearchZeroShot-master.zip
  • VisualSearchZeroShot-master
  • results
  • result_30_31_1.mat
    1.7KB
  • result_30_31_2.mat
    1.7KB
  • Models
  • caffevgg16
  • VGG_ILSVRC_16_layers_deploy.prototxt.lua
    2.5KB
  • VGG_ILSVRC_16_layers_deploy.prototxt
    4.7KB
  • choppednaturaldesign
  • img_id004_225_1121_layertopdown.mat
    1.5KB
  • img_id004_225_897.jpg
    4.1KB
  • img_id004_1_1.jpg
    7.9KB
  • img_id004_449_1121.jpg
    2.7KB
  • img_id004_897_897.jpg
    6.6KB
  • img_id004_897_673.jpg
    8KB
  • img_id004_225_225_layertopdown.mat
    1.6KB
  • img_id004_673_225.jpg
    10.8KB
  • img_id004_225_673_layertopdown.mat
    1.6KB
  • img_id004_449_449_layertopdown.mat
    1.6KB
  • img_id004_1_1121.jpg
    1.3KB
  • img_id004_673_1_layertopdown.mat
    1.7KB
  • img_id004_449_1.jpg
    1.9KB
  • img_id004_897_1_layertopdown.mat
    1.7KB
  • img_id004_1_225_layertopdown.mat
    1.7KB
  • img_id004_1_1_layertopdown.mat
    1.7KB
  • img_id004_897_897_layertopdown.mat
    1.6KB
  • img_id004_673_1121.jpg
    6.4KB
  • img_id004_1_673.jpg
    2.3KB
  • img_id004_449_449.jpg
    4.6KB
  • img_id004_449_1121_layertopdown.mat
    1.5KB
  • img_id004_225_1121.jpg
    1.7KB
  • img_id004_897_1.jpg
    7KB
  • img_id004_673_1.jpg
    12KB
  • img_id004_1_673_layertopdown.mat
    1.6KB
  • img_id004_673_1121_layertopdown.mat
    1.7KB
  • img_id004_449_225.jpg
    2.4KB
  • img_id004_897_449.jpg
    8KB
  • img_id004_897_225.jpg
    7.6KB
  • img_id004_449_897_layertopdown.mat
    1.6KB
  • img_id004_225_1_layertopdown.mat
    1.7KB
  • img_id004_225_897_layertopdown.mat
    1.6KB
  • img_id004_225_449.jpg
    2.9KB
  • img_id004_225_449_layertopdown.mat
    1.7KB
  • img_id004_673_225_layertopdown.mat
    1.7KB
  • img_id004_897_1121.jpg
    4KB
  • img_id004_673_673.jpg
    12.3KB
  • img_id004_1_897.jpg
    1.9KB
  • img_id004_673_897.jpg
    8.4KB
  • img_id004_897_673_layertopdown.mat
    1.6KB
  • img_id004_449_673_layertopdown.mat
    1.6KB
  • img_id004_449_897.jpg
    2.7KB
  • img_id004_1_449_layertopdown.mat
    1.6KB
  • img_id004_673_673_layertopdown.mat
    1.7KB
  • img_id004_449_1_layertopdown.mat
    1.4KB
  • img_id004_673_449_layertopdown.mat
    1.7KB
  • img_id004_897_225_layertopdown.mat
    1.7KB
  • img_id004_1_897_layertopdown.mat
    1.4KB
  • img_id004_225_225.jpg
    1.6KB
  • img_id004_225_1.jpg
    2.9KB
  • img_id004_449_673.jpg
    3KB
  • img_id004_225_673.jpg
    3KB
  • img_id004_673_449.jpg
    11.8KB
  • img_id004_1_225.jpg
    3KB
  • img_id004_1_1121_layertopdown.mat
    1.3KB
  • img_id004_897_449_layertopdown.mat
    1.7KB
  • img_id004_897_1121_layertopdown.mat
    1.7KB
  • img_id004_673_897_layertopdown.mat
    1.7KB
  • img_id004_449_225_layertopdown.mat
    1.5KB
  • img_id004_1_449.jpg
    4KB
  • sampleimg
  • target_2.jpg
    2.9KB
  • montagenovel.jpg
    77.6KB
  • gray004.jpg
    147.2KB
  • waldo.JPG
    59.4KB
  • array_2.jpg
    19.9KB
  • gt_2_1.jpg
    16.3KB
  • targetgray004.jpg
    32.3KB
  • cropped_2_1.jpg
    476.6KB
  • gt4.jpg
    21.3KB
  • choppedwaldo
  • img_id_2_1_673_1.jpg
    17.2KB
  • img_id_2_1_449_1121.jpg
    13.4KB
  • img_id_2_1_897_1121_layertopdown.mat
    1.7KB
  • img_id_2_1_897_1.jpg
    10.9KB
  • img_id_2_1_1_673_layertopdown.mat
    1.6KB
  • img_id_2_1_897_449_layertopdown.mat
    1.7KB
  • img_id_2_1_897_225_layertopdown.mat
    1.7KB
  • img_id_2_1_1_225_layertopdown.mat
    1.6KB
  • img_id_2_1_897_449.jpg
    11KB
  • img_id_2_1_449_225_layertopdown.mat
    1.7KB
  • img_id_2_1_1_897.jpg
    16.3KB
  • img_id_2_1_449_897_layertopdown.mat
    1.7KB
  • img_id_2_1_673_1_layertopdown.mat
    1.7KB
  • img_id_2_1_449_225.jpg
    17.6KB
  • img_id_2_1_449_449_layertopdown.mat
    1.7KB
  • img_id_2_1_897_225.jpg
    11KB
  • img_id_2_1_225_1.jpg
    15.9KB
  • img_id_2_1_225_225.jpg
    16.5KB
  • img_id_2_1_1_225.jpg
    16.1KB
  • img_id_2_1_225_897.jpg
    18.3KB
内容介绍
# Finding any Waldo with Zero-shot Invariant and Efficient Visual Search Authors: Mengmi Zhang, Jiashi Feng, Keng Teck Ma, Joo Hwee Lim, Qi Zhao, and Gabriel Kreiman This repository contains an implementation of a zero-shot deep learning model for invariant and efficient visual search. Our paper is published in Nature Communication. Free access to our manuscript [HERE](https://www.nature.com/articles/s41467-018-06217-x), supplementary material [HERE](http://klab.tch.harvard.edu/publications/PDFs/gk7627_supplement.pdf) ## Project Description Searching for a target object in a cluttered scene constitutes a fundamental challenge in daily vision. Visual search must be selective enough to discriminate the target from distractors, invariant to changes in the appearance of the target, efficient to avoid exhaustive exploration of the image, and must generalize to locate novel target objects with zero-shot training. Previous work has focused on searching for perfect matches of a target after extensive category-specific training. Here we show for the first time that humans can efficiently and invariantly search for natural objects in complex scenes. To gain insight into the mechanisms that guide visual search, we propose a biologically inspired computational model that can selectively, invariantly and efficiently locate targets, generalizing to novel objects. The model provides an approximation to the mechanisms integrating bottom-up and top-down signals during search in natural scenes. | [![Stimuli](sampleimg/cropped_2_1.jpg)](sampleimg/cropped_2_1.jpg) | [![Target](sampleimg/waldo.JPG)](sampleimg/waldo.JPG) |[![attentionmap](GIF/AM.gif)](GIF/AM.gif) | [![fixatedplace](GIF/FP.gif)](GIF/FP.gif) | |:---:|:---:|:---:|:---:| | Stimuli | Target | Attention Map predicted by our model | Fixated Place | ## Pre-requisite The code has been successfully tested in MAC OSX and Ubuntu 14.04. Only CPU is required. To speed up computation, GPU is highly recommended (3GB GPU memory at least). It requires the deep learning platform Torch7. Refer to [link](http://torch.ch/docs/getting-started.html) for installation. Matio package is required (save and load matlab arrays from Torch7). Refer to [link](https://github.com/soumith/matio-ffi.torch) for installation. Loadcaffe package is required (load pre-trained caffe model to Torch7). Refer to [link](https://github.com/szagoruyko/loadcaffe) for installation. Run the commands: ``` luarocks install image luarocks install tds ``` Download our repository: ``` git clone https://github.com/kreimanlab/VisualSearchZeroShot.git ``` Download the caffe VGG16 model from [HERE](https://drive.google.com/open?id=1AEJse0liaT8uJoLmImqhyJN2y2_6mDsJ) and place it in folder ```/Models/caffevgg16/``` ## Usage ### Visual search in object arrays Navigate to the repository folder. To run our search model, copy the following command in the command window: ``` th IVSNtopdown_30_31_array.lua ``` Visualize the generated attention map in MATLAB: ```visAttentionMap_array.m``` ### Visual search in natural images Navigate to the repository folder and run ```PreprocessNaturalDesign.m``` in MATLAB It preprocesses the search images by cropping it into uniform pieces and saving the paths of the cropped images in ```croppednaturaldesign_img.txt```. To run our search model, copy the following command in the command window: ``` th IVSNtopdown_30_31_naturaldesign.lua ``` Visualize the generated attention map in MATLAB: ```visAttentionMap_naturaldesign.m``` ### Visual search in Waldo images Navigate to the repository folder and run ```PreprocessWaldoImage.m``` in MATLAB It preprocesses the search images by cropping it into uniform pieces and saving the paths of the cropped images in ```croppednaturaldesign_img.txt```. To run our search model, copy the following command in the command window: ``` th IVSNtopdown_30_31_waldo.lua ``` Visualize the generated attention map in MATLAB: ```visAttentionMap_waldo.m``` ## Datasets We have collected human eyetracking data in three increasingly complex visual search tasks: object arrays, natural images and Waldo images. Below are the preview of example trial in each task. Yellow circles denote the eye movements. Red bounding box denotes the ground truth. Correspondingly, we show the eye movement predicted by our computational model. | [![Objectarray](GIF/array_6.gif)](GIF/array_6.gif) | [![Naturalimage](GIF/naturaldesign_21_subj1.gif)](GIF/naturaldesign_21_subj1.gif) |[![Waldoimage](GIF/waldo_31_subj1.gif)](GIF/waldo_31_subj1.gif) | |:---:|:---:|:---:| | Object array (Human) | Natural image (Human) | Waldo image (Human) | | [![Objectarray](GIF/array_6_model.gif)](GIF/array_6_model.gif) | [![Naturalimage](GIF/naturaldesign_21_model.gif)](GIF/naturaldesign_21_model.gif) |[![Waldoimage](GIF/waldo_31_model.gif)](GIF/waldo_31_model.gif) | |:---:|:---:|:---:| | Object array (Model) | Natural image (Model) | Waldo image (Model) | **Note** that we also include one variation of the search task in object arrays. Instead of familiar objects, we collect **novel** objects from [[1]](http://wiki.cnbc.cmu.edu/Novel_Objects) [[2]](http://michaelhout.com/?page_id=759) [[3]](https://www.turbosquid.com/Search/Index.cfm?keyword=alien&max_price=0&min_price=0) and create a dataset of **novel** objects on arrays. See below for **novel** object examples: ![Novel objects](sampleimg/montagenovel.jpg) You can **download** the complete dataset including the novel object dataset from [Part1](https://drive.google.com/file/d/1ZvmugJDds-CrwTvhIXmyYVxnniNmx7ce/view?usp=sharing), [Part2](https://drive.google.com/open?id=1C4T2Siz6zWxksvDbL549-KnWvJMgoCeh) and [Part3](https://drive.google.com/open?id=1eQzrTVFov1OjGoRAGy75vgDabPrHMoS7) (total size: ~9GB) It contains the following: - Part 1: ```Datasets``` folder: contain search images, targets, ground truth, psychophysics (human eyetracking data, MATLAB function to process and extract fixations) - Part 2: ```Eval``` folder: contain MATLAB files to evaluate cummulative search performance as a function of number of fixations for our computational model across three search tasks - Part 2: ```Plot``` folder: contain MATLAB files to reproduce figures in our paper - Part 2: ```supportingFunc``` folder: add this directory in your MATLAB search path - Part 3: ```DataForPlot``` Folder: pre-processed results saved in .mat. Place this folder in ```Plot``` folder (Part 2) before using plotting functions ## Errata Physical screen size used for stimulus presentation (in mm): 360x295 Physical distance between the computer monitor and the eyetracking camera (in mm): 76.2 Averaged sitting distance between human subjects and the screen (in mm): 660.4 (The measurement errors do not alter major results or conclusions reported in the paper.) ## Notes The source code is for illustration purpose only. Path reconfigurations may be needed to run some MATLAB scripts. We do not provide techinical supports but we would be happy to discuss about SCIENCE! ## License See [Kreiman lab](http://klab.tch.harvard.edu/code/license_agreement.pdf) for license agreements before downloading and using our source codes and datasets.
评论
    相关推荐
    • Ubuntu完全教程
      Ubuntu完全教程,Ubuntu完全教程,让你成为Ubuntu高手!
    • 角色-Ubuntu商业回购
      业务存储库的Noreque Ansible角色 只需设置Noreque的默认桌面业务存储库即可。 要求 必须是Ubuntu的重点。 角色变量 没有 依存关系 没有 剧本范例 - hosts: servers roles: - { role: noreque.ubuntu-business-...
    • ubuntu fastDFS 文件分布式存储框架配置
      实现了在ubuntu上fastDFS 的安装和配置,并附带一个java的上传demo
    • cjdns-ubuntu-pubkey:我的cjdns Ubuntu存储库的公钥
      适用于Ubuntu的cjdns存储库(已停产) 由于新的限制性法律规范了俄罗斯的互联网,该存储库已停产。 在某些情况下,他们将责任归于“软件所有者”。 哪些开源软件没有。 所以故事会。 鉴于以上情况,维护该存储库对...
    • ubuntu-12.04-unattended
      如果您使用此存储库中的默认预置文件生成的 .ISO,则该 ISO 将删除现有分区并在安装目标上创建新的空白分区,而无需提示输入。 用法 根据需要自定义 cd-files/ks.cfg 和 cd-files/preseed/unattended.seed。 您可能...
    • Ubuntu linux 教程
      比较好的教程 简单易懂 易于操作 比较好的教程 简单易懂 易于操作
    • linux ubuntu 文档
      对linux感兴趣的朋友可以从ubuntu开始尝试
    • Ubuntu安装手册
      Ubuntu新手的入门级教程,编者也是初学者,手册由编者自己的学习经验总结而成,因而特别适合入门者。
    • deb-downloader:一个简单的工具,可从Ubuntu存储库下载deb软件包及其依赖项列表
      deb-downloader是一个简单的工具,可以从标准Ubuntu存储库中下载deb软件包及其依赖项的列表。 它可以用于下载i386和amd64体系结构的软件包。 如果需要,可以为特定的APT配置加载定制文件并扩展源列表。 要求 您需要...
    • Ubuntu.rar
      Ubuntu.rarUbuntu.rarUbuntu.rar