reinforced-agglomerative-clustering:学习通过强化学习进行聚集聚类

  • v8_497552
    了解作者
  • 24.6KB
    文件大小
  • zip
    文件格式
  • 0
    收藏次数
  • VIP专享
    资源类型
  • 0
    下载次数
  • 2022-05-16 08:02
    上传日期
强化凝聚聚类 为了克服聚集聚类中传统链接标准的贪婪性,我们提出了一种强化学习方法,通过将聚集聚类建模为马尔可夫决策过程来学习非贪婪合并策略。 是层次聚类的一种“自下而上”的方法,其中每个观察值都在其自己的聚类中开始,并且随着一个聚类向上移动,聚类对将合并。 聚集聚类是一个顺序决策问题,它伴随着一个问题,即较早做出的决定会影响较晚的结果。 但是传统的链接标准无法通过简单地测量当前阶段集群的相似性来解决这个问题。 这促使我们将聚类建模为马尔可夫决策过程,并通过强化学习对其进行求解。 代理应该学习非贪婪的合并策略,以便选择每个合并操作以获得更好的长期折价奖励。 该状态定义为当前聚类的特征表示。 我们使用池来聚合所有集群的功能。 该动作定义为合并群集i和群集j。 我们使用Q学习来计算状态-动作对的值。 在训练中,奖励是通过图像的地面真相标签来计算的。 并且在测试时,我们在不同的域中测试代理,以
reinforced-agglomerative-clustering-master.zip
  • reinforced-agglomerative-clustering-master
  • env
  • __init__.py
    0B
  • env.py
    8.8KB
  • tree.py
    7.7KB
  • test_env.py
    1018B
  • dataset
  • download_data.sh
    533B
  • utils
  • vae_example.py
    5.7KB
  • baseline_clustering.py
    7.8KB
  • utils_pad.py
    11.1KB
  • agent.py
    19.5KB
  • rl.config
    387B
  • README.md
    1.7KB
  • requirements.txt
    48B
  • .gitignore
    72B
  • main.py
    19.4KB
内容介绍
# Reinforced agglomerative clustering To overcome the greediness of traditional linkage criteria in agglomerative clustering, we proposed a reinforcement learning approach to learn a non-greedy merge policy by modeling agglomerative clustering as Markov Decision Process. [Agglomerative clustering](https://en.wikipedia.org/wiki/Hierarchical_clustering) is a "bottom up" approach of hierarchical clustering, where each observation starts in its own cluster, and pairs of clusters are merged as one moves up the hierarchy. Agglomerative clustering is a sequential decision problem, which comes with the problem that a decision made earlier affects the later result. But traditional linkage criteria fail to handle this problem by simply measuring similarity of clusters in current phase. This motivated us to model the clustering as Markov Decision Process and solve it with reinforcement learning. The agent should learn a non-greedy merge policy so that each merge operation is chosen for a better long term discounted reward. The state is defined as feature representation of current clustering. We use pooling to aggregate the feature of all clusters. The action is defined as merging cluster i and cluster j. We use Q-learning to compute the value of a state-action pair. In training, the reward is computed by the ground truth label of images. And at test time, we test the agent in a different domain to see how it can generalize. ## Installation 1. Download mnist dataset ``` cd dataset/ & bash download_data.sh ``` 2. Install all the dependencies ``` pip install -r requirements.txt ``` ## Usage 1. Train ``` python main.py --train ``` 2. Test ``` python main.py --test [MODEL_DIR] ```
评论
    相关推荐
    • 聚类
      聚类
    • 烧瓶聚类
      烧瓶聚类
    • 聚类
      Chipotle聚类挑战 语境 项目名称:Chipotle聚类挑战项目背景:BeCode,列日校区,人工智能/数据运营商训练营,2021年2月该项目的目标: 能够使用geopandas,matplotlib(和seaborn)在地图上可视化群集数据。 能够...
    • K均值聚类
      预将数据分为K组,则随机选取K个对象作为初始的聚类中心,然后计算每个对象与各个种子聚类中心之间的距离,把每个对象分配给距离它最近的聚类中心。聚类中心以及分配给它们的对象就代表一个聚类
    • 聚类
      聚类
    • 聚类
      聚类
    • 邻里聚类
      Neighborhood-Clustering-main
    • 考试聚类
      考试聚类
    • 聚类
      聚类
    • 聚类测试
      聚类测试