开发工具:Jupyter Notebook
上传日期:2023-03-29 22:30:52
上 传 者sh-1993
说明:  Q-Learning Out of Africa模式。这是该校统计机器学习课程的项目...
(Q-Learning Out of Africa model. This was the project for the course Statistical Machine Learning for the University of Torino.)

Images (0, 2023-03-30)
Images\earth.jpg (72827, 2023-03-30)
Images\pure-bw-earth.jpg (44026, 2023-03-30)
LICENSE (1073, 2023-03-30)
Metrics (0, 2023-03-30)
Metrics\Images (0, 2023-03-30)
Metrics\Images\avg_lifetime.jpg (25061, 2023-03-30)
Metrics\Images\avg_reward.jpg (24713, 2023-03-30)
Metrics\Images\migration_exploration.jpg (64885, 2023-03-30)
Metrics\lifetime.npy (16128, 2023-03-30)
Metrics\q_table.npy (14420352, 2023-03-30)
Metrics\reward_per_episode.npy (16128, 2023-03-30)
Report (0, 2023-03-30)
Report\ML_Project.pdf (1339606, 2023-03-30)
diffusion_process.ipynb (548369, 2023-03-30) (1759, 2023-03-30) (7214, 2023-03-30) (321, 2023-03-30)
q-learning.png (180154, 2023-03-30)
requirements.txt (28, 2023-03-30) (606, 2023-03-30)

# Human Diffusion Human Diffusion is a Q-Learning model that simulates the human migration from Africa. This project was developed for the Statistical Machine Learning course at the University of Torino. ## Overview The model uses Q-Learning to simulate the movement of early humans out of Africa and into other parts of the world. The model uses a grid-based representation of the world map and simulates human migration by learning the most rewarding paths. It takes into account the physical barriers, such as oceans and mountains, as well as potential migration paths like the Arabian Bridge, Indonesian Bridge, and Bering Strait. Unfortunately implementing mountain ranges and deserts would take too much time, and would be useless, since the report had a 6-page constraint. The simulation begins with a starting area in Africa and generates migration routes based on Q-Learning, a reinforcement learning technique. The model learns to take actions (move west, east, north, or south) to maximize rewards, which are given based on the desirability of a location. Below is a simple explanation of the Q-Learning algorithm, which I unfortunately couldn't use in the LaTeX due to space constraints. ![]( ### Results The simulation generates a visual representation of the migration routes taken by early humans as they moved out of Africa. The model is able to learn the most efficient routes given the constraints of the physical environment, and produces a map that closely resembles the actual migration patterns of early humans. ## Getting Started ### Prerequisites To run this project, you will need to have Python 3.x installed, along with the following Python libraries: - NumPy - Matplotlib - tqdm You can install them using `pip`: ```bash pip install numpy matplotlib tqdm ``` or ```bash pip install -r "requirements.txt" ``` ### Running the Simulation 1. Clone this repository: ```bash git clone cd Human-Diffusion ``` 2. Run the `` script to train the Q-Learning model: ```bash python ``` This will start the training process and save the Q table, reward per episode, and lifetime data in `.npy` files. 3. Visualize the results of the trained model by running the `` script: ```bash python ``` This will display the average reward, average lifetime of the group, and the final migration path. ## Code Structure The project consists of three main files: - ``: Contains the `Earth` class, that creates the map. - ``: Contains the `HumanMigration` class, which encapsulates the Q-Learning model, training process, and visualization methods. - ``: A script that imports the `HumanMigration` class, trains the model, and saves the results. The `` script is used to visualize the results of the trained model. ## Additional Visualizations The code also includes two additional visualizations to help analyze the performance of the Q-Learning algorithm throughout the simulation: 1. **Average Reward**: This plot shows the average reward per episode over a defined interval (e.g., every 10% of the total episodes). The average reward provides insights into how well the algorithm is learning to navigate the environment and find desirable locations. 2. **Average Lifetime of Group**: This plot illustrates the average number of generations the group survived before either reaching a desirable location or encountering a barrier. This can be used to evaluate the effectiveness of the learned migration routes. ## License This project is licensed under the MIT License - see the LICENSE file for details.