caltrans-data-exploration

所属分类:数据可视化
开发工具:Jupyter Notebook
文件大小:13656KB
下载次数:0
上传日期:2020-01-17 08:47:17
上 传 者sh-1993
说明:  使用OmniSci工具探索加利福尼亚州的交通
(Exploring California s traffic by using OmniSci tools)

文件列表:
bin (0, 2020-01-17)
bin\__init__.py (0, 2020-01-17)
bin\extract.py (343, 2020-01-17)
bin\extract_darksky_weather.py (1414, 2020-01-17)
bin\old (0, 2020-01-17)
bin\old\_process.py (12357, 2020-01-17)
bin\old\extract_ncdc_weather.py (3687, 2020-01-17)
bin\transform_traffic_data_load_omnisci.py (2924, 2020-01-17)
bin\utils.py (567, 2020-01-17)
config.ini (419, 2020-01-17)
data (0, 2020-01-17)
data\html_files (0, 2020-01-17)
data\html_files\hour (0, 2020-01-17)
data\html_files\hour\caltrans_d4_2015_1hr.html (2164, 2020-01-17)
data\html_files\hour\caltrans_d4_2016_1hr.html (2164, 2020-01-17)
data\html_files\hour\caltrans_d4_2017_1hr.html (2164, 2020-01-17)
data\html_files\hour\caltrans_d4_2018_1hr.html (2164, 2020-01-17)
data\html_files\hour\caltrans_d4_2019_1hr.html (932, 2020-01-17)
data\html_files\min (0, 2020-01-17)
data\html_files\min\caltrans_d4_2015_5min.html (57619, 2020-01-17)
data\html_files\min\caltrans_d4_2016_5min.html (57462, 2020-01-17)
data\html_files\min\caltrans_d4_2017_5min.html (57621, 2020-01-17)
data\html_files\min\caltrans_d4_2018_5min.html (57621, 2020-01-17)
data\html_files\min\caltrans_d4_2019_5min.html (19470, 2020-01-17)
data\incident (0, 2020-01-17)
data\incident\all_text_chp_incidents_month_2019_01.txt (6543195, 2020-01-17)
data\incident\all_text_chp_incidents_month_2019_02.txt (6340180, 2020-01-17)
data\incident\all_text_chp_incidents_month_2019_03.txt (6850690, 2020-01-17)
data\incident\all_text_chp_incidents_month_2019_04.txt (6471878, 2020-01-17)
data\weather_noaa (0, 2020-01-17)
data\weather_noaa\ncdc_data.csv (21850759, 2020-01-17)
models (0, 2020-01-17)
models\190515_2030_TrafficAndWeather.h5 (1815768, 2020-01-17)
models\190516_0000_TrafficAndWeather.h5 (1729368, 2020-01-17)
models\archive (0, 2020-01-17)
models\archive\190511_1539.h5 (2027176, 2020-01-17)
models\archive\190512_0930_traffic_final.h5 (2066784, 2020-01-17)
... ...

Analyzing San Francisco's traffic with python and OmniSci ============================================== Note: Follow the instructions step by step to extract the data from the sources. However, if you just want to try the notebooks, then go straight there (however, you'll still need to load data from somewhere). ## Table of Contents * [General Info](https://github.com/abeduplaa/caltrans-data-exploration/blob/master/#general-info) * [setup](https://github.com/abeduplaa/caltrans-data-exploration/blob/master/#setup) * [Extracting traffic data from Caltrans](https://github.com/abeduplaa/caltrans-data-exploration/blob/master/#extracting-traffic-data) * [Extracting weather](https://github.com/abeduplaa/caltrans-data-exploration/blob/master/#extracting-weather) * [Blog posts](https://github.com/abeduplaa/caltrans-data-exploration/blob/master/#blog-posts) ## General Info The state of California provides a an enormous database containing years of traffic sensor data. In this repo, there is code to: * Extract weather data from skylab * Extract traffic data from PeMS * Extract weather data from noaa * Transform and load data to OmniSci **But, the best way to start is to go through the jupyter notebooks!** ## Setup 1. Preferably, use python 3.6 2. Install the requirements in requirements.txt: `pip install -r requirements.txt` 3. Create accounts at the appropriate places to be able to download the data. 4. Fill in the fields in `config.ini`. The code reads critical information, such as your login to Caltrans from this file. **You will not be able to extract data without creating a free account.** 5. Download the correct html files with the appropriate links for data extraction (read below in [extracting traffic data...] (#extracting-traffic-data-from-caltrans)) Once everything is ready, you'll only need to run the files in `bin/` to extract data and load to OmniSci. Order to run the files in: 1. `python bin/extract.py` 2. `python bin/extract_darksky_weather.py` 3. `python bin/transform_traffic_data_load_omnisci.py` ## Extracting traffic data The data is provided by California Department of Transportation (CalTrans) and found in their Performance Measurement System (PeMS) database. CalTrans collects data in realtime from around 40,000 sensors! To extract CalTrans traffic data, follow these steps: 0. Follow the setup steps 1. Set up the login info, paths, etc. in `config.ini` 2. Go to CalTrans PeMS website (http://pems.dot.ca.gov/) and login. 3. Once in the website, navigate to the Data Clearinghouse (http://pems.dot.ca.gov/?dnode=Clearinghouse) 4. The Data Clearinghouse has the data you need. Unfortunately, scrapy hasn't been implemented yet for this project, so you'll need to download the html for the desired Traffic data type and district from the website and place it in `./html_files/`. I've already placed some sample files in there. 5. Also important! Make sure to download the meta files for your district. These are necessary as they contain meta data regarding the stations. When transforming/loading to OmniSci, the code will read all meta files in the folder and join them together. All meta files for district 04 from 2015 to 2019 can be found in `data/meta/`. 6. You're ready to run: `python bin/extract.py` ## Extracting Weather 0. Follow the setup steps 1. Set up the login info, paths, etc. in `config.ini` 2. Create an API key at [darksky](https://github.com/abeduplaa/caltrans-data-exploration/blob/master/https://darksky.net/dev) and add it to the `config.ini`. 3. Open `bin/extract_darksky_weather.py` and configure the location, dates, etc 5. You're ready to run: `python bin/extract_darksky_weather.py ` **Note:** There is already data from NOAA included in `data/weather_noaa`. The script to download this data is also included but there are still some bugs. ## Transforming and loading to OmniSci In order to load the data in, make sure to have OmniSci running and have put in your OmniSci credentials in `config.ini`. 1. Make sure you have all the data correctly downloaded and ready. 2. Open `transform_traffic_data_load_omnisci.py` and set the table name and other input parameters. 2. Run `python bin/transform_traffic_data_load_omnisci.py` The data should now be in OmniSci and ready to visualize! ## Notebooks The notebooks all require reading from OmniSci. Check them out to see how we created a model to: 1. forecast traffic: `notebooks/Train_Models.ipynb` and `Prediction.ipynb` 2. Identify the severity of an accident: `notebooks/IncidentClassification.ipynb` Try them out and also try some new ideas with the data! ## Blog posts If you want to check out some of the insights we've found from the traffic data, you can read the blog posts here: 1. [Analyzing historic traffic data](https://github.com/abeduplaa/caltrans-data-exploration/blob/master/https://www.omnisci.com/blog/analyzing-historical-traffic-flow-in-real-time-with-omnisci) 2. [Traffic weather and Incidents](https://github.com/abeduplaa/caltrans-data-exploration/blob/master/https://www.omnisci.com/blog/traffic-weather-and-incidents-a-360-degree-view-of-california-commutes) 3. [Modeling Traffic Behavior](https://github.com/abeduplaa/caltrans-data-exploration/blob/master/https://www.omnisci.com/blog/modeling-traffic-behavior-as-a-function-of-real-time-traffic-flow-and-weather) Feel free to contact me for any questions or to get in touch with OmniSci.

近期下载者

相关文件


收藏者