covid19_scenarios_data-master

所属分类:其他
开发工具:Python
文件大小:377KB
下载次数:0
上传日期:2020-08-10 23:43:25
上 传 者MainaBenaa
说明:  Real dataset simulation of covid19

文件列表:
LICENSE (1276, 2020-04-02)
__init__.py (0, 2020-04-02)
case-counts (0, 2020-04-02)
case-counts\brazil (0, 2020-04-02)
case-counts\brazil\BRA-Acre.tsv (547, 2020-04-02)
case-counts\brazil\BRA-Alagoas.tsv (658, 2020-04-02)
case-counts\brazil\BRA-Amapá.tsv (480, 2020-04-02)
case-counts\brazil\BRA-Amazonas.tsv (550, 2020-04-02)
case-counts\brazil\BRA-Bahia.tsv (729, 2020-04-02)
case-counts\brazil\BRA-Ceará.tsv (580, 2020-04-02)
case-counts\brazil\BRA-Distrito Federal.tsv (734, 2020-04-02)
case-counts\brazil\BRA-Espírito Santo.tsv (757, 2020-04-02)
case-counts\brazil\BRA-Goiás.tsv (646, 2020-04-02)
case-counts\brazil\BRA-Maranh_o.tsv (485, 2020-04-02)
case-counts\brazil\BRA-Mato Grosso do Sul.tsv (605, 2020-04-02)
case-counts\brazil\BRA-Mato Grosso.tsv (426, 2020-04-02)
case-counts\brazil\BRA-Minas Gerais.tsv (712, 2020-04-02)
case-counts\brazil\BRA-Paraná.tsv (614, 2020-04-02)
case-counts\brazil\BRA-Paraíba.tsv (635, 2020-04-02)
case-counts\brazil\BRA-Pará.tsv (523, 2020-04-02)
case-counts\brazil\BRA-Pernambuco.tsv (646, 2020-04-02)
case-counts\brazil\BRA-Piauí.tsv (459, 2020-04-02)
case-counts\brazil\BRA-Rio Grande do Norte.tsv (525, 2020-04-02)
case-counts\brazil\BRA-Rio Grande do Sul.tsv (634, 2020-04-02)
case-counts\brazil\BRA-Rio de Janeiro.tsv (644, 2020-04-02)
case-counts\brazil\BRA-Rond_nia.tsv (460, 2020-04-02)
case-counts\brazil\BRA-Roraima.tsv (466, 2020-04-02)
case-counts\brazil\BRA-Santa Catarina.tsv (653, 2020-04-02)
case-counts\brazil\BRA-Sergipe.tsv (447, 2020-04-02)
case-counts\brazil\BRA-S_o Paulo.tsv (919, 2020-04-02)
case-counts\brazil\BRA-Tocantins.tsv (476, 2020-04-02)
case-counts\canada (0, 2020-04-02)
case-counts\canada\CAN-Alberta.tsv (1197, 2020-04-02)
case-counts\canada\CAN-BC.tsv (1283, 2020-04-02)
case-counts\canada\CAN-Manitoba.tsv (1169, 2020-04-02)
case-counts\canada\CAN-NL.tsv (1167, 2020-04-02)
case-counts\canada\CAN-NWT.tsv (1153, 2020-04-02)
... ...

# NOTE: This repo has been moved directly within covid19-scenarios. Please continue the discussion there

COVID-19 Scenarios Data

Data preprocessing scripts and preprocessed data storage for COVID-19 Scenarios project

License GitHub commit activity GitHub contributors GitHub last commit

Join the community on Spectrum Contributions: welcome Discuss: in issue 18

Twitter Follow

Got questions or suggestions?

Image for the link to join the chat

Discover

Simulator Source code repository Data repository Updates
Image with app logo and text 'Try' Image with GutHub logo and text 'Get Involved' Image with GutHub logo and text 'Add Data' Image with Twitter logo and text 'Follow'

## Overview This repository serves as the source of observational data for [covid19_scenarios](https://neherlab.org/covid19/). It ingests data from a variety of sources listed in [sources.json](sources.json). For each source there is a parser written in python in the directory `parsers`. The data is stored as `tsv` files (tab separated values) for each location or country. These tabular files are mainly meant to enable data curation and storage, while the web application needs json files as input. The following commands assume that you have cloned this repository as `covid19_scenarios_data` and run these commands from **outside** this repository. To run the parsers, call ```shell python3 covid19_scenarios_data/generate_data.py --fetch ``` This will update the tables in the directory `case-counts`. For each parser there is a separate directory which contains individual case counts for each location covered by the parser. To only run specific parsers, run ```shell python3 covid19_scenarios_data/generate_data.py --fetch --parsers netherlands switzerland ``` To generate jsons for the app, specific the path the location of the target. This can either be done in combination with updating the `tsv` files or separately depending on whether the command is run with `--fetch` or not. ```shell python3 covid19_scenarios_data/generate_data.py \ --output-cases path/case-counts.json \ --output-population path/population.json ``` To generate the integrated scenario json, run ```shell python3 covid19_scenarios_data/generate_data.py \ --output-cases path/case-counts.json \ --output-scenarios path/scenarios.json ``` ## Contents ### Country codes List of countries associated to regions, subregions, and three letter codes supplied by the U.N. ### Population data List of settings used by the default scenario by COVID-19 epidemic simulation for different regions of interest. ### Case count data Within the directory `./case-counts` is a structured set of tsv files containing aggregated data for select country and subregion/city. We welcome contributions to keep this data up to date. The format chosen is: ``` time cases deaths hospitalized ICU recovered 2020-03-14 ... ``` We are actively looking for people to supply data to be used for our modeling! ## Contributing and curating data: ### Adding parser or case count data for a new region: The steps to follow are: ##### Identify a source for case counts data that is updated frequently (at least daily) as outbreak evolves. - Write a script that downloads and converts raw data into a dict of lists of lists {'': [['2020-03-20', 1, 0, ...], ['2020-03-21', 2, 0, ...]]} - Columns: [time, cases, deaths, hospitalized, ICU, recovered] - **Important:** all columns must be cumulative data. - The time column **must** be a string formatted as `YYYY-MM-DD` - Try to keep the same order of columns for hygiene, although it should not ultimately matter - If data is missing, please leave the entry empty (i.e., ['2020-03-20',1, None, None, ...]) - Use the store_data() function in utils to store the data into .tsv automatically - Ensure that the data provided to store_data() is well formatted - The keys in the datastructure provided to utils should be - For countries: U.N. country names (see country_codes.csv), or - For states within countries: -, where is the three letter code for the country (see country_codes.csv), and is the state name - The second parameter is the string identifying your parser (see sources.json entry below) - Place the script into the parsers directory - The name should correspond to the region name desired in the scenario. - There **must** be a function parse() defined that calls store_data() from utils ##### Update the _sources.json_ file to contain all relevant metadata. - The three fields are: - primarySource = The URL/path to the raw data - dataProvenance = The organization behind the data collection - license = The license governing the usage of data ##### Test your parser and create a Pull Request - Create the appropriate directory in case-counts/ - Test your parser from the directory above (outside your covid19_scenario_data folder) using ```shell python3 covid19_scenarios_data/generate_data.py --fetch --parsers ``` - Check the resulting output in case-counts//, and add the files to your Pull Request together with the parser and sources.json ##### Add populations data for the additional regions/states. Case count data is most useful when tied to data on the population it refers to. To ensure new case counts are correctly included in the population presets, add a line to the `populationData.tsv` for each new region (see [Adding/editing population data for a country and/or region](#adding/editing-population-data-for-a-country-and/or-region) below). ### Updating/editing case count data for the existing region: We note that this option is not preferred relative to a script that automatically updates as outlined above. However, if there is no accessible data sources, one can manually enter the data. To do so ##### Commit a manually entered file into the "manuals" directory - Please use only the U.N. designated name for the country, the file name should be .tsv. ### Adding/editing population data for a country and/or region: As of now all data used to initialize scenarios used by our model is found within populationData.tsv It has the following form: name populationServed ageDistribution hospitalBeds ICUBeds suspectedCaseMarch1st importsPerDay hemisphere Switzerland ... - Names: the U.N. designated name found within country_codes.csv - For a sub-region/city, please prefix the name with the three letter country code of the containing country. See country_codes.csv for the correct letters. - populationServed: a number with the population size - ageDistribution: name of the country the region is within. Must be U.N. designated name - hospitalBeds: number of hospital beds within the region - ICUBeds: number of ICU beds - suspectedCasesMarch1st: The number of cases thought to be within the region on March 1st. - importsPerDay: number of suspected import cases per day - hemisphere: either 'Northern', 'Southern', or 'Tropical', used to determine parameters for the epidemiology At least one of `suspectedCasesMarch1st` and `importsPerDay` needs to be non-zero. Otherwise there is no outbreak (good news in principle, but not useful for exploring scenarios). ## License [Mixed](LICENSE)

近期下载者

相关文件


收藏者