news_clippings

所属分类:人工智能/神经网络/深度学习
开发工具:Shell
文件大小:5KB
下载次数:0
上传日期:2021-09-11 19:27:08
上 传 者sh-1993
说明:  新闻剪辑:上下文外多模态媒体的自动生成
(NewsCLIPpings: Automatic Generation of Out-of-Context Multimodal Media)

文件列表:
download.sh (862, 2021-09-12)
news_clippings (0, 2021-09-12)
news_clippings\.keep (0, 2021-09-12)
news_clippings_training (0, 2021-09-12)
visual_news (0, 2021-09-12)
visual_news\.keep (0, 2021-09-12)

# [NewsCLIPpings Dataset](https://arxiv.org/abs/2104.05893) [![DOI](https://zenodo.org/badge/355308357.svg)](https://zenodo.org/badge/latestdoi/355308357) Our dataset with automatically generated out-of-context image-caption pairs in the news media. For inquiries and requests, please contact graceluo@berkeley.edu. ## Requirements Make sure you are running Python 3.6+. ## Getting Started 1. Request the [VisualNews Dataset](https://github.com/FuxiaoLiu/VisualNews-Repository). Place the files under the `visual_news` folder. 2. Run [`./download.sh`](https://github.com/g-luo/news_clippings/blob/master/download.sh) to download our matches and populate the `news_clippings` folder (place into `news_clippings/data/`). 3. Consider doing analyses of your own using the embeddings we have provided (place into `news_clippings/embeddings/`). All of the ids and image paths provided in our `data/` folder exactly correspond to those listed in the `data.json` file in VisualNews. Your file structure should look like this: ``` news_clippings │ └── data/ └── embeddings/ visual_news │ └── origin/ │ └── data.json │ ... └── ... ``` ## Data Format The data is ordered such that every even sample is pristine, and the next sample is its associated falsified sample. - `id`: the id of the VisualNews sample associated with the caption - `image_id`: the id of the VisualNews sample associated with the image - `similarity_score`: the similarity measure used to generate the sample (i.e. `clip_text_image, clip_text_text, sbert_text_text, resnet_place`) - `falsified`: a binary indicator if the caption / image pair was the original pair in VisualNews or a mismatch we generated - `source_dataset` (Merged / Balanced only): the index of the sub-split name in `source_datasets` Here's an example of how you can start using our matches: ``` import json visual_news_data = json.load(open("visualnews/origin/data.json")) visual_news_data_mapping = {ann["id"]: ann for ann in visual_news_data} data = json.load(open("news_clippings/data/merged_balanced/val.json")) annotations = data["annotations"] ann = annotations[0] caption = visual_news_data_mapping[ann["id"]]["caption"] image_path = visual_news_data_mapping[ann["image_id"]]["image_path"] print("Caption: ", caption) print("Image Path: ", image_path) print("Is Falsified: ", ann["falsified"]) ``` ## Embeddings We include the following precomputed embeddings: - `clip_image_embeddings`: 512-dim image embeddings from [CLIP](https://github.com/openai/CLIP) ViT-B/32.
Contains embeddings for samples in all splits. - `clip_text_embeddings`: 512-dim caption embeddings from [CLIP](https://github.com/openai/CLIP) ViT-B/32.
Contains embeddings for samples in all splits. - `sbert_embeddings`: 768-dim caption embeddings from [SBERT-WK](https://github.com/BinWang28/SBERT-WK-Sentence-Embedding).
Contains embeddings for samples in all splits. - `places_resnet50`: 2048-dim image embeddings using ResNet50 trained on [Places365](https://github.com/CSAILVision/places365).
Contains embeddings only for samples in the `scene_resnet_place` split (where [PERSON] entities were not detected in the caption). The following embedding types were not used in the construction of our dataset, but you may find them useful. - `facenet_embeddings`: 512-dim embeddings for each face detected in the images using [FaceNet](https://github.com/TIBHannover/cross-modal_entity_consistency/blob/master/visual_descriptors/person_embedding.py). If no faces were detected, returns `None`.
Contains embeddings only for samples in the `person_sbert_text_text` split (where [PERSON] entities were detected in the caption). All embeddings are dictionaries of {id: numpy array} stored in pickle files for train / val / test. You can access the features for each image / caption by its id like so: ``` import pickle clip_image_embeddings = pickle.load(open("news_clippings/embeddings/clip_image_embeddings/test.pkl", "rb")) id = 7018*** print(clip_image_embeddings[id]) ``` ## Available Upon Request We have additional metadata available upon request, such as the [spaCy](https://spacy.io) and [REL](https://github.com/informagi/REL) named entities, timestamp, location of the original article content, etc. We also have `sbert_embeddings_dissecting`, which has an embedding for each token and its weighting from running the "dissecting" setting of [SBERT-WK](https://github.com/BinWang28/SBERT-WK-Sentence-Embedding), available upon request. ## Training To run the benchmarking experiments we reported in our paper, look at the README for `news_clippings_training/`. ## Citing If you find our dataset useful for your research, please, cite the following paper: ``` @article{luo2021newsclippings, title={NewsCLIPpings: Automatic Generation of Out-of-Context Multimodal Media}, author={Luo, Grace and Darrell, Trevor and Rohrbach, Anna}, journal={arXiv:2104.05893}, year={2021} } ```

近期下载者

相关文件


收藏者