---
Fundus is:
* **A static news crawler.**
Fundus lets you crawl online news articles with only a few lines of Python code!
Be it from live websites or the CC-NEWS dataset.
* **An open-source Python package.**
Fundus is built on the idea of building something together.
We welcome your contribution to help Fundus [grow](https://github.com/flairNLP/fundus/blob/master/docs/how_to_contribute.md)!
## Quick Start
To install from pip, simply do:
```
pip install fundus
```
Fundus requires Python 3.8+.
## Example 1: Crawl a bunch of English-language news articles
Let's use Fundus to crawl 2 articles from publishers based in the US.
```python
from fundus import PublisherCollection, Crawler
# initialize the crawler for news publishers based in the US
crawler = Crawler(PublisherCollection.us)
# crawl 2 articles and print
for article in crawler.crawl(max_articles=2):
print(article)
```
That's already it!
If you run this code, it should print out something like this:
```console
Fundus-Article:
- Title: "Feinstein's Return Not Enough for Confirmation of Controversial New [...]"
- Text: "Democrats jammed three of President Joe Biden's controversial court nominees
through committee votes on Thursday thanks to a last-minute [...]"
- URL: https://freebeacon.com/politics/feinsteins-return-not-enough-for-confirmation-of-controversial-new-hampshire-judicial-nominee/
- From: FreeBeacon (2023-05-11 18:41)
Fundus-Article:
- Title: "Northwestern student government freezes College Republicans funding over [...]"
- Text: "Student government at Northwestern University in Illinois "indefinitely" froze
the funds of the university's chapter of College Republicans [...]"
- URL: https://www.foxnews.com/us/northwestern-student-government-freezes-college-republicans-funding-poster-critical-lgbtq-community
- From: FoxNews (2023-05-09 14:37)
```
This printout tells you that you successfully crawled two articles!
For each article, the printout details:
- the "Title" of the article, i.e. its headline
- the "Text", i.e. the main article body text
- the "URL" from which it was crawled
- the news source it is "From"
## Example 2: Crawl a specific news source
Maybe you want to crawl a specific news source instead. Let's crawl news articles from Washington Times only:
```python
from fundus import PublisherCollection, Crawler
# initialize the crawler for Washington Times
crawler = Crawler(PublisherCollection.us.WashingtonTimes)
# crawl 2 articles and print
for article in crawler.crawl(max_articles=2):
print(article)
```
## Example 3: Crawl articles from CC-NEWS
If you're not familiar with CC-NEWS, check out their [paper](https://github.com/flairNLP/fundus/blob/master/https://paperswithcode.com/dataset/cc-news).
````python
from fundus import PublisherCollection, CCNewsCrawler
# initialize the crawler for news publishers based in the US
crawler = CCNewsCrawler(*PublisherCollection.us)
# crawl 2 articles and print
for article in crawler.crawl(max_articles=2):
print(article)
````
## Tutorials
We provide **quick tutorials** to get you started with the library:
1. [**Tutorial 1: How to crawl news with Fundus**](https://github.com/flairNLP/fundus/blob/master/docs/1_getting_started.md)
2. [**Tutorial 2: How to crawl articles from CC-NEWS**](https://github.com/flairNLP/fundus/blob/master/docs/2_crawl_from_cc_news.md)
3. [**Tutorial 3: The Article Class**](https://github.com/flairNLP/fundus/blob/master/docs/3_the_article_class.md)
4. [**Tutorial 4: How to filter articles**](https://github.com/flairNLP/fundus/blob/master/docs/4_how_to_filter_articles.md)
5. [**Tutorial 5: How to search for publishers**](https://github.com/flairNLP/fundus/blob/master/docs/5_how_to_search_for_publishers.md)
If you wish to contribute check out these tutorials:
1. [**How to contribute**](https://github.com/flairNLP/fundus/blob/master/docs/how_to_contribute.md)
2. [**How to add a publisher**](https://github.com/flairNLP/fundus/blob/master/docs/how_to_add_a_publisher.md)
## Currently Supported News Sources
You can find the publishers currently supported [**here**](https://github.com/flairNLP/fundus/blob/master//docs/supported_publishers.md).
Also: **Adding a new publisher is easy - consider contributing to the project!**
## Contact
Please email your questions or comments to [**Max Dallabetta**](https://github.com/flairNLP/fundus/blob/master/mailto:max.dallabetta@googlemail.com?subject=[GitHub]%20Fundus)
## Contributing
Thanks for your interest in contributing! There are many ways to get involved;
start with our [contributor guidelines](https://github.com/flairNLP/fundus/blob/master/docs/how_to_contribute.md) and then
check these [open issues](https://github.com/flairNLP/fundus/blob/master/https://github.com/flairNLP/fundus/issues) for specific tasks.
## License
[MIT](https://github.com/flairNLP/fundus/blob/master/LICENSE)