# @nberlette/f1
## Scraping photos of the Las Vegas Formula 1 track construction
This is an autonomous image scraper developed using [TypeScript], [Deno], and
[GitHub Actions]. It was purpose-built to document the historic [Formula 1][formula1]
track construction in [Las Vegas, Nevada][formula1-official-site], slated to host the
inaugural Heineken Silver Grand Prix on November 18th. The images will be stitched
together into timelapse videos of the track's lifecycle, from _start_ to _finish_ .
> **Note**: this video was created with images from **2023-08-15** - **2023-10-12**
---
## About
The first scrape happened on June 3rd, 2023. As of October 18th it has surpassed
**18,500 commits**, equivalent to over **1.2GB** of image data. Photos are stored
in the `./assets` folder of this repository, and also persisted to a **[Deno KV]**
database backed by [FoundationDB].
The origin of the scraped images is an real-time photo feed, sourced directly
from the official Formula 1 website.
> This project is for educational purposes and is not affiliated with Formula 1.
#### [ **Click here for an in-depth explanation of the scrape process**](https://github.com/nberlette/f1/blob/master/#scrape-process-step-by-step)
### Tools Used
- [x] [`Deno v1.37.2`][Deno v1.37.2]
- Rust-based JS runtime, sandboxed, with great TS/TSX support.
- Provides the tools for network and file system operations.
- [x] [`TypeScript 5.2.2`][TypeScript]
- Superset of JavaScript featuring advanced static typechecking.
- Better type safety means more readable and maintainable code.
- [x] [`GitHub Actions`][GitHub Actions]
- Provides **free** macOS virtual machines powering the scraper.
- Responsible for scheduled execution of the scraper [workflow]
- Temporarily stores the image [artifacts](https://github.com/nberlette/f1/blob/master/#workflow-artifacts)
- [x] [`Deno KV`][Deno KV Docs] _(currently in beta)_
- Provides us with global data persistence and caching
- [x] [`ffmpeg`][ffmpeg] _(timelapse feature is **unstable**)_
- Leveraged to automatically generate timelapse videos
---
> AI-generated F1 art created with [SDXL 1.0][sdxl] and the prompt
> `"Formula 1 cars on the Las Vegas Strip"`
---
## How it Works
The majority of the work happens in [`main.ts`][main.ts], despite it only being
3 lines of code. It is responsible for invoking the scraper located in
[`src/scrape.ts`][src-scrape.ts], and is ran every 10 minutes by a GitHub Action
defined by the workflow in [`main.yml`][workflow].
### Assets and Data
Images are named after their capture time as a `JPEG` file in **UTC**. For
example, an image captured at `2023-07-09T04:28:57` would be saved as
[`./assets/2023-07-09/04_28_57.jpg`](https://github.com/nberlette/f1/blob/master/https://github.com/nberlette/f1/blob/main/assets/2023-07-09/04_28_57.jpg?raw=true).
The latest image is always saved as [`./assets/latest.jpg`][latest-img] for easy access.
### Scrape Process, Step-by-Step
1. GitHub Actions runs the [scrape workflow][workflow] every ~10 minutes, depending on traffic.
2. The runner checks out the repo, installs [Deno][deno], and prepares to scrape.
3. `deno task scrape` is executed, which runs the [`main.ts`][main.ts] file.
4. [`main.ts`][main.ts] imports [`scrape()`][function-scrape] from [`src/scrape.ts`][src-scrape.ts],
which defines two inner functions, [**`read`**][function-read] and [**`write`**][function-write].
- Once it has checked that `import.meta.main` is set, the following steps are taken:
1. **READ**: [`read()`][function-read] is called with [`IMAGE_URL`][const-image-url].
- Internally, the [Fetch API] is used to download the image.
If the request fails, it will be retried up to [`ATTEMPTS`][const-attempts]
times, with a short pause between each successive attempt.
- If all attempts are exhausted without success, the run will **terminate**.
- Otherwise, a new instance of the [`Image`][src-image.ts] class is returned.
2. **WRITE**: [`write()`][function-write] is called, with the [`Image`][src-image.ts]
as its only argument. Before writing, however, it runs through some checks:
1. The [`Image.hash`][image-hash] is checked against the hash "table" in [Deno KV].
If an entry exists, the image is stale and won't make it any further. If Deno KV
is unavailable, the image data is checked against [`latest.jpg`][latest-img] via
a timing-safe equality comparison, avoiding exposure to timing-based attacks.
- If they are equal, the image has not updated at the origin. The process
starts over at **step 4** and repeats until a new image is found.
- If the maximum number of [`ATTEMPTS`][const-attempts] is reached and
no new image was found, the job **terminates unsuccessfully**.
2. If we've made it this far, we have a fresh image and we need to store it.
- [`Image.write()`][image-write] persists the image to [Deno KV].
> The key is generated by the `Image` API, using the image timestamp.
- The image timestamp is indexed with its unique SHA-256 hash in [Deno KV].
> This prevents later scrapes from duplicating this image. It also means
> if you try to instantiate a new Image from an old hash, it will always
> return the original image and its original timestamp.
- [`Image.writeFile()`][image-writefile] saves it to the local file system.
> The filename is generated by the `Image` API, using the image timestamp.
- `Image.writeFile()` also saves it to [`./assets/latest.jpg`][latest-img],
3. The [`setOutput`][src-helpers-actions.ts] helper pipes the image metadata
to the GitHub Actions runner, to be used in the commit step.
3. The scrape is now complete and the runner proceeds to the final steps.
6. The photo is stored as a workflow artifact for **90 days**.
7. All changes are committed + pushed to the repository.
8. The job finishes successfully and the runner is terminated. Hooray!
---
---
## Previous Snapshots
October 17th
October 15th
October 13th
October 11th
October 9th
October 7th
October 5th
October 3rd
October 2nd
September 28th
September 24th
September 20th
September 16th
September 12th
September 8th
September 4th
---
[latest-snapshot]: #latest-snapshot "View the latest snapshot from the construction site"
[timelapse-preview]: #timelapse-preview "View a short sample timelapse video, created from the last 8 weeks of photos."
[previous-snapshots]: #previous-snapshots "View some of the previously captured snapshots"
[about]: #about "Interested in how it works? Click here for more info!"
[MIT]: https://nick.mit-license.org "MIT License"
[Nicholas Berlette]: https://github.com/nberlette "Nicholas Berlette's GitHub profile"
[nberlette]: https://github.com/nberlette "Nicholas Berlette's GitHub profile"
[n.berlette.com/f1]: https://n.berlette.com/f1 "View the GitHub Pages site at n.berlette.com/f1"
[Star on GitHub]: https://github.com/nberlette/f1/stargazers "Star this project on GitHub!"
[readme]: https://github.com/nberlette/f1#readme "View the README on GitHub"
[workflow]: https://github.com/nberlette/f1/blob/main/.github/workflows/main.yml "GitHub Actions workflow file"
[assets]: https://github.com/nberlette/f1/tree/main/assets "View the 'assets' folder on GitHub"
[main.ts]: https://github.com/nberlette/f1/blob/main/main.ts "View the source code for the 'main.ts' file on GitHub"
[src-scrape.ts]: https://github.com/nberlette/f1/blob/main/src/scrape.ts "View the source code for the 'src/scrape.ts' file on GitHub"
[src-helpers.ts]: https://github.com/nberlette/f1/blob/main/src/helpers.ts "View the source code for the 'src/helpers.ts' file on GitHub"
[src-constants.ts]: https://github.com/nberlette/f1/blob/main/src/constants.ts "View the source code for the 'src/constants.ts' file on GitHub"
[src-image.ts]: https://github.com/nberlette/f1/blob/main/src/image.ts "View the source code for the 'src/image.ts' file on GitHub"
[image-hash]: https://github.com/nberlette/f1/blob/main/src/image.ts "View the source code for the 'src/image.ts' file on GitHub"
[src-helpers-actions.ts]: https://github.com/nberlette/f1/blob/main/src/helpers/actions.ts "View the source code for the 'src/helpers/actions.ts' file on GitHub"
[const-attempts]: https://github.com/nberlette/f1/blob/main/src/constants.ts#L37 "View the source for the 'ATTEMPTS' constant on GitHub"
[const-delay]: https://github.com/nberlette/f1/blob/main/src/constants.ts#L30 "View the source for the 'DELAY' constant on GitHub"
[const-image-url]: https://github.com/nberlette/f1/blob/main/src/constants.ts#L47 "View the source for the 'IMAGE_URL' constant on GitHub"
[image-write]: https://github.com/nberlette/f1/blob/main/src/image.ts#L205 "View the source for the 'Image.write()' method on GitHub"
[image-writefile]: https://github.com/nberlette/f1/blob/main/src/image.ts#L250 "View the source for the 'Image.writeFile()' method on GitHub"
[function-scrape]: https://github.com/nberlette/f1/blob/main/src/scrape.ts#L30 "View the source for the 'scrape()' function on GitHub"
[function-read]: https://github.com/nberlette/f1/blob/main/src/scrape.ts#L51 "View the source for the 'read()' function on GitHub"
[function-write]: https://github.com/nberlette/f1/blob/main/src/scrape.ts#L96 "View the source for the 'write()' function on GitHub"
[latest-img]: https://github.com/nberlette/f1/blob/main/assets/latest.jpg?raw=true&no-cache&cache=no-cache "The latest snapshot of the Formula 1 track construction site in Las Vegas, Nevada."
[artwork-1]: ./docs/img/f1_artwork_1.png "AI-Generated artwork of a Formula 1 car racing down the Las Vegas Strip"
[artwork-2]: ./docs/img/f1_artwork_2.png "AI-generated artwork of a Formula 1 car racing down the Las Vegas Strip"
[artwork-3]: ./docs/img/f1_artwork_3.png "AI-generated artwork of a Formula 1 car racing down the Las Vegas Strip"
[artwork-4]: ./docs/img/f1_artwork_4.png "AI-generated artwork of a Formula 1 car racing down the Las Vegas Strip"
[Fetch API]: https://mdn.io/Fetch%20API
[GitHub Actions]: https://github.com/features/actions "GitHub Actions Official Landing Page"
[sdxl]: https://github.com/Stability-AI/stablediffusion "Stable Diffusion XL 1.0"
[ffmpeg]: https://ffmpeg.org "The FFmpeg Project Official Website"
[Track Layout]: https://www.f1lasvegasgp.com/track-layout "Formula 1's Las Vegas Grand Prix Track Layout"
[Formula 1]: https://www.formula1.com
[formula1]: https://www.formula1.com/en/latest/article.las-vegas-to-host-formula-1-grand-prix-from-2022.3Z1Z3ZQZw8Zq8QZq8QZq8Q.html "Formula 1's announcement of the Las Vegas Grand Prix"
[formula1-official-site]: https://www.formula1.com/en/racing/2023/Las_Vegas.html "Official Site for the Formula 1 Heineken Silver Las Vegas Grand Prix 2023"
[oxblue]: https://oxblue.com "OxBlue Construction Cameras"
[typescript]: https://typescriptlang.org "TypeScript's Official Website"
[deno]: https://deno.land "Deno's Official Website - A secure runtime for JavaScript and TypeScript"
[Deno KV]: https://deno.land/manual@v1.36.0/runtime/kv "Deno KV - key-value store built directly into the Deno runtime."
[FoundationDB]: https://www.foundationdb.org "FoundationDB's Official Website"
[Deno v1.37.2]: https://deno.land/manual@v1.37.2
[Deno KV Docs]: https://docs.deno.com/kv/manual