# ðï¸ Video summary as a service ð
> _This is not an official Google product. This is a tutorial aiming at giving you ideas..._
## ð Hello!
Dear developers,
- Do you like the adage _"a picture is worth a thousand words"_? I do!
- Let's check if it also works for _"a picture is worth a thousand frames"_.
- In this tutorial, you'll see the following:
- how to understand the content of a video in a blink,
- in less than 300 lines of Python (3.7) code.
Here is a visual summary example, generated from a 2'42" video made of 35 sequences (shots):

> Note: The summary is a grid where each cell is a frame representing a video shot.
## ð Objectives
This tutorial has 2 objectives, 1 practical and 1 technical:
- Automatically generate visual summaries of videos
- Build a processing pipeline with these properties:
- managed (always ready and easy to set up)
- scalable (able to ingest several videos in parallel)
- not costing anything when not used
## ð ï¸ Tools
A few tools are enough:
- Storage space for videos and results
- A serverless solution to run the code
- A machine learning model to analyze videos
- A library to extract frames from videos
- A library to generate the visual summaries
## ð§± Architecture
Here is a possible architecture using 3 Google Cloud services ([Cloud Storage](https://cloud.google.com/storage/docs), [Cloud Functions](https://cloud.google.com/functions/docs), and [Video Intelligence API](https://cloud.google.com/video-intelligence/docs)):
> 
The processing pipeline follows these steps:
1. You upload a video to the 1st bucket (a bucket is a storage space in the cloud)
2. The upload event automatically triggers the 1st function
3. The function sends a request to the Video Intelligence API to detect the shots
4. The Video Intelligence API analyzes the video and uploads the results (annotations) to the 2nd bucket
5. The upload event triggers the 2nd function
6. The function downloads both annotation and video files
7. The function renders and uploads the summary to the 3rd bucket
8. The video summary is ready!
## ð Python libraries
Open source client libraries let you interface with Google Cloud services in idiomatic Python. You'll use the following:
- `Cloud Storage`
- To manage downloads and uploads
- <https://pypi.org/project/google-cloud-storage>
- `Video Intelligence API`
- To analyze videos
- <https://pypi.org/project/google-cloud-videointelligence>
Here is a choice of 2 additional Python libraries for the graphical needs:
- `OpenCV`
- To extract video frames
- There's even a headless version (without GUI features), which is ideal for a service
- <https://pypi.org/project/opencv-python-headless>
- `Pillow`
- To generate the visual summaries
- `Pillow` is a very popular imaging library, both extensive and easy to use
- <https://pypi.org/project/Pillow>
## âï¸ Project setup
Assuming you have a Google Cloud account, you can set up the architecture from Cloud Shell with the `gcloud` and `gsutil` commands. This lets you script everything from scratch in a reproducible way.
### Environment variables
```bash
# Project
PROJECT_NAME="Visual Summary"
PROJECT_ID="visual-summary-REPLACE_WITH_UNIQUE_SUFFIX"
# Cloud Storage region (https://cloud.google.com/storage/docs/locations)
GCS_REGION="europe-west1"
# Cloud Functions region (https://cloud.google.com/functions/docs/locations)
GCF_REGION="europe-west1"
# Source
GIT_REPO="cherry-on-py"
PROJECT_SRC=~/$PROJECT_ID/$GIT_REPO/gcf_video_summary
# Cloud Storage buckets (environment variables)
export VIDEO_BUCKET="b1-videos_${PROJECT_ID}"
export ANNOTATION_BUCKET="b2-annotations_${PROJECT_ID}"
export SUMMARY_BUCKET="b3-summaries_${PROJECT_ID}"
```
> Note: You can use your GitHub username as a unique suffix.
### New project
```bash
gcloud projects create $PROJECT_ID \
--name="$PROJECT_NAME" \
--set-as-default
```
```text
Create in progress for [https://cloudresourcemanager.googleapis.com/v1/projects/PROJECT_ID].
Waiting for [operations/cp...] to finish...done.
Enabling service [cloudapis.googleapis.com] on project [PROJECT_ID]...
Operation "operations/acf..." finished successfully.
Updated property [core/project] to [PROJECT_ID].
```
### Billing account
```bash
# Link project with billing account (single account)
BILLING_ACCOUNT=$(gcloud beta billing accounts list \
--format 'value(name)')
# Link project with billing account (specific one among multiple accounts)
BILLING_ACCOUNT=$(gcloud beta billing accounts list \
--format 'value(name)' \
--filter "displayName='My Billing Account'")
gcloud beta billing projects link $PROJECT_ID --billing-account $BILLING_ACCOUNT
```
```text
billingAccountName: billingAccounts/XXXXXX-YYYYYY-ZZZZZZ
billingEnabled: true
name: projects/PROJECT_ID/billingInfo
projectId: PROJECT_ID
```
### Buckets
```bash
# Create buckets with uniform bucket-level access
gsutil mb -b on -c regional -l $GCS_REGION gs://$VIDEO_BUCKET
gsutil mb -b on -c regional -l $GCS_REGION gs://$ANNOTATION_BUCKET
gsutil mb -b on -c regional -l $GCS_REGION gs://$SUMMARY_BUCKET
```
```text
Creating gs://VIDEO_BUCKET/...
Creating gs://ANNOTATION_BUCKET/...
Creating gs://SUMMARY_BUCKET/...
```
You can check how it looks like in the [Cloud Console](https://console.cloud.google.com/storage/browser):

### Service account
Create a service account. This is for development purposes only (not needed for production). This provides you with credentials to run your code locally.
```bash
mkdir ~/$PROJECT_ID
cd ~/$PROJECT_ID
SERVICE_ACCOUNT_NAME="dev-service-account"
SERVICE_ACCOUNT="${SERVICE_ACCOUNT_NAME}@${PROJECT_ID}.iam.gserviceaccount.com"
gcloud iam service-accounts create $SERVICE_ACCOUNT_NAME
gcloud iam service-accounts keys create ~/$PROJECT_ID/key.json --iam-account $SERVICE_ACCOUNT
```
```text
Created service account [SERVICE_ACCOUNT_NAME].
created key [...] of type [json] as [~/PROJECT_ID/key.json] for [SERVICE_ACCOUNT]
```
Set the `GOOGLE_APPLICATION_CREDENTIALS` environment variable and check that it points to the service account key. When you run the application code in the current shell session, client libraries will use these credentials for authentication. If you open a new shell session, set the variable again.
```bash
export GOOGLE_APPLICATION_CREDENTIALS=~/$PROJECT_ID/key.json
cat $GOOGLE_APPLICATION_CREDENTIALS
```
```text
{
"type": "service_account",
"project_id": "PROJECT_ID",
"private_key_id": "...",
"private_key": "-----BEGIN PRIVATE KEY-----\n...",
"client_email": "SERVICE_ACCOUNT",
...
}
```
Authorize the service account to access the buckets:
```bash
IAM_BINDING="serviceAccount:${SERVICE_ACCOUNT}:roles/storage.objectAdmin"
gsutil iam ch $IAM_BINDING gs://$VIDEO_BUCKET
gsutil iam ch $IAM_BINDING gs://$ANNOTATION_BUCKET
gsutil iam ch $IAM_BINDING gs://$SUMMARY_BUCKET
```
### APIs
A few APIs are enabled by default:
```bash
gcloud services list
```
```text
NAME TITLE
bigquery.googleapis.com BigQuery API
bigquerystorage.googleapis.com BigQuery Storage API
cloudapis.googleapis.com Google Cloud APIs
clouddebugger.googleapis.com Cloud Debugger API
cloudtrace.googleapis.com Cloud Trace API
datastore.googleapis.com Cloud Datastore API
logging.googleapis.com Cloud Logging API
monitoring.googleapis.com Cloud Monitoring API
servicemanagement.googleapis.com Service Management API
serviceusage.googleapis.com Service Usage API
sql-component.googleapis.com Cloud SQL
storage-api.googleapis.com Go