I was looking for a simple command line script to upload files to Google drive and I stumbled upon gdrive a command line utility to manage files in google drive. But it was heavy weight for my requirements and had dependencies which I was not able to install in the servers. So I decided on writing one myself which catered my needs.

Google offers quite a number of REST APIs to integrate with Google Drive and its really simple to use them. In this following post I use Google v2 APIs to upload files / folders to Google drive. The complete script is available to download in github (https://github.com/labbots/google-drive-upload)

Dependencies

My intention is to write a script with minimum dependencies and this script does not have very many dependencies. Most of the dependencies are available by default in most Linux platforms. The script requires the following packages

  • curl
  • sed (Stream editor)
  • find command
  • awk
  • getopt

Create Google API key

Accessing Google API requires authentication credentials which can be created using developers Google Console. Make sure Google Drive API is enabled for the project created in the Google Console. This API key (client id and client secret) will be used in the script to generate OAuth 2.0 token to access Google Drive APIs.

Bash script

To seamlessly access and manage the Google drive of the user,the script requires Device authorization to access Google drive of the user. The V3 Google APIs have restricted the scope for the device authorization, thereby V3 APIs cannot be used to upload / manage files in Google drive through Device code authorization workflow. So I decided to use the V2 APIs to achieve my requirements (I know older versions will be deprecated and not a good idea to develop against, but this is a weekend project and it works :P )

The idea is to have a script that takes filename and foldername as arguments to the script and uploads the file to the specified foldername in google drive. I wanted all other configurations such as API keys and refresh tokens to be stored in a config file which can be setup during the initial execution of the script.

Step 1 : Parsing arguments and options passed to the script.

To achieve this I used getopt library available in all linux distributions. This library allows parsing both short and long options. The options can be parsed as shown below.


PROGNAME=${0##*/}
SHORTOPTS="vhr:C:z:" 
LONGOPTS="verbose,help,create-dir:,root-dir:,config:" 

set -o errexit -o noclobber -o pipefail -o nounset 
OPTS=$(getopt -s bash --options $SHORTOPTS --longoptions $LONGOPTS --name $PROGNAME -- "$@" ) 

eval set -- "$OPTS"

VERBOSE=false
HELP=false
CONFIG=""
ROOTDIR=""

while true; do
  case "$1" in
    -v | --verbose ) VERBOSE=true;curl_args="--progress"; shift ;;
    -h | --help )    usage; shift ;;
    -C | --create-dir ) FOLDERNAME="$2"; shift 2 ;;
    -r | --root-dir ) ROOTDIR="$2";ROOT_FOLDER="$2"; shift 2 ;;
    -z | --config ) CONFIG="$2"; shift 2 ;;
    -- ) shift; break ;;
    * )  break ;;
  esac
done

The default config parameters are stored in a config file in the home directory of the user for future use.


if [ -e $HOME/.googledrive.conf ]
then
    . $HOME/.googledrive.conf
fi

old_umask=`umask`
umask 0077

if [ -z "$ROOT_FOLDER" ]
then
    read -p "Root Folder ID (Default: root): " ROOT_FOLDER
    if [ -z "$ROOT_FOLDER" ] || [ `echo $ROOT_FOLDER | tr [:upper:] [:lower:]` = `echo "root" | tr [:upper:] [:lower:]` ]
    	then
    		ROOT_FOLDER="root"
    		echo "ROOT_FOLDER=$ROOT_FOLDER" >> $HOME/.googledrive.conf
    	else
		    if expr "$ROOT_FOLDER" : '^[A-Za-z0-9_]\{28\}$' > /dev/null
		    then
				echo "ROOT_FOLDER=$ROOT_FOLDER" >> $HOME/.googledrive.conf
			else
				echo "Invalid root folder id"
				exit -1
			fi
		fi
fi

if [ -z "$CLIENT_ID" ]
then
    read -p "Client ID: " CLIENT_ID
    echo "CLIENT_ID=$CLIENT_ID" >> $HOME/.googledrive.conf
fi

if [ -z "$CLIENT_SECRET" ]
then
    read -p "Client Secret: " CLIENT_SECRET
    echo "CLIENT_SECRET=$CLIENT_SECRET" >> $HOME/.googledrive.conf
fi

Step 2 : Generate access token.

We require access token to access the APIs. In the script we use device code authorization OAuth workflow to generate access token and refresh token for the user. When the user runs the script for the first time, we want them to authorize the application so we can get the access token from Google for the user.
We are using REST APIs of Google can be called from shell script using curl command and the API responds with json value. So we need a simple parser to retrieve values from the json object in the response. So following function will let us extract the required value from json reponse.


# Method to extract data from json response
function jsonValue() {
KEY=$1
num=$2
awk -F"[,:}][^://]" '{for(i=1;i<=NF;i++){if($i~/\042'$KEY'\042/){print $(i+1)}}}' | tr -d '"' | sed -n ${num}p | sed -e 's/[}]*$//' -e 's/^[[:space:]]*//' -e 's/[[:space:]]*$//' -e 's/[,]*$//' 
}

To get the application authorized by the user using device code authorization, we need to make a API call to Google Oauth endpoint as follows


  RESPONSE=`curl --silent "https://accounts.google.com/o/oauth2/device/code" --data "client_id=$CLIENT_ID&scope=$SCOPE"`
	DEVICE_CODE=`echo "$RESPONSE" | jsonValue "device_code"`
	USER_CODE=`echo "$RESPONSE" | jsonValue "user_code"`
	URL=`echo "$RESPONSE" | jsonValue "verification_url"`

	echo -n "Go to $URL and enter $USER_CODE to grant access to this application. Hit enter when done..."
	read

	RESPONSE=`curl --silent "https://accounts.google.com/o/oauth2/token" --data "client_id=$CLIENT_ID&client_secret=$CLIENT_SECRET&code=$DEVICE_CODE&grant_type=http://oauth.net/grant_type/device/1.0"`

	ACCESS_TOKEN=`echo "$RESPONSE" | jsonValue access_token`
	REFRESH_TOKEN=`echo "$RESPONSE" | jsonValue refresh_token`

    echo "REFRESH_TOKEN=$REFRESH_TOKEN" >> $HOME/.googledrive.conf

The resulting access token can be used to access the google drive API and refresh token can be stored in config which can be used to regenerate access token when current access token expires.

Step 3 : Create Directory and Upload file.

Once the access token is generated, the last step is to upload file to the specified directory or to the root directory of the google drive. Google drive operates based on ID and not on names. So the drive can have two folders with same name. But for my use case, I wanted to upload the file to the same directory and not to create directory if the directory already exists. So first I check whether the directory exists, if it exists then I use the folder id or I create a new folder in drive and use that folder id.


function createDirectory(){
	DIRNAME="$1"
	ROOTDIR="$2"
	ACCESS_TOKEN="$3"
	FOLDER_ID=""
    QUERY="mimeType='application/vnd.google-apps.folder' and title='$DIRNAME'"
    QUERY=$(echo $QUERY | sed -f ${DIR}/url_escape.sed)

	SEARCH_RESPONSE=`/usr/bin/curl \
					--silent \
					-XGET \
					-H "Authorization: Bearer ${ACCESS_TOKEN}" \
					 "https://www.googleapis.com/drive/v2/files/${ROOTDIR}/children?orderBy=title&q=${QUERY}&fields=items%2Fid"`

	FOLDER_ID=`echo $SEARCH_RESPONSE | jsonValue id`


	if [ -z "$FOLDER_ID" ]
	then
		CREATE_FOLDER_POST_DATA="{\"mimeType\": \"application/vnd.google-apps.folder\",\"title\": \"$DIRNAME\",\"parents\": [{\"id\": \"$ROOTDIR\"}]}"
		CREATE_FOLDER_RESPONSE=`/usr/bin/curl \
								--silent  \
								-X POST \
								-H "Authorization: Bearer ${ACCESS_TOKEN}" \
								-H "Content-Type: application/json; charset=UTF-8" \
								-d "$CREATE_FOLDER_POST_DATA" \
								"https://www.googleapis.com/drive/v2/files?fields=id"`
		FOLDER_ID=`echo $CREATE_FOLDER_RESPONSE | jsonValue id`

	fi
	echo "$FOLDER_ID"
}

So the final step is to upload the file to the drive into the specified directory. I decided on using resumable upload link so I could resume upload incase of upload failure and not to restart upload from the beginning which might be a inconvenience for larger files.


function uploadFile(){

	FILE="$1"
	FOLDER_ID="$2"
	ACCESS_TOKEN="$3"
	MIME_TYPE=`file --brief --mime-type "$FILE"`
	SLUG=`basename "$FILE"`
	FILESIZE=$(stat -c%s "$FILE")

	# JSON post data to specify the file name and folder under while the file to be created
	postData="{\"mimeType\": \"$MIME_TYPE\",\"title\": \"$SLUG\",\"parents\": [{\"id\": \"$FOLDER_ID\"}]}"
	postDataSize=$(echo $postData | wc -c)

	# Curl command to initiate resumable upload session and grab the location URL
	log "Generating upload link for file $FILE ..."
	uploadlink=`/usr/bin/curl \
				--silent \
				-X POST \
				-H "Host: www.googleapis.com" \
				-H "Authorization: Bearer ${ACCESS_TOKEN}" \
				-H "Content-Type: application/json; charset=UTF-8" \
				-H "X-Upload-Content-Type: $MIME_TYPE" \
				-H "X-Upload-Content-Length: $FILESIZE" \
				-d "$postData" \
				"https://www.googleapis.com/upload/drive/v2/files?uploadType=resumable" \
				--dump-header - | sed -ne s/"Location: "//p | tr -d '\r\n'`

	# Curl command to push the file to google drive.
	# If the file size is large then the content can be split to chunks and uploaded.
	# In that case content range needs to be specified.
	log "Uploading file $FILE to google drive..."
	curl \
	-X PUT \
	-H "Authorization: Bearer ${ACCESS_TOKEN}" \
	-H "Content-Type: $MIME_TYPE" \
	-H "Content-Length: $FILESIZE" \
	-H "Slug: $SLUG" \
	--data-binary "@$FILE" \
	--output /dev/null \
	"$uploadlink" \
	$curl_args
}

The resumable upload of files are yet to be implemented. Hopefully I might cover in following posts.

The complete script is available for download in google drive upload script github repository.

Reference