Using Rclone for SharePoint Shared Files

Use Rclone for Sharepoint Shared Files. In this example, the Sharepoint user doesn't own the files shared with them, so Rclone does not show them...

Rclone OneDrive and SharePoint integration

Rclone is a very powerful, storage-agnostic tool for transferring files across the web. You can copy files to and from roughly 30 different storage providers including S3, FTP, Onedrive, and yes, you can copy files from SharePoint sites using WebDAv.

How To Use Rclone?

This multi-threaded computer program helps to manage data within the cloud. It can perform many different tasks such as sync, crypt, cache, transfer, compress, union, and mount.

However, to automate the downloading of files "Shared" with a SharePoint user on a SharePoint Site, Rclone has a known issue using either onedrive, or webdav configurations.

Related: How Automated Data Analytics Can Improve Your Data Teams Productivity

The problem is our user doesn't own the files shared with them, so Rclone does not show them when you list files.

Let's assume we were given the following SharePoint information:

url = https://myacct.sharepoint.com/sites/MYSITE
user = me@mycompany.com
pass = somepassword

As mentioned you cannot see files that have been shared with this user using Rclone, so the workaround involves using the Microsoft Graph API to list the files and then to re-configure Rclone webdav with the information provided.

To accomplish this we use the Microsoft Graph API Explorer. Click the link to load the explorer, and then sign in as the SharePoint user.

Microsoft Graph Sign-on page to use Rclone
Microsoft Graph Sign-on page to use Rclone

Once logged in, scroll down. On the left under "OneDrive" click on "files shared with me" to see a list of files shared with this user.

Microsoft Graph API - files shared with me
Microsoft Graph API - files shared with me

The information we need to re-configure Rclone is under the key "remoteItem": weDavUrl. It should be something like:

https://myacct.sharepoint.com/personal/useremail_domain_suffix/Documents/Path/To/Shared%20Folder/filename.xlsx

myacct, and useremail_domain_suffix will change depending on your account name, and the user who shared the file. NOTE: This url is encoded, and you will need to decode it before it can be used with Rclone and Zuar Runner. What is Zuar Runner?  Check it out...

Zuar Runner ELT Data Staging Platform | Zuar
Zuar Runner is a fast, lightweight, automated data staging platform. Connect to APIs, Databases, or Flat Files to model your data in preparation for analytics.

Configure Rclone

With the information above you're ready to configure Rclone to copy the file. If you don't have Rclone installed locally install it first. Then pop open a terminal, and type:

rclone config

Follow the instructions here to configure for "WebDav." For the URL, use the webdav link you got in the previous step, excluding everything after Documents/. In the example above you would use: https://myacct.sharepoint.com/personal/useremail_domain_suffix/Documents/. Select "SharePoint" as the vendor, then enter the username and password for your SharePoint user. Leave "Bearer Token" blank.

Using Rclone with Zuar Runner

If you're a Zuar Runner user (highly recommended!), you can test your rclone config locally before you upload it.. In a terminal type: rclone lsl sharepoint-webdav:

sharepoint-webdav is the name of the remote in the config. In this case it will not list any files, but if it runs without an error Rclone is configured properly.

If you already have Rclone installed and configured on your Runner instance download the existing config file to your local machine, and append the new rclone remote to the end of the existing config. Otherwise, just copy the new config to a new file.

cat .config/rclone/rclone.conf - to view the new remote in your local config file.

With the new remote in the config file, upload the file back to your Runner instance.

Related: The Best Place To Store Your Data

Create a cmd job

Finally, we'll create the job. In your Runner UI, on the bottom left, click on "Add Job" and then choose "Generic." Give the job a name like "[cmd] copy shared file", and edit the following json:

{
    "cmd": "rclone copy sharepoint-webdav:Path/To/Shared\\ Folder/filename.xlsx /var/runner/data/ --config /var/runner/data/rclone.conf",
    "shell": true
}

NOTE: After creating the job, click the little pencil next to "job type" and change it from io to cmd.

sharepoint-webdav is the name of the Rclone remote (inside [] in the config file).

Path/To/Shared\\ Folder/filename.xlsx NOTE: the path to the file has been decoded and the space is escaped twice.

Transport, warehouse, transform, model, report & monitor: learn how Zuar Runner gets data flowing from hundreds of potential sources into a single destination for analytics.

Runner and Rclone Custom Jobs | Zuar
What’s Rclone and the Zuar Runner Rclone plug-in? We can explain how to use these programs for your data transfer needs.
Rclone
OverviewRclone [https://rclone.org/] is a program that can be used to transfer files toand from more than forty different storage backends (e.g., Amazon S3, Box,Dropbox, FTP, Google Cloud Storage, Google Drive, Microsoft Azure Blob Storage,Microsoft OneDrive, Microsoft Sharepoint, SFTP, etc.). …