Chris Ruggieri Logo

Chris Ruggieri (Neocount Phoenix)

Security Blog, Rants, Raves, Write-ups, and Code

Project Phoenix Rebuilding: End-to-End Backup and Access Architecture

May 30, 2026 · Chris Ruggieri

Project Phoenix Rebuilding backup architecture logo

This writeup documents the full buildout of the environment, including OneDrive to S3 backup, NAS to S3 backup, lifecycle rules into Deep Archive, EC2 role permissions, non-EC2 key usage, Microsoft Entra app registrations for stable OAuth handling, and Cloudflare Tunnel for EC2, Windows, and Synology DSM. What started as “protect my own data without spending a fortune” became a full architecture exercise around identity, object storage, validation, lifecycle economics, and zero-trust access.

1. Objectives

The design goals were simple on paper:

The final design naturally split into two planes:

Data plane
OneDrive / NAS → rclone → S3 bucket → Glacier Deep Archive

Access plane
Client → Cloudflare Access → Entra ID → Tunnel → protected service

Project Phoenix Rebuilding backup architecture overview

High-level architecture showing OneDrive, EC2, AWS S3 Glacier lifecycle storage, Synology NAS backups, and Cloudflare Tunnel access.

2. Core Components

3. S3 Bucket Layout

The primary bucket used was:

phoenix-backup-archive

Prefixes were organized by purpose:

onedrive-<user1>/
onedrive-<user2>/
nas/

Under nas/, prefixes mirrored the exact NAS folder structure:

nas/Movies - NAS/
nas/UHD Movies - NAS/
nas/TV Shows - NAS/
nas/Documents/
nas/music/
nas/video/

That exact namespace symmetry mattered because it ensured future delta copies from the NAS would line up with already-uploaded content instead of duplicating or restructuring it.

AWS S3 bucket layout for Project Phoenix Rebuilding

S3 bucket layout showing the top-level backup prefixes for OneDrive and NAS data before lifecycle transition into long-term archive storage.

AWS S3 NAS prefix subfolder layout for Project Phoenix Rebuilding

Expanded view of the NAS backup prefix in S3, showing the top-level Synology folder structure being archived.

4. S3 Bucket Configuration and Lifecycle Rules

After the bucket structure was created, the next step was locking down the bucket and defining how data would move into long-term archive storage. The bucket is private, encrypted, versioned, and governed by lifecycle rules that transition backup data into Glacier Deep Archive.

The storage cost model depended on not leaving backup data in S3 Standard any longer than necessary. S3 Standard was treated as the landing zone. Glacier Deep Archive was treated as the resting state.

The most important lesson here was that S3 lifecycle rules match on bucket-relative prefixes, not URIs and not filesystem-style paths.

Correct examples:

nas/video/
nas/music/
onedrive-<user1>/

Incorrect examples that matched nothing:

s3://phoenix-backup-archive/nas/video/
/nas/video/

Because versioning was enabled, lifecycle had to consider both current and noncurrent versions:

Another operational lesson: lifecycle does not run immediately when a rule is created or corrected. Objects become eligible, then AWS evaluates and processes them asynchronously. There is no “run now” button.

That meant lifecycle should not be enabled for a prefix until the copy had completed, validation had completed, and the dataset was considered ready for archive.

S3 block public access settings for the Phoenix backup archive

Block Public Access is enabled to prevent accidental public exposure of backup data.

S3 default encryption settings for the Phoenix backup archive

Default server-side encryption is enabled so new objects are encrypted when written to the bucket.

S3 bucket versioning settings for the Phoenix backup archive

Bucket versioning is enabled to preserve object history and provide a recovery path for accidental overwrite or deletion scenarios.

S3 lifecycle rules list for Glacier Deep Archive transitions

Lifecycle rules automate movement into Glacier Deep Archive and cleanup of older noncurrent versions.

S3 lifecycle rule for OneDrive backups moving to Glacier Deep Archive

OneDrive backup prefixes transition to Glacier Deep Archive after the configured retention window.

S3 lifecycle rule for NAS backups moving to Glacier Deep Archive

NAS backup prefixes use the same archive pattern, transitioning current and noncurrent data into Glacier Deep Archive.

S3 lifecycle rule for noncurrent version cleanup

Version cleanup rules manage noncurrent object versions so the archive does not retain stale versions forever.

5. EC2 Role Permissions for S3

The EC2 instance used role-based auth, not hardcoded keys.

That meant the EC2 instance could talk to S3 using temporary credentials from the attached IAM role. The role needed permissions like:

The EC2 S3 remote looked like this:

[aws_archive]
type = s3
provider = AWS
env_auth = true
region = us-east-2

This worked on EC2 because AWS provides role credentials automatically through the instance metadata service. It does not work the same way on a Synology or workstation outside AWS.

IAM policy for S3 backup archive access

Customer-managed IAM policy granting the backup workflow access to the S3 archive bucket and objects inside it.

IAM user permissions for NAS upload access

Dedicated IAM user for NAS uploads with console access disabled and the backup permissions attached directly.

IAM policy for NAS upload permissions to S3

NAS upload policy scoped to listing the bucket, getting bucket location, uploading objects, and managing multipart uploads.

6. Access Key / Secret Key Usage Outside EC2

For systems outside AWS, role-style EC2 auth was not available. That applied to the Windows workstation ingest host and the Synology NAS.

So an IAM user with scoped bucket permissions was created for non-EC2 access. The handling model was straightforward:

The actual secret key is intentionally not reproduced here.

7. rclone Configuration Model

EC2 S3 remote

[aws_archive]
type = s3
provider = AWS
env_auth = true
region = us-east-2

Non-EC2 S3 remote

[aws_archive]
type = s3
provider = AWS
env_auth = false
access_key_id = <redacted>
secret_access_key = <redacted>
region = us-east-2

Keeping the remote name aws_archive the same across environments reduced command drift and made scripts portable. The remote name stayed consistent, but the authentication model changed depending on where rclone was running.

8. OneDrive Authentication Problem and Entra App Fix

Default OneDrive authentication through rclone was not sufficient for unattended long-running jobs. The transfers were large enough and long enough that token handling became a reliability problem.

Symptoms included:

The fix was to use a custom Microsoft Entra app registration and tenant-specific authentication endpoints instead of relying on the generic default flow.

Key settings included:

Redacted Microsoft Entra app registration overview for rclone OneDrive authentication

Custom Microsoft Entra app registration used for rclone OneDrive authentication, with tenant and application identifiers redacted.

Microsoft Entra redirect URI configured for rclone OneDrive authentication

Redirect URI configured for the local rclone authentication callback.

Microsoft Graph API permissions for rclone OneDrive access

Microsoft Graph permissions used by the custom Entra app registration, including offline access for long-running rclone jobs.

Tenant-specific endpoints used:

https://login.microsoftonline.com/<tenant-id>/oauth2/v2.0/authorize
https://login.microsoftonline.com/<tenant-id>/oauth2/v2.0/token

That stabilized the authentication flow and solved the refresh-token problem cleanly.

9. OneDrive Backup Execution

Separate OneDrive remotes were maintained for separate datasets, such as od_<user1>_app and od_<user2>_app. Each remote pointed to a different OneDrive source, but both used the same S3 archive remote on the destination side.

The backup model used rclone copy, not rclone sync.

For the OneDrive backup jobs, the command pattern looked like this:

rclone copy od_<user1>_app:/ aws_archive:phoenix-backup-archive/onedrive-<user1> \
  --fast-list \
  --transfers 8 \
  --checkers 8 \
  --progress \
  --log-file ~/rclone-logs/backup-<user1>.log \
  --log-level INFO

The same model was used for the second OneDrive dataset:

rclone copy od_<user2>_app:/ aws_archive:phoenix-backup-archive/onedrive-<user2> \
  --fast-list \
  --transfers 8 \
  --checkers 8 \
  --progress \
  --log-file ~/rclone-logs/backup-<user2>.log \
  --log-level INFO

Validation used rclone's own source and destination comparison tools rather than relying on visual usage bars in the cloud console.

rclone size od_<user1>_app:/
rclone size od_<user2>_app:/

rclone check od_<user1>_app:/ aws_archive:phoenix-backup-archive/onedrive-<user1> --size-only
rclone check od_<user2>_app:/ aws_archive:phoenix-backup-archive/onedrive-<user2> --size-only

The rule became simple: trust rclone validation, not the pretty usage bar.

10. EC2 Automation

Once the OneDrive backup commands were proven manually, they were converted into repeatable shell scripts on the EC2 staging box. Each dataset received its own script so logging, destination paths, and troubleshooting stayed clean.

The automation model was intentionally simple:

A cron job that has not survived real manual execution is not automation. It is roulette.

The script pattern looked like this:

#!/bin/bash

rclone copy od_<user1>_app:/ aws_archive:phoenix-backup-archive/onedrive-<user1> \
  --fast-list \
  --transfers 8 \
  --checkers 8 \
  --log-file ~/rclone-logs/backup-<user1>.log \
  --log-level INFO

The second dataset used the same structure with a different source remote, destination prefix, and log file:

#!/bin/bash

rclone copy od_<user2>_app:/ aws_archive:phoenix-backup-archive/onedrive-<user2> \
  --fast-list \
  --transfers 8 \
  --checkers 8 \
  --log-file ~/rclone-logs/backup-<user2>.log \
  --log-level INFO

After the scripts were stable, they were scheduled every six hours in a staggered pattern. That created a convergence loop where each run reduced drift between OneDrive and the S3 archive without requiring a destructive sync operation.

# Example staggered cron schedule

0 */6 * * * /home/ubuntu/rclone-scripts/backup-<user1>.sh
30 */6 * * * /home/ubuntu/rclone-scripts/backup-<user2>.sh

The staggered schedule kept the jobs from competing with each other while still ensuring both OneDrive datasets were checked and copied multiple times per day.

11. NAS Ingest Strategy

The NAS footprint is large enough that it had to be treated as a phased ingestion project rather than a casual cloud upload. With tens of terabytes of data involved, the initial upload path mattered just as much as the backup destination.

Home upload bandwidth was extremely poor, which made full direct ingest from home impractical for bulk seeding. Trying to push the entire NAS directly from home would have turned the first backup into a never-ending transfer instead of a controlled archive project.

The practical solution was to separate the initial bulk seed from the ongoing delta maintenance:

This approach avoided trying to brute-force the entire project through a weak home upload connection. Bulk data movement was handled where bandwidth was available, while the NAS remained responsible for long-term validation and incremental convergence.

# High-level ingest model

NAS source data
  → external disk seed
  → high-bandwidth upload location
  → S3 archive bucket
  → Glacier Deep Archive lifecycle transition

Ongoing maintenance
  → NAS rclone check/copy jobs
  → size-only validation
  → delta correction as needed

This cleanly separated bulk seeding from ongoing delta maintenance. The first pass was about getting the data into the archive. Later passes were about proving it, correcting drift, and keeping the archive aligned over time.

12. Validation from Synology

Once rclone was running on the Synology, the NAS became more than just a source system. It also became a validation node. That mattered because the archive was not considered complete just because files existed in S3. The source and destination still needed to be compared.

The validation command used the Synology-mounted data path and compared it against the matching S3 prefix:

docker run --rm -it \
  -v /volume1/docker/rclone:/config/rclone \
  -v /volume1:/data \
  rclone/rclone check \
  "/data/Movies - NAS" \
  "aws_archive:phoenix-backup-archive/nas/Movies - NAS" \
  --size-only \
  --one-way \
  --fast-list \
  --checkers 4 \
  --log-file /config/rclone/logs/movies-check.log \
  --log-level INFO

The important validation flags were:

A full check against Movies - NAS showed two categories of findings. Some were Synology sidecar artifacts, such as indexing or metadata files that do not need to be preserved in the archive. Others were genuinely missing payload files, which proved the copy was not fully converged yet.

# Examples of Synology-generated artifacts excluded or ignored during backup validation

**/@eaDir/**
**/#recycle/**
**/.DS_Store
**/Thumbs.db
**/SYNOINDEX_MEDIA_INFO/**

That result was useful, not a failure. It showed the validation layer doing its job. The system identified that the dataset was not fully converged, which meant lifecycle transition should not be treated as the final proof of a complete backup.

The system did exactly what it was supposed to do: it prevented an incomplete dataset from being blindly accepted as finished.

13. Full Fidelity Backup Decision

After the Synology validation checks exposed both normal payload files and Synology-generated sidecar artifacts, I had to make a design decision about what this archive was supposed to be.

There were two valid paths:

The cleaner option would reduce object count, reduce noise in validation reports, and avoid preserving files that were not directly created by me. That would make the archive look better, but it would also mean the S3 copy was no longer a true representation of the NAS.

The final decision was to back up everything, including:

That decision increases object count and creates more noise during review, but it aligns with the actual disaster-recovery goal. If the NAS ever has to be rebuilt from the archive, I do not want to be guessing which supporting files mattered and which ones did not.

# Final backup philosophy

If it exists in the source dataset,
it belongs in the archive.

The backup target should represent the NAS as-is,
not a manually cleaned interpretation of it.

This changed the archive from a curated media backup into a full-fidelity recovery copy. It may not be the prettiest approach, but it is the safer one for a rebuild scenario.

14. Synology-native rclone via Docker

For the NAS side of the architecture, rclone was run directly on the Synology using Docker. That kept the setup portable and avoided needing to install rclone directly into the Synology operating system.

The container model was simple:

The basic Docker pattern looked like this:

docker run --rm -it \
  -v /volume1/docker/rclone:/config/rclone \
  -v /volume1:/data \
  rclone/rclone [rclone-command]

That mount structure made the Synology folder tree available inside the container under /data. For example, a NAS folder that exists on the Synology as:

/volume1/Movies - NAS

became available to rclone inside the container as:

/data/Movies - NAS

The Synology has multiple storage volumes, so the container mount strategy matters. In this setup, /volume1 was mounted into the rclone container as /data. Any folders living on Volume 2 need to be handled with their own mount path or a separate command that maps /volume2 into the container as well.

Synology storage usage showing Volume 1 and Volume 2 capacity

Synology storage is split across Volume 1 and Volume 2, so rclone container mounts must match where the source folders actually live.

The rclone configuration and logs persisted outside the container under the Docker config path:

/volume1/docker/rclone
/volume1/docker/rclone/logs

This allowed both direct copies and direct checks from the NAS itself. The Synology could act as the source for ongoing archive maintenance, and it could also validate whether S3 contained the expected objects.

For the NAS, Docker provided the cleanest balance: rclone stayed containerized, logs and configuration remained persistent, and the NAS data could be presented to the container without changing the underlying Synology installation.

15. Scheduled Tasks for NAS Folders

Once the Docker-based rclone model was working on the Synology, the next step was preparing scheduled tasks for the NAS folders that would eventually be maintained directly from the NAS.

Scheduled Synology tasks were prepared for:

Each task follows the same basic containerized rclone pattern: mount the rclone configuration, mount the NAS volume, copy the selected folder to the matching S3 prefix, and write logs back to the Synology.

docker run --rm \
  -v /volume1/docker/rclone:/config/rclone \
  -v /volume1:/data \
  rclone/rclone copy \
  "/data/<folder-name>" \
  "aws_archive:phoenix-backup-archive/nas/<folder-name>" \
  --fast-list \
  --transfers 4 \
  --checkers 4 \
  --log-file /config/rclone/logs/<folder-name>.log \
  --log-level INFO

The scheduled task model keeps each folder independent. That makes it easier to troubleshoot failures, review logs, rerun a single folder, or temporarily disable a noisy job without impacting the rest of the backup plan.

Because current home upload bandwidth is weak, these tasks were prepared but not treated as aggressive always-on bulk transfer jobs yet. The near-term goal is controlled validation and incremental convergence, not trying to brute-force the entire NAS through a constrained upload pipe.

Once better upload bandwidth is available, the same task structure can be used more aggressively. Until then, the schedule remains a controlled framework for maintaining the archive without overwhelming the connection.

16. Cloudflare Tunnel for EC2

At this point, the project shifted from backup storage and data movement into the access layer. Sections 16 through 20 cover the Zero Trust Network Access side of the build: reaching infrastructure without exposing inbound services directly to the internet. Section 21 then moves into Synology SSO, which is identity-based access to DSM itself rather than a network tunnel.

Cloudflare Tunnel was used to reach the EC2 backup host without exposing SSH publicly. This established the access-plane pattern reused elsewhere: identity-gated access, no exposed inbound ports, and Cloudflare Access sitting in front of administrative entry points.

The goal was simple: the EC2 instance needed to be reachable for administration, but SSH did not need to be open to the world.

Cloudflare Tunnel connector list showing EC2 and Synology tunnels

Cloudflare Tunnel connectors provide outbound-only connectivity from the EC2 backup host and Synology NAS into Cloudflare.

The EC2 tunnel route mapped the public hostname to SSH on the local EC2 instance:

Cloudflare Tunnel configuration for EC2 SSH access

Cloudflare Tunnel maps the EC2 backup hostname to SSH on the private staging host without exposing port 22 publicly.

tunnel: aws-backup-vm
credentials-file: /home/ubuntu/.cloudflared/<tunnel-id>.json

ingress:
  - hostname: aws-backup-vm.phoenix-comp.com
    service: ssh://localhost:22
  - service: http_status:404

Cloudflare Access was then layered in front of that hostname so the administrative path required identity validation before SSH access was allowed.

Cloudflare Access policy protecting EC2 SSH access

Cloudflare Access adds the identity gate in front of the EC2 administrative path.

The tunnel setup pattern looked like this:

cloudflared tunnel login

cloudflared tunnel create aws-backup-vm

cloudflared tunnel route dns aws-backup-vm aws-backup-vm.phoenix-comp.com

Once the tunnel was created and routed, the Cloudflare service was installed so the tunnel would survive reboots:

sudo cloudflared service install
sudo systemctl enable cloudflared
sudo systemctl start cloudflared
sudo systemctl status cloudflared

That gave the EC2 staging box a secure administrative access path without requiring public SSH exposure. From an architecture standpoint, this became the access model for the rest of the project: keep services private, publish only the Cloudflare-controlled entry point, and put identity policy in front of it.

17. Cloudflare Tunnel for Windows RDP Host

A Windows system was also tunneled through Cloudflare for RDP access. This reused the same Zero Trust Network Access pattern from the EC2 host: keep the service off the public internet, route access through Cloudflare Tunnel, and place Cloudflare Access policy in front of the administrative entry point.

The purpose was not to publish RDP directly. The purpose was to make RDP reachable through a controlled Cloudflare path without opening inbound RDP to the internet.

The Windows tunnel route followed the same published-application pattern:

hostname: chris-razer.phoenix-comp.com
service: rdp://localhost:3389

One of the bigger lessons was that localhost and 127.0.0.1 are not always interchangeable in real troubleshooting. If the tunnel route is correct but the connection still fails, verify what address and port the Windows service is actually listening on.

Useful validation checks included confirming that RDP was enabled, confirming the listener was active, and confirming the local port was reachable before blaming Cloudflare.

# Confirm RDP listener locally on Windows

netstat -ano | findstr :3389

# Confirm Remote Desktop service state

sc query TermService

Once the local RDP listener was confirmed, the remaining work was Cloudflare-side: publish the route, verify the Access application, apply the correct policy, and make sure the tunnel stayed running persistently.

cloudflared tunnel login

cloudflared tunnel create chris-razer

cloudflared tunnel route dns chris-razer chris-razer.phoenix-comp.com

cloudflared service install
cloudflared service start

The final result was RDP access that did not require exposing RDP directly to the internet. The system stayed private, Cloudflare handled the route, and Access policy controlled who could reach the administrative path.

18. Cloudflare Tunnel for Synology DSM

The Synology DSM interface was also tunneled through Cloudflare. This followed the same access pattern as the EC2 and Windows systems: keep the service private, route access through Cloudflare Tunnel, and avoid exposing the NAS directly to the public internet.

The DSM hostname was published through a redacted Cloudflare-managed subdomain:

<redacted-subdomain>.phoenix-comp.com

The build process was:

The initial login and tunnel creation were run from Docker on the Synology:

docker run --rm -it \
  --user root \
  -v /volume1/docker/cloudflared:/root/.cloudflared \
  cloudflare/cloudflared:latest \
  tunnel login

docker run --rm -it \
  --user root \
  -v /volume1/docker/cloudflared:/root/.cloudflared \
  cloudflare/cloudflared:latest \
  tunnel create synology-nas

docker run --rm -it \
  --user root \
  -v /volume1/docker/cloudflared:/root/.cloudflared \
  cloudflare/cloudflared:latest \
  tunnel route dns synology-nas <redacted-subdomain>.phoenix-comp.com

The final tunnel configuration pointed the Cloudflare hostname to the DSM HTTPS service on the NAS:

tunnel: <tunnel-id>
credentials-file: /root/.cloudflared/<tunnel-id>.json

ingress:
  - hostname: <redacted-subdomain>.phoenix-comp.com
    service: https://<synology-lan-ip>:<dsm-port>
    originRequest:
      noTLSVerify: true
  - service: http_status:404

After manual testing succeeded, the tunnel was converted into a persistent detached Docker container:

docker run -d \
  --name cloudflared-synology \
  --restart unless-stopped \
  --user root \
  --network host \
  -v /volume1/docker/cloudflared:/root/.cloudflared \
  -v /volume1/docker/cloudflared/config:/etc/cloudflared \
  cloudflare/cloudflared:latest \
  tunnel --config /etc/cloudflared/config.yml run

Key lessons from that build:

Final result: DSM became reachable through Cloudflare Access without exposing the NAS directly to the public internet.

19. Cloudflare Access Policy Model

Once the tunnels were in place, the important control became the Cloudflare Access policy model. The tunnel creates the private path to the service, but Access determines who is allowed to use that path.

The policy model used three main controls:

Email OTP was unreliable and unnecessary in a Microsoft-heavy environment. Since the users and authentication controls already lived in Microsoft Entra, using Entra as the login method was the better fit.

This kept the access model consistent:

The end result was cleaner than relying on one-off OTP behavior. Administrative access was tied back to the same identity provider already used elsewhere in the environment, which made the policy easier to reason about and easier to maintain.

Access model:

User
  → Cloudflare Access
  → Microsoft / Entra authentication
  → MFA
  → Cloudflare Tunnel
  → Private service

That became the standard model for the tunneled administrative services in this project: private origin, Cloudflare-controlled route, and Microsoft-backed identity enforcement.

20. Entra App Registration for Cloudflare Access

Separate from the rclone OneDrive app registration, a second Microsoft Entra app registration was used for Cloudflare Access. The OneDrive app handled rclone authentication to Microsoft Graph, while this app handled identity federation between Cloudflare Access and Microsoft Entra.

The important distinction is that this app was not for backup data movement. It was for access control. Cloudflare used it as the Microsoft login method for protected applications such as the EC2 SSH tunnel, Windows RDP tunnel, and Synology DSM tunnel.

The redirect URI had to match the exact Cloudflare team domain callback path:

https://<cloudflare-team-name>.cloudflareaccess.com/cdn-cgi/access/callback

If that callback URI does not match exactly, the Microsoft login flow will fail even if the tunnel and Access policy are otherwise correct.

The Entra configuration model was:

The Cloudflare side then referenced this Entra app as the Microsoft identity provider. That allowed the Cloudflare Access policy model to use Microsoft-backed identity instead of relying on email OTP.

Cloudflare Access
  → Microsoft Entra login method
  → Entra app registration
  → Cloudflare callback URI
  → Access policy evaluation
  → Protected tunnel application

This gave the tunnel access layer a cleaner authentication model. Cloudflare still controlled the application policy, but Microsoft Entra handled the actual identity proof and MFA workflow.

21. Synology SSO to Microsoft Entra

After getting DSM safely behind Cloudflare Tunnel, the next logical step was getting rid of yet another standalone local login workflow and tying DSM into Microsoft Entra for home-lab SSO. The goal here was not domain join, file share mapping, or anything that would require an added-cost service. The goal was simple: use the existing Entra tenant for DSM login without bolting on a paid identity tier.

The first pass used OIDC. That got far enough to prove the Microsoft login itself worked, but it ran into claim-mapping problems on the DSM side. The Entra sign-in would succeed, but DSM would not map the returned identity to the local NAS account the way it needed to. In this case, the local DSM account needed to match the username value DSM expected, while the Microsoft sign-in identity was still naturally email-shaped.

The clean pivot was to move from OIDC to SAML. That exposed the much more useful Entra-side Attributes & Claims controls and made it possible to send the exact username value DSM needed.

The SAML app configuration ended up using:

Identifier (Entity ID): https://<redacted-subdomain>.phoenix-comp.com
Reply URL (ACS URL): https://<redacted-subdomain>.phoenix-comp.com
Sign on URL: https://<redacted-subdomain>.phoenix-comp.com

Unique User Identifier (NameID): user.mailnickname
NameID format: Unspecified
Microsoft Entra SAML configuration for Synology DSM Web

Microsoft Entra SAML configuration for Synology DSM Web, with tenant and hostname details redacted.

Microsoft Entra SAML claim mapping using user.mailnickname

Entra claim mapping configured the NameID to use user.mailnickname, matching the local DSM username.

That meant the assertion being returned from Entra contained the DSM-compatible username value:

<NameID Format="urn:oasis:names:tc:SAML:1.1:nameid-format:unspecified"><local-dsm-username></NameID>

At that point, the Entra side was correct. The sign-in logs all showed success, and the SAML assertion itself proved the NAS was being handed the exact username it needed.

The problem turned out to be on the DSM side and it was maddeningly specific: signature type.

DSM had originally been configured to expect:

Response signature verification: Sign SAML response

But the actual Entra SAML payload was signing the Assertion, not the outer Response. In other words, the username was right, the audience was right, the recipient was right, the Microsoft authentication was right, and DSM was still rejecting the login because it was validating the wrong layer of the SAML message.

The fix was to change DSM to verify the signed Assertion instead of only the signed Response. Once that was corrected, the login worked.

Synology DSM SAML signature verification setting

DSM had to verify the signed SAML assertion, not just the outer SAML response, before SSO would complete successfully.

That entire path is worth documenting for anyone trying to light up SSO in a home lab because it is exactly the kind of problem that wastes hours:

The practical lesson is simple: when troubleshooting SAML, do not stop at “the IdP says success.” Capture the assertion, inspect the NameID, inspect the Audience, inspect the Recipient, and confirm whether the identity provider is signing the Response, the Assertion, or both.

SAML troubleshooting checklist:

1. Confirm Entra sign-in success
2. Confirm DSM/local account username
3. Inspect NameID value
4. Confirm Audience / Entity ID
5. Confirm Recipient / ACS URL
6. Confirm whether the Response, Assertion, or both are signed
7. Match DSM signature verification mode to the actual signed SAML layer

For home lab builders, this is one of the bigger takeaways from the whole project: you can absolutely bring enterprise-style identity patterns into a personal environment without turning it into a six-figure science project. But once you do, you also inherit enterprise-style troubleshooting. The upside is that when you finally solve it, you end up with something a lot cleaner, safer, and frankly more satisfying than another pile of isolated local accounts.

22. Lessons Learned

23. Current State

This is no longer a lab experiment. It is an operating system.

24. Final Operating Model

Now

Later, after better connectivity

25. Closing

This build was not about “copying files to the cloud.” It was about building a system that reduces cost over time, tolerates imperfect first runs, uses identity instead of perimeter exposure, preserves recoverability, and keeps converging without constant manual intervention.

It combines AWS S3, Glacier Deep Archive, rclone, Microsoft Entra, Cloudflare Tunnel, Cloudflare Access, Synology, Docker, and a lot of stubborn troubleshooting into something that behaves like an actual backup and access architecture.

That system now exists.