Unable to configure Cross GRID Replication between two GRID
Applies to
StorageGRID 11.7 or earlier
Issue
The CGR (Cross Grid Replication) setup between two GRID's fails to establish. The grid federation connection cannot be approved, resulting in HTTP 500 errors
and routing failures between the VIPs of both sites.
Error Logs
127.0.0.1--[02/Jul/2025:10:01:24+0000] "POST /grid-federation/.../approve HTTP/1.1" 500 65 "-" "Faraday v1.1.0"
2025/07/02 10:01:21 [error] connect() failed (113: No route to host) while connecting to upstream...
1. Admin Node - Internal Server and Certificate Errors
Jul 2 09:51:22 <Node_Name> NMS: |2025-07-02T09:51:22.474| ERROR Unable to perform a Grid Federation connectivity check from <Source_IP> to <Destination_IP>
Jul 2 09:51:22 <Node_Name> NMS: |2025-07-02T09:51:22.474| ERROR Internal server error. The server encountered an error and could not complete your request. Try again. If the problem persists, contact support. Internal Server Error (MgmtApi::LocalizedRuntimeError)
...
Jul 2 09:59:30 <Node_Name> NMS: |2025-07-02T09:59:30.327| ERROR Connection test failed for connection <ID>
Jul 2 09:59:30 <Node_Name> NMS: |2025-07-02T09:59:30.327| ERROR Validation failed. No certificate exists on the destination grid. You must upload the verification file on the destination grid before you can test the connection. (MgmtApi::LocalizedValidationError)
2. Admin Node - Connection Timeout and DNS Issues
2025/07/02 09:53:05 [error] <ID>: *<Node_ID> upstream timed out (110: Connection timed out) while connecting to upstream, client: 127.0.0.1, server: _, request: "DELETE /grid-federation/<ID>/mgmt-api/grid-federation-connections/<ID> HTTP/1.1", upstream: "https://<IP>/grid-federation/<ID>/mgmt-api/grid-federation-connections/<ID>", host: "localhost:9999"
...
2025/07/02 09:53:10 [error] 1623554#1623554: *<Node_ID> [lua] grid_federation_balancer.lua:144: on_balance_phase(): DNS servers cannot resolve this domain name or Nginx cannot connect to any targets, exit while connecting to upstream, client: 127.0.0.1, server: _, request: "DELETE /grid-federation/<ID>/mgmt-api/grid-federation-connections/<ID> HTTP/1.1", upstream: "https://<IP>/grid-federation/<ID>/mgmt-api/grid-federation-connections/<ID>", host: "localhost:9999"
3. Peer Admin Node - Certificate and Routing Errors
Logs from the partner admin node show certificate mismatches and routing issues.
2025/07/02 09:50:19 [error] 1129487#1129487: *13309782 upstream SSL certificate does not match "<IP>" while SSL handshaking to upstream, client: 127.0.0.1, server: _, request: "POST /grid-federation/<ID>/mgmt-api/grid-federation-connections/<ID>/approve HTTP/1.1", upstream: "https://<IP>/grid-federation/<ID>/mgmt-api/grid-federation-connections/<ID>/approve", host: "localhost:9999"
...
2025/07/02 10:12:10 [error] 1173090#1173090: *13317988 connect() failed (113: No route to host) while connecting to upstream, client: 127.0.0.1, server: _, request: "POST /grid-federation/<IP>/mgmt-api/grid-federation-connections/<IP>/approve HTTP/1.1", upstream: "https://<IP>/grid-federation/dca8f799-7759-446a-a696-a0e5710740b1/mgmt-api/grid-federation-connections/dca8f799-7759-446a-a696-a0e5710740b1/approve", host: "localhost:9999"