Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-44308: [C++][FS][Azure] Implement SAS token authentication #45021

Open
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

Tom-Newton
Copy link
Contributor

@Tom-Newton Tom-Newton commented Dec 13, 2024

Rationale for this change

SAS token auth is sometimes useful and it the last one we haven't implemented.

What changes are included in this PR?

  • Implement ConfigureSasCredential
  • Update AzureOptions::FromUri so that simply appending a SAS token to a blob storage URI works. e.g. AzureOptions::FromUri("abfs://[email protected]/?se=2024-12-12T18:57:47Z&sig=pAs7qEBdI6sjUhqX1nrhNAKsTY%2B1SqLxPK%2BbAxLiopw%3D&sp=racwdxylti&spr=https,http&sr=c&sv=2024-08-04")
    • SAS tokens are made up of a bunch of URI query parameters that I'm not sure we can exhaustively list.
    • Therefore we now assume that any unrecognised URI query parameters are assumed to be part of a SAS token, instead of returning an error status.
  • Update CopyFile to use StartCopyFromUri instead of CopyFromUri

Are these changes tested?

Yes

  • Added new tests for authenticating with SAS and doing some operations including CopyFile
  • Added new tests for AzureOptions::FromUri with a SAS token.

I also made sure to run the tests which connect to real blob storage.

Are there any user-facing changes?

  • SAS token in now supported
  • Unrecognised URI query parameters are ignored by AzureOptions::FromUri instead of failing fast. IMO this is a regression but still the best option to support SAS token.

@Tom-Newton Tom-Newton marked this pull request as ready for review December 13, 2024 17:07
// Assume these are part of a SAS token. Its not ideal to make such an assumption
// but given that a SAS token is a complex set of URI parameters, that could be
// tricky to exhaustively list I think its the best option.
credential_kind = CredentialKind::kSasToken;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have the SAS token specification that includes parameter names used by a SAS token, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I had a quick search and couldn't find what we need. If you think it's important I can try a bit harder. The closest I found seemed to be unabbreviated versions of what actually appears in the sas token.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cpp/src/arrow/filesystem/azurefs.h Outdated Show resolved Hide resolved
cpp/src/arrow/filesystem/azurefs.cc Outdated Show resolved Hide resolved
cpp/src/arrow/filesystem/azurefs.cc Outdated Show resolved Hide resolved
@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting review Awaiting review labels Dec 14, 2024
@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Dec 14, 2024
@Tom-Newton
Copy link
Contributor Author

Wait... I might have just accidentally worked out how to avoid any of the special authentication stuff for copying...

@Tom-Newton Tom-Newton marked this pull request as draft December 14, 2024 12:56
@Tom-Newton Tom-Newton marked this pull request as ready for review December 14, 2024 14:22
@@ -311,6 +321,15 @@ Status AzureOptions::ConfigureAccountKeyCredential(const std::string& account_ke
return Status::OK();
}

Status AzureOptions::ConfigureSasCredential(const std::string& sas_token) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Status AzureOptions::ConfigureSasCredential(const std::string& sas_token) {
Status AzureOptions::ConfigureSASCredential(const std::string& sas_token) {

@@ -690,6 +690,36 @@ class TestAzureOptions : public ::testing::Test {
ASSERT_EQ(options.credential_kind_, AzureOptions::CredentialKind::kEnvironment);
}

void TestFromUriCredentialSasToken() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
void TestFromUriCredentialSasToken() {
void TestFromUriCredentialSASToken() {

// We use StartCopyFromUri instead of CopyFromUri because it supports blobs larger
// than 256 MiB and it doesn't require generating a SAS token to authenticate
// reading a source blob in the same storage account.
auto copy_operation = dest_blob_client.StartCopyFromUri(src_url);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow! I should have found it when I implement this...

// than 256 MiB and it doesn't require generating a SAS token to authenticate
// reading a source blob in the same storage account.
auto copy_operation = dest_blob_client.StartCopyFromUri(src_url);
copy_operation.PollUntilDone(std::chrono::milliseconds(1000));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a comment what std::chrono::milliseconds(1000) means?

@github-actions github-actions bot removed the awaiting change review Awaiting change review label Dec 15, 2024
@github-actions github-actions bot added the awaiting changes Awaiting changes label Dec 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants