Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

curvefs/client: local cache policy optimization #2064

Merged
merged 1 commit into from
Nov 22, 2022

Conversation

Tangruilin
Copy link
Contributor

@Tangruilin Tangruilin commented Nov 12, 2022

What problem does this PR solve?

Issue Number: #2025
Issue Number: #1280
Problem Summary:

What is changed and how it works?

What's Changed:
add use ReadCache and some Related function.

How it Works:
Add useReadCache to decide if use read cache. If do, write to the read cache and update to s3 async.

Side effects(Breaking backward compatibility? Performance regression?):

Check List

  • Relevant documentation/comments is changed or added
  • I acknowledge that all my contributions will be made under the project's license

EXPECT_CALL(*diskCacheManager_, IsDiskUsedInited()).WillOnce(Return(true));
EXPECT_CALL(*diskCacheManager_, IsDiskCacheFull()).WillOnce(Return(false));
EXPECT_CALL(*diskCacheManager_, WriteReadDirect(_, _, _))
.WillOnce(Return(context->bufferSize));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should have EXPECT_CALL(*diskCacheRead_, WriteDiskFile(_, _, _)).WillOnce(Return(context->bufferSize)); here, but the test will fail with it(It seems not enter the func which will be done ).

@zhanghuidinah
Copy link
Member

Thanks for your contribution!

  • use git commit -s -a **** to satisfy DCO requirement
    image

if (!useDiskCache) {
s3ClientAdaptor_->GetS3Client()->UploadAsync(*iter);
} else {
if (useDiskCache) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we add useReadCache, maybe we need a new name of useDiskCache?

@@ -56,13 +56,37 @@ int DiskCacheManagerImpl::Init(const S3ClientAdaptorOption option) {
}

void DiskCacheManagerImpl::Enqueue(
std::shared_ptr<PutObjectAsyncContext> context) {
std::shared_ptr<PutObjectAsyncContext> context, bool isReadWrite) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isReadWrite may cause confusion? isReadCacheOnly may be better?

int DiskCacheManagerImpl::WriteReadDirectClosure(
std::shared_ptr<PutObjectAsyncContext> context) {
VLOG(9) << "WriteReadClosure start, name: " << context->key;
int ret = WriteReadDirect(context->key,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The write may have to be put after the s3 upload is successful, otherwise if the write is successful but the client exits before the upload is successful, the data will be inconsistent

@Tangruilin Tangruilin force-pushed the feature-#2025 branch 2 times, most recently from e1ae703 to fdd2804 Compare November 14, 2022 05:19
std::shared_ptr<PutObjectAsyncContext> context) {
VLOG(9) << "WriteReadClosure start, name: " << context->key;
// Upload to s3
int s3Ret = client_->Upload(context->key,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

synchronous upload may affects performance, maybe we can put the enqueue(WriteReadDirect) to PutObjectAsyncCallBack which is defined in DataCache::Flush?

and i think we don't need to care if the cache read disk is successfully written

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah,I will do this

@wuhongsong wuhongsong changed the title [Issue #2025] local cache curvefs/client: local cache policy optimization Nov 14, 2022
@wuhongsong
Copy link
Contributor

cpplint isd failed
image

image

@Tangruilin

context->startTime = butil::cpuwide_time_us();

S3ClientAdaptorOption s3AdaptorOption;
s3AdaptorOption.diskCacheOpt.threads = 5;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unit test is failed because code modification not covered, I guess it should be set here(s3AdaptorOption.diskCacheOpt.diskCache.diskCacheType=1)

@Tangruilin Tangruilin force-pushed the feature-#2025 branch 2 times, most recently from c6e55af to dce170e Compare November 15, 2022 12:27
Copy link
Contributor

@wuhongsong wuhongsong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good job, LGTM

@ilixiaocui ilixiaocui self-requested a review November 16, 2022 03:00
@ilixiaocui
Copy link
Contributor

Please synthesize a commit to submit.

image

@Tangruilin
Copy link
Contributor Author

Please synthesize a commit to submit.

image

@ilixiaocui done

return;
}
LOG(WARNING) << "Put object failed, key: " << context->key;
s3ClientAdaptor_->GetS3Client()->UploadAsync(context);
};

std::vector<std::shared_ptr<PutObjectAsyncContext>> uploadTasks;
bool useDiskCache =
bool useReadWriteCache =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

useReadCacheOnlyuseReadWriteCache
Is it possible to add an enum, like this:

enum CachePolicy {
    NCache,
    RCache,
    WRCache,
}

To express caching strategy?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A good idea.

So how about now?

Copy link
Contributor

@Cyber-SiKu Cyber-SiKu Nov 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A good idea.

So how about now?

IMO, the best way is bit operation, that is, 0 means NCache, 1 means RCache, 2 means WCache, and 3 means WRcache. Bitwise AND to calculate whether to open the corresponding operation.

enum CachePolicy {
    NCache = 0,
    RCache = 1<<0,
    WCache = 1<<1,
    WRCache=RCache|WCache,
};

Whether to enable read cache:

if (cachePoily & CachePolicy::RCache) {
    ....
}

But this will lose the readability of the code.
just an opinion

cachePoily & CachePolicy::RCache is not a good expression for enum class, and it will really lose the readability as you say. I think readability can be more important in function Flush.
Also, thanks for your suggestion, it's really a beautiful expression, I learn a lot

@Tangruilin
Copy link
Contributor Author

If this looks ok, I will rebase and commit again
@wuhongsong @ilixiaocui @Cyber-SiKu

@wuhongsong
Copy link
Contributor

recheck

@@ -2226,13 +2231,20 @@ CURVEFS_ERROR DataCache::Flush(uint64_t inodeId, bool toS3) {
<< ",Len:" << tmpLen << ",blockPos:" << blockPos
<< ",blockIndex:" << blockIndex;

bool useReadCacheOnly =
s3ClientAdaptor_->IsReadCache() &&
if ( s3ClientAdaptor_->IsReadCache() &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bool mayCache = s3ClientAdaptor_->HasDiskCache() && 
!s3ClientAdaptor_->GetDiskCacheManager()->IsDiskCacheFull() &&  !toS3 ;
if ( mayCache) {
      cachePoily = IsReadWriteCache()? cachePoily::WRCache:cachePoily::RCache;
} else {
      cachePoily = cachePoily::NCache;
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bool mayCache = s3ClientAdaptor_->HasDiskCache() && 
!s3ClientAdaptor_->GetDiskCacheManager()->IsDiskCacheFull() &&  !toS3 ;
if ( mayCache) {
      cachePoily = IsReadWriteCache()? cachePoily::WRCache:cachePoily::RCache;
} else {
      cachePoily = cachePoily::NCache;
}

I modify the code as this, and not the ut will fail, which is odd

@@ -2243,17 +2262,19 @@ CURVEFS_ERROR DataCache::Flush(uint64_t inodeId, bool toS3) {
}
VLOG(9) << "PutObjectAsyncCallBack: " << context->key
<< " pendingReq is: " << pendingReq;
if (cachePoily::RCache == cachePoily) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cachePoily::RCache == cachePoily || cachePoily::WRCache == cachePoily ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only ReadCache,WRCache will do s3ClientAdaptor_->GetDiskCacheManager()->Enqueue(*iter);,in which it will wirte to WCache before uploading then doing upload async.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only ReadCache,WRCache will do s3ClientAdaptor_->GetDiskCacheManager()->Enqueue(*iter);,in which it will wirte to WCache before uploading then doing upload async.

ok

return;
}
LOG(WARNING) << "Put object failed, key: " << context->key;
s3ClientAdaptor_->GetS3Client()->UploadAsync(context);
};

std::vector<std::shared_ptr<PutObjectAsyncContext>> uploadTasks;
bool useDiskCache =
bool useReadWriteCache =
Copy link
Contributor

@Cyber-SiKu Cyber-SiKu Nov 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A good idea.

So how about now?

IMO, the best way is bit operation, that is, 0 means NCache, 1 means RCache, 2 means WCache, and 3 means WRcache. Bitwise AND to calculate whether to open the corresponding operation.

enum CachePolicy {
    NCache = 0,
    RCache = 1<<0,
    WCache = 1<<1,
    WRCache=RCache|WCache,
};

Whether to enable read cache:

if (cachePoily & CachePolicy::RCache) {
    ....
}

But this will lose the readability of the code.
just an opinion

cachePoily & CachePolicy::RCache is not a good expression for enum class, and it will really lose the readability as you say. I think readability can be more important in function Flush.
Also, thanks for your suggestion, it's really a beautiful expression, I learn a lot

@wuhongsong
Copy link
Contributor

recheck

@wuhongsong
Copy link
Contributor

If this looks ok, I will rebase and commit again @wuhongsong @ilixiaocui @Cyber-SiKu

I have no further questions and have reviewed +1

@wuhongsong
Copy link
Contributor

recheck

Copy link
Contributor

@Cyber-SiKu Cyber-SiKu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@wuhongsong
Copy link
Contributor

unittest has failed @Tangruilin

@Tangruilin Tangruilin force-pushed the feature-#2025 branch 2 times, most recently from 796eac3 to e198f65 Compare November 18, 2022 08:34
@Tangruilin
Copy link
Contributor Author

recheck

@Tangruilin
Copy link
Contributor Author

unittest has failed @Tangruilin

It seems there are some problem, I'll slove it these days, I have try three time and it failed for two different reasons

@Tangruilin
Copy link
Contributor Author

recheck

@Tangruilin
Copy link
Contributor Author

recheck

1 similar comment
@Cyber-SiKu
Copy link
Contributor

recheck

@Tangruilin
Copy link
Contributor Author

recheck

@Tangruilin Tangruilin force-pushed the feature-#2025 branch 4 times, most recently from 9ad3e07 to dc59822 Compare November 21, 2022 13:15
@wuhongsong
Copy link
Contributor

recheck

@ilixiaocui ilixiaocui merged commit 7d9dd88 into opencurve:master Nov 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants