-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Scylla API for restore #4192
base: ml/scylla-api
Are you sure you want to change the base?
Conversation
3475e2f
to
c735fb2
Compare
55cb381
to
a572b27
Compare
We need to extend run progress with Scylla task ID. Moreover, we need it to be a part of the clustering key Otherwise, we won't be able to store multiple run progresses with the same remote sstable dir and host in the DB. That's why we need to re-create the table. It also allows us to rename manifest path and versioned progress to more suitable names. The downside is that users will lose their restore progress history on upgrade, but this shouldn't harm anybody if communicated properly.
It's good to have it specified in our codebase. E.g. the knowledge about the TOC component will be necessary for native Scylla restore.
This commit extends SSTable structure with its TOC component, which is needed when using Scylla restore API. Moreover, it introduces batch types, which are also needed for deciding, whether given batch can be restored with Scylla restore API or the Rclone API. It also makes sure that all SSTables within the same batch belong to the same batch type.
It's better to do it at the start so that: - we don't need to do it per restore type - we don't create and immediately return the batch
With native Scylla restore, we can gradually update restored bytes progress.
This commit adds new metrics: - restored_bytes - restore_duration which are counterparts of the similar downloaded and streamed metrics. It also adds new restore state (RestoreStateNativeRestore) which describes the usage of native Scylla restore
a572b27
to
da7a59b
Compare
This commit makes TestRestoreTablesPreparationIntegration check for the proper API used for restore.
ALTER TABLE backup_run_progress ADD scylla_task_id text; | ||
|
||
DROP TABLE restore_run_progress; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As an alternative to dropping progress history we can try to do smth like:
- create temp table
- copy data from restore_run_progress to temp table
- drop restore_run_progress
- create restore_run_progress with new primary key
- copy data from temp table to restore_run_progress
- drop temp table
It's an additional work, but then we can keep user history and we can safe some time trying to communicate why we need to drop table :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it was a backup/repair task, I would try harder to preserve task history, but I thought that dropping restore task history is fine for the users. Anyway, I will try to preserve it.
We even have some pkg/schema/migrate
pkg.
I'm not familiar with it, but it has some functions for rewriting the table, so it should be rather easy!
) | ||
|
||
// batchDispatcher is a tool for batching SSTables from | ||
// Workload across different hosts during restore. | ||
// It follows a few rules: | ||
// | ||
// - all SSTables within a batch have the same have the same batchType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo
// ValidateAllDispatched returns error if not all SSTables were dispatched. | ||
func (bd *batchDispatcher) ValidateAllDispatched() error { | ||
bd.mu.Lock() | ||
defer bd.mu.Unlock() | ||
|
||
for i, rdp := range bd.workloadProgress.remoteDir { | ||
if rdp.RemainingSize != 0 || len(rdp.RemainingSSTables) != 0 { | ||
failed := rdp.RemainingSize != 0 | ||
for _, ssts := range rdp.RemainingSSTables { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if i understand correctly, we can have a situation when RemaningSize is 0, but there are still some RemaningSSTables? How this can happen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It shouldn't. I just wanted to keep on checking both size and sstables count just to be on the safe side (as it was done before).
To be honest, checking size seems less robust (maybe 0 size sstable is not realistic, but perhaps we might encounter some bug where Rclone reports 0 size by mistake).
I think that we can get rid of relying on remaining size and use sstable count instead.
How do you find this idea?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think either one of these will work. if checking the sstables count is more safe, then we can use only it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Michal-Leszczynski !
I have no comments.
I see that latest scylla-enterprise:nightly-build supports backup/restore API and it's tested against all integration tests, what is v good.
@@ -31,6 +33,8 @@ func NewRestoreMetrics() RestoreMetrics { | |||
remainingBytes: g("Remaining bytes of backup to be restored yet.", "remaining_bytes", | |||
"cluster", "snapshot_tag", "location", "dc", "node", "keyspace", "table"), | |||
state: g("Defines current state of the restore process (idle/download/load/error).", "state", "cluster", "location", "snapshot_tag", "host"), | |||
restoredBytes: g("Restored bytes", "restored_bytes", "cluster", "host"), | |||
restoreDuration: g("Restore duration in ms", "restore_duration", "cluster", "host"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you need or want to use this metric for? What is the purpose of it ?
w.metrics.IncreaseRestoreStreamDuration(w.run.ClusterID, pr.Host, timeSub(pr.RestoreStartedAt, pr.RestoreCompletedAt, timeutc.Now())) | ||
w.metrics.IncreaseRestoreStreamDuration(w.run.ClusterID, pr.Host, timeSub(pr.DownloadCompletedAt, pr.RestoreCompletedAt, timeutc.Now())) | ||
w.metrics.IncreaseRestoredBytes(w.run.ClusterID, pr.Host, b.Size) | ||
w.metrics.IncreaseRestoreDuration(w.run.ClusterID, pr.Host, timeSub(pr.RestoreStartedAt, pr.RestoreCompletedAt, timeutc.Now())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pr *RunProgress
defines the progress of a single batch (set of SSTables), right ?
w.metrics.IncreaseRestoreDuration(w.run.ClusterID, pr.Host, timeSub(pr.RestoreStartedAt, pr.RestoreCompletedAt, timeutc.Now()))`
Increases the duration after every batch completion, but the context of the metric itself is the restore task executed on a given node.
Do you want to keep this metric to measure the time single node spent on the exact restore (downloading + l&s) ?
That's fine, but what information it provides at the end ? You want to show the "real-time" bandwidth per node ?
Right now, Scylla Manager shows the bandwidth when the sctool is called. What is the benefit of having it in metrics as well ?
On the other hand, it doesn't harm anyone...
OK, Actually it make sense to have possibility of for monitoring the ongoing restore without the need for calling sctool.
And it's done the same for RClone restore.
EDIT: I think it was added for "debug" reason. To see how much time node spent on downloading and how much on streaming, right ? Never the less, it's completely fine to keep them.
) | ||
|
||
// batchDispatcher is a tool for batching SSTables from | ||
// Workload across different hosts during restore. | ||
// It follows a few rules: | ||
// | ||
// - all SSTables within a batch have the same have the same batchType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// - all SSTables within a batch have the same have the same batchType | |
// - all SSTables within a batch have the same batchType |
This PR starts using the native Scylla restore API. The main focus points are as following:
restored_bytes
andrestore_duration
as a counterpart to the already existing download/stream metrics. See 63d029a for more info.Fixes #4144
Fixes #4137
Fixes #4142
Fixes #4194