Use Scylla API for restore #4192

Michal-Leszczynski · 2025-01-08T12:54:08Z

This PR starts using the native Scylla restore API. The main focus points are as following:

We needed to drop and re-create restore run progress table, because we needed to include Scylla task ID as the clustering key. We can't reuse the agent job ID field for this purpose, because its an integer and Scylla task ID is a string. Changing clustering key via alter statement is not allowed, so that's why the table was re-created. This will need to be clearly communicated in the release notes. See 39011de for more info.
This PR introduces batch types. They aim to simplify the decision of which API should be used for restoring given batch, as the integer and versioned SSTables are still restored with the Rclone API. See 2efd8b4 for more info.
This PR also adds new restore metrics: restored_bytes and restore_duration as a counterpart to the already existing download/stream metrics. See 63d029a for more info.

Fixes #4144
Fixes #4137
Fixes #4142
Fixes #4194

We need to extend run progress with Scylla task ID. Moreover, we need it to be a part of the clustering key Otherwise, we won't be able to store multiple run progresses with the same remote sstable dir and host in the DB. That's why we need to re-create the table. It also allows us to rename manifest path and versioned progress to more suitable names. The downside is that users will lose their restore progress history on upgrade, but this shouldn't harm anybody if communicated properly.

It's good to have it specified in our codebase. E.g. the knowledge about the TOC component will be necessary for native Scylla restore.

This commit extends SSTable structure with its TOC component, which is needed when using Scylla restore API. Moreover, it introduces batch types, which are also needed for deciding, whether given batch can be restored with Scylla restore API or the Rclone API. It also makes sure that all SSTables within the same batch belong to the same batch type.

Ref #4181

It's better to do it at the start so that: - we don't need to do it per restore type - we don't create and immediately return the batch

This commit adds code for using Scylla restore API. Luckily for us, handling pause/resume is analogous to the Rclone API handling. Fixes #4144 Fixes #4137

With native Scylla restore, we can gradually update restored bytes progress.

This commit adds new metrics: - restored_bytes - restore_duration which are counterparts of the similar downloaded and streamed metrics. It also adds new restore state (RestoreStateNativeRestore) which describes the usage of native Scylla restore

Fixes #4142

This commit makes TestRestoreTablesPreparationIntegration check for the proper API used for restore.

VAveryanov8 · 2025-02-06T09:55:12Z

schema/v3.5.0.cql

+ALTER TABLE backup_run_progress ADD scylla_task_id text;
+
+DROP TABLE restore_run_progress;


As an alternative to dropping progress history we can try to do smth like:

create temp table

copy data from restore_run_progress to temp table

drop restore_run_progress

create restore_run_progress with new primary key

copy data from temp table to restore_run_progress

drop temp table

It's an additional work, but then we can keep user history and we can safe some time trying to communicate why we need to drop table :)

If it was a backup/repair task, I would try harder to preserve task history, but I thought that dropping restore task history is fine for the users. Anyway, I will try to preserve it.

We even have some pkg/schema/migrate pkg.
I'm not familiar with it, but it has some functions for rewriting the table, so it should be rather easy!

VAveryanov8 · 2025-02-06T10:00:09Z

pkg/service/restore/batch.go

 )

 // batchDispatcher is a tool for batching SSTables from
 // Workload across different hosts during restore.
 // It follows a few rules:
 //
+// - all SSTables within a batch have the same have the same batchType


VAveryanov8 · 2025-02-06T10:20:33Z

pkg/service/restore/batch.go

 // ValidateAllDispatched returns error if not all SSTables were dispatched.
 func (bd *batchDispatcher) ValidateAllDispatched() error {
 	bd.mu.Lock()
 	defer bd.mu.Unlock()

 	for i, rdp := range bd.workloadProgress.remoteDir {
-		if rdp.RemainingSize != 0 || len(rdp.RemainingSSTables) != 0 {
+		failed := rdp.RemainingSize != 0
+		for _, ssts := range rdp.RemainingSSTables {


if i understand correctly, we can have a situation when RemaningSize is 0, but there are still some RemaningSSTables? How this can happen?

It shouldn't. I just wanted to keep on checking both size and sstables count just to be on the safe side (as it was done before).

To be honest, checking size seems less robust (maybe 0 size sstable is not realistic, but perhaps we might encounter some bug where Rclone reports 0 size by mistake).

I think that we can get rid of relying on remaining size and use sstable count instead.
How do you find this idea?

Yes, I think either one of these will work. if checking the sstables count is more safe, then we can use only it.

karol-kokoszka

Thanks @Michal-Leszczynski !
I have no comments.

I see that latest scylla-enterprise:nightly-build supports backup/restore API and it's tested against all integration tests, what is v good.

karol-kokoszka · 2025-02-11T14:34:11Z

pkg/metrics/restore.go

@@ -31,6 +33,8 @@ func NewRestoreMetrics() RestoreMetrics {
 		remainingBytes: g("Remaining bytes of backup to be restored yet.", "remaining_bytes",
 			"cluster", "snapshot_tag", "location", "dc", "node", "keyspace", "table"),
 		state:            g("Defines current state of the restore process (idle/download/load/error).", "state", "cluster", "location", "snapshot_tag", "host"),
+		restoredBytes:    g("Restored bytes", "restored_bytes", "cluster", "host"),
+		restoreDuration:  g("Restore duration in ms", "restore_duration", "cluster", "host"),


What do you need or want to use this metric for? What is the purpose of it ?

karol-kokoszka · 2025-02-11T14:45:43Z

pkg/service/restore/tablesdir_worker.go

-	w.metrics.IncreaseRestoreStreamDuration(w.run.ClusterID, pr.Host, timeSub(pr.RestoreStartedAt, pr.RestoreCompletedAt, timeutc.Now()))
+	w.metrics.IncreaseRestoreStreamDuration(w.run.ClusterID, pr.Host, timeSub(pr.DownloadCompletedAt, pr.RestoreCompletedAt, timeutc.Now()))
+	w.metrics.IncreaseRestoredBytes(w.run.ClusterID, pr.Host, b.Size)
+	w.metrics.IncreaseRestoreDuration(w.run.ClusterID, pr.Host, timeSub(pr.RestoreStartedAt, pr.RestoreCompletedAt, timeutc.Now()))


pr *RunProgress defines the progress of a single batch (set of SSTables), right ?

w.metrics.IncreaseRestoreDuration(w.run.ClusterID, pr.Host, timeSub(pr.RestoreStartedAt, pr.RestoreCompletedAt, timeutc.Now()))`

Increases the duration after every batch completion, but the context of the metric itself is the restore task executed on a given node.

Do you want to keep this metric to measure the time single node spent on the exact restore (downloading + l&s) ?

That's fine, but what information it provides at the end ? You want to show the "real-time" bandwidth per node ?
Right now, Scylla Manager shows the bandwidth when the sctool is called. What is the benefit of having it in metrics as well ?

On the other hand, it doesn't harm anyone...

OK, Actually it make sense to have possibility of for monitoring the ongoing restore without the need for calling sctool.

And it's done the same for RClone restore.

EDIT: I think it was added for "debug" reason. To see how much time node spent on downloading and how much on streaming, right ? Never the less, it's completely fine to keep them.

karol-kokoszka · 2025-02-11T15:40:26Z

pkg/service/restore/batch.go

 )

 // batchDispatcher is a tool for batching SSTables from
 // Workload across different hosts during restore.
 // It follows a few rules:
 //
+// - all SSTables within a batch have the same have the same batchType


Suggested change

// - all SSTables within a batch have the same have the same batchType

// - all SSTables within a batch have the same batchType

Michal-Leszczynski force-pushed the ml/restore-scylla-api branch 4 times, most recently from 3475e2f to c735fb2 Compare January 9, 2025 16:18

Michal-Leszczynski force-pushed the ml/restore-scylla-api branch 2 times, most recently from 55cb381 to a572b27 Compare January 30, 2025 08:19

Michal-Leszczynski added 13 commits January 30, 2025 15:53

feat(sstable): add SSTable component description

d8c54f1

It's good to have it specified in our codebase. E.g. the knowledge about the TOC component will be necessary for native Scylla restore.

feat(scyllaclient): add method for Scylla restore API

c73371c

feat(agent): resolve host name in proxy for Scylla restore API

bd8180b

feat(restore): add nodeInfo helper method

55aa3f0

Ref #4181

feat(restore): check disk size before batch dispatch

2c5e263

It's better to do it at the start so that: - we don't need to do it per restore type - we don't create and immediately return the batch

feat(restore): use Scylla restore API

0d62fb0

This commit adds code for using Scylla restore API. Luckily for us, handling pause/resume is analogous to the Rclone API handling. Fixes #4144 Fixes #4137

chore(restore_test): adjust tests to the Scylla restore API

055d2ad

feat(restore): adjust native Scylla restore progress update

bb0bba0

With native Scylla restore, we can gradually update restored bytes progress.

feat(restore): update metrics for Rclone restore

773cf1e

feat(restore): update metrics for native Scylla restore

da7a59b

Fixes #4142

Michal-Leszczynski force-pushed the ml/restore-scylla-api branch from a572b27 to da7a59b Compare January 30, 2025 15:18

feat(restore_test): extend preparation test with API checks

a21b97c

This commit makes TestRestoreTablesPreparationIntegration check for the proper API used for restore.

Michal-Leszczynski marked this pull request as ready for review January 31, 2025 12:39

Michal-Leszczynski requested a review from karol-kokoszka as a code owner January 31, 2025 12:39

Michal-Leszczynski requested a review from VAveryanov8 January 31, 2025 12:39

Michal-Leszczynski mentioned this pull request Feb 3, 2025

Set Scylla user task TTL when using native Scylla API #4238

Draft

VAveryanov8 reviewed Feb 6, 2025

View reviewed changes

karol-kokoszka approved these changes Feb 11, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use Scylla API for restore #4192

Use Scylla API for restore #4192

Michal-Leszczynski commented Jan 8, 2025 •

edited

Loading

VAveryanov8 Feb 6, 2025

Michal-Leszczynski Feb 6, 2025

VAveryanov8 Feb 6, 2025

VAveryanov8 Feb 6, 2025

Michal-Leszczynski Feb 6, 2025

VAveryanov8 Feb 6, 2025 •

edited

Loading

karol-kokoszka left a comment

karol-kokoszka Feb 11, 2025

karol-kokoszka Feb 11, 2025

karol-kokoszka Feb 11, 2025

		ALTER TABLE backup_run_progress ADD scylla_task_id text;

		DROP TABLE restore_run_progress;

	// - all SSTables within a batch have the same have the same batchType
	// - all SSTables within a batch have the same batchType

Use Scylla API for restore #4192

Are you sure you want to change the base?

Use Scylla API for restore #4192

Conversation

Michal-Leszczynski commented Jan 8, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

VAveryanov8 Feb 6, 2025 • edited Loading

Choose a reason for hiding this comment

karol-kokoszka left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Michal-Leszczynski commented Jan 8, 2025 •

edited

Loading

VAveryanov8 Feb 6, 2025 •

edited

Loading