-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[no sq] Split blockpos into 3 different columns in sqlite3 map database #15768
base: master
Are you sure you want to change the base?
Conversation
Note that this index will only allow SQLite to speed up the following (parts of) a query:
That is, a range query can essentially only be done along a single axis, and all preceding axes must be equality comparisons (or SQLite will effectively have to loop over all possible values for each), in the order the keys appear in the index. |
We can add more indexing if needed tho. Edit: we can however influence the order of the default index. (X, Z, Y) sounds most useful so I'll change that. |
While you are at it, we could also split out a block's timestamp into a separate column. |
If all our databases support z-order we can do approximate spatial indexing that way. (It's not hard to add it via plpgsql to Postgres for example, and then index on a function.) On the whole I would expect this change to slow things, but that's just a gut feeling. |
Overall I'm in favour of this PR, as it will significantly improve performance of my map analysis scripts, in particularly RocketLib Toolkit which includes many functions for iterating mapblocks in certain areas. https://bitbucket.org/sorcerykid/rocketlib/src/master/ Currently, I have to resort to generating a "Cache" database that just consists of a cross-reference table for indexing Block Position (X, Y, Z) to Block ID. This process can be extremely time-consuming, particularly on a map.sqlite that is several gigabytes. It would be so nice to finally eliminate that step entirely. My map viewer application also makes extensive use of position queries for rendering and navigating a density map along the X, Y, or Z axis. Currently, I have to scan the entire database and manually total up the selected mapblocks. But with separate columns for Block Position, this could be streamlined. |
Would make sense to do at once, but extracting out the timestamp has a whole other bunch of different implications.
I couldn't find much info on native support in databases for this but also I don't really see any use case. |
Can you point to where this happens? I'm genuinely curious, because I was under the impression mapblocks were only written back to the database if a flag was set for one or more changes (aside from just a new timestamp). After all, if a mapblock is already loaded in memory, then I would think the timestamp is also in memory. So It seems the solution is simply not to write the mapblock to the database if only the timestamp has changed. |
luanti/src/serverenvironment.cpp Lines 1434 to 1448 in dd0070a
Note that this only marks the block to be written when unloaded, not immediately. Someone could go and test how many byte of disk writes it would save if we could update the timestamp independently, for a typical SP or MP session. |
Created two worlds with same seed et al. Seen no measurable degradation in load speed. Good! |
Just rememberd f1349be :), before we marked every loaded block dirty. Now it's just unloaded active blocks. |
Now that I am looking a bit more, check this one: https://www.sqlite.org/lang_createtable.html#rowid, especially:
It seems all this time we had declared the sqlite table such that it had two INT keys. (Because we used INT instead of INTEGER to declared the PK). And when I change that INTEGER and create a new world (with same seed) I see that pure DB load time of a block goes from 6-7us to about 5-6us. In the scheme of things it won't make a visible difference, though. |
Wow. I can't say I'm a fan of all the subtle behaviors SQLite has (often justified with backwards compatibility). |
More of the sqlite quirkiness... According to this https://www.sqlite.org/rowidtable.html
So:
Is the same as:
And if so, we can replace the implicit rowid column with our pos column and maintain as before:
This way we get the benefit of the x, y, z columns and backwards compatibility. Need to be tested of course! And according to this: https://www.sqlite.org/lang_createtable.html it's identical to:
This also makes it clear that by adding the x, y, and z columns we essentially have just added an index. |
I'm sure that could be considered an elegant solution (by some measure), but I really don't want to make the "is this a new format or old format map" decision on a row-basis. It just complicates things for everyone and also literally breaks range queries. |
Looked at some DB sizes. Sames seed, and camera position and direction. viewing_range = 1000, mostly an open scene with some ocean. Loads about 8000 blocks.
sqlite seems to do well with this change. So while I'd prefer the old format (with the extra useless index removed for 20% space saving), I guess I'll be the only one. At least this one doesn't increase the size, and performance is on par as far as I can tell. |
This changes the map database schema for sqlite to be (x, y, z, data) instead of (pos, data).
Advantages:
Disadvantages:
To do
This PR is a Work in Progress
How to test