Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sort lua components during serialization #6070

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

Gliese852
Copy link
Contributor

In the serialization of the Lua objects there is a system of "one definition", when all subsequent mention of the object only refer to it. It is vital that the full definition meets before the rest of the refs to the object, otherwise an error occurs.

The same objects can now meet in different lua components, and this means that we must ensure the order of processing procedure during saving and loading.

fixes #6064

In the serialization of the Lua objects there is a system of "one
definition", when all subsequent mention of the object only refer to it.
It is vital that the full definition meets before the rest of the refs
to the object, otherwise an error occurs.

The same objects can now meet in different lua components, and this
means that we must ensure the order of processing procedure during
saving and loading.

fixes pioneerspacesim#6064
@Gliese852
Copy link
Contributor Author

Gliese852 commented Feb 13, 2025

So I looked again and now I saw some promise that the values inside the json object are sorted:

pioneer/contrib/json/json.hpp

Lines 11543 to 11546 in 89d503a

- Internally, name/value pairs are stored in lexicographical order of the
names. Objects will also be serialized (see @ref dump) in this order.
For instance, `{"b": 1, "a": 2}` and `{"a": 2, "b": 1}` will be stored
and serialized as `{"a": 2, "b": 1}`.

I wonder, can we trust this and not even use std::set? On the one hand, this is put into the docs, on the other hand, there is the word "Internally".

@markdascher
Copy link

I like the explicit sort, just in case that library sorts things slightly differently than std::set

@sturnclaw
Copy link
Member

So I looked again and now I saw some promise that the values inside the json object are sorted:

pioneer/contrib/json/json.hpp

Lines 11543 to 11546 in 89d503a

- Internally, name/value pairs are stored in lexicographical order of the
names. Objects will also be serialized (see @ref dump) in this order.
For instance, `{"b": 1, "a": 2}` and `{"a": 2, "b": 1}` will be stored
and serialized as `{"a": 2, "b": 1}`.

I wonder, can we trust this and not even use std::set? On the one hand, this is put into the docs, on the other hand, there is the word "Internally".

There is a strong guarantee that the contents of a JSON object are lexicographically sorted when written to and read from disk, and that sort order is based on the container type used in the basic_json type definition. By default (which we use) all sorting of JSON keys uses std::less<std::string>.

The problem at hand is that we are potentially deserializing keys in a different order than we are serializing them, due to the instability of Lua table traversal. This requires a sorted order for keys when serializing, but as long as that sort is performed using std::less we can deserialize directly without an additional sort step.

Note that the data referred to by std::string_view key must be assumed to go out-of-scope as soon as the underlying string key is popped from the Lua stack. Lua does not provide any guarantees that string data is valid unless the string value is present on the stack. Thus, the std::set<...> componentNames should be std::set<std::string>.

@Gliese852
Copy link
Contributor Author

@sturnclaw

Note that the data referred to by std::string_view key must be assumed to go out-of-scope as soon as the underlying string key is popped from the Lua stack. Lua does not provide any guarantees that string data is valid unless the string value is present on the stack. Thus, the std::set<...> componentNames should be std::set<std::string>.

Indeed, very specifically written, thanks! But still, maybe we can count on the fact that these strings in a sense are still on the stack, because the table containing them is on the stack? Not that I am against std::string, I just want to know your opinion on this.

@sturnclaw
Copy link
Member

But still, maybe we can count on the fact that these strings in a sense are still on the stack, because the table containing them is on the stack?

That would be relying on undocumented behavior, and so I'd somewhat prefer to avoid doing so in the general case.

In this particular case however, because strings are uniquely interned in the global Lua string table (not actually a LUA_TTABLE) and IIRC have pointer identity, as long as the string value in question is considered reachable by the Lua GC it's probably safe to do so.

Regardless of std::string or std::string_view, I'd recommend pushing all component strings into a single std::vector and then performing a sort using the std::less comparator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Most savegames can not be loaded and crash the game
3 participants