Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Logging] Add timestamp_ns to core context #49259

Merged
merged 10 commits into from
Dec 20, 2024

Conversation

GeneDer
Copy link
Contributor

@GeneDer GeneDer commented Dec 13, 2024

Why are these changes needed?

Adding timestamp_ns to core context so all structured logs, including the ones from each library, will have this attribute.

Related issue number

Closes https://github.com/anyscale/product/issues/32223

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@GeneDer GeneDer added the go add ONLY when ready to merge, run all tests label Dec 13, 2024
Signed-off-by: Gene Su <[email protected]>
@GeneDer
Copy link
Contributor Author

GeneDer commented Dec 13, 2024

Started a workspace built with the wheel generated by this PR and seeing the astime and the log's timestamp matches
image

Also start to see the field timestamp_ns showing up in structured logs
image

Comment on lines 8 to 9
timestamp_ns = int(record.created * 1e9)
setattr(record, LogKey.TIMESTAMP_NS.value, timestamp_ns)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to see how this works with _setup_log_record_factory in python/ray/_private/ray_logging/logging_config.py cc @kevin85421

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sg, added a test case for it. The timestamp_ns still works as expected

…rd_factory() and in CoreContextFilter

Signed-off-by: Gene Su <[email protected]>
…) which applies to all log records

Signed-off-by: Gene Su <[email protected]>
@@ -1,6 +1,9 @@
import logging
import threading
from typing import Union
import time

LOG_KEY_TIMESTAMP_NS = "timestamp_ns"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to define LOG_KEY_TIMESTAMP_NS here instead of using the one in logging_config.py? Are there any circular dependencies?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, all those code runs in init, we can't import LogKey before all modules are fully initialized. The one way to break the circular dependencies is to define this first and reuse it everywhere

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. If @jjyao is okay with this, I am fine with it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to move the import to a local import within the _setup_log_record_factory function to avoid circular dependencies, if any?

Copy link
Contributor Author

@GeneDer GeneDer Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not possible, due to this was call when Ray is importing and we can't import partially imported modules

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But to be honest, we probably don't care about timestamp_ns being in LogKey or not, since it's not used anywhere expect for setting up log record factory here. Maybe we can just drop it from LogKey altogether if both of you don't mind

Copy link
Member

@kevin85421 kevin85421 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, it looks good to me. I left some minor comments.

@@ -89,3 +92,21 @@ def generate_logging_config():
# See https://github.com/ray-project/ray/pull/31858 for related PR
rllib_logger = logging.getLogger("ray.rllib")
rllib_logger.setLevel(logging.WARN)

# Set up the LogRecord factory.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separate the logic into a function so that it is easy to unit test, and we can write tests such as class TestSetupLogRecordFactory.

@@ -1,6 +1,9 @@
import logging
import threading
from typing import Union
import time

LOG_KEY_TIMESTAMP_NS = "timestamp_ns"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. If @jjyao is okay with this, I am fine with it.

@@ -1,6 +1,9 @@
import logging
import threading
from typing import Union
import time

LOG_KEY_TIMESTAMP_NS = "timestamp_ns"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jjyao has a slight concern about the collision of timestamp_ns, and suggested to use something like _ray_timestamp_ns instead.

@GeneDer mentioned that this is a backward incompatible change. Would you mind providing more details?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In product, we use timestamp_ns field to determine the timestamp at ingestion time and show on the UI. The existing version of Ray is already producing this field and will be used in the log viewer. Yes, we can accept a new field _ray_timestamp_ns, but we will always need to support timestamp_ns to be compatible with older versions of Ray regardless, and this adds more complexity on the product logic to needing to first check the existence of _ray_timestamp_ns and then check the existence of timestamp_ns before populating the timestamp.

Also, I don't think we should worry about the collision at all since with or without the changes in this PR, timestamp_ns is already populating the same way using log record factory. The only difference is before this change, the timestamp_ns field is only added when Ray Core's LoggingConfig is used, but after this change the timestamp_ns field will be added when Ray is imported.

Signed-off-by: Gene Su <[email protected]>
@jjyao jjyao merged commit fa50e74 into ray-project:master Dec 20, 2024
5 checks passed
@GeneDer GeneDer deleted the add-timestamp-ns-to-serve-logs branch December 20, 2024 17:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
go add ONLY when ready to merge, run all tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants