Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Notebooks 2.0 // Backend // Workspace Logs API // Fetch Workspace container logs #208

Open
Yael-F opened this issue Feb 12, 2025 · 4 comments
Assignees

Comments

@Yael-F
Copy link

Yael-F commented Feb 12, 2025

Summary

Develop an API endpoint to retrieve logs from a container running in a workspace.
This API should support retrieving both the current and 'previous' log and provide a real-time log view option.

Key features:

  1. Retrieve logs of a container in a workspace.
  2. Fetch real-time logs from a container.
  3. Support fetching logs from a previously terminated container run.
  4. Limit the size of the fetched logs.
  5. RBAC - only authorized users can view the logs
  6. No sensitive data is exposed.

Suggested Implementation:

  1. API Endpoints:

    • For log pagination and old log retrieval:
      PATH: GET /api/v1/workspaces/{namespace}/{name}/details/log
      Query Parameters:

      • container (required): The name of the container for which logs are requested.
      • previous (optional): Fetch logs from the previously terminated container.
      • offset (optional): The starting point for pagination. It indicates the number of lines to skip (default: end_of_file - limit or 0, whichever is larger).
      • limit (optional): The number of lines to retrieve (default: 100, max limit: 1,000).

      Response:
      { "logs": [], "pagination": { "offsetFrom": start_index, "offsetTo": end_index, "endOfFile": bool }

    • For log streaming:
      PATH: GET /api/v1/workspaces/{namespace}/{name}/details/log/stream
      Query Parameters:

      • container (required): The name of the container for which logs are requested.
      • sinceTime (optional): Retrieve logs after a specified timestamp.

      Response:
      { "logs": [], "referenceTimestamp": last_timestamp_sent }

  2. Real time log retrieval:
    Implement Server-Sent Events (SSE)/WS for streaming.

  3. Add 'previous' query parameter
    Adding query parameter to indicate previous log retrieval.
    The pod container list data should include for each container if 'previous log' exists, so the option is disabled for containers that doesn't have previous log. in case of API called for container with no previous log, the API returns:
    HTTP 400 Bad Request → "No previous logs available for this container."

  4. Size of the log
    Limit the response to a maximum of 1,000 lines of log data.

  5. RBAC
    Only workspace owner and contributors can view the logs.

  6. Sensitive data
    By default, no sensitive data will be filtered.
    Possible enhancement:
    Since container patterns are not predictable (We don't have control on which containers will run as part of the workspace), a potential solution is to add a LogFilterPatterns field to the WorkspaceKind and allow admins to configure filters for the workspace. These filters will be used by the backend to mask sensitive data.

  7. Log rotation
    Users will only have access to the latest logs and will not be able to access older logs after rotation.

@Yael-F
Copy link
Author

Yael-F commented Feb 13, 2025

/assign

@thesuperzapper
Copy link
Member

I think we should effectively proxy to the Kubernetes log API, but with our auth:

@thesuperzapper
Copy link
Member

Now that #210 is merged, the status of the Workspace resources will look something like this, which can be used by the backend to know if the pod exists, and what containers it has:

This new status field will look something like (when the Pod exists):

status:
  podTemplatePod:
    name: ws-jupyterlab-workspace-56mlf-0
    containers:
      - name: main
    initContainers: []

This new status field will look something like (when the Pod does not exist):

status:
  podTemplatePod:
    name: ""

@ederign
Copy link
Member

ederign commented Feb 18, 2025

@Yael-F, I believe that with the last PR from Mathew, we have all the data needed to proceed, right? Or do you need any further info?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Needs Triage
Development

No branches or pull requests

3 participants