Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add release quality targets on the site #281

Open
denis-tingaikin opened this issue May 21, 2024 · 7 comments
Open

Add release quality targets on the site #281

denis-tingaikin opened this issue May 21, 2024 · 7 comments
Assignees

Comments

@denis-tingaikin
Copy link
Member

denis-tingaikin commented May 21, 2024

To make the process of release delivery more clear, transparent, safe, and stable, we should define the main release quality targets and the definition of done.

At this moment, it could be

  1. NSM should be stable for 24 hours in a high-load scenario. (This means no mem leaks.)
  2. What are the latency criteria?
  3. TODO: Add other criteria.
@denis-tingaikin denis-tingaikin changed the title Add release target qualities on the site Add release quality targets on the site May 21, 2024
@szvincze
Copy link

It is not easy to define the latency criteria because we just have application level diagrams:
NSM-latency-spikes

The first spikes came after ~3,5 hours. Then we could see that as time progressed the spikes appear more and more frequently.
So, something similar to the stability criteria would be good here as well. I mean latency should keep under a defined limit for the whole test period.

@denis-tingaikin
Copy link
Member Author

denis-tingaikin commented May 27, 2024

At this time, we can use this picture as an acceptable latency level.

image

I still don't like spikes here, but they can be handled and improved in the next releases. 

@edwarnicke I think in ideal, our latency should be 0–50 ms; could you say based on your experience the ideal latency level for NSM?

@szvincze
Copy link

As the diagram shows the latency is around and under 50 ms most of the times. Now we reached a point when the system can survive the infrequent latency spikes without disconnections and significant traffic loss, which is good. If we can stabilize the system in this situation and stop the memory increase then I think it is an acceptable status for releasing and we can work on improvements in the next releases.

@denis-tingaikin
Copy link
Member Author

denis-tingaikin commented Jun 4, 2024

v1.13.1-rc.3 datapath latency picture
image

@denis-tingaikin
Copy link
Member Author

v1.13.1-rc.3 memory usage

image

@denis-tingaikin
Copy link
Member Author

denis-tingaikin commented Jul 22, 2024

forwawrder memory consumption in high load <= v1.13.2:

image

forwarder memory consumption in high load tinden/cmd-forwarder-vpp:v1.13.2-fix.3

image

@denis-tingaikin
Copy link
Member Author

acceptable memory diff for the nsmgr after 27h of running.

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants