Benchmarking Hub 🤖

Welcome to the MaaS LLM/SLM) benchmarking hub. This project focuses on dual intent, providing key performance metrics regarding model latency/throughput, and quality. It's a one-stop shop for benchmarking MaaS, helping you make smarter decisions regarding the choice of foundation model for your AI projects based on an in-depth performance analysis.

What Makes This Project Different? 🚀

Intuitive User Interface: Our user-centric app simplifies benchmarking processes. Engage with performance and quality test results for models like "GPT" and delve into intricate visualizations. Easily customize this tool to customize results that are most useful to your specific situation.
Accelerated Integration: As the landscape of LLM/SLM technologies rapidly evolves, staying ahead becomes a challenge. Our project serves as an agile launchpad for benchmarking foundational models, significantly reducing the time-to-integration for the latest advancements. Equip your enterprise with the tools to swiftly adapt and implement cutting-edge AI technologies.
BYOP (Bring Your Own Prompt) for Custom Benchmarks: This feature enables the application of the benchmarking suite to your data, offering valuable insights into model performance on real-world problems as opposed to theoretical scenarios. It's an essential tool for enterprises and individuals aiming to assess the effectiveness of foundational models against their specific datasets and challenges.
Light Python SDK: Tailored for performance-centric evaluations, our SDK facilitates extensive analysis across latency, throughput, and a suite of quality metrics. Designed for bulk processing, it streamlines the assessment of multiple models simultaneously, ensuring a thorough comparison.
Built from Expertise-Driven Design for Large Enterprise AI Systems: Drawing from deep experience in building large-scale enterprise AI systems with special focuses on Azure OpenAI (AOAI) implementations. It guides you through best practices and effective troubleshooting strategies for latency, throughput, and various quality metrics to optimize performance later on.

How to Get Started 🔍

First things first, let's get your development environment set up:

Create a Conda Environment: First, you need to create a Conda environment using the environment.yaml file provided in the repository. Open your terminal and run:
```
conda env create -f environment.yaml
```

To activate your Conda environment and run your Streamlit application, follow these steps:

Activate Conda Environment: After creating your environment, activate it using the command:
```
conda activate <your_env_name>
```

Running the App 💻

To deploy your Streamlit application locally, follow these steps:

Ensure your development environment is set up and your Conda environment is activated.

Step 1: Launch the Application: To start your Streamlit app, navigate to the `src/app` directory in your terminal and execute:

```bash
streamlit run src/app/Home.py
```

The application should launch directly in your browser as a local host. Enjoy, but please...

Provide Feedback - Your Insights Fuel Our Growth

Encountered an issue or have suggestions for improvements? We want to hear from you! Please submit an issue on our GitHub repository. Your feedback is vital to our development process.

Running the SDK 💡

Ensure your development environment is set up and your Conda environment is activated.

Step 1: Define Test Parameters

First, you need to define the parameters for your test:

Deployment Names: An array of deployment names you wish to test.
Token Counts: A list of maximum token counts to test against each deployment.

deployment_names = ["YourModelName1", "YourModelName2"]
max_tokens_list = [100, 500, 700, 800, 900, 1000]

Step 2: Initialize the Testing Class

Depending on whether your test is for streaming or non-streaming deployments, initialize the appropriate class. Here's how to initialize for non-streaming:

from src.performance.aoaihelpers.latencytest import AzureOpenAIBenchmarkNonStreaming

client_non_streaming = AzureOpenAIBenchmarkNonStreaming(
    api_key="YOUR_AZURE_OPENAI_API_KEY",
    azure_endpoint="YOUR_AZURE_OPENAI_ENDPOINT",
    api_version="YOUR_AZURE_OPENAI_API_VERSION"
)

Step 3: 🛠️ Execute the Tests

Run the run_latency_benchmark_bulk method with your defined parameters:

await client_non_streaming.run_latency_benchmark_bulk(
    deployment_names, max_tokens_list, iterations=1, context_tokens=1000, multiregion=False
)

For detailed instructions on running throughput benchmarks, refer to HOWTO-Throughput.md. For guidance on latency benchmarks, see HOWTO-Latency.md.

Disclaimer

Important

This software is provided for demonstration purposes only. It is not intended to be relied upon for any purpose. The creators of this software make no representations or warranties of any kind, express or implied, about the completeness, accuracy, reliability, suitability or availability with respect to the software or the information, products, services, or related graphics contained in the software for any purpose. Any reliance you place on such information is therefore strictly at your own risk.

Name		Name	Last commit message	Last commit date
Latest commit History 80 Commits
.github		.github
.streamlit		.streamlit
devops		devops
my_utils		my_utils
notebooks		notebooks
src		src
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
01-run-benchmarking.ipynb		01-run-benchmarking.ipynb
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
SETTINGS.md		SETTINGS.md
environment.yaml		environment.yaml
requirements-codequality.txt		requirements-codequality.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Benchmarking Hub 🤖

What Makes This Project Different? 🚀

How to Get Started 🔍

Running the App 💻

Step 1: Launch the Application: To start your Streamlit app, navigate to the `src/app` directory in your terminal and execute:

Running the SDK 💡

Step 1: Define Test Parameters

Step 2: Initialize the Testing Class

Step 3: 🛠️ Execute the Tests

Disclaimer

About

Releases

Packages

Contributors 2

Languages

License

pablosalvador10/gbb-bench4ai

Folders and files

Latest commit

History

Repository files navigation

Benchmarking Hub 🤖

What Makes This Project Different? 🚀

How to Get Started 🔍

Running the App 💻

Step 1: Launch the Application: To start your Streamlit app, navigate to the src/app directory in your terminal and execute:

Running the SDK 💡

Step 1: Define Test Parameters

Step 2: Initialize the Testing Class

Step 3: 🛠️ Execute the Tests

Disclaimer

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Step 1: Launch the Application: To start your Streamlit app, navigate to the `src/app` directory in your terminal and execute:

Packages