Documentation AIME API Benchmark
- class run_api_benchmark.BenchmarkApiEndpoint
Benchmark tool to measure and monitor the performance of GPUs with multiple asynchronous requests on llama2_chat and stable_diffusion_xl_txt2img endpoints.
- load_flags()
Parsing the command line arguments.
- Returns:
The argparse object containing the command line arguments
- Return type:
argparse.Namespace
- run()
Starting the benchmark.
- async progress_callback(progress_info, progress_data)
Called when job progress is received from the API Server. Initializes or updates the job related progress bar, updates the title and measures the number of current running jobs.
- Parameters:
progress_info (dict) – Job progress information containing the job_id and the progress state like number of generated tokens so far or percentage.
progress_data (dict) – The already generated content like tokens or interim images.
- async result_callback(result)
Called when the final job result is received. Removes the job related progress bar, processes information about the server and the worker and updates the title.
- Parameters:
result (dict) – The final job result like a generated text, audio or images.
- print_benchmark_summary_string()
Printing the benchmark summary and the results.
- async handle_first_batch(progress_info)
Detecting the jobs of the first batch to exclude them from the benchmark results.
- Parameters:
progress_info (dict) – Job progress information containing the job_id and the progress state like number of generated tokens so far or percentage.
- update_title(result=None)
Updating the title bars for the header containing information about the benchmark.
- update_worker_and_endpoint_data_in_title(result={})
Updating the title bars with information about the API server and the workers.
- make_benchmark_result_string()
Making string containing mean benchmark results.
- Returns:
Result string
- Return type:
str
- print_start_message()
Printing benchmark parameters at the start.
- get_default_values_from_config()
Parsing the default job parameters from the related endpoint config file.
- Returns:
Job parameters for API request.
- Return type:
dict
- async do_request_with_semaphore()
Limiting the concurrent requests using asyncio.Semaphore().
- static get_unit(args)
Getting the unit of the generated objects like ‘tokens’ for llama2_chat and ‘images’ image generators.
- Returns:
The unit string of the generated objects
- Return type:
str