Documentation AIME API Benchmark

class run_api_benchmark.BenchmarkApiEndpoint

Benchmark tool to measure and monitor the performance of GPUs with multiple asynchronous requests on chat and stable_diffusion_xl_txt2img endpoints.

load_flags()

Parsing the command line arguments.

Returns:: The argparse object containing the command line arguments
Return type:: argparse.Namespace

run(): Starting the benchmark.

async process_progress_result(result)

Called when job progress is received from the API Server. Initializes or updates the job related progress bar, updates the title and measures the number of current running jobs.

Parameters:

progress_info (dict) – Job progress information containing the job_id and the progress state like number of generated tokens so far or percentage.
progress_data (dict) – The already generated content like tokens or interim images.

async process_job_result(result)

Called when the final job result is received. Removes the job related progress bar, processes information about the server and the worker and updates the title.

Parameters:: result (dict) – The final job result like a generated text, audio or images.

async finish_benchmark(): Printing the benchmark summary and the results.

async handle_first_batch(progress_info)

Detecting the jobs of the first batch to exclude them from the benchmark results.

Parameters:: progress_info (dict) – Job progress information containing the job_id and the progress state like number of generated tokens so far or percentage.

async update_header(result={}, init=False): Updating the title bars for the header containing information about the benchmark.

print_start_message(): Printing benchmark parameters at the start.

get_default_values_from_config()

Parsing the default job parameters from the related endpoint config file.

Returns:: Job parameters for API request.
Return type:: dict

async do_request_with_semaphore(): Limiting the concurrent requests using asyncio.Semaphore().

static get_unit(args)

Getting the unit of the generated objects like ‘tokens’ for llms and ‘images’ for image generators.

Returns:: The unit string of the generated objects
Return type:: str

static align_coloured_string(input_string, length)

Aligns a string to the specified length, considering ANSI color codes.

Parameters:

input_string (str) – The string containing ANSI color codes.
length (int) – The desired length of the string.

Returns:

The aligned string.

Return type:

str