AIME API Benchmark
Benchmark tool to test, monitor and compare the performance of running GPU workers with the AIME API Server. Sends a given number of asynchronous requests using the python client interface.
Start
Start the benchmark tool from the root directory of the AIME API Server repo with:
python3 run_api_benchmark.py
Optional command line parameters:
[-as, --api_server]: Address of the AIME API Server. Default:http://0.0.0.0:7777[-tr, --total_requests]: Total number of requests. Choose a multiple of the worker’s batchsize to have a full last batch. Default:4[-cr, --concurrent_requests]: Number of concurrent asynchronous requests limited with asyncio.Semaphore(). Default:40[-cf, --config_file]: To change address of endpoint config file to get the default values of the job parameters.[-ep, --endpoint_name]: Name of the endpoint. Default:llama2_chat[-ut, --unit]: Unit of the generated objects. Default: “tokens” if endpoint_name is “llama2_chat” else “images”[-t, --time_to_get_first_batch_jobs]: Time in seconds after start to get the number of jobs in the first batch. Default:4[-u, --user_name]: User name to login on AIME API Server. Default:aime[-k, --login_key]: Login key related to the user name received from AIME to login on AIME API Server. Default:6a17e2a5b70603cb1a3294b4a1df67da[-nu, --num_units]: Number of units to generate. Images for stable_diffusion_xl_txt2img. Default:1