AIME API Benchmark
Benchmark tool to test, monitor and compare the performance of running GPU workers with the AIME API Server. Sends a given number of asynchronous requests using the python client interface.
Start
Start the benchmark tool from the root directory of the AIME API Server repo with:
python3 run_api_benchmark.py
Optional command line parameters:
[-as, --api_server]
: Address of the AIME API Server. Default:http://0.0.0.0:7777
[-tr, --total_requests]
: Total number of requests. Choose a multiple of the worker’s batchsize to have a full last batch. Default:4
[-cr, --concurrent_requests]
: Number of concurrent asynchronous requests limited with asyncio.Semaphore(). Default:40
[-cf, --config_file]
: To change address of endpoint config file to get the default values of the job parameters.[-ep, --endpoint_name]
: Name of the endpoint. Default:llama2_chat
[-ut, --unit]
: Unit of the generated objects. Default: “tokens” if endpoint_name is “llama2_chat” else “images”[-t, --time_to_get_first_batch_jobs]
: Time in seconds after start to get the number of jobs in the first batch. Default:4
[-u, --user_name]
: User name to login on AIME API Server. Default:aime
[-k, --login_key]
: Login key related to the user name received from AIME to login on AIME API Server. Default:6a17e2a5b70603cb1a3294b4a1df67da
[-nu, --num_units]
: Number of units to generate. Images for stable_diffusion_xl_txt2img. Default:1