Gradio

Query

Response

Query Emissions

How emissions are calculated

This widget is currently running Llama 3.2 3b at 16-bit precision on a T4 instance on Huggingface, and uses LLMCarbon to measure the overall power usage of the model during inference. LLMCarbon takes into account the location of the inference endpoint being used, and the average emissions per kWh of their power grid. The first figure returned by the model is directly reported from the LLMCarbon python package, and directly reflects the CO2e generated from the query.

The CO2e per user per day (the second returned result) is based on the average number of ChatGPT queries per user per day, and is derived from figures reported by demandsage - which works out to approximately 8 queries per day.

The total CO2e per day (the final returned result) is based on the average number of overall queries per day globally, which, according to demandsage is over 1 billion per day.

NOTE: commercial LLM deployments are likely to have significantly higher CO2e per query, given significantly higher parameter counts and proportionately more powerful hardware to run such large models. Rough approximations of their power usage based on smaller open-source models are difficult to make, given the lack of transparency around inference hardware, model architecture, and any other code or kernel-level optimisations being made by inference providers.

·

Built with Gradio logo

·