By using sampling, Google Analytics allows us to generate reports from a subset of the data, rather than using all of the data. In this way, Analytics can calculate the report data faster than if it used all the data to generate the report.
When generating a standard report, Google Analytics prepares the data by pre-calculating it and organizing it into tables, quickly obtaining this data without the need for sampling.
When sometimes we have to modify a standard report, adding segments or secondary dimensions, for example, or when we create a custom report with new combinations of dimensions and metrics, both from the interface and from the report APIs is when Analytics, after checking if it can process the report from the data found in the tables already processed, sees that it is not possible to access all of them in real time. It is then when it checks how many sessions should be included in the request and generates the report with this set of sessions.
In the case of a small number of sessions from which to calculate the requested data, it will use all of these sessions. However, if the number of sessions is very high, Google Analytics will use a sample to generate the report.
For example, if we create a custom report that includes the dimensions city and campaign and the metrics, sessions and conversion rate, which is a combination of metrics and dimensions that is not pre-calculated in any aggregated table, and we choose a time period that includes many sessions, the report will be calculated from a sample of the data from that period, with the number of sessions used being the sample size.
This size can be adjusted using a control in the Google Analytics reporting interface or by specifying the size when you send requests to the GA reporting APIs. In the case of increasing the sample size, more sessions will be included in the calculation, at the cost of increasing the response time, however, reducing the sample size will include fewer sessions in the calculation but the response time will be shorter.
Google Analytics establishes a maximum number of sessions that will be included to calculate the reports (50,000 sessions per day in standard Analytics and 75,000 sessions per day in Analytics Premium). If this number is exceeded, a sample of the data will be used for this calculation.
In order not to exceed the number of sessions, we can use shorter time periods when generating a report. In the case of Google Analytics Premium, it is possible to generate custom reports in which sampling is not used, even if the data exceeds the total limit from which sampling would be applied.
In short, session sampling is effective in reducing latency when generating a report and allows Google Analytics to process custom requests in an efficient way. So that at a specific time it can solve all the doubts you may have about what is happening on your website or in your application.
Don't go yet
I invite you to leave me your impressions and / or questions in the Contact Form And that I propose new topics that you would like to try in these tutorials. I Will Be happy to answer by email and write on this blog.