The Analytics API response is returning duplicate entries in the response

I'm extracting Guru data using the List Analytics Data endpoint. 


The problem

The response includes duplicate entries, e.g. the below object will appear several times, with the exact same values for each element:

'type': 'search',
'eventType': 'search',
'eventDate': '2021-11-03T16:00:06.299+0000',
'user': <user email>,
'properties': {
'searchId': <search ID>,
'searchTerms': <search terms>,
'source': 'UI',
'numberOfResults': '0'

In the above example, where it is a ‘search’ type, I’ve been advised that when a search is triggered in the UI, it runs as a separate event for each collection the user has access to, i.e. the duplicate entries are the various collections that were searched.


We are using this response to store analytics data in a database, which can then be accessed by users in our business to report on. The issue for these users is that it is not intuitive as to why there are duplicate records in this database, e.g. we get asked:

are they genuine duplicates or was there an issue with the data transfer or an integrity issue in Guru.


The solution 

Ideally each element in the API response would be unique in some way, e.g. have a unique ID or a primary key which can be defined as a group of fields. 

In the above example, having adding a ‘collectionId’ object to the ‘properties’ element would do the trick; you could then see that for a given ‘searchId’ value, there are multiple ‘collectionId’ values and it would become easier to interpret the data.




