Vector: Query vector
Vector: Query vector
A query vector is a tokenized version of a query sequence for a given timeframe. This method is still work in progress. Focus has been on obfuscating the sequences to avoid re-identification of the client, which will limit the effectiveness of the analysis. The token space is deliberately small and tokens are generated by a hash function where collisions are to be expected. Any new tokens found are submitted with the vectors.
Data
# | Name | Type | Required | Comment |
---|---|---|---|---|
1 | start_time | Timestamp | yes | Starting point for vector</tr> </tr> |
2 | duration | Integer | yes | Vector length in seconds</tr> </tr> |
3 | vectors | list<Bytestring> | yes | Vectors for all clients for the given time window. The vectors consist of tokens that are 32 bit long hashes of the word they represent</tr> </tr> |
4 | Wordlist delta | list<Bytestring> | yes | Wordlist for all tokens not on the default list, ie the list of new words</tr> </tr> </table> |