341
edits
No edit summary |
No edit summary |
||
Line 10: | Line 10: | ||
==Approach:== | ==Approach:== | ||
To generate the point word-clouds, the texts are analyzed using the natural language technique word2vec. The resutling vector space is of very high dimensionality, thus cannot be easily visualized. To reduce the high dimensional space to three.dimensions the method t-distributed stochastic neighbor embedding is used, which keeps words close together that were close in the high dimensional space. | To generate the point word-clouds, the texts are analyzed using the natural language processing technique ''word2vec''. The resutling vector space is of very high dimensionality, thus cannot be easily visualized. To reduce the high dimensional space to three.dimensions the method ''t-distributed stochastic neighbor embedding'' is used, which keeps words close together that were close in the high dimensional space. | ||
[[File:VRI-LEK-Epochs.PNG|400px]] | [[File:VRI-LEK-Epochs.PNG|400px]] | ||
[[File:VRI-LEK-Graph.png|400px]] | [[File:VRI-LEK-Graph.png|400px]] | ||
The resulting data is than imported into unity using a csv-file and for every data-point a billboard-text of the word is generated. This process is repeated for every text. | The resulting data is than imported into unity using a ''csv''-file and for every data-point a billboard-text of the word is generated. This process is repeated for every text. | ||
[[File:VRI-LEK-5.png|400px]] | [[File:VRI-LEK-5.png|400px]] | ||
edits