341
edits
No edit summary |
No edit summary |
||
Line 10: | Line 10: | ||
==Approach:== | ==Approach:== | ||
To generate the point word-clouds, the texts are analyzed using the natural language processing technique ''word2vec'' | To generate the point word-clouds, the texts are analyzed using the natural language processing technique ''word2vec''. | ||
[[File:VRI-LEK-Epochs.PNG|none|1000px|Learning Process]] | [[File:VRI-LEK-Epochs.PNG|none|1000px|Learning Process]] | ||
The resutling vector space is of very high dimensionality, thus cannot be easily visualized. To reduce the high dimensional space to three.dimensions the method ''t-distributed stochastic neighbor embedding'' is used, which keeps words close together that were close in the high dimensional space. | |||
[[File:VRI-LEK-Graph.png|frame|500px|Resulting Graph]] | [[File:VRI-LEK-Graph.png|frame|500px|Resulting Graph]] | ||
The resulting data is than imported into unity using a ''csv''-file and for every data-point a billboard-text of the word is generated. This process is repeated for every text. | The resulting data is than imported into unity using a ''csv''-file and for every data-point a billboard-text of the word is generated. This process is repeated for every text. |
edits