341
edits
No edit summary |
No edit summary |
||
Line 11: | Line 11: | ||
==Approach:== | ==Approach:== | ||
To generate the point word-clouds, the texts are analyzed using the natural language processing technique ''word2vec''. The resutling vector space is of very high dimensionality, thus cannot be easily visualized. To reduce the high dimensional space to three.dimensions the method ''t-distributed stochastic neighbor embedding'' is used, which keeps words close together that were close in the high dimensional space. | To generate the point word-clouds, the texts are analyzed using the natural language processing technique ''word2vec''. The resutling vector space is of very high dimensionality, thus cannot be easily visualized. To reduce the high dimensional space to three.dimensions the method ''t-distributed stochastic neighbor embedding'' is used, which keeps words close together that were close in the high dimensional space. | ||
[[File:VRI-LEK-Epochs.PNG| | [[File:VRI-LEK-Epochs.PNG|none|1000px|Learning Process]] | ||
[[File:VRI-LEK-Graph.png|frame|500px|Resulting Graph]] | [[File:VRI-LEK-Graph.png|frame|500px|Resulting Graph]] | ||
The resulting data is than imported into unity using a ''csv''-file and for every data-point a billboard-text of the word is generated. This process is repeated for every text. | The resulting data is than imported into unity using a ''csv''-file and for every data-point a billboard-text of the word is generated. This process is repeated for every text. |
edits