No edit summary |
(→Build:) |
||
(80 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
[[File:VRI-LEK-Front.png]] | |||
==Context:== | ==Context:== | ||
While there is a lot of talk about the obscure, dark or funny places on the internet, I wanted to shine light on places on the internet, where people talk about serious topics, search for genuine advice or motivation. Furthermore I want to explore the possibilities of analysing the spaces using word embeddings and presenting it in a way that would reflect the feeling that these communities convey. | |||
The communities I am talking about are various subreddits that all have one thing in common and that is that by partaking or reading the discussions one does feel a little bit of internet sentimentality while at the same time being fully aware that most of what is written there are empty phrases, words that are depraved of their meaning, thus loosing their meaning and only gaining a new one by the context they are presented in. | |||
And while that was an empty phrase as well, you are invited to walk the endless plains of ''The Internet is a calm and soothing place'' and explore the various accumulations of words and letters that form the space of serious internet conversations. | |||
---- | ---- | ||
==Concept:== | ==Concept:== | ||
The viewer experiences the word embeddings by wandering through a minimalistic procedural generated world. To guide the viewer towards the "word-spheres" the viewer is followed by a swarm of letters that live and get born if you are close to a word-sphere and die the further away you get from them. After interacting with a word-sphere, the viewer is presented with the word embedding and has now the chance to explore the word-relations by formulating sentences or searching for words that will be shown in the word-cloud. | |||
The world is empty and endless and the player feels anonymous and no matter the direction he goes, the flavour of the world around him might change, but the monolithic structures and the repetitive nature of the grassy hills around him stay the same forever. The subreddits analyzed are: | |||
[https://www.reddit.com/r/confession/ /r/confession] | |||
[https://www.reddit.com/r/GetMotivated/ /r/GetMotivated] | |||
[https://www.reddit.com/r/offmychest/ /r/offmychest] | |||
[https://www.reddit.com/r/quotes/ /r/quotes] | |||
[https://www.reddit.com/r/relationship_advice/ /r/relationship_advice] | |||
[https://www.reddit.com/r/relationships/ /r/relationships] | |||
[https://www.reddit.com/r/SeriousConversation/ /r/SeriousConversation] | |||
[https://www.reddit.com/r/UpliftingNew/ /r/UpliftingNew] | |||
[https://www.reddit.com/r/CasualConversation/ /r/CasualConversation] | |||
All these subreddits are analyzed separately and scattered as intractable sphere all over the generated world. When interacting with these spheres, the viewer sees a short description of the forum and then the according word-cloud. | |||
<gallery> | |||
File:VRI-LEK-1.png | |||
File:VRI-LEK-6.png | |||
File:VRI-LEK-3.png | |||
File:VRI-LEK-4.png | |||
</gallery> | |||
---- | ---- | ||
==Approach:== | ==Approach:== | ||
To analyze the relationship between words, the content of the subreddits had to be extracted using the ''pushift''-API to crawl all posts and comments and then merging them into one file per subreddit. | |||
[[File:VRI-LEK-RawJSONData.png|none|1000px|Raw JSON data]] | |||
For the texts to generate the word-clouds, the text of the posts and comments are extracted from the JSON files and are analyzed using the natural language processing technique ''word2vec''. | |||
[[File:VRI-LEK-Epochs.PNG|none|1000px|Learning Process]] | |||
The resulting vector space is of very high dimensionality, thus cannot be easily visualized. To reduce the high dimensional space to three dimensions the method ''t-distributed stochastic neighbor embedding'' is used, which keeps words close together that are close in the high dimensional space. | |||
[[File:VRI-LEK-Graph.png|none|500px|Resulting Graph]] | |||
The resulting data is then imported into unity using a ''csv''-file and for every data-point a billboard-text of the word is generated. This process is repeated for every text. | |||
[[File:VRI-LEK-5.png|none|500px]] | |||
The world in Unity that the viewer walks through is generated using tileable noise as displacement for a plane. As the user walks through the world, new chunks are generated on the fly thus giving the illusion of an infinite world. | |||
[[File:VRI-LEK-WorldGenerator.mp4|none|500px]] | |||
The grass, cuboids and word-spheres are generated and distributed per tile using seeded randomness. Every tile has its own noise and therefore distribution pattern of the objects, making the world even more endless. A swarm of letters using the boid-algorithm guides the player through the world and towards the word-spheres. The closer the view-direction of the viewer is to a word-sphere the more letters are in the swarm and the closer they fly to each other. The boid-algorithm simulates the behaviour of birds by enforcing simple rules for every boid like separation, alignment and cohesion towards all other boids. A compute-shader is used to speed up the simulation process. | |||
[[File:VRI-LEK-Boids.mp4|none|500px]] | |||
---- | |||
==Reflection / Outlook:== | |||
The interaction with the word-cloud would not work with a real VR-controller, text-input needs to work with a heads up keyboard and scale and rotation could work with two controllers. To fully benefit from the word-embeddings it would be great to be able to do simple arithmetics in the word-vector-space. Maybe by dragging and dropping words onto each other. | |||
To further underline the idea of an immersive walk, there should be ambient and interaction sound-effects. | |||
Distribution of things in the world should not be totally random and interaction should have consequences. | |||
---- | ---- | ||
==Videos== | |||
{{#ev:youtube|DK_HU3QfhTg|735}} | |||
{{#ev:youtube|_ngl4XmXYD8|735}} | |||
{{#ev:youtube|7c2ym-1aKjI|735}} | |||
{{#ev:youtube|mmFLRg5O5zk|735}} | |||
==Media:== | |||
===Wordcloud=== | |||
[[File:VRI-LEK-4.png|1000px]] | |||
===Boids=== | |||
[[File:VRI-LEK-1.png|1000px]] | |||
===Lettertrails=== | |||
[[File:VRI-LEK-2.png|1000px]] | |||
===Wordsphere=== | |||
[[File:VRI-LEK-3.png|1000px]] | |||
==/r/relationship== | |||
[[File:VRI-LEK-6.png|1000px]] | |||
==Build:== | |||
Proceed with caution! May load for two-three minutes, because of the large word-clud files. Not very optimised due to 30.000+ words beeing rendered. | |||
[https://drive.google.com/file/d/17MF2j1sobrcWl0_eN8dmqjfruaVi6d-A/view?usp=sharing Build] | |||
==Further Reading:== | ==Further Reading:== | ||
*[https://lvdmaaten.github.io/tsne/ TSNE by Laurens van der Maaten] | |||
*[https://arxiv.org/abs/1301.3781 word2vector Paper] | |||
*[http://www.red3d.com/cwr/boids/ Boids by Craig Reynolds] |
Latest revision as of 10:19, 5 November 2020
Context:
While there is a lot of talk about the obscure, dark or funny places on the internet, I wanted to shine light on places on the internet, where people talk about serious topics, search for genuine advice or motivation. Furthermore I want to explore the possibilities of analysing the spaces using word embeddings and presenting it in a way that would reflect the feeling that these communities convey. The communities I am talking about are various subreddits that all have one thing in common and that is that by partaking or reading the discussions one does feel a little bit of internet sentimentality while at the same time being fully aware that most of what is written there are empty phrases, words that are depraved of their meaning, thus loosing their meaning and only gaining a new one by the context they are presented in. And while that was an empty phrase as well, you are invited to walk the endless plains of The Internet is a calm and soothing place and explore the various accumulations of words and letters that form the space of serious internet conversations.
Concept:
The viewer experiences the word embeddings by wandering through a minimalistic procedural generated world. To guide the viewer towards the "word-spheres" the viewer is followed by a swarm of letters that live and get born if you are close to a word-sphere and die the further away you get from them. After interacting with a word-sphere, the viewer is presented with the word embedding and has now the chance to explore the word-relations by formulating sentences or searching for words that will be shown in the word-cloud. The world is empty and endless and the player feels anonymous and no matter the direction he goes, the flavour of the world around him might change, but the monolithic structures and the repetitive nature of the grassy hills around him stay the same forever. The subreddits analyzed are: /r/confession /r/GetMotivated /r/offmychest /r/quotes /r/relationship_advice /r/relationships /r/SeriousConversation /r/UpliftingNew /r/CasualConversation All these subreddits are analyzed separately and scattered as intractable sphere all over the generated world. When interacting with these spheres, the viewer sees a short description of the forum and then the according word-cloud.
Approach:
To analyze the relationship between words, the content of the subreddits had to be extracted using the pushift-API to crawl all posts and comments and then merging them into one file per subreddit.
For the texts to generate the word-clouds, the text of the posts and comments are extracted from the JSON files and are analyzed using the natural language processing technique word2vec.
The resulting vector space is of very high dimensionality, thus cannot be easily visualized. To reduce the high dimensional space to three dimensions the method t-distributed stochastic neighbor embedding is used, which keeps words close together that are close in the high dimensional space.
The resulting data is then imported into unity using a csv-file and for every data-point a billboard-text of the word is generated. This process is repeated for every text.
The world in Unity that the viewer walks through is generated using tileable noise as displacement for a plane. As the user walks through the world, new chunks are generated on the fly thus giving the illusion of an infinite world.
The grass, cuboids and word-spheres are generated and distributed per tile using seeded randomness. Every tile has its own noise and therefore distribution pattern of the objects, making the world even more endless. A swarm of letters using the boid-algorithm guides the player through the world and towards the word-spheres. The closer the view-direction of the viewer is to a word-sphere the more letters are in the swarm and the closer they fly to each other. The boid-algorithm simulates the behaviour of birds by enforcing simple rules for every boid like separation, alignment and cohesion towards all other boids. A compute-shader is used to speed up the simulation process.
Reflection / Outlook:
The interaction with the word-cloud would not work with a real VR-controller, text-input needs to work with a heads up keyboard and scale and rotation could work with two controllers. To fully benefit from the word-embeddings it would be great to be able to do simple arithmetics in the word-vector-space. Maybe by dragging and dropping words onto each other. To further underline the idea of an immersive walk, there should be ambient and interaction sound-effects. Distribution of things in the world should not be totally random and interaction should have consequences.
Videos
Media:
Wordcloud
Boids
Lettertrails
Wordsphere
/r/relationship
Build:
Proceed with caution! May load for two-three minutes, because of the large word-clud files. Not very optimised due to 30.000+ words beeing rendered.