-
Categories
-
Pharmaceutical Intermediates
-
Active Pharmaceutical Ingredients
-
Food Additives
- Industrial Coatings
- Agrochemicals
- Dyes and Pigments
- Surfactant
- Flavors and Fragrances
- Chemical Reagents
- Catalyst and Auxiliary
- Natural Products
- Inorganic Chemistry
-
Organic Chemistry
-
Biochemical Engineering
- Analytical Chemistry
-
Cosmetic Ingredient
- Water Treatment Chemical
-
Pharmaceutical Intermediates
Promotion
ECHEMI Mall
Wholesale
Weekly Price
Exhibition
News
-
Trade Service
Recently, the team of Professor Zhu Songchun and Zhu Yixin of the Institute of Artificial Intelligence published a paper "Emergent Graphical Conventions in a Visual Communication Game"
at NeurIPS 2022.
The thesis work used the game of you draw and guess to emerge and evolve a new type of graphical symbol system for the first time computationally, and at the same time proposed three graphical symbol properties: pictographic, symbolic and semantic.
The experimental results show that the cooperative training of the players and the receiver's permission to terminate the game early and the interactive timing communication between the two can encourage the newly formed graphical symbology system to retain pictographic and semantic while having high symbolism
.
This work provides a new computing framework and ideas
for studying the origin and evolution of human language and writing.
Cognitive science research believes that the formation of writing systems is a process from pictographic icons to abstract symbols [1].
As shown, when describing the sun, human ancestors used sketches to get as close as possible to what the sun looked like in nature [2].
In this process, people gradually established a connection
between visual concepts and pictograms.
In subsequent communications, these icons
are repeated whenever people need to describe the sun.
In order to improve the efficiency of communication, these icons would become simple and abstract, gradually forming the hieroglyphic writing system
we have today.
During the study, cognitive scientists used the "you draw me guess" game to simulate this process [3].
People must use sketches to communicate at the beginning of the game, and as the game progresses, people will continue to see what has been communicated before, and finally the experimental results show that people will form a new symbology
between two people after continuous communication iterations.
In the figure below, when representing the British Parliament, the player will first draw the premises of the parliament and the national flag in detail, and after running-in and refining, it will be directly represented as curves and circles
.
Also in the presentation of the opera (Soap Opera), the sketch first describes "Soap" and "Opera" figuratively, simplifying it into a square and a line
.
This paper simulates the formation process of graphical symbol systems by training two agents to play the game of "you draw and I guess", explores the reconciliation and balance of accuracy and efficiency in the formation of abstract characters, and verifies the necessary environmental factors
for the formation of human graphical symbol systems.
As shown in the figure, we describe "you draw me guess" as a multi-agent timing decision game, each round of the game has two players, one is the sender, can observe the target to be communicated this time (a common visual concept, such as rabbit, cup), the other is the receiver, can observe a set of pictures (one of the pictures fits the category to be communicated), need to guess which picture is the target
of this game to communicate through the sender's drawing.
At each time step, the sender continues to complete the painting on the canvas based on the goal; After the receiver observes the newly added stroke, they decide whether to ask the sender to continue drawing or make a judgment
.
The game is terminated
when the recipient makes a judgment, or the wait time exceeds the maximum step size of the game.
After the game is terminated, both sides will receive a reward/penalty of +1/-1, and in order to encourage players to communicate more efficiently, the reward/punishment will be multiplied by the decay factor γ according to the total game step, and finally the two players will receive γ t/-γt in return
.
The training goal of the sender and receiver is to maximize the final game score
.
At the same time, we smooth the entire convergence process
by qualifying traces [4].
We explore the influence of the following environmental factors on symbology evolution:
1) whether to cooperate in training;
2) whether the recipient can choose to terminate the game;
3) Whether the two have interactive timing communication
.
Fixing each factor, we designed an experimental group complete and four control groups:
1) sender-fixed: the sender's model parameters are not updated, controlling the cooperative training factors;
2) max-step: The receiver cannot end the game early, and the control factor for the receiver to choose to terminate the game;
3) one-step: two players can only communicate one time step at a time, controlling interactive timing communication factors;
4) retrieve: The sender's model parameters are not updated, and the receiver cannot end the game early, which is equivalent to the situation
where there is no communication between the two parties.
Since the fourth setting does not communicate, the sketch does not simplify, and the sketch produced in this setting has the highest pictography, and we set its experimental result to the upper bound
that communication can reach.
We also show how the painting changes during training (each image from left to right is a sketch with 0 to 30,000 iterations).
It can be seen that the sketch has gone through a process of complexity to simplicity, and for paintings of the same category, the sender can consistently emphasize the most obvious features
in the category.
As shown in the picture, the sketch will emphasize the rabbit's ears; Even though the giraffe in the picture is in a different pose, especially in the third image, the giraffe bends its neck, but the sketch will still emphasize the giraffe's long vertical neck
.
Communication success rate and communication efficiency
We first verify the effectiveness of
the designed training framework through communication success rate and communication efficiency.
1.
Communication success rate: We assume that when the communication accuracy rate is greater than 80%, a new communication system
is formed between agents.
As shown in Figure (a), the agents have formed new communication systems in other experimental settings except one-step, which indicates that the training framework can enable the agent to communicate successfully, and illustrates the importance of
interactive timing communication factors.
2.
Communication length: In the results of human experiments, after repeated communication, the number of strokes required for drawing will become less and less
.
As shown in Figure (b), for settings that can change the communication length (complete, sender-fixed), the communication length will gradually decrease, which indicates that the implicit rewards and punishments designed can prompt the agent to reduce the communication length to improve communication efficiency
.
3.
Accuracy vs efficiency: There may be two reasons for the reduction of the communication length of the agent: one is to improve the communication efficiency while ensuring the accuracy rate, and the other is to converge to short communication because it is difficult to learn in long communication
.
The first of these is the reason
for the envisionment.
During the training process, the receiver is tested to judge the accuracy of sketches with the number of strokes drawn by the sender with the number of strokes 1, 3, 5, and 7
.
AS SHOWN IN FIGURE (C) CUMULATIVE, THE TEST RESULTS (USING REINFORCE TRAINING AS A BASELINE FOR COMPARISON), THE MORE STROKES THE SKETCH ACCURACY IS LOWER, INDICATING THAT THE REDUCTION IN COMMUNICATION LENGTH IS DUE TO THE INABILITY TO LEARN UPDATES
IN LONGER COMMUNICATIONS.
In contrast, in the proposed training framework, the accuracy of sketches with more strokes first reaches the highest (guaranteed accuracy), and the accuracy of sketches with fewer strokes then gradually increases to the accuracy rate of 7 strokes (reducing the number of strokes to improve communication efficiency), indicating that the agent is actively balancing accuracy and efficiency
.
Analysis of results: three attributes
In order to compare the advantages and disadvantages of the newly formed communication system, we design the properties of three graphical symbol systems and their corresponding measurement methods
.
Iconicity: We define it as a natural picture
in a mapping space where a sketch is close to its counterpart.
As shown in Figure 1, in the Ψ space, the distance of painting S A is closer to its corresponding picture ΙA and farther
away from other pictures.
To measure pictography, we tested the agent's communication accuracy for unseen images or categories in each experimental setting
.
As shown in the table, complete and sender-fixed can control the length of communication according to the familiarity with the content of the communication, and when encountering unfamiliar pictures and categories, the agent can improve the pictogram of the painting by increasing the communication length
.
Symbolicity: Sketches that we define as belonging to the same category can be easily distinguished in a high-dimensional mapping space
.
As shown in Figure 2, there are clear boundaries
between the different categories.
To measure symbolism, we fine-tune the already trained VGGNet [5] for classifying paintings
belonging to different categories.
As shown in the bar chart, the symbology formed with the complete setting has the highest consistency
.
Semantics: We define the topology of a sketch in a high-dimensional mapping space similar
to the topology of its corresponding image.
As shown in Figure 3, semantically similar concepts such as cats and dogs, sketches and images are relatively close together, cups are far away from them
.
We first project the name of each class to the vector space as feature A using word2vec [6], and use VGG trained in attribute 2 to project the final evolved sketch into the vector space as feature B
.
We calculate the correlation coefficient
between the vector distances that can be composed in all feature B and the vector distances in all feature A.
As can be seen from the table results, the complete setting best preserves semantics
.
We also use t-SNE[7] to project complete feature B into a two-dimensional plane, and we can see that the boundaries between categories are very clear, and semantically similar categories such as cattle, deer, and horses are close to each other, while being far away
from hamburgers, apples, etc.
In this work, the researchers simulated the formation of a new graphical symbol system using the You Draw I Guess game
.
The researchers verified the effectiveness of the training framework and proposed three graphical symbol properties – pictographic, symbolic, and semantic.
The experimental results show that the cooperative training of the players and the receiver's permission to terminate the game early and the interactive timing communication between the two can encourage the newly formed graphical symbology system to retain pictographic and semantic while having high symbolism
.
The researchers hope that this work will provide ideas
for studying the evolution of hieroglyphics.
References:
[1] Fay, N.
, Ellison, M.
, and Garrod, S.
(2014).
Iconicity: From sign to system in human communication and language.
Pragmatics & Cognition, 22(2):244–2633.
[2] Hong, Y.
, Si, Z.
, Hu, W.
, Zhu, S.
C.
, & Wu, Y.
N.
(2014).
Unsupervised learning of compositional sparse code for natural image representation.
Quarterly of Applied Mathematics, 373-406.
[3] Fay, N.
, Garrod, S.
, Roberts, L.
, and Swoboda, N.
(2010).
The interactive evolution of human communication systems.
Cognitive Science, 34(3):351–386.
[4] Sutton, R.
S.
and Barto, A.
G.
(2018).
Reinforcement learning: An introduction.
MIT press.
[5] Simonyan, K.
and Zisserman, A.
(2015).
Very deep convolutional networks for large-scale image recognition.
In International Conference on Learning Representations (ICLR).
[6] Mikolov, T.
, Chen, K.
, Corrado, G.
, and Dean, J.
(2013).
Efficient estimation of word representations in vector space.
arXiv preprint arXiv:1301.
3781.
[7] Van der Maaten, L.
and Hinton, G.
(2008).
Visualizing data using t-sne.
Journal of Machine Learning Research (JMLR), 9(11).