- Communication is a goal: to transfer information from source
- If M and F are independent, .
- If F deterministically map to M, .
communication, information, source, destination, computational level goal, necessary subgoal
source, probability distribution over meaning, transmitter, encoder, function, meaning, form, receiver, decoder, noise source, reference resolution
information theory, information content, surprise, entropy, uncertainty, outcome, conditional entropy, mutual information, deterministic, KL divergence
- Mutual information: the information shared between two variables
- Kullback-Leibler (KL) Divergence: The information gained by using distribution P compared to distribution Q
tool, frequent, homophone, word sense, information processing effect, reading time,
- Language as a communication system seems optimized for efficiency.
- Highly needed words are:
- short in length
- close to each other in sentences
- Homophones: Same sounds but different meanings
- Word Sense: A single meaning of a word
- Sapir-Whorf Hypothesis
- Languages carve up the world in different ways.
- Does this influence our conceptual system?
- You cannot minimize two superlatives without specifying how they trade-off.
- Objective Function
- Minimize Description Length (Algorithmic Complexity)
- Minimize Communicative Cost:
- There are a variety of fields that study linguistic diversity.
- Linguistic Anthropology
- Linguistic Typology
- Semantic typology studies how different languages carve up the world.
- Recently, information theory has helped characterize and explain the diversity of semantic systems as communicative efficiency trade-offs.
- This is in line with the soft take of the Sapir-Whorf Hypothesis.
linguistic diversity, Sapir-Whorf Hypothesis, World Color Survey, Munsell Color Chart, basic color term, efficiency trade-off, principle of least effort, force of unification, force of diversification, communicative efficiency trade-offs, the information bottleneck, theoretical limit, observation, hypothetical variant, kinship system, objective function, description length, communication cost,
- The information bottleneck,
context vector, context words, target words, distributional similarity, syntactic categories, context window, wording meaning, LSA, dimensionality reduction step,
learn, word embedding, neural network, guess a word from its context, representation of the context, representation of the word, words within a context window, count vector, Word2V Model, the input and output layers are one-hot encoded, each word is represented as a vector of size V, the number of words in the vocabulary, representation learning
- after training, the hidden layer for target word can be used as its context vector.
model evaluation, semantic priming, lexical decision experiment,
- learns word representations automatically from raw text;
- simple approach: all we need is a corpus and some notion of what counts as a word;
- language-independent, cognitively plausible.
- many ad-hoc parameters when creating the embeddings;
- ambiguous words: their meaning is the average of all senses;
- no representation of word order