Wednesday, 4 April 2012


A new 'species of speech' emerging in China was described in an interesting post from LanguageLog. Called 'stream of consciousness blather,' it is associated with the vacant, vain spoutings of politicians.  It is being satirized by poets. 

I am considering how to integrate poetic analysis into an open data (re)modeling paper.  I heard once that lawyers should study poetry to develop their sense of nuance, complexity, layering.  Why aren't poets part of polisci analysis when more and more of the data comes in the form of communication streams like tweets and blog musings from across cultures?  Poets may have better insight at deciphering the ethos of online chatter than a traditionally trained analyst.  

In Persian, 'seem' means wire\In Japanese, a close homophone means narrow.
The amount of associations and connotations with these words in English makes them difficult to translate correctly.  You need to know the context, the culture. 
Filmmakers and novelists from Persian culture have a remarkable facility to weave metaphors and shade context through allusion. This narrative style is present throughout more everyday transactions.  However, analysis of data filtered through ICT frequently ignores the invisible variable of cultural translation or context.  Research begins with the false assumption that data transmitted via ICT has a universality because a perceived quality of the technology has been collapsed with that of the data transmitted across it.  When considering communication data from other cultures (in my paper I will look at Farsi and Russian), why does American analysis take words at face value?  The semantic analysis tools are set to consider equivalent inputs.  Words and phrases are given equal weight in each language when there are many examples of 'hot button' words in one culture which don't translate.  They don't translate because you need context.  The contextual markers for understanding the intentions behind a phrase like a metaphor are significantly more complex. For those who think there's no place for poetry in data analysis, here's the concept as an equation.

If f(a)=c and f(b)=c then a=b.  Except it isn't when c is not a constant....
This is the false premise on which current research on ICT is moving forward and which continues to limit the communication capacities of language communities.  

Let f be the function of any familiar technology such as facebook, blogging, etc. which transmits communication. Let (a) and (b) be any languages which are transmitted. And let c be the communication data we see at the end.  c is not a constant; it is a function of culture. There are two problems which seem to arise from misinterpreting the nature of f.  But perhaps these suggest an alternative formula in which any language may be transmitted via a new type of ICT which does not homogenize and sanitize the cultural markers, nor limit users in terms of culturally prescribed formats and architectures for transmission.  Ideally, an ICT would provide as fluid a space for communication as currently possible in the state of our technology. 


I will expand on these two problems in successive posts.  The first focuses on input architecture, and the second deals with models based on data collected through ICTs.  And the milk metaphor ends here.