Inferring social behavior and interaction on twitter by combining metadata about users & messages

Cheong, Marc

doi:10.4225/03/58b5009d3726a

4701577_monash_120048.pdf (8.93 MB)

Inferring social behavior and interaction on twitter by combining metadata about users & messages

thesis

posted on 2017-02-28, 04:46 authored by Cheong, Marc

Social media - in particular microblogging - is fast becoming important in today's world. A good example is Twitter, which is a rich source of readily-available information by, and about, people. Real-life happenings are constantly reported on Twitter; thus, it functions as a 'mirror' to the real world. These happenings range from the banal (individual thoughts, opinions, and observations), to the dramatic (celebrity announcements, scandals, and Internet memes), to real-world events with serious consequences (riots, coordination during natural disasters, response to terrorism, and political dissent). Most extant literature treats the message and user domains on Twitter independently of one another. Current research focuses only on a single domain, but rarely on both. Research consists mostly of specialized techniques, such as opinion and sentiment mining, community detection, social network analysis, and trend mining which are merely applied to Twitter data. Rarely are metadata from both the user and message domains analyzed in tandem with each other. My thesis combines metadata from both domains and transforms them into useful inferences for detecting hidden patterns. The basis of my research is the use of metadata from both Twitter users and messages as the raw material, from which we can discover hidden patterns and inferences. Such patterns and inferences, in turn, can be combined with data mining techniques to unearth a wealth of knowledge about Twitter users in particular, and people in general. In this thesis, I investigate two aspects. First, I introduce a new framework for the large-scale gathering and collation of Twitter user and message metadata. Secondly, I introduce and investigate new inference algorithms that combines metadata from both domains, inspired by current literature, which are hitherto absent in research. In doing so, I contributed to the development of novel inference algorithms, and frameworks to harvest raw metadata from Twitter for the provision of ample data for the evaluation of my algorithms. From the wealth of metadata from the two domains on Twitter, my new algorithms produce three categories of inferences - social demographics, exhibition of online presence by users, and messaging (tweeting) behavior of users. Using these new inference algorithms, I tested my findings on a large-scale real-world dataset, collected from Twitter using data gathering frameworks I have developed. Consequently I was able to draw conclusions of the current 'state of the Twitterverse'. Following that, I introduced a novel application of pattern detection and clustering on inferences generated from my algorithms. This is for the detection of latent traits and identification of non-obvious patterns, with respect to the three categories of inferences that are generated from my algorithms. To conclude my thesis, I showed that my approaches provide useful insights about serious real-world phenomena captured on Twitter pertaining to - environmental activism, terrorism events, and public disorder - all of which are of interest to researchers, governments, and the media alike. Using the approaches proposed throughout my thesis, I was able to discover the behavior of people in the real world, and illustrated how such real-life behavior is translated into expression and social communication in the online realm. The results from these studies covered in my thesis led to a better understanding of who social media consumers are, how they communicate online, and how behavioral patterns from these users 'mirror' the real-world.

History

Campus location

Australia

Principal supervisor

Sid Ray

Additional supervisor 1

David Green

Year of Award

2013

Department, School or Centre

Information Technology (Monash University Clayton)

Course

Doctor of Philosophy

Degree Type

DOCTORATE

Faculty

Faculty of Information Technology

Usage metrics

Keywords

Clustering Social media Microblogging thesis(doctorate)Twitter monash:120048 Demographics Open access ethesis-20130726-122141 Pattern recognition 1959.1/892714 Inference algorithms 2013

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Inferring social behavior and interaction on twitter by combining metadata about users & messages

History

Campus location

Principal supervisor

Additional supervisor 1

Year of Award

Department, School or Centre

Course

Degree Type

Faculty

Usage metrics

Categories

Keywords

Licence

Exports