Text Analysis - A Reflection

Dealing with Disciplinary Distress

“text”, (n)

“The wording of anything written or printed; the structure formed by the words in their order; the very words, phrases, and sentences as written.” (OED)

What is Text Analysis?

Text analysis is a technological tool that allows for information to be apprehended from large amounts of data as a means to track patterns and trends.

Types of Text Analysis - An Unfinished List

-Classification analysis

-Relationship analysis

-Anomaly detection

-Clustering analysis

-Sentiment analysis

-Topic modeling

-Term frequency

Text Analysis Literary Studies

The implementation of new technologies into a field is often met with a certain amount of resistence. Text analysis brings forth anxieties surrounding the potential obsolescence of the human in the future of an academic field. Recent technological developments allow researchers to move beyond human ability to look at collections of text larger than would otherwise be possible. As with any paradigm shift, there is often a fear from within the field that changes will cause there to be a loss of disciplinary purity.

And yet, if we take a step back and look more broadly at the history of a disicpline it becomes apparant that there are constant changes that adjust the scope of the field. Literary studies for instance has shifted a number of times throughout its existence as a reputable field of study, moving from a ranking system meant to determine what should be included within the canon to its contemporary form which focuses on the study, evaluation, and interpretation of a much broader range of texts. Disciplines are in a state of constant evolution, and the introduction of new technologies is just another step in the development of the field.

Text analysis creates new possibilities for academic fields as well as a number of other areas but it cannot be seperated from human interpretation. Without the researcher, the data that has been mined from the text cannot be understood. But beyond this, literary studies has traditionally been built on the use of “close reading” to examine a text through a detailed analysis of a piece that looks for specific information such as word use, repetition, and literary devices. The ability of text analysis to allow for “distant reading”, which looks for patterns in large amounts of data, is often percieved as a threat. Using only the technique of “distant reading” researchers can get a sense of broad patterns but they are often without context, which is why it cannot be seperated completely from “close reading” so that there is a sample from the texts that have been looked at as a representation of the broader set of data. And so, the use of text analysis can be seen as a tool that can be used in conjunction with other disciplinary research methods not one that causes others to be obsolete.

Tools:

-AntConc: A freeware corpus analysis toolkit for concordancing and text analysis. http://www.laurenceanthony.net/software/antconc/

-Word Cloud: Create free word clouds that represent the frequency of words used. https://www.wordclouds.com/

-corpus.byu.edu: The most widely used online corpora. https://corpus.byu.edu/corpora.asp

Links:

Cottom, Tressie McMillan ‘Nascent Thoughts on Text Analysis Across Disciplines’ http://tressiemc.com/2015/08/06/nascent-thoughts-on-text-analysis-across-disciplines/

Clement, Tanya et al., 2008 “How Not to Read a Million Books”. http://www.people.virginia.edu/~jmu2m/hownot2read.html.

Clement, Tanaya. 2013. “Text Analysis, Data Mining, and Visualizations in Literary Scholarship” https://dlsanthology.mla.hcommons.org/text-analysis-data-mining-and-visualizations-in-literary-scholarship/

Graham, S., I Milligan, S. Weingart The Historian’s Macroscope: Big Digital History http://www.themacroscope.org/?page_id=113

Healy, Kieran. ‘Using Metadata to Find Paul Revere’ https://kieranhealy.org/blog/archives/2013/06/09/using-metadata-to-find-paul-revere/

Michel, Jean-Baptiste, et al. 2011 “Quantitative Analysis of Culture Using Millions of Digitized Books,” Science Vol. 331, 176 (14 January 2011). http://www.librarian.net/wp-content/uploads/science-googlelabs.pdf

Moravec, Michelle. ‘Corpus Linguistics for Historians’ http://historyinthecity.blogspot.ca/2013/12/corpus-linguistics-for-historians.html

Rockwell, Geoffrey “What is Text Analysis, Really?,” Literary and Linguistic Computing 18.2 (2003): 209-220. https://doi.org/10.1093/llc/18.2.209

Journal of Cultural Analytics http://culturalanalytics.org/

Schmidt, Ben. 2015 ‘Vector Space Models for the Digital Humanities’ http://bookworm.benschmidt.org/posts/2015-10-25-Word-Embeddings.html

Speer, Rob. 2017 ‘How to make a racist AI without really trying’ ConceptNet Blog https://blog.conceptnet.io/2017/07/13/how-to-make-a-racist-ai-without-really-trying/

The Syuzhet Episode: http://www.matthewjockers.net/2015/02/02/syuzhet/

My annotations can be found at https://hypothes.is/users/kirstenbussiere.

Written on October 29, 2017