Because I Don’t Want to Be a Prospector: Towards a Better Term than “Data Mining”
I have a confession.
I hate the term “data mining.” Don’t get me wrong, I appreciate the concept of
data mining and its importance in certain situations, but the term itself is
problematic to me, especially when applied to humanities research.
I received my autumn
2011 copy of Victorian Studies in the
mail yesterday and read the articles on interpretation in a digital age with
glee. The three articles all used data
mining (or text mining) to look at larger patterns in term (word) usage in nineteenth-century
texts. Their work uses tools such as Google Ngram Viewer, to provide signals which
ideally lead to an exploration of larger concepts. I think this is fascinating
work that definitely has value and should be continued especially since, as all
the articles suggest, nineteenth-century texts are a large corpus of work that
is ideal for this kind of investigation.
Though I re-emphasize that this is all very valuable and
necessary work, the term “data mining” seems to connote a type of forcefulness
in analysis, akin to trying to make a piece of a puzzle fit into a space it
should not. When mining for this data, Heuser and Le-Khac warn against
mistaking signal as data (81). However, the term “data mining” suggests that digital
humanities scholars should prospect for data, set up stakes around the
perimeter, make sure to not overlap into another prospector’s claim, dig , and
hope for the best.
Maybe it is because I
grew up in a mining town and thus the association the term “mining” has is less
than ethical, but I feel that we should have a better term to describe the
important work being done. I suggest we
use something like data exegesis (too religious?) or maybe text (term?)
curating .
My main concern is
the use or (over use) of the term “data mining” will open the field of digital
humanities to critique by those who do not understand the work that digital
humanists do. Data mining has such an unethical connotation to start with, especially
in relation to the type of information stripping for profit and advertisement that
is done in social media, that scholars need to be savvier with the terminology
we choose to use in the field.
What do you propose we use instead? Do you think data mining
is an appropriate term for the type of work being done?
Work Cited
Heuser, Ryan and Long Le-Khac. “Learning to Read Data:
Bringing out the Humanistic in the Digital Humanities.” Victorian Studies 54.1 (Autumn 2011): 79-86. Print.
Comments
Post a Comment