(Research terms are in Bold)
(duh, title is borrowed… from one of my favorite books: Murakami‘s What I Talk About When I Talk About Running)
I use a program called Transcriptions for Mac. It’s not the greatest but it works. I don’t have a foot pedal (you use them to stop play or rewind the audio as you type… it makes things easier)…. Oh yeah, transcription… what is it? Transcript is the written or printed version of something (usually audible)…. in qualitative research, transcription is the process of transforming audio to text for later coding and analysis.
When you research in the humanities or social sciences, you have to make choices about your data collection and analysis methodology (how you are doing your research): either you’ll choose a 1) quantitative approach (numbers… how many did what when and where?), 2) qualitative approach (why, how, context) or 3) a mixture of both called the mixed methods approach… sounds fun… actually it is! But I AM weird like that….
So this post is dedicated to the lovely process of transcribing your interviews. Interviews are used as a mechanism or method in qualitative data collection. It can be quantitative but it’s not useful doing a quan research that way using interviews and may undermine your entire research unless you can justify it… a survey will do you better if you choose to go with numbers. Data collection is collecting information that will help feed answers to your “hypothesis” or Research Problem, the very thing you set out to explore. It involves “getting” your data or collecting it, organizing it, preserving it for the next step in the research process, Analysis.
Interviews are basically the same as what journalist or talk show hosts do… they sit someone (or a bunch of people… which is then a focus group) down and talk to them. The interviewer (the researcher), can choose to ask the interviewee (the participant) a number of predefined questions (you have thought about this beforehand and planned what those questions are). This is known as structured interviewing. Alternatively, you can wing it with an unstructured interview (you would need to be experienced to do this as you may lose track of what you are trying to figure out or miss “leads” that your interviewee gives you). And yet another option for you (cause social science rocks like that and isn’t so black-and-white, yes-or-no math-like type research), is semi-structured interviewing! Basically that’s where you prepare a bunch of points you would like to go over with the person you are interviewing, the interviewee, just as a reminder during your session… think about it as the bullet points on a slide presentation or your presentation cue cards… easy? cool.
Ok so then you have your interview recorded and transferred into WAV or MP3 files onto your computer, and converted to whatever format your transcription software likes (in my case I converted WAV to MP3) and your ready to start transcribing… wait, what!? The software doesn’t do that for you?! You’re kidding! Nope, unfortunately amigo this is reality… if you would like that kind of technology, you have got to wait… “5 years” for that technology to be commercialized–says every AI researcher (when they actually mean anything between “20 years” and “I don’t know”).
The truth is, a computer understands what you allow it to understand… and when you introduce symbols–images, text, or other visible content that is almost static–then you are working with a limited, albeit LARGE, amount of information. THIS IS NOT THE CASE FOR AUDIO. So unless you “tell” it that when it detects the sound/noise “AAAA-OOOOO,” which means ouch, which is a sound you make to express pain, it won’t process it as that… it will process it as frequency… fine… what if it understands that “aaaa–oooo” is a human sound that expresses pain? But then you introduce sounds from your trip to Algonquin park (north of Ottawa) and the recording happens to catch the sounds of wolves howling? How do you teach the computer to distinguish between the sounds? Let alone teach it one’s an animal and the other is human? You can introduce other variables like intonation, depth, length and so on… what about ambiguities such as sarcasm? It gets harder doesn’t it? This leaves you with an infinite amount of possibilities that necessitates thinking of a way to get the computer to become “self-taught” rather than feeding it the information… Seemingly, with things like Siri and Amazon echo, we are getting close… but technology still has a lot of Semantic intelligence to catch up with in order to develop flawless performance.
With that tangent complete, the unfortunate reality is that when you conduct qualitative research and use interviews as a data collection method, you are left with no choice but to utilize human effort. The obvious way is to do it yourself, the not so obvious way is to hire the services of companies that do it for you… you’ll need to double-check their privacy terms and to make sure you include it in your ethical clearance application (if your university has one… which it must likely does!)… of course, when you are a poor student with literally nothing to your name except heaps and heaps of loans and debts and credits and overdraft… you get the picture…. that’s not an option… so you do it yourself… you sit for three hours going over a 10 minute segment because you don’t understand the interviewee’s british accent a couple of weeks after you recorded the interview and after so many others in between… but you just dive into it and keep going even past the two month deadline you gave yourself… It gets easier, monotonous, but easier. It doesn’t get shorter unless you start is what I keep telling myself during this tedious endeavour.
So there you have it! What it’s like to go through this boring process.