You are currently browsing the tag archive for the ‘statistics’ tag.

I had a great meeting with my second supervisor yesterday. I think retired life agrees with her – she was in a better mood than I’d ever seen her in before.

We talked about my analysis together, talked stats, went through the calculations I’d made. It turns out my stats are right, but it’s difficult to justify why the effects I’ve found should be of theoretical interest. My supervisor asked if I’d looked at some of my other variables instead – ones I had originally been interested in, but didn’t bother exploring very much because the nature of the data didn’t fit the tests I could do and, more importantly, it didn’t occur to me that there could be a way to change the nature of the data. Of course, it turns out there is, and now, thanks to her, I know how to do it.

I love talking about statistics. The philosophy of statistics is bizarre, ironic, and contradictory. Statistics can be mind-blowing – just when you think you have the answer, it escapes you. There are strengths and weaknesses, advantages and disadvantages, pros and cons to everything. There are assumptions and conditions in which you can violate assumptions. There are multiple ways of doing the same thing, and multiple ways of deciding which way to do it. I still have the undergraduate reflex of flinching when I see a significant p value, but I have matured enough to put my excitement to one side and check other indicators of significance, and dilute my enthusiasm with caution for sample sizes, skewed distributions, and Type I errors.

For all her coldness, I have a great second supervisor. She knows her stuff, and she likes it when you share her passion for stats. We had some great ideas, and she showed me how to do things I hadn’t even thought about before. My final study was going to go in some bizarre, barely-justfiable direction I wasn’t even sure I was interested in, simply because that was the only area in which I could find results worth reporting. Now I see results worth reporting aren’t merely the significant ones – they’re the ones that spark theoretical interest. I’m not going to do what I thought I had to do – I’m going to go back to what I was originally interested in, and reanalyse that data. I didn’t find much of interest in it the first time round, but, thanks to my supervisor, and the beauty of stats, I find there are things in my data worth talking about.

Wow.

This is a great feeling.

I’m in the game again! Maybe I’ll even get my head around this thing!

My supervisor is going to be here in just over 4 hours. Perhaps I should clarify – my second supervisor, with whom I am meeting this afternoon, is a retired emeritus professor and lives in a small village in the middle of nowhere, a good three hours’ commute away from London. Fortunately, after several distasteful altercations with our head of department, she got permission to claim for travel expenses to come to London once in a while and discuss stats with me. She wouldn’t hear of me being given a replacement supervisor. “I will supervise you no matter what,” she said. God bless.

Except that now that she lives three hours away (on a good day), however much she has much more time to spend on our own research, I feel guilty about calling her in to see me because of all the time and stress it involves. And now that I have called her in for our meeting today, the pressure is on to show her that it was worth it!

My second supervisor is a little different from my first, although ironically, the two have known each other for donkey’s years and are the best of friends. My second supervisor is very focused, likes to get down to business immediately, and hates it when you make a fuss about anything. Until recently, she seemed to be irritated even by simple social conventions like saying “How are you?”, at the start of a meeting. I always felt silly asking her this, even though I would ask out of genuine interest rather than just paying lip service to British politeness, because she would give me a cold reply like “OK.” and not even return the enquiry. Fortunately though, perhaps because we have had some very in-depth debates about stats and psychometric theory in which she really seemed to enjoy herself, she has warmed up a bit and now actually asks me how I am back.

Now that’s progress.

Anyway, the fact that she has warmed to me isn’t the point here. The point is that she has a very focused way of working in which she likes to examine things in detail in advance, have a think about it, and only then hold a meeting. I’ve known this for some time and have, since then, always emailed her my datafiles and notes in advance. Whilst this helps her understand my questions better, and allows her to come prepared, I’ve found I feel very stressed between emailing her my stuff and meeting her, simply because of my anxiety about all the embarrassing mistakes I imagine she’ll find in my work. I keep thinking, “I’m a psychologist. Psychologists have rigorous academic training in statistics and research methods from year 1 right up to PhD level. I’m supposed to be on the ball with everything stats related. And here I am still having to look up ANOVAs in a textbook! I’m hopeless! My supervisor is going to eat me alive! I’ll never amount to anything! My thesis is going to suck! I’m going to fail my viva! And end up homeless and penniless on the streets!”

Et cetera, et cetera, ad infinitum, ad nauseum.

These irrational thoughts are still stuck in my head even now, as I write this. It’s maddening. I know I have put in a good effort to try my hand at the analysis, so as not to make my supervisor feel like I am dumping my work at her feet and saying “Here. Just tell me the answer.” She hates that. She hates dumb, needy students coming to her and begging her to just tell them the answer, or, worse, to actually do their work for them. But still, I feel like I’m not going to be able to live up to her standards, like I have not done enough work to impress her, and like I am going to be left feeling like an idiot – not just for not being smart enough, but for wasting her time.

I have 4 hours to get my head straightened. I have to review my analysis, make sure all my datafiles are saved on my flashdrive, reread my notes, pick up the keys to the meeting room, and get everything set up early. I concede these things will not actually do much to get my head straightened, but they will, hopefully, distract me from the madness that’s brewing inside.

I came across a strangely delightful quote from Scott Fitzgerald today:

To write it, it took three months; to conceive it three minutes; to collect the data in it all my life.
Poor, tragic Scott. I wonder if writing novels is as mentally exhausting as writing a thesis?
There are interesting parallels between the literary process and thesis-writing. The most obvious (to me) is that both cause irreversible madness. But more than that, when you think about how long it takes to write, and the lengths you have to go to just to get to a stage where you can write, you see the process is the same.
 
Sure, I will write the (almost) final draft of my thesis in three months, but to get to the stage where I can do that, I spent six months trying to work out what a PhD is all about, three months collecting and analysing data for my first study, nine months writing up my first study and running my second study, and another six months running my third study and coming back to trying to work out what a PhD is all about. I spent the best part of 2 years swimming in a mental sea of data – words, numbers, statistics, software packages, charts, tables and diagrams. I just swam around, trying to interpret it, and trying to make my interpretations actually make sense, and maybe even an original contribution to knowledge. Then there’s the fact that I conceived of the original idea for this whole project in the space of about 20 minutes.
 
If only I’d known what I was getting myself into.
 
No matter what sort of writers we are – artistic, academic, or a bizarre blend of both – there is a lot that goes into our work besides just writing the words. There’s a lot of thinking and a lot of data collection, and a lot of interpretation and reinterpretation and a lot of madness.
 
Struggling thesis writers, novelists, madmen and women – unite! We shall conquer these great seas of chaos and emerge brighter, stronger, more learned, at the helm of this mighty ship.
 

I spent the weekend preparing slides for a lecture I’m giving to a group of undergrads in a few weeks’ time.

I, the supposed-to-be-submitting-in-May PhD candidate.

Over the years, the more immersed I’ve become in my very narrow, very specific area of research, the more complex my understanding of the world has become, and the less I am now able to see the world in simple (or simplistic?) terms. Where, as an undergraduate, people, places, events seemed reasonably clear to me in what they were, now I always seem to be saying “but only if”, “based on the assumption”, “may have a different perspective”, “if we hypothesise”, “insufficient evidence to suggest”, “need further research” and “remains an open question”.

Even about things like what the weather’s going to be like today.

I’ve forgotten how to think like a lay person. Science has taken over my thoughts. I can’t resist the logic, the rationality, the stoic procedural calmness of thinking like a scientist.

So it’s not surprising that I find it difficult – infuriating, even – to write lecture material for an undergrad cohort mostly newly out of high school and unaware of the basic things many of us academics would expect they ought to be aware of. At an undergraduate statistics tutorial last year I only just managed to hide my incredulity at a student who didn’t know how to round numbers to two decimal places when the purpose of the tutorial was to construct a simple 2D correlation matrix using output from statistical software.

“So when you’ve got 0.972, you look at the 2 and then what?” she asked. I stared for a second, unsure if she was serious or joking.

“Then because the 2 is a number 4 or under, you leave the 7 as it is, and your answer is 0.97,” I said.

I thought that would address her confusion, but a while later the same student called me over again and this time asked me what to do if the third decimal place was a number 5 or over.

Honestly, I remember learning about decimal numbers in 6th grade. At primary school. Where have these students been all their lives? What do schools teach them these days? And I’m not even that old – in fact, most of the students I teach are just about my age, in their early twenties. It’s not like I was educated in a different era.

So, in what should theoretically be a straightforward research methods lecture, I have, deliberately, included words like “paradigm”, “constructivist” and “empirical” and suggested reading original articles dating to the 1960s. In short, I’ve included material that, in comparison to the relatively ‘soft’ lectures other staff seem to give, will shock and repulse many undergrads and fill them with the horror of actually having to look up an article themselves and read it in all its 1960s snobby white upper middle class style of English. And, imagine them being forced to look up “paradigm” in the dictionary! Oh, the torture!

So what do we conclude? Am I a bad lecturer for raising the level of complexity in my material even when I know many students won’t be able to understand it completely without, shock horror, doing extra reading, researching, or investigating? Or is the system to blame for so many of the students coming to university without knowing how to round decimal numbers, write essays, or address lecturers respectfully? Or, conversely, are all undergraduates at a degree of understanding that is somehow ideal, and instead I’m the one who’s gone nuts because my PhD has made me far too scientifically knowledgeable?

In a data set of a thousand or more cases, I’ve been told to conduct a test once, and once only, and the result of that test is to be reported as the defintive result of the study. Because conducting a test several times on different randomly drawn subsamples of the data set and comparing them for replicability of the result is apparently unscientific.

I think what’s unscientific is choosing not to acknowledge that a test result is not replicable on randomly drawn subsamples when it indeed isn’t. Because you’re insisting that something you chance upon in the first instance is gospel, to however a small extent, when you have the possibility of showing that it isn’t. But you’re scared to do it, because science dictates you can’t.

That sucks.

And however much traditional science embraces an epistemology of empiricism, insofar as that empiricism depends on statistics to show that effects exist in the world, there will always be an intellectual community that embraces single-test philosophy as the only feasible way to report the existence of an effect, and thus, with so many data sets, we can never really know whether such effects still exist, by way of replication, in multiple randomly drawn subsamples of large data sets.

Well, that’s exactly what I’ve just done and the results are variable. Meaning what you find first isn’t always the right answer.

Rather, ‘truth’, insofar as science can escertain it, should be built on the basis of repeated discovery of the same replicability of an effect, not just a single instance of an effect. If an effect only exists in a data set in one instance, I’m just not inclined to believe it’s really there.

Or maybe I’ve just been staring at my data for too long.

Friend of WikiLeaks

July 2020
M T W T F S S
 12345
6789101112
13141516171819
20212223242526
2728293031  

Categories

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 96 other followers

The Final Countdown

Submission of PhD ThesisMay 1st, 2013
The big day is here. Joy to the world!