06 August 2017


by Cecily
Everybody* is always** talking about how much less women and various minority groups make than white men, and often they do it by saying something like "Women in this group earned 90 cents for every dollar a man in the same age group earned."  And then I am indignant but not surprised, and brood over the inequality that is still pervasive in our world, and consider history, and look up things on Wikipedia. But the I find the how-much-of-a-dollar thing boring and irritating. It is overly simplistic. Yes, the nation and the world continue to be in a state of inexcusable inequity and iniquity. That is not new information. (Also, getting women up to the full dollar would not rid the world of iniquity and inequity. I dislike the dollar thing slightly less when it includes minorities.)

But I'm just bored by the white-man's-dollar thing, not the topic in general. I like thinking about the statistics that are behind that, and what it might look like, and wondering how the data*** was organized, and where it came from, and what would happen if you binned it in different ways and did more statistics to it.

This particular white-man's-dollar thing made the Facebook rounds recently, so I started thinking about where those numbers came from, and what kind of average does it represent, and how much variation is being obscured by the average, and does that archetype white guy who earned the dollar include the super rich white guys? What does the top of the income scale look like, compared to the bottom?

(citation needed)

And also I started thinking about what I would do with the data if I had it.

Here's the study I want to see: a huge, huge sample size, collected nationwide (we'll do the rest of the world later) coded for a ton of things. Income is still the dependent variable, but I want way more independent variables than just gender and race. The things I have thought of so far are
  • number of years experience (0, 1, 2, 5, 10+)
  • degrees held (none, BA, MA, etc)
  • specific job type (e. g. nurse, lawyer, tour guide) 
  • general field (medicine, technology, sportsball)
  • geographic location (city, state)
  • type of area (urban/rural)
  • race
  • disability
  • parental income
  • (these independent variables are getting a little out of hand. We might need to save some of for another project using the same data set.)
  • with/without children
  • single/married
  • height? weight?
  • plus probably some more
I would send out my army of minions (or maybe just use facebook for most of it?) to find out all these things about lots and lots of people. And after the minions had visited thousands and thousands of people,I would take all of that data and put it into a huge, beautiful spreadsheet, and I'd do statistics to it. I'd create a bunch of study sets, ranging from extremely narrow (black nurses with RNs and 5 years experience, in California) to very broad (everyone with 2 years experience at any job). I'd add a column for [modal income minus actual income], and one for [male income minus female income]. Then I'd make another spreadsheet, with all the numbers from the first one, with subgroups as tokens, instead of people. And I'd calculate some things about the subgroups, like their modes and means and mediums.

(Isn't this a realistic project? Among other things, we're going to need a huge travel budget for all the places we have to go and then find sufficiently large numbers of participants for each most-specific subgroup: at least 30 male and 30 female nurses who have RNs and 5 years experience, in Chicago, from each minority category, and also 30 of each with 1 year of experience, and also with whatever other nursing degrees there are. And all the variations of all the other jobs (mechanics, high school teachers, circus performers, etc.)**** Just the search for minions may take a while.)

And then I'd get out my trusty, rusty, dusty old R program on my computer and do some statistics to my statistics and create some visual representations! I have been thinking about what kind of graph would fit best for various questions. Lines? Scatter plot? Chopped up dollars? The ones I've been thinking about would have groups of people on the x axis and plot the income disparities on y. (I also want to know what would happen if we put  modal incomes on x and looked at race and gender, and if we put actual income on x and looked at disparity. But I haven't been thinking about how to make graphs of those, yet).

For example: We want to know which makes more of an impact on a nurse's salary, experience or education, and if is it the same for men and women. We would look at the education and experience of nurses of all races and abilities, nationwide and sort them into groups twice, first by how many degrees (with each degree group split by gender), and again by years of experience (same). There will be 50 groups each time, for a total of 100 groups. Now we have a study set! Each line on our spreadsheet will represent one of those subgroups of nurses like this: [male, 0 years, RN];  [male, 1 year , RN]; [male, 2 years, RN]. And we'll get, from our other spreadsheet, the modes and medians and means (let's be very thorough) for each subgroup, and put them in the new spreadsheet. Now it's time to make a graph!

This super useful and precise example graph has all our subgroups on the x axis, and the mean difference between modal and actual incomes is plotted on y. Like this!

This graph bears no relation to reality; I made up the point placement out of thin air and was too lazy to make up what units to use- dollars? percentages? standard deviations?
Then I'd look at my graph and see if it seems interesting in any way, like something that might be a line or a shape or any pattern of any kind. If it does, figure out what and write a paper about it. If it doesn't, start over with different variable groups.

I would like to know much more about wage inequity.  Are there exceptions to the general rule? Is there a difference between how much a master's boosts salaries of black teachers' vs white teachers'? Do years of experience and/or degrees ever make up for being disabled?  Which fields of work are the worst? Which state is the best?

And so on. Basically my interest is in looking at the details of the big picture, rather than just the final, single, average. (Also I wish everyone would use more precise language when they talk about the "averages" of things.) "White men" is too heterogeneous to be the baseline. Are there any subgroups of white men where they are paid less than women? If there are, that would be very interesting. Looking at the big picture is great, but you have to check the details first to make sure the big picture isn't hiding anything.

I started thinking about this specifically because, while looking at the dollar picture above, I also thought about the extremely wealthy white men who have obscene salaries because they are CEOs, and how there are more men named John among them than there are women CEOs. That's a different kind of disparity, which is also very disheartening, but not within the scope of this project (don't get distracted!). Anyway the fact that all these CEOs and college presidents and whatever are being paid absurd amounts of money, the disparity might be much larger if you just look at the amount of money, but smaller as a percentage (the difference between $50 and $100 seems way more dramatic than the difference beween $500,000 and $500,050. I think the absurdly high salaries at the far end of the scale might screw up the data. So I was thinking about extreme outliers, and what stuff was accounted for and what wasn't, and where those parts of dollars came from. And for that you need more information about the data than the dollar picture gives you. And here we are!

During one of my many Adventures in Wikipedia, I looked at a lot of studies about this. (This disparity has been known, but not fixed, for a pretty long time, so there are quite a few studies.) Most of the ones I found were pretty careful and targeted specific populations. Many of them match jobs and years of experience. But none of them answer my details-of-the-big-picture questions from up above. And the Telephone game from actual research to media report to striking pictures on Facebook is a lossy, lossy transmission, resulting in the chopped-up dollar picture I am complaining about.

Maybe one of the silver linings of our impending transition to a dictatorship will be that the government will require everyone to report every detail of our lives anyway, and I can sweet-talk the dictator into letting me see the records to make some spreadsheets.

In conclusion, the things I am interested in knowing about income disparity in America are not adequately addressed in the chopped-up-dollar picture.

And now you know a new fact about me: one of the ways I entertain myself while bedridden is to invent unrealistic studies and think about how I would arrange the data in a spreadsheet, and what to do with the data, and what kind of visual representation of the results would work best.

*Some people, with whom I occasionally interact in some way

**Once in a while

***I refuse to treat data as a plural. We're speaking English, damn it! It's a singular mass noun and Latin can keep its stupid inflections to itself.

****Some of these subgroups may be empty sets. Disabled black lawyers with LLMs who live in Montana, for example.

It is really, really stupid that the white men at the very top make as much money as they do.

No comments:

Post a Comment