Infotropism Kirrily Robert’s blog

Posted
23 May 2008 @ 10am

Categories
Work

Freebase, notability, and minority data

If you're new here, you may want to subscribe to my RSS feed. Thanks for visiting!

You know I went to work for Freebase, right? Well, this is a personal post about a work-related topic. Consider that a disclaimer; these are my views, not official company ones, though I trust and believe that there is significant overlap between the two.

Freebase is a free database of the world’s information. That’s something we say a lot around here but I just wanted to spend a few minutes unpacking it.

Free. Freebase data is licensed under Creative Commons CC-BY license, which means you can download it, re-use it, remix it, regurgitate it, and do whatever you want with it as long as you acknowledge you got it from us. Easy enough.

Database. Freebase is structured. Unlike most websites, which you’d need to crawl and parse to extract information, we provide an API which lets you ask for things like cloud classifications or CFOs of San Francisco companies with > $1m in revenue in a structured way.

The world’s information. This is where it gets interesting. Unlike Wikipedia, Freebase doesn’t have strong standards of notability. On Freebase, if it’s interesting to you, it’s probably interesting to us. Our contribution guidelines actually say, any data that might be of interest to other people.

On Wikipedia — as in much of society, so please don’t think I’m harping on Wikipedia particularly here — there have been accusations of systemic gender bias. Similarly, topics of interest to any kind of minority will tend to be treated as non-notable by the majority, and excluded from the system. Freebase gives us the opportunity to redress this imbalance.

This week I launched a Freebase data mob on the subject of Ethnicity. Data mobs are short-term efforts to gather information on a given topic. The ethnicity data mob gives us an opportunity to ensure that people from all ethnic backgrounds are included in Freebase, and lets us make queries about ethnicity as it relates to other facets of life. The data on ethnicity is only just beginning to grow, so here’s an example using gender instead: Female CEOs of public companies in order of market cap. Soon, we’ll be able to ask for things like “ethnic breakdown of university faculties in the US” or “award-winning Native American authors”.

Ethnicity is only one part of the minority data world. We’re making enormous leaps in how we deal with geodata, and we are starting to be able to ask things like “show me schools and shopping malls within five miles of my home”. But with more minority data contributed by people who care about the subject, we could ask for childcare facilities near a conference centre, or the nearest Planned Parenthood clinic, or for data on toxic waste hazards in a given area.

Ethan Zuckerman spoke at ETech this year on The Cute Cat Theory of Digital Activism.

I’d offer the hypothesis that any sufficiently advanced read/write technology will get used for two purposes: pornography and activism. Porn is a weak test for the success of participatory media - it’s like tapping a mike and asking, “Is it on?” If you’re not getting porn in your system, it doesn’t work. Activism is a stronger test - if activists are using your tools, it’s a pretty good indication that your tools are useful and usable.

Freebase is full of porn, just like Wikipedia is (warning: link goes to Christian Newswire). Here, have some anal sex, nipple clamps, and foot fetishism. Now we’ve got the basics covered, I feel like Freebase is on the cusp of something bigger, and I’m hoping that we’re going to start seeing activism using Freebase data. The way we can draw connections between points is phenomenal, and is exactly the sort of free information that can really empower people do do amazing, maybe world-changing things.

But for now, please help us gather data on ethnicity. Consider it your good deed for the day. And subscribe to the Freebase blog if you want to keep up with what I’m doing over there.

Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • del.icio.us
  • Reddit
  • StumbleUpon
  • Technorati
Tags: , , , ,

No Comments Yet


There are no comments yet. You could be the first!

Leave a Comment

Web2Expo: I believe a small rant is required Off to London