Commons Metadata Statistics

Posted by striatic in Statistics

Patrick Peccatte of the incredible PhotosNormandie has just published an article that provides metadata statistics for all Commons institutions. The article also includes detailed information regarding how each institution uses machine tags and photo descriptions, so if you want all the details, be sure to check out the Google translation of the original article.

Here are the statistics relating to comments, tags, and notes. The institutions are displayed in the order in which they joined The Commons. Links are also provided to the photo at the top of each category within an institution. These are useful for discovering photos that have received a lot of attention. [data collected between February 7 and 8, 2009]

Library of Congress, Washington, DC, United States

Launched on 16 January 2008, currently has 5,421 photos in 5 sets.
11,675 comments, for an average of 2.15 per photo. Max = 133
75,143 tags, for an average of 13.86 per photo. Max = 72
2712 notes, for an average of 0.50 per photo. max = 33

Powerhouse Museum, Sydney, Australia

Launched on 7 April 2008, currently has 1,101 photos in 27 sets.
1,464 comments,for an average of 1.33 per photo. Max = 97
4,619 tags, for an average of 4.20 per photo. Max = 34
305 notes, for an average of 0.28 per photo. Max = 19

Brooklyn Museum, New York, United States

Launched on 28 May 2008, currently has 677 Commons images in 6 sets.
[Following are statistics re-collected today Feb, 21]
1,508 comments, for an average of 2.23 per photo. Max = 107
4,875 tags for an average of 7.2 per photo. Max = 65
373 notes or an average 0.55 per photo. Max = 20

Smithsonian Institution, Washington, DC, United States

Launched on 16 June 2008, currently has 1,403 photos in 12 sets.
1,468 comments, for an average of 1.05 per photo. Max = 68
5,687 tags, for an average of 4.05 per photo. Max = 43
238 notes, for an average of 0.17 per photo. Max = 19

Bibliothèque de Toulouse, France

Launched on 26 June 2008, currently has 640 photos in 39 sets.
297 comments, for an average of 0.46 per photo. Max = 30
1,193 tags, for an average of 1.86 per photo. Max = 42
26 notes, for an average of 0.04 per photo. Max = 7

George Eastman House, Rochester, NY, United States

Launched on 17 July 2008, currently has 592 photos in 9 sets.
2,431 comments, for an average of 4.11 per photo. Max = 116
4,217 tags, for an average of 7.12 per photo. Max = 36
350 notes, for an average of 0.59 per photo. Max = 10

Biblioteca de Arte-Fundação Calouste Gulbenkian, Lisboa, Portugal

Launched on 14 August 2008, currently has 3,027 photos in 76 sets.
Most photos are under copyright, only 80 photos are labeled “no known restrictions”.
423 comments, for an average of 0.14 per photo. Max = 35
611 tags, for an average of 0.20 per photo. Max = 11
23 notes, for an average of 0.007 per photo. Max = 4

National Media Museum, Bradford, West Yorkshire, UK

Launched on 27 August 2008, currently has 130 photos in 7 sets.
858 comments, for an average of 6.6 per photo. Max = 143
958 tags, for an average of 7.37 per photo. Max = 33
83 notes, for an average of 0.64 per photo. Max = 6

National Maritime Museum, Greenwich, UK

Launched on 17 September 2008, currently has 191 photos in 8 sets.
415 comments, for an average of 2.17 per photo. Max = 37
877 tags, for an average of 4.60 per photo. Max = 26
23 notes, for an average of 0.12 per photo. Max = 3

State Library of New South Wales, Australia

Launched on 29 September 2008, currently has 249 photos in 38 sets.
1,436 comments, for an average of 5.77 per photo. Max = 149
2,944 tags, for an average of 11.82 per photo. Max = 44
163 notes, for an average of 0.65 per photo. Max = 14

Library of Virginia, Richmond, Virginia, United States

Launcheded on 6 October 2008, currently has 314 photos in 2 sets.
233 comments, for an average of 0.74 per photo. Max = 29
1172 tags, for an average of 3.73 per photo. Max = 22
70 notes, for an average of 0.22 per photo. Max = 11

Musée McCord Museum, Montreal, Canada

Launched on 13 October 2008, currently has 236 photos in 3 sets.
255 comments, for an average of 1.08 per photo. Max = 25
851 tags, for an average of 3.61 per photo. Max = 31
18 notes, for an average of 0.08 per photo. Max = 3

Nationaal Archief, The Hague, The Netherlands

Launched on 21 October 2008, currently has 590 photos in 14 sets.
1,407 comments, for an average of 2.38 per photo. Max = 41
1,552 tags, for an average of 2.63 per photo. Max = 27
20 notes, for an average of 0.03 per photo. Max = 3

Australian War Memorial, Canberra, Australia

Launched on 10 November 2008, currently has 42 photos in 2 sets.
155 comments, for an average of 3.69 per photo. Max = 42
258 tags, for an average of 6.14 per photo. Max = 21
8 notes, for an average of 0.19 per photo. Max = 4

Imperial War Museum, London, UK

Launched on 11 November 2008, currently has 10 photos, no sets.
50 comments, for an average of 5.0 per photo. Max = 22
71 tags, for an average of 7.1 per photo. Max = 22
7 notes, for an average of 0.7 per photo. Max = 3

National Library of New Zealand, Wellington, New Zealand

Launched on 27 November 2008, currently has 149 photos in 8 sets.
181 comments, for an average of 1.21 per photo. Max = 23
663 tags, for an average of 4.45 per photo. Max = 14
6 notes, for an average of 0.04 per photo. max = 4

New York Public Library, New York, United States

Launched on 15 December 2008, currently has 1,300 photos in 17 sets.
457 comments, for an average of 0.35 per photo. Max = 18
2,579 tags, for an average of 1.98 per photo. Max = 30
24 notes, for an average of 0.02 per photo. Max = 5

National Galleries of Scotland, Edinburgh, Scotland, UK

Launched on 14 January 2009, currently has 107 photos in 7 sets.
209 comments, for an average of 1.95 per photo. Max = 21
811 tags, for an average of 7.58 per photo. Max = 21
30 notes, for an average of 0.28 per photo. Max = 4

State Library of Queensland, Brisbane, Australia

Launched on 26 January 2009, currently has 150 photos in 8 sets.
61 comments, for an average of 0.41 per photo. Max = 9
110 tags, for an average of 0.73 per photo. Max = 14
7 notes, for an average of 0.05 per photo. Max = 1

Tags: , , , , , , , , , , , , , , , , , , , ,

14 Responses to “Commons Metadata Statistics”

  1. Shelley Says:

    hmmm. I think these stats are a little off – looks like the API is querying unreleased sets or perhaps he’s not querying just the commons material? Regarding The Commons – we’ve only released 6 sets and approx 700 photos. If he’s querying everything, then even more is wonky because we tag non-commons material, so it’s a bit misleading.

  2. striatic Says:

    This set of data looks at all photos from each Commons institution, as opposed to only photos in the Commons collection.

    I’m not sure if Patrick’s data is misleading though, as he does make specific mention that In the case of the Brooklyn Museum “The collection contains 1,153 photos that are not licensed Commons”.

  3. striatic Says:

    Although that would suggest over 1000 “Commons” photos in the Brooklyn Museum stream, not 700, so either your approximation is a bit off or the data is.

    Checking the Brooklyn Museum flickr sets page, it looks like 700 is closer to the current number.

  4. Shelley Says:

    I don’t know. As one of the only mixed accounts in the Commons, to me it is a bit misleading. The participation is really varied and presenting stats in this way does not account for or tell that story. It’s the problem with straight stats with no interpretation (something that I really have a problem with generally). I’m not sure where that 300 discrepancy is – I’d have to look more carefully tomorrow.

  5. Shelley Says:

    Another thought – really, this is a simple issue to fix – if he’d just query the commons assets, then we’d be comparing apples to apples and it would be a lot less of an issue and the metric would be a lot more accurate given what he’s trying to show. I believe this can now be done, no?

  6. striatic Says:

    That can be done now, although it was much more obscure before Flickr added the “Is Commons” flag to the API, which is a fairly recent development.

    If I was to hazard a guess, the 300 discrepancy might be due to taking the entire stream count and then subtracting all the photos under Creative Commons licenses, but not subtracting the photos in the Brooklyn Museum stream that are under copyright. There are a few here and there in the stream, and it is conceivable they could add up to 300.

  7. Patrick Says:

    Yes, I have found 1153 photos that are not licensed ‘Commons’ in Brooklyn Museum collection.
    Maybe it could be more useful to give statistics only for photos licensed ‘Commons’. I agree.
    However, I found only 80 photos with this ‘Commons’ license in Biblioteca de Arte-Fundação Calouste Gulbenkian collection, and I have supposed the project is active also for the other 2947 photos.
    Patrik Peccatte

  8. 100th Post | new curator Says:

    [...] Common Metadata Statistics Save to: [...]

  9. Shelley Says:

    Hi Patrick!

    Yay, nice of you to weigh in here. Yes, I think generally if you can query only commons materials through the API, it will give us all a much more accurate picture of what is going on with regards to the oddball blended accounts like ours. I had thought that Brooklyn Museum and maybe the Maritime were the only blended, but I’m not sure about Biblioteca de Arte-Fundação Calouste Gulbenkian. Someone at Flickr may be able to give us a run down on who’s blended and who’s not.

    But, wait 1153 photos that are not licensed commons? I don’t get it – the math is not adding up. Right now we have 2414 public assets in the entire stream and approx 700 of them are commons:

    Paris 270
    Chicago 97
    Egypt 88+101
    Turkey 110+11

    Total Commons 677
    Total non-commons 1737

    I can’t figure out where these discrepancies are…

  10. Patrick Says:

    Hi Shelley,

    It seems there is a problem with Flickr API.
    I use API flickr.photos.search to count different kind of licenses from value 1 to 7 according to flickr.photos.licenses.getInfo documentation, see
    http://www.flickr.com/services/api/flickr.photos.licenses.getInfo.html

    Following are my results collected today Feb 20 for Brooklyn Museum:

    licence 1, “Attribution-NonCommercial-ShareAlike License”, 0 photos
    licence 2, “Attribution-NonCommercial License”, 0 photos
    licence 3, “Attribution-NonCommercial-NoDerivs License”, 1154 photos, ex. 3276831950
    licence 4, “Attribution License”, 8 photos, ex. 3264857348
    licence 5, “Attribution-ShareAlike License”, 0 photos
    licence 6, “Attribution-NoDerivs License”, 1 photo, ex. 160909843
    licence 7, “No known copyright restrictions” [Commons], 677 photos, ex. 2828649270

    Total is 1840, so the difference 2414-1840=574 must be the number of photos with “All rights reserved” license.
    Using flickr.photos.getInfo to get infos about an “All rights reserved” photo, the string “license=0″ is displayed but this value is not documented and searching for license=0 does not work:
    when I search for license=0 on BM collection using flickr.photos.search, I found 2414 photos which is the full asset.
    For me, something is wrong with Flickr API on licenses. Or maybe I have missing a point…

  11. striatic Says:

    Patrick,

    Flickr recently changed from using the license number system as the supported method for querying The Commons.

    You can now scope all queries to report back only Commons photos by using the “is_commons” parameter on your query.

    See here:

    http://flickr.com/groups/flickrcommons/discuss/72157611463307739/

  12. striatic Says:

    Using API explorer to search the Brooklyn Museum, restricted to “is_commons” being “true” gives the following result:

    http://api.flickr.com/services/rest/?method=flickr.photos.search&api_key=ef7e596424a94df7eab47333b9ceaf7b&user_id=83979593%40N00&is_commons=true&auth_token=72157614116292977-a719cd7a97f60770&api_sig=e7ebc9e1497bc68784cac3a8f192fcf1

    Which is 677 photos.

  13. Patrick Says:

    Striatic,
    “is_commons” parameter has the same effect than searching “license=7″.
    On Brooklyn Museum both returns 677 photos.
    The problem for me is there is no way to search only for “All rights reserved” photos, which seems to correspond to “license=0″.

  14. Shelley Says:

    Hi Patrick, striatic,

    I think this is good, though. At least we know we are accurately getting 677 for Commons images and if we use that as the metric from now on it will be the equivalent to looking at things apples to apples. It actually will help me quite a bit and thanks for your run down of the Commons stats for us here.

Leave a Reply