I am trying to collect sample datasets to test in OpenVisuals.org. As of that, these sample datasets should:
- sample a generic type of datasets.
- should have different but generic formats, which would be a format outcome of copying/pasting, exporting from excel, etc.. (right now, I can think of differences in comma separated values (sometimes comma, sometime tab, sometimes space).
- should imply comparison among its data.
I will use these datasets to make sure that the website (hence the applets) will be able to work fine with them. I currently fetched couple of datasets that would be suitable:
MDF (wood type) Production for Continents per years (gathered from FAO United Nations website)
years subject commodity America + Asia + Europe + Oceania + 1995 Production Quantity MDF 2398000.00 1582000.00 3363300.00 540000.00 1996 Production Quantity MDF 2725000.00 2190000.00 3604300.00 786000.00 1997 Production Quantity MDF 3285036.00 3737000.00 4587300.00 853000.00 1998 Production Quantity MDF 3899694.00 3484000.00 6557913.00 899000.00 1999 Production Quantity MDF 4635000.00 3854000.00 7288500.00 1025000.00 2000 Production Quantity MDF 4773000.00 4652000.00 8380493.00 1241000.00 2001 Production Quantity MDF 5101472.00 8002300.00 9163590.00 1350000.00 2002 Production Quantity MDF 5601924.00 10135100.00 10386690.00 1419000.00 2003 Production Quantity MDF 6327229.00 13970200.00 11110990.00 1627000.00 2004 Production Quantity MDF 7729545.00 18658026.00 11846105.00 1639000.00 2005 Production Quantity MDF 7751384.00 24133523.00 12704400.00 1622000.00 2006 Production Quantity MDF 8467770.00 27017523.00 13383800.00 1635000.00
In the table above, the generated dataset set is not delimited with comma, but tab spaces (although the file is .csv). This also happens when you copy paste from Excel.
Also below, is a very typical example of a dataset with empty cells (unknown data), though the content is one of the most popular ones lately.
"State","Barack Obama","Hillary Rodham Clinton ","John Edwards","John McCain ","Mike Huckabee","Mitt Romney","Ron Paul" "Alabama",10,19,0,16,21,0,0 "Arizona",21,25,0,50,0,0,0 "Arkansas",3,10,0,1,29,1,0 "California",155,195,0,146,0,3,0 "Connecticut",26,22,0,27,0,0,0 "Delaware",9,6,0,18,0,0,0 "Georgia",22,12,0,3,45,0,0 "Illinois",68,35,0,54,0,3,0 "Kansas",15,6,0,,,,0 "Massachusetts",38,55,0,18,0,22,0 "Missouri",36,36,0,58,0,0,0 "Montana",,,,0,0,25,0 "New Jersey",46,54,0,52,0,0,0 "New Mexico",12,14,0,,,,0 "New York",80,121,0,101,0,0,0 "North Dakota",,,,5,5,8,5 "Oklahoma",14,24,0,32,6,0,0 "Tennessee",14,24,0,19,25,8,0 "Utah",14,9,0,0,0,36,0 "West Virginia",,,,,18,0,0 "Florida",,,,57,0,0,0 "South Carolina",25,12,8,19,5,0,0 "Michigan",,,,6,1,23,0 "New Hampshire",9,9,4,7,1,4,0
Here is also the imported version of this dataset in OpenVisuals.org .
0 Responses to “Sample datasets I am trying to import”