Holes in the Wolfram Knowledgebase

Wolfram touts its Knowledgebase as “the world’s largest and broadest repository of computable knowledge” and “carefully curated expert knowledge directly derived from primary sources.” There’s certainly a lot in there, but there are some inexplicable holes that could be filled with little effort.

I used the Knowledgebase last year in my post about orbital curvature. Things like

Entity["Planet", "Earth"]["AverageOrbitDistance"]


Entity["Planet", "Earth"]["Mass"]

pulled information out of the Knowledgebase so I didn’t have to look it up outside of Mathematica and paste it into my code. Very convenient.

But I learned a few months ago that another part of the Knowledgebase was missing data, which could get in the way of other types of calculation. I was testing out the kind of state- and county-level information I could access, and my initial explorations focused on where I live: DuPage County, Illinois.

To be sure, the Knowledgebase has lots of info on DuPage County. It knows, for example, the area, the population, the per capita income, and the number of annual births and deaths. But it doesn’t know the county seat, which I would think is easier to determine and enter into the Knowledgebase than most of the other stuff—not to mention more stable than transient figures like population and income.

Broadening my exploration to all the counties in Illinois, I learned that of our 102 counties, Wolfram knew the capitals of all of them except DuPage and DeKalb counties. So this command

 Entity["AdministrativeDivision", {"CookCounty", "Illinois", 
  "UnitedStates"}], "CapitalName"]



as expected, while both of these commands,

 Entity["AdministrativeDivision", {"DuPageCounty", "Illinois", 
  "UnitedStates"}], "CapitalName"]


 Entity["AdministrativeDivision", {"DeKalbCounty", "Illinois", 
  "UnitedStates"}], "CapitalName"]



This is not exactly obscure information, and there are reliable sources from which to get it. Here, for example is a map from the Illinois Blue Book, an official publication of the state.

Illinois county map

As you (and the folks at Wolfram) can see, the DuPage and DeKalb county seats are Wheaton and Sycamore, respectively.

I sent an email to Wolfram about the missing county seat data and got a boilerplate reply saying their development team would review it. That was in August; the Knowledgebase still returns Missing["NotAvailable"].

Recently, I decided to look for missing county seats in every state. Here are all the counties—or administrative divisions that Wolfram treats like counties— that are missing their capitals in the Knowledgebase:

County w/o seat State
DeKalb County Alabama
Aleutians West Alaska
Bethel Alaska
Chugach Alaska
Copper River Alaska
Dillingham Alaska
Hoonah-Angoon Alaska
Nome Alaska
Prince of Wales-Hyder Alaska
Petersburg Alaska
Skagway Alaska
Southeast Fairbanks Alaska
Kusilvak Census Area Alaska
Wrangell Alaska
Yukon-Koyukuk Alaska
Mono County California
Sierra County California
Conejos County Colorado
Wakulla County Florida
Columbia County Georgia
Crawford County Georgia
DeKalb County Georgia
Echols County Georgia
Kalawao County Hawaii
Owyhee County Idaho
DeKalb County Illinois
DuPage County Illinois
DeKalb County Indiana
LaPorte County Indiana
Plaquemines Parish Louisiana
St. James Parish Louisiana
Keweenaw County Michigan
Lake of the Woods County Minnesota
DeSoto County Mississippi
Franklin County Mississippi
DeKalb County Missouri
McPherson County Nebraska
Esmeralda County Nevada
Eureka County Nevada
Lincoln County Nevada
Storey County Nevada
Burlington County New Jersey
Mora County New Mexico
Rio Arriba County New Mexico
Bronx County (The Bronx) New York
Broome County New York
Kings County (Brooklyn) New York
New York County (Manhattan) New York
Queens County (Queens) New York
Richmond County (Staten Island) New York
Camden County North Carolina
Currituck County North Carolina
Hyde County North Carolina
Dunn County North Dakota
Bristol County Rhode Island
Kent County Rhode Island
Buffalo County South Dakota
DeKalb County Tennessee
Borden County Texas
Glasscock County Texas
Kenedy County Texas
King County Texas
Loving County Texas
McMullen County Texas
Montague County Texas
Palo Pinto County Texas
Young County Texas
Rich County Utah
Alexandria (independent city) Virginia
Amelia County Virginia
Bath County Virginia
Bland County Virginia
Bristol (independent city) Virginia
Buckingham County Virginia
Buena Vista (independent city) Virginia
Charles City County Virginia
Charlottesville (independent city) Virginia
Chesapeake (independent city) Virginia
Colonial Heights (independent city) Virginia
Covington (independent city) Virginia
Cumberland County Virginia
Danville (independent city) Virginia
Dinwiddie County Virginia
Emporia (independent city) Virginia
Fairfax (independent city) Virginia
Falls Church (independent city) Virginia
Fluvanna County Virginia
Franklin (independent city) Virginia
Fredericksburg (independent city) Virginia
Galax (independent city) Virginia
Goochland County Virginia
Hampton (independent city) Virginia
Hanover County Virginia
Harrisonburg (independent city) Virginia
Hopewell (independent city) Virginia
Isle of Wight County Virginia
King and Queen County Virginia
King George County Virginia
King William County Virginia
Lancaster County Virginia
Lexington (independent city) Virginia
Lunenburg County Virginia
Lynchburg (independent city) Virginia
Manassas (independent city) Virginia
Manassas Park (independent city) Virginia
Martinsville (independent city) Virginia
Mathews County Virginia
Middlesex County Virginia
Nelson County Virginia
New Kent County Virginia
Newport News (independent city) Virginia
Norfolk (independent city) Virginia
Northumberland County Virginia
Norton (independent city) Virginia
Nottoway County Virginia
Petersburg (independent city) Virginia
Poquoson (independent city) Virginia
Portsmouth (independent city) Virginia
Powhatan County Virginia
Prince George County Virginia
Radford (independent city) Virginia
Richmond County Virginia
Roanoke County Virginia
Salem (independent city) Virginia
Stafford County Virginia
Staunton (independent city) Virginia
Suffolk (independent city) Virginia
Sussex County Virginia
Virginia Beach (independent city) Virginia
Waynesboro (independent city) Virginia
Williamsburg (independent city) Virginia
Winchester (independent city) Virginia

Quite a list. Now there are legitimate (or at least arguable) reasons some of these counties are missing their county seat:

  1. Some counties really don’t have a county seat. Kalawao County in Hawaii, for example.
  2. Some administrative districts in Alaska don’t have a capital. Alaska has boroughs rather than counties, and one of them, called the Unorganized Borough,1 is further subdivided into “census areas,” none of which have a capital.
  3. Virginia has, in addition to counties, “independent cities,” which appear to be at the same level as counties.2 While I think Wolfram would be justified in saying these independent cities are their own capitals, it’s decided to say the capitals are missing, which is a legitimate choice, too.
  4. Five counties in New York match up with the boroughs of New York City. Pretty hard to choose the capital of a portion of a city, so these counties also have missing capitals.

But most of the counties with missing county seats are like DuPage and DeKalb counties—regular counties with regular county seats that are just not included in the Knowledgebase, despite them being easy to look up and verify. There are fewer than 100 of them. I don’t know why they’re missing, but filling in missing values like this is a pretty standard data cleaning operation. And as I said earlier, this is pretty much a one-time operation; counties just don’t change seats very often.

I haven’t sent this list off to Wolfram. If the people on its development team can’t be bothered to clean the data in their own home state, how likely is it that they’ll fill in all the other states’ data? But they should.

Update 4 Dec 2023 11:45 AM
Chon Torres on Mastodon informed me that the two California counties, Mono and Sierra, do have county seats, but they’re unincorporated, and that might explain why they’re missing from the Knowledgebase. That’s a good explanation, but I would argue with Wolfram that it’s a poor reason for excluding a capital. A county seat should have county government offices—Chon mentioned that he’s been at the Sierra County Courthouse in Downieville—but I don’t see why it needs a municipal government.

Adding to the madness of Virginia government, Sam Davies told me that an independent city can also be the county seat of a county that it’s been carved out of. The example he gave was Charlottesville, which is both an independent city and the capital of Albermarle County. To me, a more disturbing example is Fairfax, which is an independent city but also the county seat of—yes, that’s right—Fairfax County.

Thanks to Chon and Sam for the local government expertise.

  1. Which is apparently not actually a borough itself, despite its name. This is more than I wanted to know about Alaska’s government. 

  2. As with the Unorganized Borough in Alaska, this is more than I wanted to know about Virginia’s government.