That seems like more of what WolframAlpha caters to. Personally, I don't like assuming an engine has interpreted what I'm looking for correctly - I'd prefer to maintain some of the load of personally understanding the source of information and it's context. So here is what I do:
>> distance from Los Angeles to New York
!m los angeles to new york
>> Joe Biden age
!w joe biden
>> knives out cast
!imdb knives out
>> capital of South Africa
!w South Africa
And that last one is a really good example of why I don't want to trust an engine to interpret what I'm looking for, because Wolfram Alpha just tells you Pretoria, and if I hadn't spent a large part of my life there I wouldn't know that's probably not what people are looking for. Economically? They probably want to know that Johannesburg is the largest city. Just like how people are sometimes surprised when they learn that New York City and Los Angeles are not state capitals, even though they're really important cities. Politically? Well the roles of the government is split between 3 cities and that's not a simple thing for an engine to comprehend. And I didn't even know Bloemfontein held that kind of status until I just read it on Wikipedia. Neither I, a former citizen of the country, nor WolframAlpha, was aware of that.
It definitely still has a long way to go. For example, it cannot understand area Pittsburgh / area Kanazawa or population density Pittsburgh / population density Kanazawa which seems less complex than the above queries. Alpha is, however, aware of the populations of each city.
“Every county on the map above gets a score for each place based on a combination of proximity, population, and Wikipedia article length, then normalized by share.”
This score does not tell you “what place someone is most likely referring to, depending on where they are.”
That might seem logical on the surface, but dig just a little deeper and the flaws become readily apparent.
For example if you're looking for the population of New York between 1850 and 1950, you would see a potentially dramatic shift of the population around the turn of the century, which would be misleading if you didn't already know the caveat about the consolidation of the city in 1898.
However that shouldn't discourage the tagging of data by city. It just means that a city, as metadata, is no more or less functional than country, as the grandparent comment suggests.
I wish the article had explained how the individual regions on the map are determined. It probably has to do with how the searches are geo-located but it seems like it could've skewed the results for some places.
Use of City boundaries instead of MSA makes this inaccurate and not terribly useful as a tool for understanding anything about these markets, let alone making decisions from.
It's that the case anywhere? Even city centre / suburbs will have different values, much less a whole province where A may contain an empty field or a group of residential buildings with 50 floors and shared lifts.
I could see how it would be useful for a map of a city, or maybe even at a scale of some regions... but not for comparing totals between regions.
From the article:
"Google Now has a huge knowledge graph—you can ask questions like ‘Where was Abraham Lincoln born?’ And it can name the city. You can also say, ‘What is the population?’ of a city and it’ll bring up a chart and answer. But you cannot say, ‘What is the population of the city where Abraham Lincoln was born?’”
Out of interest I tried this question on WolframAlpha and it happily returned the answer[0].
> rational boundaries like states, counties, localities, suburbs, or any other reasonably controlled, surveyed, and relatively consistent dataset.
Those boundaries aren't necessarily rational either.
But not only does your post code not reference your political boundaries. Your postal city may not be a political city, or may not match your political city. I've lived in several places where I needed to write the city that the post office serving my house was in, if I wanted to receive mail. It's really more of the name of the post office, there are plenty of post offices in unincorporated county land, which doesn't belong to any city.
Thanks for mentioning this. To be fair, locality of information is a pretty common assumption but I should definitely qualify this a bit more in the future.
100%. I live in a hamlet of a larger town in the US, and was curious what the population of my hamlet is.
There’s a Wikipedia page for the hamlet, but it’s empty. No population data, etc.
I’d much rather see no data than a LLM’s best guess. I’m guessing a LLM using the data would also perform better without approximated or “probably right” information.
They explicitly say that in their "data and methods" blurb:
> Person/city associations were based on the thousands of “People from X city” pages on Wikipedia. The top person from each city was determined by using median pageviews (with a minimum of 1 year of traffic). We chose to include multiple occurrences for a single person because there is both no way to determine which is more accurate and people can “be from” multiple places.
Thank you! At least someone understand my frustration. When I try to compare countries, it somehow manages to insert New York along with a few other US states as countries. He/She/They might be a good React/GraphQL ninja(s), but definitely skipped out on the ETL course.
Idea is good by the tool is not good enough -- it only notices the population centres that author has took a note of, this looks like unintentional confirmation bias.
>> distance from Los Angeles to New York
!m los angeles to new york
>> Joe Biden age
!w joe biden
>> knives out cast
!imdb knives out
>> capital of South Africa
!w South Africa
And that last one is a really good example of why I don't want to trust an engine to interpret what I'm looking for, because Wolfram Alpha just tells you Pretoria, and if I hadn't spent a large part of my life there I wouldn't know that's probably not what people are looking for. Economically? They probably want to know that Johannesburg is the largest city. Just like how people are sometimes surprised when they learn that New York City and Los Angeles are not state capitals, even though they're really important cities. Politically? Well the roles of the government is split between 3 cities and that's not a simple thing for an engine to comprehend. And I didn't even know Bloemfontein held that kind of status until I just read it on Wikipedia. Neither I, a former citizen of the country, nor WolframAlpha, was aware of that.
reply