Gazetteer: Difference between revisions
mNo edit summary |
mNo edit summary |
||
Line 1: | Line 1: | ||
{{stub}} | {{stub}} | ||
Around [[cartography]], a gazetteer is analogous to a dictionary or encyclopedia - look up a location, and things about it. | '''Around [[cartography]]''', a gazetteer is analogous to a dictionary or encyclopedia - look up a location, and things about it. | ||
They were seemingly introduced to know about parts of the world, and there were different types for different interests - e.g. broad world overview, to detailing parts of countries, to to almost tourist-like thematic information). | They were seemingly introduced to know about parts of the world, and there were different types for different interests - e.g. broad world overview, to detailing parts of countries, to to almost tourist-like thematic information). | ||
Line 11: | Line 11: | ||
'''Around [[computational linguistics]]''', a gazetteer is ''largely'' just using a list of already known known names, probably per label/category you are interested in, and optionally some basic rules around it. | |||
Around [[computational linguistics]], a gazetteer is ''largely'' just using a list of already known known names | |||
Around the task of [[Named Entity Recognition]], this may provide good precision (a lot of names will be matched because we already knew them precisely), | Around the task of [[Named Entity Recognition]], this may provide good precision (a lot of names will be matched because we already knew them precisely), | ||
but poor | but poor recall in that unknown names, and e.g. variant spellings, will always be ignored. |
Revision as of 18:08, 23 February 2024
Around cartography, a gazetteer is analogous to a dictionary or encyclopedia - look up a location, and things about it.
They were seemingly introduced to know about parts of the world, and there were different types for different interests - e.g. broad world overview, to detailing parts of countries, to to almost tourist-like thematic information).
In some modern use it sometimes just means a name-to-coordinates lookup.
https://en.wikipedia.org/wiki/Gazetteer
Around computational linguistics, a gazetteer is largely just using a list of already known known names, probably per label/category you are interested in, and optionally some basic rules around it.
Around the task of Named Entity Recognition, this may provide good precision (a lot of names will be matched because we already knew them precisely), but poor recall in that unknown names, and e.g. variant spellings, will always be ignored.