Gazetteer: Difference between revisions

From Helpful
Jump to navigation Jump to search
mNo edit summary
mNo edit summary
Line 1: Line 1:
{{stub}}
{{stub}}


Around [[cartography]], a gazetteer is analogous to a dictionary or encyclopedia - look up a location, and things about it.  
'''Around [[cartography]]''', a gazetteer is analogous to a dictionary or encyclopedia - look up a location, and things about it.  


They were seemingly introduced to know about parts of the world, and there were different types for different interests - e.g. broad world overview, to detailing parts of countries, to to almost tourist-like thematic information).
They were seemingly introduced to know about parts of the world, and there were different types for different interests - e.g. broad world overview, to detailing parts of countries, to to almost tourist-like thematic information).
Line 11: Line 11:




 
'''Around [[computational linguistics]]''', a gazetteer is ''largely'' just using a list of already known known names, probably per label/category you are interested in, and optionally some basic rules around it.
Around [[computational linguistics]], a gazetteer is ''largely'' just using a list of already known known names (which in the case of geographica names might come from a classical gazetteer), probably per label/category you are interested in, and optionally some basic rules around it.


Around the task of [[Named Entity Recognition]], this may provide good precision (a lot of names will be matched because we already knew them precisely),
Around the task of [[Named Entity Recognition]], this may provide good precision (a lot of names will be matched because we already knew them precisely),
but poor precision in that unknown names, and e.g. variant spellings, will always be ignored.
but poor recall in that unknown names, and e.g. variant spellings, will always be ignored.

Revision as of 18:08, 23 February 2024

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

Around cartography, a gazetteer is analogous to a dictionary or encyclopedia - look up a location, and things about it.

They were seemingly introduced to know about parts of the world, and there were different types for different interests - e.g. broad world overview, to detailing parts of countries, to to almost tourist-like thematic information).

In some modern use it sometimes just means a name-to-coordinates lookup.

https://en.wikipedia.org/wiki/Gazetteer


Around computational linguistics, a gazetteer is largely just using a list of already known known names, probably per label/category you are interested in, and optionally some basic rules around it.

Around the task of Named Entity Recognition, this may provide good precision (a lot of names will be matched because we already knew them precisely), but poor recall in that unknown names, and e.g. variant spellings, will always be ignored.