AARoads Forum

Non-Road Boards => Off-Topic => Topic started by: Scott5114 on May 14, 2024, 03:14:47 AM

Title: Ideal test toponyms
Post by: Scott5114 on May 14, 2024, 03:14:47 AM
Leudimin mentioned on the wiki Discord that he has a set of standard test names that he uses for doing road-related fictional things with. Since I do a lot of type design for my job, that got me started thinking on what characteristics a battery of "standard" test names should have. For type design and lettering purposes, you would probably want to have a lot of common letter combinations (bigrams, trigrams, and quadrigrams). If you can make a word with a lot of these look good, you will be able to make most words look good, because the same letter combinations would appear often in many different words.

What are the most common letter combinations? Someone at Notre Dame ran an analysis on English-language works on Project Gutenberg (https://www3.nd.edu/~busiforc/handouts/cryptography/Letter%20Frequencies.html), and came up with the following lists:

So, using the above lists...
1. What toponym, letter for letter, is the most efficient representation of as many of the above letter combinations as possible? I imagine the easiest way to express this would be to count the number of common letter combinations and divide it by the number of letters. I ran a quick analysis on the list of Nevada municipalities (easy since there's only 19 of them) and came up with the most efficient municipality in Nevada being Henderson (he, nd, de, er, on, nde are all on the list, making 6 matches, divided by 9 letters, producing a score of 0.66).
2. What is the most efficient toponym in each state?
3. What set of toponyms would most efficiently cover the entire list above?
Title: Re: Ideal test toponyms
Post by: CNGL-Leudimin on May 14, 2024, 09:03:24 AM
Not only I use the set for road-related fictional things, but also for many other things, for example I have them set as additional places on the weather app :sombrero:. However a recent event has seen me replacing one of them with the result they are now all geographically in Europe, although I'm not sure if the new entry will stick around.
Title: Re: Ideal test toponyms
Post by: kurumi on May 16, 2024, 12:49:55 PM
This might be a good technical interview question
Title: Re: Ideal test toponyms
Post by: Dirt Roads on May 28, 2024, 04:34:07 PM
Geesh, I miss the days where I could look at a list of facts/figures and quickly solve these types of problems.  "West Virginia" doesn't score well in the list of states and territories with respect to the number of toponyms.  Not surprisingly, "Northern Marianas Islands" ranks at the top with "Washington" (state) coming in at a distant second.

Not sure if the OP would approve, but here is the ranking system that I've used:
This gives quadrigrams a much larger percentage of the total score than using a letter count ranking.

Welcome to West Virginia:

*The town of Webster Springs is officially named Addison. 

Toponyms such as counties, mountains, lakes and parks in West Virginia did not score well.  The top five counties (Mingo County, Pendleton County, Preston County, Randolph County and Wyoming County) only got 5 points each, and sadly the word "County" drove up those scores.  Shavers Mountain was the biggest in this ranking with 9 points, with Shenandoah Mountain, Nathaniel Mountain and Patterson Creek Mountain right behind at 8 points.  Jennings Randolph Lake also dives in with 9 points, and Stony River Reservoir swims away in second with only 7 points.  In addition to Panther State Forest listed above, Watters Smith Memorial State Park comes in at second in the parks category with 9 points.

** I stubbornly refused to include the titles of the state and national parks, forests and rec areas in the count totals.  The term "National Recreation Area" scores quite well all by itself.  That may have been a mistake, since the words "river", "creek", "fork", "mountain" and a few others also added a little bit to the other scores.  For the record, "Spruce Knob-Seneca Rocks National Recreation Area" (49 letters) only picked up 1 point in this system.


Title: Re: Ideal test toponyms
Post by: Scott5114 on May 30, 2024, 11:36:02 PM
Worthington is a good one—ington shows up in a good number of place names.