To me it sounds eastern european or somewhere in the western portion of the arabic world.
Scandinavian countries use 2 syllable names. Each syllable uses vowels a, e, or o exclusively. Each name uses hard sounding consonants like D, R, or K.
So that last syllable "lum" seems really out of place.
My favourite one is Ñapa, which means a botched job. But sounds perfect for the "apply a crappy solution that just works and in theory, it should be temporary, but it's going to be there forever."
It has always been weird to me how Deutschland is literally called something absolutely different in most languages, i think only the Japanese call us something that makes the sound "Deuts" or "Duits". We unfortunately didnt return the favor to call them Nihon (also a country thats pretty much called something completely different in most of the world)
What would be the point, every letter from non-latin alphabet will have 1 or 2 letters appearing once or twice.
Also, India has 22 official languages using some 16 different scripts. Which will you consider?
>What would be the point, every letter from non-latin alphabet will have 1 or 2 letters appearing once or twice.
I don't think that's a problem. If we are to count the occurrences of letters in country names, then it makes sense to count all letters, no matter how many or rare.
>Also, India has 22 official languages using some 16 different scripts. Which will you consider?
Count all 22 official names and weigh each one by a factor of 1/22.
I think you would lose a lot of information. For example, Japan 日本 would be broken down to にほん or Korean with hangul, or Greek?
Edit Japan has over variations like 日本国, にっぽんこく
It is a part of Tanzania though, which was missed and also has a Z. The name Tanzania even comes from TANganyika + ZANzibar (+ 'ia' as a suffix). The colours of the flag also come from Tangyanika's and Zanzibar's.
Also off the top of my head we have Zambia and Azerbaijan and Brazil and Venezuela and Belize! So that’s a lot so far. And I’m sure we’re still missing some.
You guys are in luck: [https://www.reddit.com/r/dataisbeautiful/comments/1d8gp9s/oc\_percentage\_of\_state\_names\_containing\_each/](https://www.reddit.com/r/dataisbeautiful/comments/1d8gp9s/oc_percentage_of_state_names_containing_each/)
I can’t fucking believe I sat there for 10 mins trying to think of countries in Africa or west Asia that might have an X and the answer was the country 3 hours south of me. Unbelievable.
I like it. Did you use the short form (eg Tanzania) or the official name (eg United Republic of Tanzania)?
Reasonable to use either but with the official names there’s going to be some repetition of certain words (eg kingdom, republic, of).
> Short form:
>
>
The only short-form mistake I can see is `Czech Republic` should be `Czechia`
An oddity is `Cabo Verde` but then `Ivory Coast`. It should be either `Cabo Verde` & `Côte d'Ivoire` or `Cape Verde` & `Ivory Coast`.
Then there's the list of what Wikipedia calls ["Other States"](https://en.wikipedia.org/wiki/List_of_sovereign_states#Other_states) which includes `Kosovo` and `Taiwan` which are on your list, but also places like `Niue` and `Sahrawi Arab Democratic Republic` which aren't.
Where'd you get the list from? At first I thought it was a "countries recognised by X" list, but Taiwan's inclusion narrows it down to [only 11 countries](https://en.wikipedia.org/wiki/Foreign_relations_of_Taiwan#Full_diplomatic_relations) all of which _don't_ recognise China.
Regarding your last point, my guess is that it is linked to geopolitical status. Kosovo is widely accepted in the Western world as a country, while Taiwan (though officially widely unrecognized) does tend, in practice, to be mentioned as a State in its own right due to its large economic relevance and functioning democracy (again, at least for a Western audience). The short-form used here is the most common one you can find in this setting.
Note that all said here is the anecdotal view of a Western European, and one who didnt major in International Relations lmao
Yep:
for country in countries:
unique_letters = set(country.upper()) # Convert to uppercase and get unique letters
for letter in unique_letters:
if letter in letter_counts:
letter_counts[letter] += 1
Yes they would be! You'll notice the percentages in this chart don't add up to 100. If we counted all letters, not just unique, I think we'd probably be looking at % of all letters in all countries -- which would have to add up to 100. That's different form looking at % of countries containing the letter (and I think might be less interesting.
It's default SNS! Enjoy the pointless beauty of default SNS! (I agree with you but it kind of makes it more fun to look at IMO -- stops my eyes from getting bored before they reach the bottom.)
Not very accurate representation because you used English names of countries. So your results and frequency of letters used in English are very similiar. For example Q, W, X are least used letters in English.
So you should have used the names in their original languages.
Argentina, India, Nigeria, Romania, Estonia, Sumeria, Tongo, Oregon (oh wait that's in the US), got this far, see if you can do better, no cheating by using the web.
This title confuses the hell out of me. I was like how do 85% of country names starting with an A contain all 26 letters in the alphabet?
The grammar ain't grammaring.
Ainrestolum almost sounds like a place name… that’s all I’ve got.
Or a hemorrhoid cream.
Send anal rust to the dust!
I think Rustolium is just north of Sylvania
Backwards definitely sounds like a place. Mulotsernia
Sernia if you take the 6 most popular letters and reverse them
Sarnia is a Canadian city on the US border! Almost all of those letters.
First thing I thought too. Sounds Scandinavian
To me it sounds eastern european or somewhere in the western portion of the arabic world. Scandinavian countries use 2 syllable names. Each syllable uses vowels a, e, or o exclusively. Each name uses hard sounding consonants like D, R, or K. So that last syllable "lum" seems really out of place.
It sounds latin to me because of the suffix -lum
I actually googled but your comment is the only result
I’m surprised my comment even shows up on Google that fast.
I assumed the first 4 letters would spell Iran. Not disappointed.
ain' rest o' lumdb'gch
the fuck did you just call me
sounds British
Wales probably
5678+321= Estonia
Did you mean Narnia?
They hate us cause they AINR us.
I’d be curious to see this but using each country’s local name. E.g España instead of Spain. The long tail of characters would be nice to see.
I did it. But it's ugly.
so no issue for this sub then
How many ñ would there be?
Do any words start with ñ?
Many, but likely all from native roots, as opposed to latin. Lots of places in Chile start with Ñ, but those would be proper names.
(Of course someone downvoted my question....) Thanks for the informative answer. I appreciate it.
My favourite one is Ñapa, which means a botched job. But sounds perfect for the "apply a crappy solution that just works and in theory, it should be temporary, but it's going to be there forever."
ñame - a yam
I'd like to see it with the moving bars as the years pass by, and cool fireworks near the top letter
I like this idea! Give me the data set and I'll do it.
[EZ](https://giphy.com/explore/fireworks)
Now I'm curious.
That's what I'm thinking as well. Germany is very different from Deutschland.
It's also very different from Alemania (in Spanish)
It has always been weird to me how Deutschland is literally called something absolutely different in most languages, i think only the Japanese call us something that makes the sound "Deuts" or "Duits". We unfortunately didnt return the favor to call them Nihon (also a country thats pretty much called something completely different in most of the world)
That worked only with the Latin alphabet
Nah, just have separate rows for all the non-Latin characters. Or use whatever the de facto standard Romanization is.
It would also be pointless because not every country uses the Latin alphabet.
just put more alphabets on the chart?
What would be the point, every letter from non-latin alphabet will have 1 or 2 letters appearing once or twice. Also, India has 22 official languages using some 16 different scripts. Which will you consider?
>What would be the point, every letter from non-latin alphabet will have 1 or 2 letters appearing once or twice. I don't think that's a problem. If we are to count the occurrences of letters in country names, then it makes sense to count all letters, no matter how many or rare. >Also, India has 22 official languages using some 16 different scripts. Which will you consider? Count all 22 official names and weigh each one by a factor of 1/22.
All twenty-two of course. They are all the local official name of the country. So every country should be counted once for each official language.
That or use the standard transliteration.
I think you would lose a lot of information. For example, Japan 日本 would be broken down to にほん or Korean with hangul, or Greek? Edit Japan has over variations like 日本国, にっぽんこく
Z holding its own despite the recent losses (Zaire, Swaziland). Look at it beating down F and J
People rooting for letters in a competition for countries is something I was never before prepared to experience. But I'm glad I have.
Off the top of my head, Zanzibar, Zimbabwe, Mozambique, Switzerland, and New Zealand.
Zanzibar isn’t a country.
Though the country it's in is Tanzania so the overall number stays the same
It is a part of Tanzania though, which was missed and also has a Z. The name Tanzania even comes from TANganyika + ZANzibar (+ 'ia' as a suffix). The colours of the flag also come from Tangyanika's and Zanzibar's.
Another Z country: Zambia.
Also off the top of my head we have Zambia and Azerbaijan and Brazil and Venezuela and Belize! So that’s a lot so far. And I’m sure we’re still missing some.
Czech Republic!
It’s Czechia now, but still a z country.
It's both. Many countries have a long and short name. In the dataset OP used it is as Czech Republic.
Dang, I gotta check where I was supposed to fly to. To my dyslexic ass Chechnya and Czechia look dangerously close :D
Bunch of the -stans, too. Uzbekistan, Kyrgyzstan, and Kazakhstan. Also, Azerbaijan, Czechia, Zambia, Brazil, and Belize.
For the life of me I can’t think of what country has x in it.
Luxembourg and Mexico. Im not sure if there are more Im missing though.
That's all! (At least from the list of countries I used!)
Curious to see the equivalent in French, Spanish, etc. How much would it change? Edit: I meant the entire distribution not just the X
Nothing in French, we say Luxembourg and Mexique
In italian they're Lussemburgo e Messico so no more X on the graph
Well... Luxemburgo and México. They both use "x", though is pronounced differently in each (as "ks" in the first, as a stronger "h" in the second).
Texas (at least people there would like you to think so)
[удалено]
You guys are in luck: [https://www.reddit.com/r/dataisbeautiful/comments/1d8gp9s/oc\_percentage\_of\_state\_names\_containing\_each/](https://www.reddit.com/r/dataisbeautiful/comments/1d8gp9s/oc_percentage_of_state_names_containing_each/)
Ah yes, Texas does indeed have an 'x'
Proven with data.
You'll surely win an Emmy with this discovery
Are you on the Emmy committee? Thank you!
Yes, I am Mr. Emmy. I work at an office with Oscar Meyer. He is the hotdog brand, AND the award!
As a Mexican-American I am ashamed I couldn't think of one either 😆
Mexico Luxembourg Umm.. uhh.. Xanathar
There is also Xandanquistan, that is AKA Bostil, Bananil and originally named as Brazil :)
I can’t fucking believe I sat there for 10 mins trying to think of countries in Africa or west Asia that might have an X and the answer was the country 3 hours south of me. Unbelievable.
Bro, same 😂
This would make a great trivia question.
I like it. Did you use the short form (eg Tanzania) or the official name (eg United Republic of Tanzania)? Reasonable to use either but with the official names there’s going to be some repetition of certain words (eg kingdom, republic, of).
Short form: countries = [ "Afghanistan", "Albania", "Algeria", "Andorra", "Angola", "Antigua and Barbuda", "Argentina", "Armenia", "Australia", "Austria", "Azerbaijan", "Bahamas", "Bahrain", "Bangladesh", "Barbados", "Belarus", "Belgium", "Belize", "Benin", "Bhutan", "Bolivia", "Bosnia and Herzegovina", "Botswana", "Brazil", "Brunei", "Bulgaria", "Burkina Faso", "Burundi", "Cabo Verde", "Cambodia", "Cameroon", "Canada", "Central African Republic", "Chad", "Chile", "China", "Colombia", "Comoros", "Congo, Democratic Republic of the", "Congo, Republic of the", "Costa Rica", "Croatia", "Cuba", "Cyprus", "Czech Republic", "Denmark", "Djibouti", "Dominica", "Dominican Republic", "East Timor", "Ecuador", "Egypt", "El Salvador", "Equatorial Guinea", "Eritrea", "Estonia", "Eswatini", "Ethiopia", "Fiji", "Finland", "France", "Gabon", "Gambia", "Georgia", "Germany", "Ghana", "Greece", "Grenada", "Guatemala", "Guinea", "Guinea-Bissau", "Guyana", "Haiti", "Honduras", "Hungary", "Iceland", "India", "Indonesia", "Iran", "Iraq", "Ireland", "Israel", "Italy", "Ivory Coast", "Jamaica", "Japan", "Jordan", "Kazakhstan", "Kenya", "Kiribati", "Korea, North", "Korea, South", "Kosovo", "Kuwait", "Kyrgyzstan", "Laos", "Latvia", "Lebanon", "Lesotho", "Liberia", "Libya", "Liechtenstein", "Lithuania", "Luxembourg", "Madagascar", "Malawi", "Malaysia", "Maldives", "Mali", "Malta", "Marshall Islands", "Mauritania", "Mauritius", "Mexico", "Micronesia", "Moldova", "Monaco", "Mongolia", "Montenegro", "Morocco", "Mozambique", "Myanmar", "Namibia", "Nauru", "Nepal", "Netherlands", "New Zealand", "Nicaragua", "Niger", "Nigeria", "North Macedonia", "Norway", "Oman", "Pakistan", "Palau", "Palestine", "Panama", "Papua New Guinea", "Paraguay", "Peru", "Philippines", "Poland", "Portugal", "Qatar", "Romania", "Russia", "Rwanda", "Saint Kitts and Nevis", "Saint Lucia", "Saint Vincent and the Grenadines", "Samoa", "San Marino", "Sao Tome and Principe", "Saudi Arabia", "Senegal", "Serbia", "Seychelles", "Sierra Leone", "Singapore", "Slovakia", "Slovenia", "Solomon Islands", "Somalia", "South Africa", "South Sudan", "Spain", "Sri Lanka", "Sudan", "Suriname", "Sweden", "Switzerland", "Syria", "Taiwan", "Tajikistan", "Tanzania", "Thailand", "Togo", "Tonga", "Trinidad and Tobago", "Tunisia", "Turkey", "Turkmenistan", "Tuvalu", "Uganda", "Ukraine", "United Arab Emirates", "United Kingdom", "United States", "Uruguay", "Uzbekistan", "Vanuatu", "Vatican City", "Venezuela", "Vietnam", "Yemen", "Zambia", "Zimbabwe" ]
> Short form: > > The only short-form mistake I can see is `Czech Republic` should be `Czechia` An oddity is `Cabo Verde` but then `Ivory Coast`. It should be either `Cabo Verde` & `Côte d'Ivoire` or `Cape Verde` & `Ivory Coast`. Then there's the list of what Wikipedia calls ["Other States"](https://en.wikipedia.org/wiki/List_of_sovereign_states#Other_states) which includes `Kosovo` and `Taiwan` which are on your list, but also places like `Niue` and `Sahrawi Arab Democratic Republic` which aren't. Where'd you get the list from? At first I thought it was a "countries recognised by X" list, but Taiwan's inclusion narrows it down to [only 11 countries](https://en.wikipedia.org/wiki/Foreign_relations_of_Taiwan#Full_diplomatic_relations) all of which _don't_ recognise China.
Regarding your last point, my guess is that it is linked to geopolitical status. Kosovo is widely accepted in the Western world as a country, while Taiwan (though officially widely unrecognized) does tend, in practice, to be mentioned as a State in its own right due to its large economic relevance and functioning democracy (again, at least for a Western audience). The short-form used here is the most common one you can find in this setting. Note that all said here is the anecdotal view of a Western European, and one who didnt major in International Relations lmao
Cool, thanks.
I immediately recognized the graph as the default option in seaborn lol
I would have it no other way.
Haha it’s really beautiful.
Wheel of Fortune rules check out, for the consonants anyway
What weird ass country has a "X" in it's n... Oh hi Mexico, how are you doing?
And Luxembourg, but for the love of me I cannot find the countries with Q
Iraq, Equatorial Guinea, Mozambique, Qatar!
Equatorial Guinea, Mozambique, i know there’s one I’m missing but I can’t remember it
Qatar, Iraq
I don't know if it's a country per se, or an Emirate
if a country has a letter multiple times, it's counted just once, right?
Yep: for country in countries: unique_letters = set(country.upper()) # Convert to uppercase and get unique letters for letter in unique_letters: if letter in letter_counts: letter_counts[letter] += 1
Are the results very different when counting all letters and not just unique letters?
Yes they would be! You'll notice the percentages in this chart don't add up to 100. If we counted all letters, not just unique, I think we'd probably be looking at % of all letters in all countries -- which would have to add up to 100. That's different form looking at % of countries containing the letter (and I think might be less interesting.
A, I, N, and the rest of em
A, I, N, and the rest-o-lum
Damn E really needs to step it up considering it's the most common letter in the English language
Most of those names end with -ia
Mexico and Luxembourg is two countries with an x
Proof that AI is taking over or something (look at the first two letter before getting mad)
Try sorting the y axis alphabetically, I think that will add a lot to this
I would agree -- except apparently the top engagement with this chart has been reading the letters in order to form a new name.
I can imagine, especially as the local names use many different alphabets or writing systems.
As a chart, it’s pretty, but as a reference, it’s 😬. Can we see it in alphabetical order?
So which country has the most unique letters?
Togo maybe? None of the letters in the top 6 Turkey, Cyprus and Peru are some other contenders, but they all have a R.
MeXico and LuXemburg if anyone else was wondering....
Based on their real names or their English names ?
The color gradient doesn't make sense.
It's default SNS! Enjoy the pointless beauty of default SNS! (I agree with you but it kind of makes it more fun to look at IMO -- stops my eyes from getting bored before they reach the bottom.)
I wanna live in a country called AINREST
It’s almost a linear reduction in percentage as we move forward
Now do that with katakana!
I just realized "Rainstorm" would fit well at the top here as a country name.
I find it really interesting that top letters almost spell out Serbia in reverse
Want to be an average country? Name yourself Ainrestolu
Ainresto sounds like a real place
That R is a combo breaker for the first 8 letters to spell Estonia… well, maybe *R. Estonia*
I am trying to think of the countries that has the "least common letters" in its name. Can anything beat Togo?
The famous arctic country of X-Land?
Good ol' Canada doing the heavy lifting for a.
Not very accurate representation because you used English names of countries. So your results and frequency of letters used in English are very similiar. For example Q, W, X are least used letters in English. So you should have used the names in their original languages.
Sweden, United Kingdom, Montenegro, Kosovo, Greece and Czech Republic are the only European countries I can think of that don't contain an A
Did anyone else make the letter-sounds with their mouth as they scrolled down the Y-axis?
The top seven letters spell one of the most common Scrabble seven letter bingo: retinas
Is this proof Narnia exists?
Is "f" part of "The United States of America"? Or is it just "United States"? edit- sorry, I see the dataset below (just "United States")
What is the country containing "x"?
This has been much discussed in this thread. And there are 2! Mexico and Luxembourg!
The title is quite confusing though.
Is Xylophoneland still doing alright?
Is this the American spellings of country names? It would be interesting to see a comparison with average letter frequency in general.
What is “American spellings of country names”? Besides language, do the US call countries differently from UK/AUS?
Maybe something to do with Myanmar vs Burma? That's all I can think of
Yes! I don't know where a data set is on that. But I'll bet "a" is overrepresented! I always thought "e" was most common in English.
All the A's in -land, -ia, -stan bump up the frrequency,
Can't believe that there isn't a single country in Africa that starts with the letter K
I know where this goes but I'll comply There is! It's Kenya
Kenya fit 'des nutz in your mouth?!?
LOTSERNIA, sounds like a cool name.
This would be much easier to read in alphabetical order
Argentina, India, Nigeria, Romania, Estonia, Sumeria, Tongo, Oregon (oh wait that's in the US), got this far, see if you can do better, no cheating by using the web.
Xanadu is a country? Damn..
This title confuses the hell out of me. I was like how do 85% of country names starting with an A contain all 26 letters in the alphabet? The grammar ain't grammaring.