Skip to content

Commit

Permalink
update dictionary + genc fixes
Browse files Browse the repository at this point in the history
  • Loading branch information
vincentarelbundock committed Oct 1, 2024
1 parent b4d17e6 commit 9191a32
Show file tree
Hide file tree
Showing 6 changed files with 4 additions and 27 deletions.
1 change: 1 addition & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
## development

* The `simplify` argument in `apply` was introduced in R 4.1.0. We used it, which broke usage of countrycode on older versions of R.
* `genc` code improvements.

## countrycode 1.6.0

Expand Down
Binary file modified data/codelist.rda
Binary file not shown.
Binary file modified data/codelist_panel.rda
Binary file not shown.
2 changes: 1 addition & 1 deletion dictionary/codelist_without_cldr.csv
Original file line number Diff line number Diff line change
Expand Up @@ -279,7 +279,7 @@ OECD1990,.va,Europe,Vatikanstaat,vatikan|heilig.*stuhl,Vatican City,holy.?see|va
LAM,.ve,Americas,Venezuela,venezuela,Venezuela,venezuela,Venezuela,venezuela|vénézuélien,Venezuela,venezuela,Venezuela,VEN,101,Bolívar Soberano,,VE,,Southern America,South-Atlantic,VE,236,VE,263,VE,VEN,862,VEN,101,CARSAM,299,VEN,Venezuela (Bolivarian Republic of),Venezuela (République bolivarienne du),VE,VEN,862,VED,926,Venezuela,VEN,101,VEN,101,Latin America & Caribbean,South America,862,فنزويلا (جمهورية - البوليفارية),Venezuela (Bolivarian Republic of),Venezuela (República Bolivariana de),Venezuela (République bolivarienne du),Венесуэла (Боливарианская Республика),委内瑞拉(玻利瓦尔共和国),19,Americas,5,South America,419,Latin America and the Caribbean,VEN,The Americas,🇻🇪,862,51,Venezuela,VEN,VE,VEN,862
ASIA,.vn,Asia,Vietnam,^(?!.*republik).*viet.?nam|^(?=.*sozialist).*viet.?nam,Vietnam,^(?!south)(?!republic).*viet.?nam(?!.*south)|democratic.republic.of.vietnam|socialist.republic.of.viet.?nam|north.viet.?nam|viet.?nam.north,Vietnam,^(?!.*republique).*viet.?nam|^(?=.*socialist).*viet.?nam,Vietnam,^(?!.*republik).*viet.?nam|^(?=.*sozialist).*viet.?nam,Vietnam,DRV,816,Dong,VN,VN,,Asia,Asia/Pacific,VN,237,VM,264,VN,VNM,704,RVN,817,ASIAPAC,582,VIE,Viet Nam,Viet Nam (le),VN,VNM,704,VND,704,Vietnam,VIE,818,VIE,818,East Asia & Pacific,South-Eastern Asia,704,فييت نام,Viet Nam,Viet Nam,Viet Nam,Вьетнам,越南,142,Asia,,,35,South-eastern Asia,SRV,Asia and the Pacific,🇻🇳,704,34,Vietnam,VNM,VN,VNM,704
ASIA,.wf,Oceania,Wallis und Futuna,futuna|wallis,Wallis & Futuna,futuna|wallis,Wallis-et-Futuna,futuna|wallis,Wallis e Futuna,futuna|wallis,,,,CFP Franc,,WF,,Asia,Asia/Pacific,WF,243,WF,266,WF,WLF,876,,,,,,Wallis and Futuna,Wallis-et-Futuna,WF,WLF,876,XPF,953,,,,,,,Polynesia,876,,,,,,,9,Oceania,,,61,Polynesia,,,🇼🇫,876,,,,,,
MAF,.eh,Africa,Westsahara,west.*sahara,Western Sahara,western.sahara,Sahara occidental,west.*sahara|sahara.*(occidental|ouest),Sahara occidentale,west.*sahara|sahara.*occidentale,,,,Moroccan Dirham,,EH,,Africa,Southern Africa,EH,205,WI,268,EH,ESH,732,,,AFI,,,Western Sahara,Sahara occidental (le),EH,ESH,732,MAD,504,,,,,,,Northern Africa,732,,,,,,,2,Africa,,,15,Northern Africa,WSH,Middle East and North Africa,🇪🇭,732,,,,,,
MAF,.eh,Africa,Westsahara,west.*sahara,Western Sahara,western.sahara,Sahara occidental,west.*sahara|sahara.*(occidental|ouest),Sahara occidentale,west.*sahara|sahara.*occidentale,,,,Moroccan Dirham,,EH,,Africa,Southern Africa,EH,205,WI,268,,,,,,AFI,,,Western Sahara,Sahara occidental (le),EH,ESH,732,MAD,504,,,,,,,Northern Africa,732,,,,,,,2,Africa,,,15,Northern Africa,WSH,Middle East and North Africa,🇪🇭,732,,,,,,
,,,Württemberg,w(ue|ü)rttemberg,Wuerttemburg,w(ue|ü)rttemburg,,w.rt.?emberg,,w.rt.?emberg,Wuerttemburg,WRT,271,,,,,,,,,,,,,,,,,,,,,,,,,,Wuerttemburg,WRT,271,WRT,271,Europe & Central Asia,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,Würtemberg,w.rtemberg,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,Europe & Central Asia,,,,,,,,,,,,,,,,,,,355,Würtemberg,,,,
MAF,.ye,Asia,Jemen,^(?!.*arab)(?!.*nord)(?!.*sana)(?!.*peo)(?!.*dem)(?!.*süd)(?!.*aden)(?!.*\bp\.?d\.?r).*jemen,Yemen,^(?!.*arab)(?!.*north)(?!.*sana)(?!.*peo)(?!.*dem)(?!.*south)(?!.*aden)(?!.*\bp\.?d\.?r).*yemen,Yémen,y(e|é)men,Yemen,yemen,Yemen,YEM,679,Yemeni Rial,YE,YE,,Middle East,Middle-East,YE,249,YM,269,YE,YEM,887,,,MID,474,YEM,Yemen,Yémen (le),YE,YEM,887,YER,886,Yemen,YEM,679,YEM,679,Middle East & North Africa,Western Asia,887,اليمن,Yemen,Yemen,Yémen,Йемен,也门,142,Asia,,,145,Western Asia,YEM,Middle East and North Africa,🇾🇪,887,14,Yemen,YEM,YE,YEM,887
Expand Down
18 changes: 0 additions & 18 deletions dictionary/data_genc.csv
Original file line number Diff line number Diff line change
@@ -1,8 +1,5 @@
country,genc2c,genc3c,genc3n
ABU MUSA AND TUNB ISLANDS,AA,XAA,935
ABYEI,QN,XQN,936
AFGHANISTAN,AF,AFG,004
AKSAI CHIN AND OTHER AREAS,QO,XQO,937
ALBANIA,AL,ALB,008
ALGERIA,DZ,DZA,012
AMERICAN SAMOA,AS,ASM,016
Expand Down Expand Up @@ -32,7 +29,6 @@ BOSNIA AND HERZEGOVINA,BA,BIH,070
BOTSWANA,BW,BWA,072
BOUVET ISLAND,BV,BVT,074
BRAZIL,BR,BRA,076
BRAZILIAN ISLAND,QP,XQP,938
BRITISH INDIAN OCEAN TERRITORY,IO,IOT,086
BRUNEI,BN,BRN,096
BULGARIA,BG,BGR,100
Expand All @@ -53,12 +49,9 @@ CHRISTMAS ISLAND,CX,CXR,162
COCOS (KEELING) ISLANDS,CC,CCK,166
COLOMBIA,CO,COL,170
COMOROS,KM,COM,174
CONEJO ISLAND,QQ,XQQ,939
CONGO (BRAZZAVILLE),CG,COG,178
CONGO (KINSHASA),CD,COD,180
CONGO RIVER ISLANDS,QR,XQR,940
COOK ISLANDS,CK,COK,184
CORISCO BAY ISLANDS,QT,XQT,941
COSTA RICA,CR,CRI,188
CÔTE D’IVOIRE,CI,CIV,384
CROATIA,HR,HRV,191
Expand All @@ -70,8 +63,6 @@ DENMARK,DK,DNK,208
DJIBOUTI,DJ,DJI,262
DOMINICA,DM,DMA,212
DOMINICAN REPUBLIC,DO,DOM,214
DOUMEIRA ISLANDS,QV,XQV,942
DRAMANA AND SHAKHATOE,QY,XQY,943
ECUADOR,EC,ECU,218
EGYPT,EG,EGY,818
EL SALVADOR,SV,SLV,222
Expand All @@ -92,7 +83,6 @@ GABON,GA,GAB,266
"GAMBIA, THE",GM,GMB,270
GEORGIA,GE,GEO,268
GERMANY,DE,DEU,276
GEYSER REEF,XF,XXF,944
GHANA,GH,GHA,288
GIBRALTAR,GI,GIB,292
GREECE,GR,GRC,300
Expand All @@ -106,9 +96,7 @@ GUINEA,GN,GIN,324
GUINEA-BISSAU,GW,GNB,624
GUYANA,GY,GUY,328
HAITI,HT,HTI,332
HANS ISLAND,XI,XXI,945
HEARD ISLAND AND MCDONALD ISLANDS,HM,HMD,334
HEIPETHES ISLANDS,XN,XXN,946
HONDURAS,HN,HND,340
HONG KONG,HK,HKG,344
HUNGARY,HU,HUN,348
Expand All @@ -125,11 +113,9 @@ JAMAICA,JM,JAM,388
JAPAN,JP,JPN,392
JERSEY,JE,JEY,832
JORDAN,JO,JOR,400
KALAPANI,XO,XXO,947
KAZAKHSTAN,KZ,KAZ,398
KENYA,KE,KEN,404
KIRIBATI,KI,KIR,296
KOALOU / KOUROU,XX,XXX,948
"KOREA, NORTH",KP,PRK,408
"KOREA, SOUTH",KR,KOR,410
KOSOVO,XK,XKS,901
Expand Down Expand Up @@ -158,7 +144,6 @@ MAURITIUS,MU,MUS,480
MAYOTTE,YT,MYT,175
MEXICO,MX,MEX,484
"MICRONESIA, FEDERATED STATES OF",FM,FSM,583
MINERVA REEFS,XY,XXY,949
MOLDOVA,MD,MDA,498
MONACO,MC,MCO,492
MONGOLIA,MN,MNG,496
Expand Down Expand Up @@ -186,7 +171,6 @@ PALAU,PW,PLW,585
PANAMA,PA,PAN,591
PAPUA NEW GUINEA,PG,PNG,598
PARAGUAY,PY,PRY,600
PARSLEY ISLAND,XZ,XXZ,950
PERU,PE,PER,604
PHILIPPINES,PH,PHL,608
PITCAIRN ISLANDS,PN,PCN,612
Expand All @@ -207,12 +191,10 @@ SAINT VINCENT AND THE GRENADINES,VC,VCT,670
SAMOA,WS,WSM,882
SAN MARINO,SM,SMR,674
SAO TOME AND PRINCIPE,ST,STP,678
SAPODILLA CAYES,ZZ,XZZ,951
SAUDI ARABIA,SA,SAU,682
SENEGAL,SN,SEN,686
SERBIA,RS,SRB,688
SEYCHELLES,SC,SYC,690
SIACHEN,SQ,XSQ,952
SIERRA LEONE,SL,SLE,694
SINGAPORE,SG,SGP,702
SINT MAARTEN,SX,SXM,534
Expand Down
10 changes: 2 additions & 8 deletions dictionary/get_genc.R
Original file line number Diff line number Diff line change
Expand Up @@ -10,14 +10,8 @@ tmp <- tempfile()
httr::GET(url, write_disk(tmp, overwrite=TRUE))
genc <- readxl::read_excel(tmp, sheet = 'Codes_for_GE_Names', skip = 2)

bad <- c('AKROTIRI', 'ASHMORE AND CARTIER ISLANDS', 'BAKER ISLAND',
'CLIPPERTON ISLAND', 'CORAL SEA ISLANDS', 'DHEKELIA', 'DIEGO GARCIA',
'ENTITY 1', 'ENTITY 2', 'ENTITY 3', 'ENTITY 4', 'ENTITY 5', 'ENTITY 6',
'EUROPA ISLAND', 'GLORIOSO ISLANDS', 'GUANTANAMO BAY NAVAL BASE',
'HOWLAND ISLAND', 'JAN MAYEN', 'JARVIS ISLAND', 'JOHNSTON ATOLL',
'JUAN DE NOVA ISLAND', 'KINGMAN REEF', 'MIDWAY ISLANDS',
'NAVASSA ISLAND', 'PALMYRA ATOLL', 'PARACEL ISLANDS', 'SAINT MARTIN',
'SPRATLY ISLANDS', 'TROMELIN ISLAND', 'UNKNOWN', 'WAKE ISLAND', 'BASSAS DA INDIA', 'GAZA STRIP')
bad <- c('AKROTIRI', 'ASHMORE AND CARTIER ISLANDS', 'BAKER ISLAND', 'CLIPPERTON ISLAND', 'CORAL SEA ISLANDS', 'DHEKELIA', 'DIEGO GARCIA', 'ENTITY 1', 'ENTITY 2', 'ENTITY 3', 'ENTITY 4', 'ENTITY 5', 'ENTITY 6', 'EUROPA ISLAND', 'GLORIOSO ISLANDS', 'GUANTANAMO BAY NAVAL BASE', 'HOWLAND ISLAND', 'JAN MAYEN', 'JARVIS ISLAND', 'JOHNSTON ATOLL', 'JUAN DE NOVA ISLAND', 'KINGMAN REEF', 'MIDWAY ISLANDS', 'NAVASSA ISLAND', 'PALMYRA ATOLL', 'PARACEL ISLANDS', 'SAINT MARTIN', 'SPRATLY ISLANDS', 'TROMELIN ISLAND', 'UNKNOWN', 'WAKE ISLAND', 'BASSAS DA INDIA', 'GAZA STRIP', "ABU MUSA AND TUNB ISLANDS", "ABYEI", "AKSAI CHIN AND OTHER AREAS", "CONEJO ISLAND", "CORISCO BAY ISLANDS", "DOUMEIRA ISLANDS", "DRAMANA AND SHAKHATOE", "GEYSER REEF", "HANS ISLAND", "HEIPETHES ISLANDS", "KALAPANI", "KOALOU / KOUROU", "MINERVA REEFS", "PARSLEY ISLAND", "SAPODILLA CAYES", "SIACHEN", "BRAZILIAN ISLAND", "CONGO RIVER ISLANDS")

genc <-
genc %>%
dplyr::select(country = Name, genc2c = `2-character Code`,
Expand Down

0 comments on commit 9191a32

Please sign in to comment.