Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File format #2

Open
IohannesIohannium opened this issue Nov 30, 2021 · 5 comments
Open

File format #2

IohannesIohannium opened this issue Nov 30, 2021 · 5 comments

Comments

@IohannesIohannium
Copy link
Collaborator

How the dictionary file should look like (example with table):

English en_article en_class en_adj French fr_article fr_class fr_adj German de_article de_class de_adj Spanish es_article es_class es_adj Russian ru_class ru_adj Polish pl_class pl_adj pl_adj_a Korean Chinese Czech cz_class cz_adj cz_adj_a Italian it_article it_class it_adj Hungarian hu_class Dutch nl_article nl_class nl_adj Portuguese pt_article pt_class pt_adj
England   en_s English Angleterre L' fr_fsav anglaise England   de_n englisch Inglaterra La es_fsa inglesa Англия ru_f английск Anglia pl_fs angielsk angielsc 잉글랜드 英国 Anglie cz_fs anglick angličt Inghilterra L' it_fsav inglese Anglia hu_s Engeland   nl_n Engels Inglaterra pt_fsn inglesa
United States The en_pl American États Unis Les fr_mpav américaine Vereinigte Staaten Die de_pl amerikanisch Estados Unidos Los es_mpa americana Соединённые Шта́ты ru_pl американск Stany Zjednoczone pl_mp amerykańsk amerykańsc 미국 美国 Spojené Státy cz_mp americk američt Stati Uniti Gli it_mpac americana Egyesült Államok hu_p Verenigde Staten Die nl_pl Amerikaans Estados Unidos Os pt_mpa americana
Chicago   en_s Chicagoan Chicago   fr_fsn chicagoise Chicago   de_n chicagoerisch Chicago   es_fsn chicagüense Чикаго ru_m чикагск Chicago pl_ns chicagowsk chicagowsc 시카고 芝加哥 Chicago cz_ns chicagsk chicagšk Chicago   it_fsn cicaghese Chicago hu_s Chicago   nl_n Chicagoer Chicago pt_fsn chicaguense

Explanation:

  • (Language name): Localized name in said language
  • _article: The article used with the country name, space included ("The ", not "The")
  • _class: Grammar category. Starts with the language two-letter code (e.g. en) and may have gender (m, f, n), number (s, p/pl), use of article (a, n), starting with vowel or consonant (v, c)
  • _adj: The adjective this country uses. Notes:
    • French, Spanish, Italian, Portuguese: feminine singular form used
    • Russian, German, Dutch: form devoid of ending used
    • Polish, Czech: form devoid of ending used, although having a separate _adj_a field for nom. pl. m. consonant change
@IohannesIohannium
Copy link
Collaborator Author

IohannesIohannium commented Nov 30, 2021

Outputs for CK3

  • ⟨title⟩: ⟨Language⟩
  • ⟨title⟩_adj: ⟨lang⟩_adj
  • ⟨title⟩_article: ⟨lang⟩_article (for languages where this exists)

Outputs for EU4

  • ⟨tag⟩: ⟨lang⟩_article⟨Language⟩
  • ⟨tag⟩_ADJ: ⟨lang⟩_adj
  • As a country flag in file history, output all ⟨lang⟩_classes

Outputs for Vic2

  • ⟨tag⟩: ⟨lang⟩_article⟨Language⟩
  • ⟨tag⟩_ADJ: ⟨lang⟩_adj

Outputs for HoI4

  • ⟨tag⟩_⟨ideology⟩: ⟨Language⟩
  • ⟨tag⟩_⟨ideology⟩_DEF: ⟨lang⟩_article⟨Language⟩
  • ⟨tag⟩_⟨ideology⟩_ADJ: ⟨lang⟩_adj
  • As a country flag in file history, output all ⟨lang⟩_classes
  • For Polish localization only, ⟨tag⟩_⟨ideology⟩_ADJ_PL: ⟨lang⟩_adj_a, along with a (yet to be defined) special way of triggering the custom loc

@IohannesIohannium
Copy link
Collaborator Author

IohannesIohannium commented Nov 30, 2021

Special things:

  • hu_adj is set to be ⟨Hungarian⟩i, in all cases
  • ko_adj is set to be ⟨Korean⟩ in all cases, zh_adj to be ⟨Chinese⟩

How the dictionary file should work:

  • Converter looks up English localization for a polity (tag, title, etc.)
  • IF the game is EU4 or Vic2, it'll try to match it to a composed entry of en_article English (e.g. The United States)
  • ELSE it'll try to match it to an entry in English (e.g. United States)
  • THEN, it'll consider the polity to match that row and appropriately output localisation (even if it overrides any other input)
    This last thing means that if a game localises "United States" in French as "EE.UU.", the converter will disregard that and output "États Unis" anyway.

Matching is case-insensitive, diacritic-sensitive.

Output will be restricted to the languages a game actually has (we'll skip Vic2 Finnish column because Finnish is impossible to handle) and will not overwrite vanilla localisations (if a game already has RUS_ADJ as localized, the converter will NOT output RUS_ADJ)

@IohannesIohannium
Copy link
Collaborator Author

Grammar category will be used to define:

  • Adjective agreement (La France est grande / Les États Unis sont grands)
  • Verb agreement (La France est grande / Les États Unis sont grands)
  • Output of articles in games where the name does not naturally have them in the polity's bare localization (La France est grande / Les États Unis sont grands)

Every other grammar shenanigan such as adjective declension (français, française, ...), polity name's declension (Франция, Франции, ...) is to be done separately through the use of respective adj_rules.txt and decl_rules.txt files

@IohannesIohannium
Copy link
Collaborator Author

As an update, most columns that have articles separate have been removed, as articles can be inferred from categories anyway.
Czech, Hungarian and Dutch have also been dropped.

@IohannesIohannium
Copy link
Collaborator Author

DROPPED: now resorting to simply giving the converter grammatical categories for each string (e.g. telling the converter that the French string Géorgie is in category fsac, that the Spanish string Puerto Rico is in category msn, etc.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant