Hana - Language for translating all syntax and commands are expressed in the Korean language to Python style code
Please cd 1_LexicalAnalyzer
or direct to 1_LexicalAnalyzer
directory for Readme on Lexer.
Please cd 2_SyntaticAnalyzer
or direct to 2_SyntaticAnalyzer
directory for Readme on Parser.
Please cd 3_CodeGen
or direct to 3_CodeGen
directory for Readme on Code Generation.
These tokens correspond to the following keywords in Python:
We are allowing identifiers to consist of valid Hangul characters, digits, and underscores. The digits must precede with an underscore. These identifiers will include Hangul Jamo, Compatibility Jamo, and pre-composed Hangul syllables. The Unicode ranges for each of these types of characters are:
- \u1100-\u11FF: Hangul Jamo (used for composing Hangul syllables).
- \u3130-\u318F: Hangul Compatibility Jamo (used for compatibility with older encodings).
- \uAC00-\uD7AF: Pre-composed Hangul syllables (the most commonly used Korean characters in modern texts). We also allow:
- Digits (0-9) only after an underscore.
- Underscore (
) only between two valid Hangul characters or digits.
Regular Expression:
Valid examples:
- '+', '-', '*', '**', '/', '%', '=', '==', '!=', '<', '>', '<=', '>='
- Order of Evaluation
Arithmetic operators follow the PEMDAS rule. This means parentheses
have the highest precedence, followed by multiplication*
, division/
, and modulus%
, which take precedence over addition+
and subtraction-
. - Addition and Subtraction
We permit the use of the addition and subtraction operator between two variables of type
, or between two expressions that result inint
. The operands on both sides of the operator can be of different type (i.e. can beint + float
). We do not support the shorthand operators++
for incrementing or decrementing by 1. - Multiplication and Division Only allow int and float (traditional mathematical multiplication and division) Division: quotient remain and remainder is removed
- Modulo Modulo (%) returns the division’s remainder. Only use int.
: Open Parenthesis)
: Close Parenthesis{
: Open Brace}
: Close Brace,
: Comma:
: Colon (used after control flow statements like만약
or after function definitions like함수
: Semicolon (optional terminator, for clarity or separation)
Regex Rule: [\(\)\{\},:;]
- Comment Tokens: Single-line Comments: In Hana, comments can start with a hash # (similar to Python). Everything following # on that line is ignored by the interpreter. Pattern: #.* (matches everything after # to the end of the line).
- String Tokens: String Literals: Strings are enclosed in double quotes ("). All characters within the quotes are considered part of the string until the closing quote. Pattern: "(?:[^"\\]|\.)*" (matches any sequence of characters between double quotes, allowing escaped characters like ").
- Integer: Sequence of digits: '[0-9]+'
- Float: Sequence of digits, decimal point, sequence of digits: '[0-9]+.[0-9]+'
Je Yang (jy3342) and Ella Kim (yk3040)