This module implements all the Unicode Normalization Form algorithms
The normalization is buffered. Buffering makes the algorithm take O(n) time and O(1) space. Making it suitable for untrusted text and streaming.
The result is not guaranteed to be equal to the unbuffered one. However, this is usually only true for malformed text. The buffer may be flushed before filling it completely.
NFD will apply a canonical decomposition. NFC will apply a canonical decomposition, then the canonical composition. NFKD will apply a compatibility decomposition. NFKC will apply a compatibility decomposition, then the canonical composition.
Procs
proc toNFD(s: string): string {...}{.raises: [], tags: [].}
- Return the normalized input. Result may take 3 times the size of the input
proc toNFD(s: seq[Rune]): seq[Rune] {...}{.deprecated: "Use toNFD(string)", raises: [], tags: [].}
- Return the normalized input. Result may take 4 times the size of the input
proc toNFC(s: string): string {...}{.raises: [], tags: [].}
- Return the normalized input. Result may take 3 times the size of the input
proc toNFC(s: seq[Rune]): seq[Rune] {...}{.deprecated: "Use toNFC(string)", raises: [], tags: [].}
- Return the normalized input. Result may take 3 times the size of the input
proc toNFKD(s: string): string {...}{.raises: [], tags: [].}
- Return the normalized input. Result may take 11 times the size of the input
proc toNFKD(s: seq[Rune]): seq[Rune] {...}{.deprecated: "Use toNFKD(string)", raises: [], tags: [].}
- Return the normalized input. Result may take 18 times the size of the input
proc toNFKC(s: string): string {...}{.raises: [], tags: [].}
- Return the normalized input. Result may take 11 times the size of the input
proc toNFKC(s: seq[Rune]): seq[Rune] {...}{.deprecated: "Use toNFKC(string)", raises: [], tags: [].}
- Return the normalized input. Result may take 18 times the size of the input
proc isNFC(s: string): bool {...}{.inline, raises: [], tags: [].}
- Return whether the unicode characters are normalized or not. For some inputs the result is always false (even if it's normalized)
proc isNFC(s: seq[Rune]): bool {...}{.inline, deprecated: "Use isNFC(string)", raises: [], tags: [].}
- false (even if it's normalized) Return whether the unicode characters are normalized or not. For some inputs the result is always
proc isNFD(s: string): bool {...}{.inline, raises: [], tags: [].}
- Return whether the unicode characters are normalized or not. For some inputs the result is always false (even if it's normalized)
proc isNFD(s: seq[Rune]): bool {...}{.inline, deprecated: "Use isNFD(string)", raises: [], tags: [].}
- false (even if it's normalized) Return whether the unicode characters are normalized or not. For some inputs the result is always
proc isNFKC(s: string): bool {...}{.inline, raises: [], tags: [].}
- Return whether the unicode characters are normalized or not. For some inputs the result is always false (even if it's normalized)
proc isNFKC(s: seq[Rune]): bool {...}{.inline, deprecated: "Use isNFKC(string)", raises: [], tags: [].}
- false (even if it's normalized) Return whether the unicode characters are normalized or not. For some inputs the result is always
proc isNFKD(s: string): bool {...}{.inline, raises: [], tags: [].}
- Return whether the unicode characters are normalized or not. For some inputs the result is always false (even if it's normalized)
proc isNFKD(s: seq[Rune]): bool {...}{.inline, deprecated: "Use isNFKD(string)", raises: [], tags: [].}
- false (even if it's normalized) Return whether the unicode characters are normalized or not. For some inputs the result is always
proc cmpNfd(a, b: openArray[char]): bool {...}{.raises: [], tags: [].}
- Compare two strings are canonically equivalent. This is more efficient than normalizing + comparing, as it does not create temporary strings (i.e it won't allocate).
Iterators
iterator toNFD(s: string): Rune {...}{.inline, raises: [], tags: [].}
- Iterates over each normalized unicode character
iterator toNFD(s: seq[Rune]): Rune {...}{.inline, deprecated: "Use toNFD(string)", raises: [], tags: [].}
- Iterates over each normalized unicode character
iterator toNFC(s: string): Rune {...}{.inline, raises: [], tags: [].}
- Iterates over each normalized unicode character
iterator toNFC(s: seq[Rune]): Rune {...}{.inline, deprecated: "Use toNFC(string)", raises: [], tags: [].}
- Iterates over each normalized unicode character
iterator toNFKD(s: string): Rune {...}{.inline, raises: [], tags: [].}
- Iterates over each normalized unicode character
iterator toNFKD(s: seq[Rune]): Rune {...}{.inline, deprecated: "Use toNFKD(string)", raises: [], tags: [].}
- Iterates over each normalized unicode character
iterator toNFKC(s: string): Rune {...}{.inline, raises: [], tags: [].}
- Iterates over each normalized unicode character
iterator toNFKC(s: seq[Rune]): Rune {...}{.inline, deprecated: "Use toNFKC(string)", raises: [], tags: [].}
- Iterates over each normalized unicode character