# from Markus Kuhn's UTF-8 decoder capability and stress test
# https://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-test.txt

# 1. The Greek word 'kosme'
κόσμε

# 2  Boundary condition test cases
# 2.1  First possible sequence of a certain length

ࠀ
𐀀


# 2.2  Last possible sequence of a certain length

߿
￿



# 2.3  Other boundary conditions
퟿

�
􏿿


# 3 Malformed sequences
# 3.1  Unexpected continuation bytes









# 3.1.9  Sequence of all 64 possible continuation bytes (0x80-0xbf):


# 3.2  Lonely start characters
# 3.2.1  All 32 first bytes of 2-byte sequences (0xc0-0xdf), each followed by a space character:
               
# 3.2.2  All 16 first bytes of 3-byte sequences (0xe0-0xef), each followed by a space character:
                
# 3.2.3  All 8 first bytes of 4-byte sequences (0xf0-0xf7), each followed by a space character:
        
# 3.2.4  All 4 first bytes of 5-byte sequences (0xf8-0xfb), each followed by a space character:
    
# 3.2.5  All 2 first bytes of 6-byte sequences (0xfc-0xfd), each followed by a space character: 
  

# 3.3  Sequences with last continuation byte missing











# 3.4  Concatenation of incomplete sequences


# 3.5  Impossible bytes




# 4  Overlong sequences
# 4.1  Examples of an overlong ASCII character





# 4.2  Maximum overlong sequences





# 4.3  Overlong representation of the NUL character






# 5  Illegal code positions
# 5.1 Single UTF-16 surrogates







# 5.2 Paired UTF-16 surrogates








# 5.3 Noncharacter code positions
￾
￿
# Other noncharacters:
﷐﷑﷒﷓﷔﷕﷖﷗﷘﷙﷚﷛﷜﷝﷞﷟﷠﷡﷢﷣﷤﷥﷦﷧﷨﷩﷪﷫﷬﷭﷮﷯"|
# 5.3.4  U+nFFFE U+nFFFF (for n = 1..10)
🿾🿿𯿾𯿿𿿾𿿿񏿾񏿿񟿾񟿿񯿾񯿿񿿾񿿿򏿾򏿿
