Difference between revisions of "Ar Tonelico III"
Blutorange (talk | contribs) |
Blutorange (talk | contribs) |
||
Line 13: | Line 13: | ||
[http://uploading.com/files/8e1m1bmm/at3sd_utf8.tar.gz/ http://uploading.com/files/8e1m1bmm/at3sd_utf8.tar.gz/] | [http://uploading.com/files/8e1m1bmm/at3sd_utf8.tar.gz/ http://uploading.com/files/8e1m1bmm/at3sd_utf8.tar.gz/] | ||
− | + | I found the speaker information now, so I'll update the links soon. For anyone who's interested, what I found out: | |
+ | <pre>AT3 ebd script files | ||
+ | |||
+ | • consists of EVENT_MESSAGE_SW[2digit-NUMBER]_[3digit-NUMBER].ebm (called DIAG from now on) and EVENT_SW[2digit-NUMBER]_[3digit-NUMBER].ebm (called CTRL from now on) | ||
+ | • each DIAG corresponds do a CTRL file with the same NUMBER's | ||
+ | • DIAG to contains the main dialogue lines, while CTRL is probably system-related | ||
+ | • DIAG files are also usually only a few hundred bytes long | ||
+ | • DIAG has a header of 3 bytes, then comes the main part | ||
+ | |||
+ | • the first 26 bytes of CTRL are as follows (decimal): | ||
+ | [#1] 000 000 000 000 000 000 000 005 000 000 000 110 097 109 101 000 005 000 000 000 144 224 150 190 000 [#2] [#3] 000 000 [#4] 000 [#5] [#6] [#7] [#8] [#8] 000 [#9] 000 | ||
+ | whereas #[n] are | ||
+ | -- #1takes many different values, 001 is very frequent (~50%) | ||
+ | -- #2 takes many different values | ||
+ | -- #3 mostly 000, a few times 001, 002, 4 times 003, 3 times 004 | ||
+ | -- #4 mostly small bytes <=021, 021 and 00x frequently occur in adjacent files together, takes 044 in two instances | ||
+ | -- #5 either 000, 016, 049, or 064 | ||
+ | -- #6 always either 113, 116, 117, 119, or 127 | ||
+ | -- #7 bytes <= 025, either 00x or 021x with x<=5 except a handful of times | ||
+ | -- #8 almost always 000, except 10 and 7 files respectively | ||
+ | -- #9 either 000, 001, 002, 003, 004, 005, 017, 019, 021. 025 with the lower bytes much more common | ||
+ | • the byte of CTRL always seems to be <bh:7f>, the last 26 bytes only being somewhat similiar | ||
+ | • in general, CTRL displays a high ration of <bh:00> | ||
+ | • CTRL contains no UTF8 chars | ||
+ | • the main part of CTRL, apart from the man 0's, contains only ASCII chars, most of which are LATIN characters and punctuation, with a few special chars such as <bh:f4>, <bh:dc> (Ü) | ||
+ | |||
+ | • the main part of DIAG is in the following format, after the 3-byte header comes: | ||
+ | [SEPARATOR] [UTF8-sequence][SEPARATOR][UTF8-sequence] ... [UTF8-sequence][SEPARATOR] | ||
+ | • as the text is Japanese, [UTF8-sequence] is usually a multiple of 3-byte blocks, each block representing a multi-byte for one Japanese character; it terminates on a zero-byte | ||
+ | • the main text may contain a ※削除※ line, [LEADING] is then <bh:ff> | ||
+ | • [SEPARATOR] always consists of 36 bytes, each byte smaller than <bd:192>, with the only exception it may also contain <bh:ff>. Not counting the <bh:00> byte UTF8 terminating byte. | ||
+ | • [SEPARATOR]: most bytes are constant, except the following meaningful bytes | ||
+ | • the 25th byte: it is a [LEADING] number, counting the dialogue lines | ||
+ | • a [LEADING] byte <bh:ff> this line is outside the "normal" dialogue flow, ie a system message ("You got item..") or "Party member xyz joined." or "……。" or "…!?" &c. | ||
+ | • the 13th byte: this indicates the [SPEAKER]. [SPEAKER] is <bh:ff> when there is no speaker | ||
+ | |||
+ | TO SUMMARIZE | ||
+ | • dialogue in EVENT-MESSAGE file: [3 byte header][36-byte separator][UTF8 byte sequence, terminating on <bh:00>], repeat | ||
+ | • 13th byte [SEPARATOR] is speaker, 26th [SEPARATOR] marks "normal" spoken text</pre> |
Revision as of 17:17, 7 February 2011
Lots and lots of text and lots of obscure kanji! First I'll finish the game, then I can re-watch the cosmosphere events from the extra menu. I will probably do the cosmospheres (they're ridiculously funny), perhaps some talk events and if there should be interested, let's see.
Translations
(or should I translate to German??)
Tips
I used the Japanese IME "canna" under ubuntu, compiled from source, and changed kana-kanji dictionaries, so that kanji+furigana is written upon entering and converting Japanese te
Script
Now I managed to dump the script. Much better than having to write by hand:) And I also ripped the voice clips and bgm, character poses, the textures, I can view a few models (not the character models though)... As of now, it is only a "pure" script dump without any indication of speakers and poses, but there is a 37-byte delimiter between the UTF8 sequences, so this info is probably right there. The dump can be found here: http://www.2shared.com/file/uvB6AgDM/at3sd_utf8tar.html or http://www.megaupload.com/?d=SH4L2U1H or http://uploading.com/files/8e1m1bmm/at3sd_utf8.tar.gz/
I found the speaker information now, so I'll update the links soon. For anyone who's interested, what I found out:
AT3 ebd script files • consists of EVENT_MESSAGE_SW[2digit-NUMBER]_[3digit-NUMBER].ebm (called DIAG from now on) and EVENT_SW[2digit-NUMBER]_[3digit-NUMBER].ebm (called CTRL from now on) • each DIAG corresponds do a CTRL file with the same NUMBER's • DIAG to contains the main dialogue lines, while CTRL is probably system-related • DIAG files are also usually only a few hundred bytes long • DIAG has a header of 3 bytes, then comes the main part • the first 26 bytes of CTRL are as follows (decimal): [#1] 000 000 000 000 000 000 000 005 000 000 000 110 097 109 101 000 005 000 000 000 144 224 150 190 000 [#2] [#3] 000 000 [#4] 000 [#5] [#6] [#7] [#8] [#8] 000 [#9] 000 whereas #[n] are -- #1takes many different values, 001 is very frequent (~50%) -- #2 takes many different values -- #3 mostly 000, a few times 001, 002, 4 times 003, 3 times 004 -- #4 mostly small bytes <=021, 021 and 00x frequently occur in adjacent files together, takes 044 in two instances -- #5 either 000, 016, 049, or 064 -- #6 always either 113, 116, 117, 119, or 127 -- #7 bytes <= 025, either 00x or 021x with x<=5 except a handful of times -- #8 almost always 000, except 10 and 7 files respectively -- #9 either 000, 001, 002, 003, 004, 005, 017, 019, 021. 025 with the lower bytes much more common • the byte of CTRL always seems to be <bh:7f>, the last 26 bytes only being somewhat similiar • in general, CTRL displays a high ration of <bh:00> • CTRL contains no UTF8 chars • the main part of CTRL, apart from the man 0's, contains only ASCII chars, most of which are LATIN characters and punctuation, with a few special chars such as <bh:f4>, <bh:dc> (Ü) • the main part of DIAG is in the following format, after the 3-byte header comes: [SEPARATOR] [UTF8-sequence][SEPARATOR][UTF8-sequence] ... [UTF8-sequence][SEPARATOR] • as the text is Japanese, [UTF8-sequence] is usually a multiple of 3-byte blocks, each block representing a multi-byte for one Japanese character; it terminates on a zero-byte • the main text may contain a ※削除※ line, [LEADING] is then <bh:ff> • [SEPARATOR] always consists of 36 bytes, each byte smaller than <bd:192>, with the only exception it may also contain <bh:ff>. Not counting the <bh:00> byte UTF8 terminating byte. • [SEPARATOR]: most bytes are constant, except the following meaningful bytes • the 25th byte: it is a [LEADING] number, counting the dialogue lines • a [LEADING] byte <bh:ff> this line is outside the "normal" dialogue flow, ie a system message ("You got item..") or "Party member xyz joined." or "……。" or "…!?" &c. • the 13th byte: this indicates the [SPEAKER]. [SPEAKER] is <bh:ff> when there is no speaker TO SUMMARIZE • dialogue in EVENT-MESSAGE file: [3 byte header][36-byte separator][UTF8 byte sequence, terminating on <bh:00>], repeat • 13th byte [SEPARATOR] is speaker, 26th [SEPARATOR] marks "normal" spoken text