I wrote a
I started off looking into Harepacker resurrected. However, the UI is limited for my needs, both for exploration and for bulk-deletes. Harepacker fortunately provides an option to dump the files a text format like XML. I expected this to be straightforward to parse, but quickly became dismayed after exploring the schema.
Example schema of Map.wz/Map/*/* via PySpark
Document elements encode the type of the value and the attributes contain the values of that node. If you wanted filter all maps that contained a particular bgm ID, you would look in the xml file. Then you have to query the data for a string with the name bgm, inside of the imgdir with name info, inside of the imgdir of the map's id.
This format is too flexible for its own good. While data is lossless in the classic XML format, it's
It's much easier to work with relationships between all of the IDs in the JSON when values are elements of the tree instead of an attribute of a node. The downside is that this can't be transformed back into the original binary format because the types are implicit (e.g. should 1 be a float, an int, a short, or a double?). I'd probably go with avro if I wanted to do anything more complex outside of MapleLib that requires repacking.
One nice feature the JSON data is that things that look like arrays are treated like arrays. Here's a look at the portals section:
This means you can do something along these lines in SQL:
I moved my attention to looking at the
This was built using all of the map id, the return map id, and the portal map id ("pn <> 'sp' and tm <> 999999999") and plotting it in Gephi. I made a smaller one by filtering out all of the maps in Victoria.
So with this, I have a list of all the map IDs I want to keep.
There are few things that I have on my TODO list. The first is to omit directories during repacking given a list of IDs. The next is to list connected maps/mobs/sounds given a set of maps (say maps in Ludibrium). Finally, I need to test this with a real client and server emulator, because I'm not familiar enough to know what will happen if there are missing IDs.
You must be registered to see links
because the classic XML format is tedious to program against. I'm going to use this to build a smaller client. The v62 client is 1GB in size, but it could be much smaller after removing unused conent.
Code:
423mb Map.wz
255mb Sound.wz
196mb Mob.wz
136mb Character.wz
45mb Skill.wz
29mb Npc.wz
27mb Reactor.wz
12mb UI.wz
12mb Item.wz
7mb Effect.wz
3mb Morph.wz
2mb Quest.wz
2mb String.wz
0mb Etc.wz
0mb List.wz
0mb Base.wz
0mb TamingMob.wz
I started off looking into Harepacker resurrected. However, the UI is limited for my needs, both for exploration and for bulk-deletes. Harepacker fortunately provides an option to dump the files a text format like XML. I expected this to be straightforward to parse, but quickly became dismayed after exploring the schema.
Code:
root
|-- _name: string (nullable = true)
|-- imgdir: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- _name: string (nullable = true)
| | |-- _value_tag: string (nullable = true)
| | |-- canvas: struct (nullable = true)
| | | |-- _basedata: string (nullable = true)
| | | |-- _height: long (nullable = true)
| | | |-- _name: string (nullable = true)
| | | |-- _value_tag: string (nullable = true)
| | | |-- _width: long (nullable = true)
| | |-- float: struct (nullable = true)
| | | |-- _name: string (nullable = true)
| | | |-- _value: double (nullable = true)
| | | |-- _value_tag: string (nullable = true)
| | |-- imgdir: array (nullable = true)
| | | |-- element: struct (containsNull = true)
...
You must be registered to see links
Document elements encode the type of the value and the attributes contain the values of that node. If you wanted filter all maps that contained a particular bgm ID, you would look in the xml file. Then you have to query the data for a string with the name bgm, inside of the imgdir with name info, inside of the imgdir of the map's id.
Code:
<imgdir name="info">
<int name="version" value="10"/>
<int name="cloud" value="0"/>
<int name="town" value="1"/>
<float name="mobRate" value="1.0"/>
<string name="bgm" value="Bgm00/FloralLife"/>
<int name="returnMap" value="100000000"/>
<int name="forcedReturn" value="999999999"/>
<int name="hideMinimap" value="0"/>
<int name="moveLimit" value="0"/>
<string name="mapMark" value="Henesys"/>
</imgdir>
This format is too flexible for its own good. While data is lossless in the classic XML format, it's
You must be registered to see links
. The folks who wrote MCDB had the idea right, but I haven't seen the source to compile wz's or a copy of the v62 client. I forked MapleLib and wrote a JSON serializer that generates data that can be queried like this:
Code:
>>> import json
>>> from pathlib import Path
>>> d = json.loads(Path("Map.wz/Map/Map0/000000000.img.json").read_text())
>>> d["payload"]["info"]["bgm"]
'BgmJp/FirstStepMaster'
It's much easier to work with relationships between all of the IDs in the JSON when values are elements of the tree instead of an attribute of a node. The downside is that this can't be transformed back into the original binary format because the types are implicit (e.g. should 1 be a float, an int, a short, or a double?). I'd probably go with avro if I wanted to do anything more complex outside of MapleLib that requires repacking.
One nice feature the JSON data is that things that look like arrays are treated like arrays. Here's a look at the portals section:
Code:
| |-- portal: array (nullable = true)
| | |-- element: struct (containsNull = true)
| | | |-- index: long (nullable = true)
| | | |-- item: struct (nullable = true)
| | | | |-- delay: long (nullable = true)
| | | | |-- hideTooltip: long (nullable = true)
| | | | |-- image: string (nullable = true)
| | | | |-- onlyOnce: long (nullable = true)
| | | | |-- pn: string (nullable = true)
| | | | |-- pt: long (nullable = true)
| | | | |-- script: string (nullable = true)
| | | | |-- tm: long (nullable = true)
| | | | |-- tn: string (nullable = true)
| | | | |-- x: long (nullable = true)
| | | | |-- y: long (nullable = true)
This means you can do something along these lines in SQL:
Code:
SELECT
name, item.x, item.y
FROM maps,
UNNEST(payload.portal) item
I moved my attention to looking at the
You must be registered to see links
. This was built using all of the map id, the return map id, and the portal map id ("pn <> 'sp' and tm <> 999999999") and plotting it in Gephi. I made a smaller one by filtering out all of the maps in Victoria.
So with this, I have a list of all the map IDs I want to keep.
There are few things that I have on my TODO list. The first is to omit directories during repacking given a list of IDs. The next is to list connected maps/mobs/sounds given a set of maps (say maps in Ludibrium). Finally, I need to test this with a real client and server emulator, because I'm not familiar enough to know what will happen if there are missing IDs.