You are here

Parsing the info XML

Having completed the tooltip XML parsing, it's time to work on the info XML parsing. So far I have learned two things:
1. There's a lot of extraneous crap in the info XML parsing that likely isn't going to do anyone much good.
2. 65,535 characters is not enough to hold the info XML for some items (e.g. Rune Thread).

The second one took me a short time to figure out. Suddenly REXML was crashing and I couldn't quite figure out why. It turns out that the issue is that I originally stored the XML in a TEXT column, which is limited to 65,535 characters. Changing the column to a MEDIUMTEXT in MySQL increases the capacity to something like 16 million characters, which should be more than sufficient.

I'm letting the parse checker run now. If it completes, that'll give me everything I need to work on the next step: Extracting the item schema.