xml parsing in perl

Although you’re strongly advised against writing your own XML parser, the standard ones can’t cope well with badly formatted XML input. The stuff I was trying to use wasn’t well formed, so I had to write something that didn’t care. It’s only four lines, I can’t see what all the fuss is about. Encodings blah.

# Chris's recursive xml parser that doesn't handle quotes properly

my $d=0;

while (<>) {
p($_);
}

sub p {
my ($t,$a,$c) = $_ =~ m:<([/\w]+) ?(\w+=".*")* ?(/)? ?>:;
if ($t =~ m:^/:) { print "\t" x $d,"+++$t+++ \n\n\n"; $d--; return; } else { print "\t" x $d,"+++$t+++\n"; }
if (defined($a)) { foreach my $kv (split(/ /, $a)) { if (my ($k, $v) = $kv =~ m/^(\w+)="(.*)"/) { print "\t" x $d,"$k = $v\n"; } } }
if (defined($c)) { print "\n\n"; } else { print "\n\n"; $d++; while (<>) { p($_); } }
}

Comments

    Leave a comment