[Catalyst] Alien::Dojo uses regexes to parse HTML, so what?

Dominique Quatravaux dom at idealx.com
Tue May 30 13:40:56 CEST 2006


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Thomas Hartman a écrit :

> I hate to throw fuel on the fire, but what personally convinced
> *me* that regexes are a bad idea for parsing html was the issue of
> html comments.

This point is not being debated (at least not by me). What I am trying
to do is to fetch a single URI, which is a strict sub-problem of
parsing the HTML under some (IMO) fairly benign set of assumptions
about what the author of the Dojo homepage will refrain from doing. No
guarantees, of course.

- --
Dominique QUATRAVAUX                           Ingénieur senior
01 44 42 00 08                                 IDEALX

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFEfC9HMJAKAU3mjcsRAqNTAJ9KRw0LC+x3Q87k6EPZvnozypEiVwCfbYRe
npe5+nqH39/9i76fZiYbJSQ=
=9JVF
-----END PGP SIGNATURE-----





More information about the Catalyst mailing list