[Catalyst] Alien::Dojo uses regexes to parse HTML, so what?
Dominique Quatravaux
dom at idealx.com
Tue May 30 13:40:56 CEST 2006
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Thomas Hartman a écrit :
> I hate to throw fuel on the fire, but what personally convinced
> *me* that regexes are a bad idea for parsing html was the issue of
> html comments.
This point is not being debated (at least not by me). What I am trying
to do is to fetch a single URI, which is a strict sub-problem of
parsing the HTML under some (IMO) fairly benign set of assumptions
about what the author of the Dojo homepage will refrain from doing. No
guarantees, of course.
- --
Dominique QUATRAVAUX Ingénieur senior
01 44 42 00 08 IDEALX
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
iD8DBQFEfC9HMJAKAU3mjcsRAqNTAJ9KRw0LC+x3Q87k6EPZvnozypEiVwCfbYRe
npe5+nqH39/9i76fZiYbJSQ=
=9JVF
-----END PGP SIGNATURE-----
More information about the Catalyst
mailing list