Romain Bignon
|
59dfe3083a
|
delete 'remove_html_tags' global function, and create IParser.tocleanstring and IParser.strip abstract methods.
|
2011-10-25 13:28:43 +02:00 |
|
Romain Bignon
|
2cc992a8bc
|
new parser 'json'
|
2011-09-23 10:00:46 +02:00 |
|
Laurent Bachelier
|
92fc86a033
|
Add support for xpath in LxmlHtmlParser.select
The returned results are similar to those of the cssselect method
so there wasn't much to do except calling it.
|
2011-04-12 01:00:28 +02:00 |
|
Romain Bignon
|
9afb301ebe
|
move select() in parser
|
2011-04-09 11:25:13 +02:00 |
|
Romain Bignon
|
7e2bb91b3b
|
change license to AGPLv3+
|
2011-04-08 12:48:07 +02:00 |
|
Christophe Benz
|
2ab29ac070
|
implement tostring for html5lib parser
|
2010-11-16 15:30:13 +01:00 |
|
Romain Bignon
|
6de583c4ca
|
Revert "do not strip cdata"
This reverts commit 8bd0ebbea2.
|
2010-11-09 13:39:43 +01:00 |
|
Nicolas Duhamel
|
8bd0ebbea2
|
do not strip cdata
|
2010-11-09 12:03:35 +01:00 |
|
Christophe Benz
|
b4c672fa46
|
new select() helper
|
2010-07-14 17:14:53 +02:00 |
|
Christophe Benz
|
470f2a9fe2
|
use real comments for licence header
|
2010-06-22 16:27:33 +02:00 |
|
Romain Bignon
|
89c11ca4a0
|
fix pyflakes errors
|
2010-05-20 10:42:20 +02:00 |
|
Christophe Benz
|
a9c8c93965
|
add new lxmlsoup parser
|
2010-05-20 01:33:54 +02:00 |
|
Romain Bignon
|
fbf639993b
|
misc
|
2010-05-01 14:41:09 +02:00 |
|
Romain Bignon
|
77044dd4be
|
fix typo
|
2010-04-20 21:11:14 +02:00 |
|
Romain Bignon
|
3f9083df27
|
documentation
|
2010-04-16 20:11:36 +02:00 |
|
Christophe Benz
|
f8e2016d59
|
get_parser returns class instead of object
|
2010-04-16 19:41:06 +02:00 |
|
Romain Bignon
|
384e3521c7
|
factorization
|
2010-04-16 18:44:55 +02:00 |
|
Christophe Benz
|
8638024756
|
rename parser/parsers module, add get_parsers() with preference_order
|
2010-04-16 18:11:52 +02:00 |
|