Laurent Bachelier
b8d1a52732
Use simplejson first, and centralize import
...
simplejson is supposed to be faster:
http://stackoverflow.com/questions/712791/json-and-simplejson-module-differences-in-python
2012-03-16 16:27:22 +01:00
Laurent Bachelier
006e97a8be
PEP8 style fixes and other small style fixes
...
I used autopep8 on some files and did carefully check the changes.
I ignored E501,E302,E231,E225,E222,E221,E241,E203 in my search, and at
least E501 on any autopep8 run.
Other style fixes not related to PEP8:
* Only use new-style classes. I don't think the usage of old-style
classes was voluntary. Old-style classes are removed in Python 3.
* Convert an if/else to a one-liner in mediawiki, change docstring style
change to a comment something that wasn't really appropriate for a
docstring.
* Unneeded first if condition in meteofrance
2012-03-14 04:51:46 +01:00
Romain Bignon
1fa64bf5f1
default parsers are now only lxml and lxmlsoup, to prevent bad behaviors with bad parsers
2012-02-02 10:17:41 +01:00
Romain Bignon
59dfe3083a
delete 'remove_html_tags' global function, and create IParser.tocleanstring and IParser.strip abstract methods.
2011-10-25 13:28:43 +02:00
Romain Bignon
2cc992a8bc
new parser 'json'
2011-09-23 10:00:46 +02:00
Laurent Bachelier
92fc86a033
Add support for xpath in LxmlHtmlParser.select
...
The returned results are similar to those of the cssselect method
so there wasn't much to do except calling it.
2011-04-12 01:00:28 +02:00
Romain Bignon
9afb301ebe
move select() in parser
2011-04-09 11:25:13 +02:00
Romain Bignon
7e2bb91b3b
change license to AGPLv3+
2011-04-08 12:48:07 +02:00
Christophe Benz
2ab29ac070
implement tostring for html5lib parser
2010-11-16 15:30:13 +01:00
Romain Bignon
6de583c4ca
Revert "do not strip cdata"
...
This reverts commit 8bd0ebbea2 .
2010-11-09 13:39:43 +01:00
Nicolas Duhamel
8bd0ebbea2
do not strip cdata
2010-11-09 12:03:35 +01:00
Christophe Benz
b4c672fa46
new select() helper
2010-07-14 17:14:53 +02:00
Christophe Benz
470f2a9fe2
use real comments for licence header
2010-06-22 16:27:33 +02:00
Romain Bignon
89c11ca4a0
fix pyflakes errors
2010-05-20 10:42:20 +02:00
Christophe Benz
a9c8c93965
add new lxmlsoup parser
2010-05-20 01:33:54 +02:00
Romain Bignon
fbf639993b
misc
2010-05-01 14:41:09 +02:00
Romain Bignon
77044dd4be
fix typo
2010-04-20 21:11:14 +02:00
Romain Bignon
3f9083df27
documentation
2010-04-16 20:11:36 +02:00
Christophe Benz
f8e2016d59
get_parser returns class instead of object
2010-04-16 19:41:06 +02:00
Romain Bignon
384e3521c7
factorization
2010-04-16 18:44:55 +02:00
Christophe Benz
8638024756
rename parser/parsers module, add get_parsers() with preference_order
2010-04-16 18:11:52 +02:00